From tim.one@home.com  Mon Jan  1 00:13:12 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 31 Dec 2000 19:13:12 -0500
Subject: [Python-Dev] Re: Most everything is busted
In-Reply-To: <14926.34447.60988.553140@anthem.concentric.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAECIIGAA.tim.one@home.com>

[Barry A. Warsaw]
> There's a stupid, stupid bug in Mailman 2.0, which I've just fixed
> and (hopefully) unjammed things on the Mailman end[1].  We're still
> probably subject to the Postfix delays unfortunately; I think those
> are DNS related, and I've gotten a few other reports of DNS oddities,
> which I've forwarded off to the DC sysadmins.  I don't think that
> particular problem will be fixed until after the New Year.
>
> relax-and-enjoy-the-quiet-ly y'rs,

I would have, except you appear to have ruined it:  hundreds of msgs
disgorged overnight and into the afternoon.  And echoes of email to c.l.py
now routinely come back in minutes instead of days.

Overall, ya, I liked it better when it was broken -- jerk <wink>.

typical-user-ly y'rs  - tim



From tim.one@home.com  Mon Jan  1 01:31:18 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 31 Dec 2000 20:31:18 -0500
Subject: [Python-Dev] Copyrights and licensing (was ... something irrelevant)
In-Reply-To: <200012291652.RAA20251@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECLIGAA.tim.one@home.com>

[Martin von Loewis]
> I'd like to get an "official" clarification on this question. Is it
> the case that patches containing copyright notices are only accepted
> if they are accompanied with license information?

It's nigh unto impossible to get Guido to pay attention to these kinds of
issues until after it's too late -- guess who's still trying to get an FSF
approved license for Python 1.6 <wink>.

What I intend to push for is that nothing be accepted except under the
understanding that copyright is assigned to the Python Software Foundation;
but, since that doesn't exist yet, we're in limbo.

> I agree that the changes are minor, I also believe that I hold the
> copyright to the changes whether I attach a notice or not (at least
> according to our local copyright law).

Under U.S. law too.  The difference is that, without an explicit copyright
notice, it's a lot easier to get lawyers to ignore that reality <0.3 wink>.
When the PSF does come into being, the lawyers will doubtless make us hassle
everyone with an explicit copyright notice into signing reams of paperwork.
It's a drain on time and money for all concerned, IMO, with no real payback.

> What concerns me that without such a notice, gencodec.py looks as if
> CNRI holds the copyright to it. I'm not willing to assign the
> copyright of my changes to CNRI, and I'd like to avoid the impression
> of doing so.

Understood, and with sympathy.  Since the status of JPython/Jython is still
muddy, I urged Finn Bock to put his own copyright notice on his Jython work
for exactly the same reason (i.e., to prevent CNRI claiming it later).

Seems to me, though, that it may simplify life down the road if, whenever an
author felt a similar need to assert copyright explicitly, they list Guido
as the copyright holder.  He's not going to screw Python!  And it's
inevitable that all Python copyrights will eventually be owned by him and/or
the PSF anyway.

But, for God's sake, whatever you do, *please* (anyone) don't make us look
at a unique license!  We're not lawyers, but we've been paying lawyers out
of our own pockets to do this crap, and it's expensive and time-consuming.
If you can't trust Guido to do a Right Thing with your code, Python is
better off without it over the long haul.

> What is even more concerning is that CNRI also holds the copyright to
> the generated files, even though they are derived from information
> made available by the Unicode consortium!

It's no concern to me -- but then I'm not paranoid <wink>.

cnri-and-the-uc-can-fight-it-out-if-it-comes-to-that-ly y'rs  - tim



From moshez@zadka.site.co.il  Mon Jan  1 10:01:02 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Mon,  1 Jan 2001 12:01:02 +0200 (IST)
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20001231105812.A12168@newcnri.cnri.reston.va.us>
References: <20001231105812.A12168@newcnri.cnri.reston.va.us>, <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>
Message-ID: <20010101100102.2360CA84F@darjeeling.zadka.site.co.il>

On Sun, 31 Dec 2000, Andrew Kuchling <akuchlin@cnri.reston.va.us> wrote:

> It also leads to one section of the FAQ (#3, I think) having something
> like 60 questions jumbled together.  IMHO the FAQ should be a text
> file, perhaps in the PEP format so it can be converted to HTML, and it
> should have an editor who'll arrange it into smaller sections.  Any
> volunteers?  (Must ... resist ...  urge to volunteer myself...  help
> me, Spock...)

Well, Andrew, I know if I leave you any more time, you won't be able
to resist the urge. OK, I'll volunteer. Can't do anything right now,
but expect to see an updated version posted on my site soon. If 
people will think it's a good idea, I'll move it to Misc/.
Fred, if the some-xml-format-to-HTML you're working on is in any
sort of readiness, I'll use that to format the FAQ. Having used Perl
in the last couple of weeks, I learned to appreciate the fact that
the FAQ is a standard part of the documentation.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From loewis@informatik.hu-berlin.de  Mon Jan  1 11:43:34 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 1 Jan 2001 12:43:34 +0100 (MET)
Subject: [Python-Dev] Re: Copyrights and licensing (was ... something irrelevant)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCECLIGAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCCECLIGAA.tim.one@home.com>
Message-ID: <200101011143.MAA11550@pandora.informatik.hu-berlin.de>

> Seems to me, though, that it may simplify life down the road if, whenever an
> author felt a similar need to assert copyright explicitly, they list Guido
> as the copyright holder.  He's not going to screw Python!  

That's a good solution, which I'll implement in a revised patch.

Thanks for the advice, and Happy New Year,

Martin


From mal@lemburg.com  Mon Jan  1 17:56:20 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 01 Jan 2001 18:56:20 +0100
Subject: [Python-Dev] Re: Copyright statements ([Patch #103002] Fix for #116285: Properly
 raise UnicodeErrors)
References: <E14Bhs3-0007uf-00@usw-sf-web3.sourceforge.net> <200012290957.KAA17936@pandora.informatik.hu-berlin.de> <3A4C757D.F64E9CEF@lemburg.com>
Message-ID: <3A50C4C4.76A1C5B6@lemburg.com>

Martin von Loewis wrote:
> 
> > My only problem with it is your copyright notice. AFAIK, patches to
> > the Python core cannot contain copyright notices without proper
> > license information. OTOH, I don't think that these minor changes
> > really warrant adding a complete license paragraph.
> 
> I'd like to get an "official" clarification on this question. Is it
> the case that patches containing copyright notices are only accepted
> if they are accompanied with license information?
> 
> I agree that the changes are minor, I also believe that I hold the
> copyright to the changes whether I attach a notice or not (at least
> according to our local copyright law).

True.

> What concerns me that without such a notice, gencodec.py looks as if
> CNRI holds the copyright to it. I'm not willing to assign the
> copyright of my changes to CNRI, and I'd like to avoid the impression
> of doing so.
>
> What is even more concerning is that CNRI also holds the copyright to
> the generated files, even though they are derived from information
> made available by the Unicode consortium!

The copyright for the files and changes needed for the Unicode 
support was indeed transferred to CNRI earlier this year. This
was part of the contract I had with CNRI.

I don't know why the copyright notice wasn't subsequently removed from
the files after final checkin of the changes, though, because, as
I remember, the copyright line was only added as "search&replace"
token to the files in question in the sign over period.

The codec files were part of the Unicode support patch, even though
they were created by the gencodec.py tool I wrote to create them
from the Unicode mapping files. That's why they also carry the
copyright token.

Note that with strict reading of the CNRI license, there's no
problem with removing the notice from the files in question:

"""
...provided, however, that CNRI's
License Agreement and CNRI's notice of copyright, i.e., "Copyright (c)
1995-2000 Corporation for National Research Initiatives; All Rights
Reserved" are retained in Python 1.6 alone or in any derivative
version prepared by Licensee...
"""

The copyright line in the Unicode files is
"(c) Copyright CNRI, All Rights Reserved. NO WARRANTY.", so this
does not match the definition they gave in their license text.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@digicool.com  Mon Jan  1 18:58:36 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 01 Jan 2001 13:58:36 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: Your message of "Fri, 29 Dec 2000 21:59:16 +0100."
 <20001229215915.L1281@xs4all.nl>
References: <EC$An3AHZGT6EwJP@jessikat.fsnet.co.uk> <LNBBLJKPBEHFEDALKOLCKEODIFAA.tim.one@home.com>
 <20001229215915.L1281@xs4all.nl>
Message-ID: <200101011858.NAA09263@cj20424-a.reston1.va.home.com>

Thomas just checked this in, using Tim's words:

> *** ref7.tex	2000/07/16 19:05:38	1.20
> --- ref7.tex	2000/12/31 22:52:59	1.21
> ***************
> *** 243,249 ****
>     \ttindex{exc_value}\ttindex{exc_traceback}}
>   
> ! The optional \keyword{else} clause is executed when no exception occurs
> ! in the \keyword{try} clause.  Exceptions in the \keyword{else} clause are
> ! not handled by the preceding \keyword{except} clauses.
>   \kwindex{else}
>   
> --- 243,251 ----
>     \ttindex{exc_value}\ttindex{exc_traceback}}
>   
> ! The optional \keyword{else} clause is executed when the \keyword{try} clause
> ! terminates by any means other than an exception or executing a
> ! \keyword{return}, \keyword{continue} or \keyword{break} statement.  
> ! Exceptions in the \keyword{else} clause are not handled by the preceding
> ! \keyword{except} clauses.
>   \kwindex{else}

How is this different from "when control flow reaches the end of the
try clause", which is what I really had in mind?  Using the current
wording, this paragraph would have to be changed each time a new
control-flow keyword is added.  Based upon the historical record
that's not a grave concern ;-), but I think the new wording relies too
much on accidentals such as the fact that these are the only control
flow altering events.  It may be that control flow is not rigidly
defined -- but as it is what was really intended, maybe the fix should
be to explain the right concept rather than the current ad-hoc
solution.  This also avoids concerns of readers who are trying to read
too much into the words and might become worried that there are other
ways of altering the control flow that *would* cause the else clause
to be executed; and guides implementors of other Pyhon-like languages
(like vyper) that might have more control-flow altering statements or
events.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@loewis.home.cs.tu-berlin.de  Mon Jan  1 19:00:38 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 1 Jan 2001 20:00:38 +0100
Subject: [Python-Dev] PSA (Was: FAQ Horribly Out Of Date)
Message-ID: <200101011900.UAA01672@loewis.home.cs.tu-berlin.de>

> It appears that CNRI can only think about one thing at a time <0.5
> wink>.  For the last 6 months, that thing has been the license.  If
> they ever resolve the GPL compatibility issue, maybe they can be
> persuaded to think about the PSA.  In the meantime, I'd suggest you
> not renew <ahem>.

I think we need to find a better answer than that, and soon. While
everybody reading this list probably knows not to renew, the PSA is
the first thing that you see when selecting "Python Community" on
python.org. The first paragraph reads

# The continued, free existence of Python is promoted by the
# contributed efforts of many people. The Python Software Activity
# (PSA) supports those efforts by helping to coordinate them. The PSA
# operates web, ftp, and email services, organizes conferences, and
# engages in other activities that benefit the Python user
# community. In order to continue, the PSA needs the membership of
# people who value Python.

If you look at the current members list
(http://www.python.org/psa/Members.html), it appears that many
long-time members indeed have not renewed. This page was last updated
Nov 14 - so it appears that CNRI is still processing applications when
they come. It may well be that many of the newer members ask
themselves by now what happened to their money; it might not be easy
to get an answer to that question. However, there is clearly somebody
to blame here: The Python Community.

So I'd like to request that somebody with write permissions to these
pages changes the text, to something along the lines of replacing the
first paragraph with

# The Python community organizes itself in different ways; people
# interested in discussing development of and with Python usually
# participate in <a href="MailingLists.html">mailing lists</a>.
#
# <p>Organizations that wish to influence further directions of the
# Python language may join the <a href="/consortium">Python
# Consortium</a>.
#
# <p>The <a href="http://www.cnri.reston.va.us/">Corporation for
# National Research Initiatives</a> hosts the Python Software
# Activity, which is described below. The PSA used to provide funding
# for the Python development; that is no longer the case.

If there is a factual error in this text, please let me
know.

Regards,
Martin


From tim.one@home.com  Mon Jan  1 19:20:53 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 1 Jan 2001 14:20:53 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <E14D9Ev-0007ac-00@usw-sf-web3.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com>

[gvanrossum, in an SF patch comment]
> Bah.  I don't like this one bit.  More complexity for a little
> bit of extra speed.
> I'm keeping this open but expect to be closing it soon unless I
> hear a really good argument why more speed is really needed in
> this area.  Down with code bloat and creeping featurism!

Without judging "the solution" here, "the problem" is that everyone's first
attempt to use line-at-a-time file input in Perl:

    while (<F>} {
        ... $_ ...;
    }

runs 2-5x faster then everyone's first attempt in Python:

    while 1:
        line = f.readline()
        if not line:
            break
        ... line ...

It would be beneficial to address that *somehow*, cuz 2-5x isn't just "a
little bit"; and by the time you walk a newbie thru

    while 1:
        lines = f.readlines(hintsize)
        if not lines:
             break
        for line in lines:
            ... line ...

they feel like maybe Perl isn't so obscure after all <wink>.

Does someone have an elegant way to address this?  I believe Jeff's shot at
elegance was the other part of the patch, using (his new) xreadlines under
the covers to speed the fileinput module.

reading-text-files-is-very-common-ly y'rs  - tim



From guido@digicool.com  Mon Jan  1 19:25:07 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 01 Jan 2001 14:25:07 -0500
Subject: [Python-Dev] PSA (Was: FAQ Horribly Out Of Date)
In-Reply-To: Your message of "Mon, 01 Jan 2001 20:00:38 +0100."
 <200101011900.UAA01672@loewis.home.cs.tu-berlin.de>
References: <200101011900.UAA01672@loewis.home.cs.tu-berlin.de>
Message-ID: <200101011925.OAA09669@cj20424-a.reston1.va.home.com>

> > It appears that CNRI can only think about one thing at a time <0.5
> > wink>.  For the last 6 months, that thing has been the license.  If
> > they ever resolve the GPL compatibility issue, maybe they can be
> > persuaded to think about the PSA.  In the meantime, I'd suggest you
> > not renew <ahem>.
> 
> I think we need to find a better answer than that, and soon. While
> everybody reading this list probably knows not to renew, the PSA is
> the first thing that you see when selecting "Python Community" on
> python.org. The first paragraph reads
> 
> # The continued, free existence of Python is promoted by the
> # contributed efforts of many people. The Python Software Activity
> # (PSA) supports those efforts by helping to coordinate them. The PSA
> # operates web, ftp, and email services, organizes conferences, and
> # engages in other activities that benefit the Python user
> # community. In order to continue, the PSA needs the membership of
> # people who value Python.
> 
> If you look at the current members list
> (http://www.python.org/psa/Members.html), it appears that many
> long-time members indeed have not renewed. This page was last updated
> Nov 14 - so it appears that CNRI is still processing applications when
> they come. It may well be that many of the newer members ask
> themselves by now what happened to their money; it might not be easy
> to get an answer to that question. However, there is clearly somebody
> to blame here: The Python Community.

I don't know how many memberships CNRI has received, but it can't be
many, since we sent out no reminders.  I'll see if I can get an
answer.

> So I'd like to request that somebody with write permissions to these
> pages changes the text, to something along the lines of replacing the
> first paragraph with
> 
> # The Python community organizes itself in different ways; people
> # interested in discussing development of and with Python usually
> # participate in <a href="MailingLists.html">mailing lists</a>.
> #
> # <p>Organizations that wish to influence further directions of the
> # Python language may join the <a href="/consortium">Python
> # Consortium</a>.
> #
> # <p>The <a href="http://www.cnri.reston.va.us/">Corporation for
> # National Research Initiatives</a> hosts the Python Software
> # Activity, which is described below. The PSA used to provide funding
> # for the Python development; that is no longer the case.
> 
> If there is a factual error in this text, please let me
> know.

I've done something slightly different -- see
http://www.python.org/psa/.  I've kept only your first paragraph, and
inserted a boldface note before that about the obsolescence (or
deprecation :-) of the PSA membership.

I've removed the references to the consortium, since that's also about
to collapse under its own inactivity; instead, the PSF will be formed,
independent from CNRI, to hold the IP rights (insofar they can be
assigned to the PSF) and for not much else.

I'll see if I can get some more news about the creation of the PSF
(which is supposed to be an initiative of ActiveState and Digital
Creations).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Mon Jan  1 19:35:24 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 01 Jan 2001 14:35:24 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Mon, 01 Jan 2001 14:20:53 EST."
 <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com>
Message-ID: <200101011935.OAA09728@cj20424-a.reston1.va.home.com>

> [gvanrossum, in an SF patch comment]
> > Bah.  I don't like this one bit.  More complexity for a little
> > bit of extra speed.
> > I'm keeping this open but expect to be closing it soon unless I
> > hear a really good argument why more speed is really needed in
> > this area.  Down with code bloat and creeping featurism!
> 
> Without judging "the solution" here, "the problem" is that everyone's first
> attempt to use line-at-a-time file input in Perl:
> 
>     while (<F>} {
>         ... $_ ...;
>     }
> 
> runs 2-5x faster then everyone's first attempt in Python:
> 
>     while 1:
>         line = f.readline()
>         if not line:
>             break
>         ... line ...

But is everyone's first thought to time the speed of Python vs. Perl?
Why does it hurt so much that this is a bit slow?

> It would be beneficial to address that *somehow*, cuz 2-5x isn't just "a
> little bit"; and by the time you walk a newbie thru
> 
>     while 1:
>         lines = f.readlines(hintsize)
>         if not lines:
>              break
>         for line in lines:
>             ... line ...
> 
> they feel like maybe Perl isn't so obscure after all <wink>.
> 
> Does someone have an elegant way to address this?  I believe Jeff's shot at
> elegance was the other part of the patch, using (his new) xreadlines under
> the covers to speed the fileinput module.

But of course suggesting fileinput is also not a great solution --
it's relatively obscure (since it's not taught by most tutorials,
certainly not by the standard tutorial).

> reading-text-files-is-very-common-ly y'rs  - tim

So is worrying about performance without a good reason...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Mon Jan  1 19:49:24 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 01 Jan 2001 14:49:24 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: Your message of "Mon, 01 Jan 2001 12:01:02 +0200."
 <20010101100102.2360CA84F@darjeeling.zadka.site.co.il>
References: <20001231105812.A12168@newcnri.cnri.reston.va.us>, <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>
 <20010101100102.2360CA84F@darjeeling.zadka.site.co.il>
Message-ID: <200101011949.OAA09804@cj20424-a.reston1.va.home.com>

[Moshe]
> Well, Andrew, I know if I leave you any more time, you won't be able
> to resist the urge. OK, I'll volunteer. Can't do anything right now,
> but expect to see an updated version posted on my site soon. If 
> people will think it's a good idea, I'll move it to Misc/.
> Fred, if the some-xml-format-to-HTML you're working on is in any
> sort of readiness, I'll use that to format the FAQ.

Moshe, if your solution is to turn the FAQ into a document with a
single editor again, I think you're not doing the community a favor.
Granted, we could add some more sections (easy enough for me if
someone tells me the new section headings and which existing questions
go where) and there is a lot of obsolete information.

But I would be very hesitant to drop the notion of maintaining the FAQ
as a group collaboration project.  There's nothing wrong with the FAQ
wizard except that the password (Spam) should be made publicly known...

I've also noticed that Bjorn Pettersen has made a whole slew of useful
updates to various sections, mostly updates about new 2.0 features or
syntax.

> Having used Perl
> in the last couple of weeks, I learned to appreciate the fact that
> the FAQ is a standard part of the documentation.

Does that mean more than that it should be linked to from
http://www.python.org/doc/ ?  It's already there in the side bar; does
it need a more prominent position?  I used to include the FAQ in Misc/
(Ping's Misc/faq2html.py script is a last remnant of that), but gave
up after realizing that the on-line FAQ is much more useful than a
single text file.

In my eyes, the best thing you (and everyone else) could do, if you
find the time, would be to use the FAQ wizard to fix or delete
out-of-date entries.  To delete an entry, change its subject to
"Deleted" and remove its body; I'll figure out a way to delete them
from the index.  Because FAQ entries can refer to each other (and are
referred to from elsewhere) by number, it's not safe to simply
renumber entries.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Mon Jan  1 20:27:37 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 1 Jan 2001 15:27:37 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <200101011858.NAA09263@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEDJIGAA.tim.one@home.com>

[Guido]
> Thomas just checked this in, using Tim's words:

[   The optional \keyword{else} clause is executed when no
    exception occurs in the \keyword{try} clause.  Exceptions in
    the \keyword{else} clause are not handled by the preceding
    \keyword{except} clauses.

vs
    The optional \keyword{else} clause is executed when the
    \keyword{try} clause terminates by any means other than an
    exception or executing a \keyword{return}, \keyword{continue}
    or \keyword{break} statement.  Exceptions in the \keyword{else}
    clause are not handled by the preceding \keyword{except} clauses.
]

> How is this different from "when control flow reaches the end of the
> try clause", which is what I really had in mind?

Only in that it doesn't appeal to a new undefined phrase, and is (I think)
unambiguous in the eyes of a non-specialist reader (like Robin's friend).
Note that "reaching the end of the try clause" is at best ambiguous, because
you *really* have in mind "falling off the end" of the try clause.  It
wouldn't be unreasonable to say that in:

    try:
         x = 1
         y = 2
         return 1

"x=1" is the beginning of the try clause and "return 1" is the end.  So if
the reader doesn't already know what you mean, saying "the end" doesn't nail
it (or, if like me, the reader does already know what you mean, it doesn't
matter one whit what it says <wink>).

> Using the current wording, this paragraph would have to be
> changed each time a new control-flow keyword is added.  Based
> upon the historical record that's not a grave concern ;-),

It was sure no concern of mine ...

> but I think the new wording relies too much on accidentals such
> as the fact that these are the only control flow altering events.
>
> It may be that control flow is not rigidly defined -- but as it is
> what was really intended, maybe the fix should be to explain the
> right concept rather than the current ad-hoc solution.
> ...

OK, except I don't know how to do that succinctly.  For example, if Java had
an "else" clause, the Java spec would say:

    If present, the "else block" is executed if and only if execution
    of the "try block" completes normally, and then there is a choice:

        If the "else block" completes normally, then the
        "try" statement completes normally.

        If the "else block" completes abruptly for reason S,
        then the "try" statement completes abruptly for reason S.

That is, they deal with control-flow issues via appeal to "complete
normally" and "complete abruptly" (which latter comes in several flavors
("reasons"), such as returns and exceptions), and there are pages and pages
and pages of stuff throughout the spec inductively defining when these
conditions obtain.  It's clear, precise and readable; but it's also wordy,
and we don't have anything similar to build on.

As a compromise, given that we're not going to take the time to be precise
(well, I'm sure not ...):

    The optional \keyword{else} clause is executed if and
    when control flows off the end of the \keyword{try}
    clause.\foonote{In Python 2.0, control "flows off the
    end" except in case of exception, or executing a
    \keyword{return}, \keyword{continue} or \keyword{break}
    statement.}
    Exceptions in the \keyword{else} clause are not handled by
    the preceding \keyword{except} clauses.

Now it's all of imprecise, almost precise, specific to Python 2.0, and
robust against any future changes <wink>.



From akuchlin@mems-exchange.org  Mon Jan  1 20:35:27 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 1 Jan 2001 15:35:27 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <200101011949.OAA09804@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 01, 2001 at 02:49:24PM -0500
References: <20001231105812.A12168@newcnri.cnri.reston.va.us>, <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> <20010101100102.2360CA84F@darjeeling.zadka.site.co.il> <200101011949.OAA09804@cj20424-a.reston1.va.home.com>
Message-ID: <20010101153527.A14116@newcnri.cnri.reston.va.us>

On Mon, Jan 01, 2001 at 02:49:24PM -0500, Guido van Rossum wrote:
>But I would be very hesitant to drop the notion of maintaining the FAQ
>as a group collaboration project.  There's nothing wrong with the FAQ
>wizard except that the password (Spam) should be made publicly known...

Why multiply the number of mechanisms required to maintain things?  We
already use CVS for other documentation; why not use it for the FAQ as 
well?  

--amk


From tim.one@home.com  Mon Jan  1 21:00:36 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 1 Jan 2001 16:00:36 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20010101153527.A14116@newcnri.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEDLIGAA.tim.one@home.com>

[Andrew Kuchling]
> Why multiply the number of mechanisms required to maintain things?
> We already use CVS for other documentation; why not use it for the
> FAQ as well?

The search facilities of the FAQ wizard are invaluable, and so is the
ability for "just users" to update the info from within their browsers.
There are two problems with the FAQ in practice:

1. It doesn't get updated enough.  We can't fix that by making it harder to
update!

2. It's *only* available via the web interface.  We should ship a text or
HTML snapshot with releases; perhaps even do the usual Usenet periodic
FAQ-posting thing.



From tim.one@home.com  Mon Jan  1 22:34:03 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 1 Jan 2001 17:34:03 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101011935.OAA09728@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEECIGAA.tim.one@home.com>

[Guido]
> But is everyone's first thought to time the speed of Python vs. Perl?

It's few peoples' first thought.  It's impossible for bilingual programmers
(or dabblers, or evaluators) not to notice *soon*, though, because:

> Why does it hurt so much that this is a bit slow?

Factors of 2 to 5 aren't "a bit" -- they're obvious when they happen, but
the *cause* is not.  To judge from a decade of c.l.py gripes, most people
write it off to "huh -- guess Python is just slow"; the rest eventually
figure out that their text input is the bottleneck (Tom Christiansen never
got this far <0.5 wink>), but then don't know what to do about it.

At this point I'm going to insert two anonymized pvt emails from last year:

-----Original Message #1 -----

From: TTT
Sent: Monday, March 13, 2000 2:29 AM
To: GGG
Subject: RE: [Python-Help] C, C++, Java, Perl, Python, Rexx, Tcl comparison

GGG, note especially figure 4 in Lutz Prechelt's report:

>   http://wwwipd.ira.uka.de/~prechelt/Biblio/#jccpprtTR

The submitted Python programs had by far the largest variability in how long
it took to load the dictionary.  My input loop is probably typical of the
"fast" Python programs, which indeed beat most (but not all) of the fastest
Perl ones here:

class Dictionary:
    ...

    def fill_from_file(self, f, BUFFERSIZE=500000):
        """f, BUFFERSIZE=500000 -> fill dictionary from file f.

        f must be an open file, or other object with a readlines()
        method.  It must contain one word per line.  Optional arg
        BUFFERSIZE is used to chunk up input for efficiency, and is
        roughly the # of bytes read at a time.
        """

        addword = self.addword
        while 1:
            lines = f.readlines(BUFFERSIZE)
            if not lines:
                break
            for line in lines:
                addword(line[:-1])  # chop trailing newline

Comparable Perl may have been the one-liner:

    grep(&addword, chomp(<>));

which may account for why Perl's memory use was uniformly higher than
Python's.

Whatever, you really need to be a Python expert to dream up "the fast way"
to do Python input!  Hire me, and I'll fix that <wink>.

nothing-like-blackmail-before-going-to-bed-ly y'rs  - TTT


-----Original Message #2 -----

From: GGG
Sent: Monday, March 13, 2000 7:08 AM
To: TTT
Subject: Re: [Python-Help] C, C++, Java, Perl, Python, Rexx, Tcl comparison


Agreed.  readlines(BUFFERSIZE) is a crock.  In fact, ``for i in
f.readlines()'' should use lazy evaluation -- but that will have to wait for
Py3K unless we add hints so that readlines knows it is being called from a
for loop.

--GGG


-----Back to 2001 -----

I took TTT's advice and read Lutz's report <wink>.  I agree with GGG that
hiding this in .readlines() would be maximally elegant.  xreadlines supplies
most of the lazy machinery GGG favored.  I don't know how hard it would be
to supply the rest of it, but it's such a frequent bitching point that I
would prefer pointing people to an explicit .xreadlines() hack than either
(a) try to convince them that they "shouldn't" care about the speed as much
as they claim to; or, (b) try to explain the double-loop buffering method.
I'd personally rather use an explicit .xreadlines() hack than code the
double-loop buffering too, and don't see an obvious way to do better than
that right now.

>> reading-text-files-is-very-common-ly y'rs  - tim

> So is worrying about performance without a good reason...

Indeed it is.  I'm persuaded that many people making this specific complaint
have a legitimate need for more speed, though, and that many don't persist
with Python long enough to find out how to address this complaint (because
the double-loop method is too obscure for a newbie to dream up).  That makes
this hack score extraordinarily high on my benefit/harm ratio scale (in P3K
xreadlines can be deprecated in favor of readlines <0.9 wink>).

heck-it-doesn't-even-require-a-new-keyword-ly y'rs  - tim



From thomas@xs4all.net  Mon Jan  1 22:46:45 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 1 Jan 2001 23:46:45 +0100
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101011935.OAA09728@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 01, 2001 at 02:35:24PM -0500
References: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> <200101011935.OAA09728@cj20424-a.reston1.va.home.com>
Message-ID: <20010101234645.B5435@xs4all.nl>

On Mon, Jan 01, 2001 at 02:35:24PM -0500, Guido van Rossum wrote:

[ Python lacks a One True Way of doing Perl's 'while(<>)' ]

> > Does someone have an elegant way to address this?  I believe Jeff's shot at
> > elegance was the other part of the patch, using (his new) xreadlines under
> > the covers to speed the fileinput module.

> But of course suggesting fileinput is also not a great solution --
> it's relatively obscure (since it's not taught by most tutorials,
> certainly not by the standard tutorial).

Is fileinput really obscure ? I personally quite like it. It is enough like
the perl idiom to be very useful for people thinking that way, and it
doesn't require special syntax or considerations. If tutorialization is the
only problem, I'd be happy to fix that, provided Fred or Moshe can TeX my
fix up.

As for speed (which stays a secondary or tertiary consideration at best) do
we really need the xreadlines method to accomplish that ? Couldn't fileinput
get almost the same performance using readlines() with a sizehint ? I
personally don't like the xreadlines because it adds yet another function to
do the same, with a slight, subtle and to the untrained programmer unclear
distinction from the rest. (I don't really like the range/xrange difference
either -- I think Python code shouldn't care whether they're dealing with a
real list or a generator, and as much as possible should just be generators.
And in the case of simple (x)range()es, I have yet to see a case where a
'real' list had significantly better performance than a generator.)

If we *do* start adding methods to (the public API of) filemethods, I think
we should consider more than just xreadlines() (I seem to recall other
proposals, but my memory is hazy at the moment -- I haven't slept since last
millennium) add whatever is necessary, and provide a UserFile in the std.
lib that 'emulates' all fileobject functionality using a single readline()
function.

Now, if you'll excuse me, I have a date with a soft bed I haven't seen in
about 40 hours, a pair of aspirin my head is killing for and probably a
hangover that I don't want to think about, right now ;)

Gelukkig-Nieuwjaar-iedereen-ly y'rs

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From jepler@inetnebr.com  Tue Jan  2 01:49:35 2001
From: jepler@inetnebr.com (Jeff Epler)
Date: Mon, 1 Jan 2001 19:49:35 -0600
Subject: [Python-Dev] Re: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com>; from Tim Peters on Mon, Jan 01, 2001 at 02:20:53PM -0500
Message-ID: <20010101194935.19672@falcon.inetnebr.com>

I'd like to speak up about this patch I've submitted on sourceforge.

I consider the xreadlines function/object to be the core of my proposal.
The addition of a method to file objects, as well as the modifications
to fileinput, are secondary in my opinion.

The desire is to iterate over file conents in a way that satisfies the
following criteria:
	* Uses the "for" syntax, because this clearly captures the
	  underlying operation. (files can be viewed as sequences of
	  lines when appropriate)
	* Consumes small amounts of memory even when the file contents
	  are large.
	* Has the lowest overhead that can reasonably be attained.

I think that it is agreed that the ability to use the "for" syntax is
important, since it was the impetus for the xrange function/object.
After all, there's a "while" statement which will give the same effect,
without introducing xrange.

The point under debate, as I see it, is the utility of speeding up the
"benchmarks" of folks who compare the speed of Python and another
language doing a very simple loop over the lines in a file.  Since this
advantage disappears once real work is beig done on the file, maybe an
XReadLines class, written in Python, would be more suitable.  In fact,
I've written such a class since I didn't know about fileinput and in
any case I find it less useful to me because of all the weird stuff it
does. (parsing argv, opening files by name, etc)

One shortcoming of my current patch, aside from the ones already named
in another person's response to the it, are that it fails when working
on a file-like class which implements .readline but not .readlines.

In any case, I wrote xreadlines to learn how to write C extensions to
Python, and submitted it at the suggestion of a fellow Python user in a
private discussion.  I'd like to extinguish one of these eternal
comp.lang.python threads with it too, but maybe it's not to be.

Happy new year, all.

Jeff


From gstein@lyra.org  Tue Jan  2 03:34:31 2001
From: gstein@lyra.org (Greg Stein)
Date: Mon, 1 Jan 2001 19:34:31 -0800
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20010101153527.A14116@newcnri.cnri.reston.va.us>; from akuchlin@cnri.reston.va.us on Mon, Jan 01, 2001 at 03:35:27PM -0500
References: <20001231105812.A12168@newcnri.cnri.reston.va.us>, <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> <20010101100102.2360CA84F@darjeeling.zadka.site.co.il> <200101011949.OAA09804@cj20424-a.reston1.va.home.com> <20010101153527.A14116@newcnri.cnri.reston.va.us>
Message-ID: <20010101193431.M10567@lyra.org>

On Mon, Jan 01, 2001 at 03:35:27PM -0500, Andrew Kuchling wrote:
> On Mon, Jan 01, 2001 at 02:49:24PM -0500, Guido van Rossum wrote:
> >But I would be very hesitant to drop the notion of maintaining the FAQ
> >as a group collaboration project.  There's nothing wrong with the FAQ
> >wizard except that the password (Spam) should be made publicly known...
> 
> Why multiply the number of mechanisms required to maintain things?  We
> already use CVS for other documentation; why not use it for the FAQ as 
> well?  

That would limit the updaters to just those with CVS access. As Guido just
pointed out, Bjorn made a bunch of updates. And he didn't need CVS to do
that...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim.one@home.com  Tue Jan  2 03:44:05 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 1 Jan 2001 22:44:05 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <20010101194935.19672@falcon.inetnebr.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEEKIGAA.tim.one@home.com>

[Jeff Epler]
> I'd like to speak up about this patch I've submitted on sourceforge.

I'm not sure that's allowed <wink>.

> ...
> The point under debate, as I see it, is the utility of speeding
> up the "benchmarks" of folks who compare the speed of Python and
> another language doing a very simple loop over the lines in a file.

If that were true, I couldn't care less.

> Since this advantage disappears once real work is being done on
> the file, ...

I agree that's true, but submit it's rarely relevant. *Most* file-crunching
apps are dominated by I/O time, which is why this is so visible to so many;
e.g., chewing over massive log files looking for patterns appears to be the
growth industry of the 21st century <wink>.  Even in Lutz's report (see
reference from earlier mail), where the task to be solved was far from
trivial, input time exceeded processing time across all languages (with some
oddball exceptions, when the coder neglected to use a hash table to store
info).  That's thoroughly typical of real file-crunching applications, in my
experience:  Perl has a killer speed advantage in the single most
time-consuming portion of the app, and due to one implementation trick.
Take that advantage away, and Python holds its own in this domain.

Coincidentally, I got pvt email from a newbie today, reading in part;

> If Perl wasn't so gosh darn good and fast at text scrubbing, it
> wouldn't really be a consideration, it's syntax is so clunky and
> hard to learn by comparison to both Python and Ruby.

This is just depressing, because I can predict every step of this dance.

> ...
> Happy new year, all.

And to you!  Just make sure it's a fast new year <wink>.




From moshez@zadka.site.co.il  Tue Jan  2 15:24:40 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Tue,  2 Jan 2001 17:24:40 +0200 (IST)
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <20010101234645.B5435@xs4all.nl>
References: <20010101234645.B5435@xs4all.nl>, <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> <200101011935.OAA09728@cj20424-a.reston1.va.home.com>
Message-ID: <20010102152440.9C26DA84F@darjeeling.zadka.site.co.il>

On Mon, 1 Jan 2001, Thomas Wouters <thomas@xs4all.net> wrote:

> As for speed (which stays a secondary or tertiary consideration at best) do
> we really need the xreadlines method to accomplish that ? Couldn't fileinput
> get almost the same performance using readlines() with a sizehint ? I

<aol>me too</aol>
Adding xreadlines() to the interface would break half a dozen file-objects all
around the world (just the standard library has StringIO, cStringIO,
GzipFile and probably some others I can't remember)

Adding .readlines(sizehint) to fileinput, and adding a function
to create something similar to fileinput from a file object (as opposed
to a file name) would help everyone, and doesn't seem to hard.
Is there a gotcha I'm just not seeing?

-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From tim.one@home.com  Tue Jan  2 08:06:32 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 2 Jan 2001 03:06:32 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <20010101234645.B5435@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com>

[Thomas Wouters]
> ...
> As for speed (which stays a secondary or tertiary consideration
> at best) do we really need the xreadlines method to accomplish
> that ?  Couldn't fileinput get almost the same performance using
> readlines() with a sizehint ?

There was a long email discussion among Jeff, Paul Prescod, Neel
Krishnaswami, and Alex Martelli about this.  I started getting copied on it
somewhere midstream, but didn't have time to follow it then (like I do now
<wink>).

About two weeks ago Neel summarized all the approaches then under
discussion:

"""
[Neel Krishnaswami]

...

Quick performance summary of the current solutions:

Slowest: for line in fileinput.input('foo'):     # Time 100
       : while 1: line = file.readline()         # Time 75
       : for line in LinesOf(open('foo')):       # Time 25
Fastest: for line in file.readlines():           # Time 10
         while 1: lines = file.readlines(hint)   # Time 10
         for line in xreadlines(file):           # Time 10

The difference in speed between the slowest and fastest is about
a factor of 10.

LinesOf is Alex's Python wrapper class that takes a file and
uses readlines() with a size-hint to present a sequence interface.
It's around half as fast as the fastest idioms, and 3-4 times
faster than while 1:. Jeff's xreadlines is essentially the same
thing in C, and is indistinguishable in performance from the
other fast idioms.

...

"""

On his box, line-at-a-time is >7x slower than the fastest Python methods,
which latter are usually close (depending on the platform) to Perl
line-at-a-time speeds.  A factor of 7 is too large for most working
programmers to ignore in the interest of somebody else's notion of
theoretical purity <wink>.  Seriously, speed is not a secondary
consideration to me when the gap is this gross, and in an area so visible
and common.

Alex's LineOf appears a good predictor for how adding
fileinput.readlines(hint) would perform, since it appears to *be* that
(except off on its own).  Then it buys a factor of 3 over line-at-a-time on
Neel's box but leaves a factor of 2.5 on the table.  The cause of the latter
appears mostly to be the overhead of getting a Python method call into the
equation for each line returned.

Note that Jeff added .xreadlines() as a file object method at Neel's urging.
The way he started this is shown on the last line:  a function.  If we threw
out the fileinput and file method aspects, and just added a new module
xreadlines with a function xreadlines, then what?  I bet it would become as
popular as the string module, and for good reason:  it's a specific approach
that works, to a specific and common problem.

> ...
> And in the case of simple (x)range()es, I have yet to see a case
> where a 'real' list had significantly better performance than
> a generator.)

It varies by platform, but I don't think I've heard of variations larger
than 20% in either direction.  20% is nothing, though; in *this* case we're
talking order of magnitude.  That's go/nogo territory.

> ...
> Gelukkig-Nieuwjaar-iedereen-ly y'rs

I understand people are passionate when reality clashes with the dream of a
wart-free language, but that's no reason to swear at me <wink>.

wishing-you-a-happy-new-year-like-a-civilized-man-ly y'rs  - tim



From paulp@ActiveState.com  Tue Jan  2 10:00:46 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 02:00:46 -0800
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines ::
 xrange : range
References: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> <200101011935.OAA09728@cj20424-a.reston1.va.home.com>
Message-ID: <3A51A6CE.3B15371D@ActiveState.com>

Guido van Rossum wrote:
> 
> ...
> 
> But is everyone's first thought to time the speed of Python vs. Perl?
> Why does it hurt so much that this is a bit slow?

I want to interject here that I asked Jeff to submit this patch because
I don't see it as "a little bit slow." When someone transliterates a
program from one scripting language to another and gets a program that
is two to five times slower that is a big deal!

> But of course suggesting fileinput is also not a great solution --
> it's relatively obscure (since it's not taught by most tutorials,
> certainly not by the standard tutorial).

Fileinput's primary problem is that IIRC, it is even slower than doing
readline yourself!

> > reading-text-files-is-very-common-ly y'rs  - tim
> 
> So is worrying about performance without a good reason...

I don't understand what constitutes good reason. We're talking about a
relatively minor change that will speed up thousands of programs, answer
a frequently asked question from comp.lang.python, obliterate an obscure
idiom and reduce the number of requests for a Python syntax change
(assignment expression) all in one bold sweep. It seemed to me as if it
was a "pure win."

 Paul Prescod


From paulp@ActiveState.com  Tue Jan  2 10:06:24 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 02:06:24 -0800
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines ::
 xrange : range
References: <20010101234645.B5435@xs4all.nl>, <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> <200101011935.OAA09728@cj20424-a.reston1.va.home.com> <20010102152440.9C26DA84F@darjeeling.zadka.site.co.il>
Message-ID: <3A51A820.50365F02@ActiveState.com>

Moshe Zadka wrote:
> 
> ...
> 
> Adding .readlines(sizehint) to fileinput, and adding a function
> to create something similar to fileinput from a file object (as opposed
> to a file name) would help everyone, and doesn't seem to hard.
> Is there a gotcha I'm just not seeing?

Fileinput is inherently slow because there are too many layers of Python
code. I started to consider ways of inverting the logic so that it only
called into Python when it needed to switch files but it would have been
a much larger patch than Jeff's and I thought that a conservative
approach was important.

Fileinput should someday be optimized but we can easily get a
low-hanging fruit improvement with Jeff's patch.

 Paul Prescod


From guido@digicool.com  Tue Jan  2 14:56:40 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 02 Jan 2001 09:56:40 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 03:06:32 EST."
 <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com>
Message-ID: <200101021456.JAA12633@cj20424-a.reston1.va.home.com>

Tim's almost as good at convincing me as he is at channeling me!  The
timings he showed almost convinced me that fileinput is hopeless and
xreadlines should be added.  But then I wrote a little timer of my
own...

I am including the timer program below my signature.  The test input
was the current access_log of dinsdale.python.org, which has about 119
Mbytes and 1M lines (as counted by the test program).

I measure about a factor of 2 between readlines with a sizehint (of 1
MB) and fileinput; a change to fileinput that
uses readline with a sizehint and in-lines the common case in
__getitem__ (as suggested by Moshe), didn't make a difference.

Output (the first time is realtime seconds, the second CPU seconds):

total 119808333 chars and 1009350 lines
count_chars_lines     7.944  7.890
readlines_sizehint    5.375  5.320
using_fileinput      15.861 15.740
while_readline        8.648  8.570

This was on a 600 MHz Pentium-III Linux box (RH 6.2).

Note that count_chars_lines and readlines_sizehint use the same
algorithm -- the difference is that readlines_sizehint uses 'pass' as
the inner loop body, while count_chars_lines adds two counters.

Given that very light per-line processing (counting lines and
characters) already increases the time considerably, I'm not sure I
buy the arguments that the I/O overhead is always considerable.  The
fact that my change to fileinput.py didn't make a difference suggests
that its lack of speed it purely caused by the Python code.

Now what to do?  I still don't like xreadlines very much, but I do see
that it can save some time.  But my test doesn't confirm Neel's times
as posted by Tim:

> Slowest: for line in fileinput.input('foo'):     # Time 100
>        : while 1: line = file.readline()         # Time 75
>        : for line in LinesOf(open('foo')):       # Time 25
> Fastest: for line in file.readlines():           # Time 10
>          while 1: lines = file.readlines(hint)   # Time 10
>          for line in xreadlines(file):           # Time 10

I only see a factor of 3 between fastest and slowest, and
readline is only about 60% slower than readlines_sizehint.

--Guido van Rossum (home page: http://www.python.org/~guido/)

import time, fileinput, sys

def timer(func, *args):
    t0 = time.time()
    c0 = time.clock()
    func(*args)
    t1 = time.time()
    c1 = time.clock()
    print "%-20s %6.3f %6.3f" % (func.__name__, t1-t0, c1-c0)

def count_chars_lines(fn, bs=1024*1024):
    nl = 0
    nc = 0
    f = open(fn, "r")
    while 1:
        buf = f.readlines(bs)
        if not buf:
            break
        for line in buf:
            nl += 1
            nc += len(line)
    f.close()
    print "total", nc, "chars and", nl, "lines"

def readlines_sizehint(fn, bs=1024*1024):
    f = open(fn, "r")
    while 1:
        buf = f.readlines(bs)
        if not buf:
            break
        for line in buf:
            pass
    f.close()

def using_fileinput(fn):
    f = fileinput.FileInput(fn)
    for line in f:
        pass
    f.close()

def while_readline(fn):
    f = open(fn, "r")
    while 1:
        line = f.readline()
        if not line:
            break
        pass
    f.close()

fn = "/home/guido/access_log"
if sys.argv[1:]:
    fn = sys.argv[1]
timer(count_chars_lines, fn)
timer(readlines_sizehint, fn, 1024*1024)
timer(using_fileinput, fn)
timer(while_readline, fn)


From guido@digicool.com  Tue Jan  2 15:07:06 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 02 Jan 2001 10:07:06 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: Your message of "Mon, 01 Jan 2001 15:27:37 EST."
 <LNBBLJKPBEHFEDALKOLCGEDJIGAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCGEDJIGAA.tim.one@home.com>
Message-ID: <200101021507.KAA12796@cj20424-a.reston1.va.home.com>

> As a compromise, given that we're not going to take the time to be precise
> (well, I'm sure not ...):
> 
>     The optional \keyword{else} clause is executed if and
>     when control flows off the end of the \keyword{try}
>     clause.\foonote{In Python 2.0, control "flows off the
>     end" except in case of exception, or executing a
>     \keyword{return}, \keyword{continue} or \keyword{break}
>     statement.}
>     Exceptions in the \keyword{else} clause are not handled by
>     the preceding \keyword{except} clauses.
> 
> Now it's all of imprecise, almost precise, specific to Python 2.0, and
> robust against any future changes <wink>.

Sounds good to me.  The reference to 2.0 could be changed to
"Currently".

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Tue Jan  2 15:20:11 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 02 Jan 2001 10:20:11 -0500
Subject: [Python-Dev] Re: curses in the core?
In-Reply-To: Your message of "Thu, 28 Dec 2000 18:25:28 EST."
 <20001228182528.A10743@thyrsus.com>
References: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>
 <20001228182528.A10743@thyrsus.com>
Message-ID: <200101021520.KAA13222@cj20424-a.reston1.va.home.com>

> What does being in the Python core mean?  There are two potential definitions:
> 
> 1. Documentation says it's available on all platforms.
> 
> 2. Documentation restricts it to one of the three platform groups 
>    (Unix/Windows/Mac) but implies that it will be available on any
>    OS in that group.  
> 
> I think the second one is closer to what application programmers
> thinking about which batteries are included expect.  But I could be
> persuaded otherwise by a good argument.

Actually, when *I* have used the term "core" I've typically thought of
this as referring to anything that's in the standard source
distribution, whether or not it is built on all platforms.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From nas@arctrix.com  Tue Jan  2 08:42:30 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 00:42:30 -0800
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101021456.JAA12633@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 02, 2001 at 09:56:40AM -0500
References: <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com> <200101021456.JAA12633@cj20424-a.reston1.va.home.com>
Message-ID: <20010102004230.A29700@glacier.fnational.com>

On Tue, Jan 02, 2001 at 09:56:40AM -0500, Guido van Rossum wrote:
> Now what to do?  I still don't like xreadlines very much, but I do see
> that it can save some time.  But my test doesn't confirm Neel's times
> as posted by Tim:
> 
> > Slowest: for line in fileinput.input('foo'):     # Time 100
> >        : while 1: line = file.readline()         # Time 75
> >        : for line in LinesOf(open('foo')):       # Time 25
> > Fastest: for line in file.readlines():           # Time 10
> >          while 1: lines = file.readlines(hint)   # Time 10
> >          for line in xreadlines(file):           # Time 10
> 
> I only see a factor of 3 between fastest and slowest, and
> readline is only about 60% slower than readlines_sizehint.

Could it be that your using the CVS version of Python which
includes Andrew's cool glibc getline enhancement?

  Neil


From guido@digicool.com  Tue Jan  2 15:40:40 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 02 Jan 2001 10:40:40 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 00:42:30 PST."
 <20010102004230.A29700@glacier.fnational.com>
References: <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com> <200101021456.JAA12633@cj20424-a.reston1.va.home.com>
 <20010102004230.A29700@glacier.fnational.com>
Message-ID: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>

[me]
> > I only see a factor of 3 between fastest and slowest, and
> > readline is only about 60% slower than readlines_sizehint.

[Neil]
> Could it be that your using the CVS version of Python which
> includes Andrew's cool glibc getline enhancement?

Bingo!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Tue Jan  2 16:34:31 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 2 Jan 2001 11:34:31 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <200101021507.KAA12796@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEFJIGAA.tim.one@home.com>

>>     The optional \keyword{else} clause is executed if and
>>     when control flows off the end of the \keyword{try}
>>     clause.\foonote{In Python 2.0, control "flows off the
>>     end" except in case of exception, or executing a
>>     \keyword{return}, \keyword{continue} or \keyword{break}
>>     statement.}
>>     Exceptions in the \keyword{else} clause are not handled by
>>     the preceding \keyword{except} clauses.

[Guido]
> Sounds good to me.  The reference to 2.0 could be changed to
> "Currently".

Cool.  See

http://sourceforge.net/bugs/?group_id=5470&func=detailbug&bug_id=127098



From tim.one@home.com  Tue Jan  2 20:48:08 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 2 Jan 2001 15:48:08 -0500
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
Message-ID: <LNBBLJKPBEHFEDALKOLCEEGJIGAA.tim.one@home.com>

test_compare is broken because the expected-output file has bizarre stuff in
it like:

    cmp(2, [1]) = -108
    cmp(2, (2,)) = -116
    cmp(2, None) = -78

What's up with that?

I'll leave test_minidom to someone who thinks they know what it's doing.

Both failures are very recent.



From tim.one@home.com  Tue Jan  2 20:48:09 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 2 Jan 2001 15:48:09 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>

[Guido]
> I only see a factor of 3 between fastest and slowest, and
> readline is only about 60% slower than readlines_sizehint.

[Neil]
> Could it be that your using the CVS version of Python which
> includes Andrew's cool glibc getline enhancement?

[Guido]
> Bingo!

It's a good thing I haven't yet had time to try any speed tests myself,
since I don't have a glibc-enabled platform so Guido and I may have been
tempted to disagree about numbers in public <wink>.

I checked out the source for glibc's getline.  It's pulling the same trick
Perl uses, copying directly from the stdio buffer when it can, instead of
(like Python, and like almost all vendor fgets implementations) doing
getc-in-a-loop.  The difference is that Perl can't do that without breaking
into the FILE* representation in platform-dependent ways.  It's a shame that
almost all vendors missed that fgets was defined as a primitive by the C
committee precisely so that vendors *could* pull this speed trick under the
covers.  It's also a shame that Perl did it for them <wink>.



From barry@digicool.com  Tue Jan  2 21:56:10 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 2 Jan 2001 16:56:10 -0500
Subject: [Python-Dev] testing, please ignore
Message-ID: <14930.20090.283107.799626@anthem.wooz.org>

Sorry folks, just making sure things are working again.

you-really-didn't-want-email-this-millennium-didja?-ly y'rs,
-Barry



From guido@python.org  Tue Jan  2 20:59:22 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 02 Jan 2001 15:59:22 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 14:59:24 EST."
 <LNBBLJKPBEHFEDALKOLCAEGFIGAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEGFIGAA.tim.one@home.com>
Message-ID: <200101022059.PAA14845@cj20424-a.reston1.va.home.com>

> [Guido]
> > I only see a factor of 3 between fastest and slowest, and
> > readline is only about 60% slower than readlines_sizehint.
> 
> [Neil]
> > Could it be that your using the CVS version of Python which
> > includes Andrew's cool glibc getline enhancement?
> 
> [Guido]
> > Bingo!
> 
> It's a good thing I haven't yet had time to try any speed tests myself,
> since I don't have a glibc-enabled platform so Guido and I may have been
> tempted to disagree about numbers in public <wink>.
> 
> I checked out the source for glibc's getline.  It's pulling the same trick
> Perl uses, copying directly from the stdio buffer when it can, instead of
> (like Python, and like almost all vendor fgets implementations) doing
> getc-in-a-loop.  The difference is that Perl can't do that without breaking
> into the FILE* representation in platform-dependent ways.  It's a shame that
> almost all vendors missed that fgets was defined as a primitive by the C
> committee precisely so that vendors *could* pull this speed trick under the
> covers.  It's also a shame that Perl did it for them <wink>.

Quite apart from whether we should enable xreadlines(), could you look
into doing a similar thing for MSVC stdio?  For most Unix platforms, a
cop-out answer is "use glibc" -- but for Windows it may pay to do our
own hack.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Tue Jan  2 21:06:05 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Tue, 2 Jan 2001 16:06:05 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 02, 2001 at 03:48:09PM -0500
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
Message-ID: <20010102160605.A5211@kronos.cnri.reston.va.us>

On Tue, Jan 02, 2001 at 03:48:09PM -0500, Tim Peters wrote:
>into the FILE* representation in platform-dependent ways.  It's a shame that
>almost all vendors missed that fgets was defined as a primitive by the C
>committee precisely so that vendors *could* pull this speed trick under the
>covers.  It's also a shame that Perl did it for them <wink>.

So, should Python be changed to use fgets(), available on all ANSI C
platforms, rather than the glibc-specific getline()?  That would be
more complicated than the brain-dead easy course of using getline(),
which is obviously why I didn't do it; PyFile_GetLine() had annoyingly
complicated logic.

When this was discussed in comp.lang.python, someone also mentioned
getc_unlocked(), which saves the overhead of locking the stream every
time, but that didn't seem a fruitful avenue for exploration.

--amk



From tim.one@home.com  Tue Jan  2 22:00:37 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 2 Jan 2001 17:00:37 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101022059.PAA14845@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEGNIGAA.tim.one@home.com>

[Guido]
> Quite apart from whether we should enable xreadlines(), could you look
> into doing a similar thing for MSVC stdio?  For most Unix platforms, a
> cop-out answer is "use glibc" -- but for Windows it may pay to do our
> own hack.

There's no question about whether it would pay on Windows, because it pays
big for Perl on Windows.  The question is about cost.  There's no way to
*do* it short of the way Perl does it, which is to write a large pile of
Windows-specific code (roughly the same size and complexity as the glibc
getline implementation -- check it out, it's not trivial, and glibc exploits
compiler inlining to make it bearable) relying on reverse-engineered
accidents of how MS happens to use all the fields from this undocumented
struct (from MS's stdio.h):

struct _iobuf {
        char *_ptr;
        int   _cnt;
        char *_base;
        int   _flag;
        int   _file;
        int   _charbuf;
        int   _bufsiz;
        char *_tmpfname;
        };
typedef struct _iobuf FILE;

in their stdio implementation.  Else it won't play correctly with MS's
stdio.  That's A Project.  Last year I tried extracting the relevant code
from Perl, but, as is usual, gave up after unraveling the third (whatever)
layer of mystery macros with no end in sight.  I bet it would take me a
week.  Is it worth that much to you and DC?  Since the real Windows experts
are hanging out at ActiveState, I bet one of them will volunteer to do it
tonight <wink>.



From tim.one@home.com  Tue Jan  2 22:17:14 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 2 Jan 2001 17:17:14 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010102160605.A5211@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEGNIGAA.tim.one@home.com>

[Tim]
> It's a shame that almost all vendors missed that fgets was defined
> as a primitive by the C committee precisely so that vendors *could*
> pull this speed trick under the covers.  It's also a shame that Perl
> did it for them <wink>.

[Andrew Kuchling]
> So, should Python be changed to use fgets(), available on all ANSI C
> platforms, rather than the glibc-specific getline()?  That would be
> more complicated than the brain-dead easy course of using getline(),
> which is obviously why I didn't do it; PyFile_GetLine() had annoyingly
> complicated logic.

The thrust of my original comment above is that fgets is almost never faster
than what Python is doing now, because vendors overwhelmingly do *not*
exploit the opportunity the std gave them.  So, no, switching to fgets()
wouldn't help.

> When this was discussed in comp.lang.python, someone also mentioned
> getc_unlocked(), which saves the overhead of locking the stream every
> time, but that didn't seem a fruitful avenue for exploration.

Well, get_unlocked isn't std (not even in C99).  Mentioning it did inspire
me to discover, however, that while the MS fgets() is the typical "getc in a
loop" thing, at least it locks/unlocks the stream once each at function
entry/exit, and uses a special MS flavor of getc ("_getc_lk") inside the
loop.  However, that this helps is an illusion, because the body of their
_getc_lk macro is identical to the body of their getc macro.  Smells like a
bug, or an unfinished project.



From paulp@ActiveState.com  Tue Jan  2 22:40:39 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 14:40:39 -0800
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines ::
 xrange : range
References: <LNBBLJKPBEHFEDALKOLCCEGNIGAA.tim.one@home.com>
Message-ID: <3A5258E7.D52CA2C@ActiveState.com>

Tim Peters wrote:
> 
> There's no question about whether it would pay on Windows, because it pays
> big for Perl on Windows.  The question is about cost.  There's no way to
> *do* it short of the way Perl does it, which is to write a large pile of
> Windows-specific code 

> ... Since the real Windows experts
> are hanging out at ActiveState, I bet one of them will volunteer to do it
> tonight <wink>.

Mark is busy tonight and the Perl guys are still recovering from
implementing it the first time. :)

 Paul


From guido@python.org  Tue Jan  2 22:46:00 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 02 Jan 2001 17:46:00 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 16:06:05 EST."
 <20010102160605.A5211@kronos.cnri.reston.va.us>
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
 <20010102160605.A5211@kronos.cnri.reston.va.us>
Message-ID: <200101022246.RAA16384@cj20424-a.reston1.va.home.com>

> On Tue, Jan 02, 2001 at 03:48:09PM -0500, Tim Peters wrote:
> >into the FILE* representation in platform-dependent ways.  It's a shame that
> >almost all vendors missed that fgets was defined as a primitive by the C
> >committee precisely so that vendors *could* pull this speed trick under the
> >covers.  It's also a shame that Perl did it for them <wink>.
> 
> So, should Python be changed to use fgets(), available on all ANSI C
> platforms, rather than the glibc-specific getline()?  That would be
> more complicated than the brain-dead easy course of using getline(),
> which is obviously why I didn't do it; PyFile_GetLine() had annoyingly
> complicated logic.

You mean get_line(), which indeed has a complicated API and
corresponding logic: the argument may be a max length, or 0 to
indicate arbutrary length, or negative to indicate raw_input()
semantics. :-(

Unfortunately we can't use fgets(), even if it were faster than
getline(), because it doesn't tell how many characters it read.  On
files containing null bytes, readline() is supposed to treat these
like any other character; if your input is "abc\0def\nxyz\n", the
first readline() call should return "abc\0def\n".  But with fgets(),
you're left to look in the returned buffer for a null byte, and
there's no way (in general) to distinguish this result from an input
file that only consisted of the three characters "abc".  getline()
doesn't seem to have this problem, since its size is also an output
parameter.

> When this was discussed in comp.lang.python, someone also mentioned
> getc_unlocked(), which saves the overhead of locking the stream every
> time, but that didn't seem a fruitful avenue for exploration.

I've never heard of getc_unlocked; it's not in the (old) C standard.
If it's also a glibc thing, I doubt that using it would be faster than
getline().  If it's a new C standard (C9x) thing, we'll have to wait.

Fred reminded me that for e.g. Solaris, while everybody probably
compiles with GCC, that doesn't mean they are using glibc, so
in practice getline() will only help on Linux.

I'm slowly warming up to xreadlines(), although we must be careful to
consider the consequences (do other file-like objects need to support
it too?).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Tue Jan  2 22:46:18 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 2 Jan 2001 17:46:18 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines ::  xrange : range
In-Reply-To: <3A5258E7.D52CA2C@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHBIGAA.tim.one@home.com>

[Tim]
> ... Since the real Windows experts are hanging out at ActiveState,
> I bet one of them will volunteer to do it tonight <wink>.

[Paul Prescod]
> Mark is busy tonight and the Perl guys are still recovering from
> implementing it the first time. :)

I'm delighted, then, that you have nothing better to do than tease the
decent, hard-working folks on Python-Dev!  I'll be up until about 4am --
feel free to submit your patch anytime before then.

in-a-pinch-i'll-even-accept-it-tomorrow-ly y'rs  - tim



From guido@python.org  Tue Jan  2 22:53:14 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 02 Jan 2001 17:53:14 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 17:00:37 EST."
 <LNBBLJKPBEHFEDALKOLCCEGNIGAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCCEGNIGAA.tim.one@home.com>
Message-ID: <200101022253.RAA16482@cj20424-a.reston1.va.home.com>

> [Guido]
> > Quite apart from whether we should enable xreadlines(), could you look
> > into doing a similar thing for MSVC stdio?  For most Unix platforms, a
> > cop-out answer is "use glibc" -- but for Windows it may pay to do our
> > own hack.
> 
> There's no question about whether it would pay on Windows, because it pays
> big for Perl on Windows.  The question is about cost.  There's no way to
> *do* it short of the way Perl does it, which is to write a large pile of
> Windows-specific code (roughly the same size and complexity as the glibc
> getline implementation -- check it out, it's not trivial, and glibc exploits
> compiler inlining to make it bearable) relying on reverse-engineered
> accidents of how MS happens to use all the fields from this undocumented
> struct (from MS's stdio.h):
> 
> struct _iobuf {
>         char *_ptr;
>         int   _cnt;
>         char *_base;
>         int   _flag;
>         int   _file;
>         int   _charbuf;
>         int   _bufsiz;
>         char *_tmpfname;
>         };
> typedef struct _iobuf FILE;
> 
> in their stdio implementation.  Else it won't play correctly with MS's
> stdio.  That's A Project.  Last year I tried extracting the relevant code
> from Perl, but, as is usual, gave up after unraveling the third (whatever)
> layer of mystery macros with no end in sight.  I bet it would take me a
> week.  Is it worth that much to you and DC?  Since the real Windows experts
> are hanging out at ActiveState, I bet one of them will volunteer to do it
> tonight <wink>.

Yeah.  That's too much.  Too bad.  I'm not holding my breath for
ActiveState though. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@mojam.com (Skip Montanaro)  Tue Jan  2 22:52:58 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 2 Jan 2001 16:52:58 -0600 (CST)
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101022246.RAA16384@cj20424-a.reston1.va.home.com>
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
 <20010102160605.A5211@kronos.cnri.reston.va.us>
 <200101022246.RAA16384@cj20424-a.reston1.va.home.com>
Message-ID: <14930.23498.53540.401218@beluga.mojam.com>

    Guido> I'm slowly warming up to xreadlines(), ...

I haven't followed this thread closely, and my brain is a bit frazzled at
the moment, but is there some fundamental reason that the file object's
readlines method can't be made lazy, perhaps only when given a sizehint?

Skip


From paulp@ActiveState.com  Tue Jan  2 22:59:47 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 14:59:47 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
 <20010102160605.A5211@kronos.cnri.reston.va.us>
 <200101022246.RAA16384@cj20424-a.reston1.va.home.com> <14930.23498.53540.401218@beluga.mojam.com>
Message-ID: <3A525D63.17ABCC87@ActiveState.com>

Skip Montanaro wrote:
> 
>     Guido> I'm slowly warming up to xreadlines(), ...
> 
> I haven't followed this thread closely, and my brain is a bit frazzled at
> the moment, but is there some fundamental reason that the file object's
> readlines method can't be made lazy, perhaps only when given a sizehint?

I suggested this at one point but it was pointed out that there is
probably a lot of code that works with the resulting list *as a list*
i.e. as a random-access, writable sequence object. I really wasn't
thrilled with xreadlines at first either...it's the least of all
possible evils (including the status quo).

 Paul


From nas@arctrix.com  Tue Jan  2 16:09:15 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 08:09:15 -0800
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEGJIGAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 02, 2001 at 03:48:08PM -0500
References: <LNBBLJKPBEHFEDALKOLCEEGJIGAA.tim.one@home.com>
Message-ID: <20010102080915.A30892@glacier.fnational.com>

On Tue, Jan 02, 2001 at 03:48:08PM -0500, Tim Peters wrote:
> test_compare is broken because the expected-output file has bizarre stuff in
> it like:
> 
>     cmp(2, [1]) = -108
>     cmp(2, (2,)) = -116
>     cmp(2, None) = -78
> 
> What's up with that?

My fault.  I only ran regrtest.py and not "make test".  I'm not
sure why you say bizarre stuff though.  Do you object to testing
that 2 is less than None (something that is not part of the
language spec) or do you think that the results from cmp() should
be clamped between -1 and 1?

  Neil


From guido@python.org  Tue Jan  2 23:06:16 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 02 Jan 2001 18:06:16 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 16:52:58 CST."
 <14930.23498.53540.401218@beluga.mojam.com>
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com> <20010102160605.A5211@kronos.cnri.reston.va.us> <200101022246.RAA16384@cj20424-a.reston1.va.home.com>
 <14930.23498.53540.401218@beluga.mojam.com>
Message-ID: <200101022306.SAA16684@cj20424-a.reston1.va.home.com>

> I haven't followed this thread closely, and my brain is a bit frazzled at
> the moment, but is there some fundamental reason that the file object's
> readlines method can't be made lazy, perhaps only when given a sizehint?

Yes -- readlines() is documented to return a list, and some people do
things to it that require it to be a real list (e.g. sort or reverse
it or modify it in place or concatenate it with other lists).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Tue Jan  2 23:19:14 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 2 Jan 2001 18:19:14 -0500
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <20010102080915.A30892@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHEIGAA.tim.one@home.com>

[Tim]
> test_compare is broken because the expected-output file has
> bizarre stuff in it like:
>
>     cmp(2, [1]) = -108
>     cmp(2, (2,)) = -116
>     cmp(2, None) = -78
>
> What's up with that?

[Neil Schemenauer]
> My fault.  I only ran regrtest.py and not "make test".

Neil, my platform doesn't even *have* a "make":  are you saying the test
passes for you when you run regrtest.py?  That's what I did.

> I'm not sure why you say bizarre stuff though.  Do you object to
> testing that 2 is less than None (something that is not part of the
> language spec)

Only in part.  Lang Ref 2.1.3 (Comparisons) says you can compare them, and
guarantees they won't compare equal, but doesn't define it beyond that.  If
Python actually says "less", fine, we can test for that, although to
minimize maintenance down the road it would be better to test for no more
than we expect Python to guarantee across releases and implementations
(suppose Jython says 2 is greater than None:  that's fine too, and it would
be better if the test suite didn't say Jython was broken).

> or do you think that the results from cmp() should be clamped
> between -1 and 1?

Not that either <wink>; cmp() isn't documented that way.

They're "bizarre" simply because they're not what Python returns!

C:\Code\python\dist\src\PCbuild>python
Python 2.0 (#8, Dec 17 2000, 01:39:08) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> cmp(2, [1])
-1
>>> cmp(2, (2,))
-1
>>> cmp(2, None)
-1
>>>

The expected-output file is supposed to match what Python actually does.  I
have no idea where things like "-108" came from.  So things like -108 look
bizarre to me.  So long as cmp(2, [1]) returns -1 in reality, an
expected-output file that claims it returns -108 will never work no matter
how you run the tests.

One of us is missing something obvious here <wink>.



From paulp@ActiveState.com  Tue Jan  2 23:26:39 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 15:26:39 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
 <20010102160605.A5211@kronos.cnri.reston.va.us> <200101022246.RAA16384@cj20424-a.reston1.va.home.com>
Message-ID: <3A5263AF.CE6C8C81@ActiveState.com>

Guido van Rossum wrote:
> 
> ...
> 
> I'm slowly warming up to xreadlines(), although we must be careful to
> consider the consequences (do other file-like objects need to support
> it too?).

The implementation is such that it is pretty easy to add the method to
other file-like objects. It is also easy to use the xreadlines module to
get the same behavior for objects that do not have the method. 
Essentially, file.xreadlines is implemented like this:

def xreadlines(self):
    import xreadlines
    xreadlines.xreadlines(self)

Any object can add the method similarly.

 Paul Prescod


From nas@arctrix.com  Tue Jan  2 16:51:48 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 08:51:48 -0800
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEHEIGAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 02, 2001 at 06:19:14PM -0500
References: <20010102080915.A30892@glacier.fnational.com> <LNBBLJKPBEHFEDALKOLCOEHEIGAA.tim.one@home.com>
Message-ID: <20010102085148.A30986@glacier.fnational.com>

On Tue, Jan 02, 2001 at 06:19:14PM -0500, Tim Peters wrote:
> Neil, my platform doesn't even *have* a "make":  are you saying the test
> passes for you when you run regrtest.py?

Yes.  Isn't checking in code without running regrtest a capital
offence? :)

> Lang Ref 2.1.3 (Comparisons) says you can compare them, and
> guarantees they won't compare equal, but doesn't define it beyond that.

Okay, I'll use == rather than cmp().  When I was working on the coercion
patch I found cmp() useful.  I guess it shouldn't be in the standard
test suite, especially since Jython may implement things differently.

[Neil]
> or, do you think that the results from cmp() should be clamped
> between -1 and 1?

[Tim]
> Not that either <wink>; cmp() isn't documented that way.
> 
> They're "bizarre" simply because they're not what Python returns!

They do on my box:

    Python 2.0 (#19, Nov 21 2000, 18:13:04) 
    [GCC 2.95.2 20000220 (Debian GNU/Linux)] on linux2
    Type "copyright", "credits" or "license" for more information.
    >>> cmp(1, None)
    -78

I guess MS uses a different strcmp than GNU.  Do you mind trying the
attached C code?  I get "-78" as output.  I should have thought a little
more before checking in the patch.  -78 is quite obviously a
machine/library dependent thing.

[Tim again]
> One of us is missing something obvious here <wink>.

I don't know about that.  The implementation of coercion and comparison
is not simple.  I've been studying it for some time now and I obviously
still don't know what the hell is going on.

AFAICT, the problem is that instances without a comparison method can
compare larger or smaller than numbers depending on where in memory the
objects are stored.

  Neil


#include <stdio.h>
#include <string.h>

int main()
{
    printf("%d\n", strcmp("", "None"));
}


From tim.one@home.com  Wed Jan  3 00:30:26 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 2 Jan 2001 19:30:26 -0500
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <20010102085148.A30986@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEHIIGAA.tim.one@home.com>

[Neil]
> They do on my box:
>
>     Python 2.0 (#19, Nov 21 2000, 18:13:04)
>     [GCC 2.95.2 20000220 (Debian GNU/Linux)] on linux2
>     Type "copyright", "credits" or "license" for more information.
>     >>> cmp(1, None)
>     -78

Well, who cares about your silly box <wink>?  Messier than I thought!  Yes,
Windows strcmp is always in {-1, 0, 1}.  Rather than run tests, here's the
tail end of MS's strcmp.c:

        if ( ret < 0 )
                ret = -1 ;
        else if ( ret > 0 )
                ret = 1 ;

        return( ret );

Wasted cycles and stupid formatting <wink>.

> ...
> AFAICT, the problem is that instances without a comparison method can
> compare larger or smaller than numbers depending on where in memory
> the objects are stored.

If so, that's a bug ... OK, it *is* a bug, at least in current CVS.  Did you
cause that, or was it always this way?  I was able to provoke this badness:

>>> j < c < i
1
>>> j < i
0
>>>

i.e. it violates transitivity, and that's never supposed to happen in the
absence of user-supplied __cmp__.  Here c is an instance of "class C: pass",
and i and j are ints.

>>> type(i), type(j), type(c)
(<type 'int'>, <type 'int'>, <type 'instance'>)
>>> i, j, c
(999999, 1000000, <__main__.C instance at 00791B7C>)
>>> id(i), id(j), id(c)
(7941572, 7744676, 7936892)
>>>

Guido thought he fixed this kind of stuff once (and I believed him <wink>)
by treating all numbers as if they had type name "" (i.e., yes, an empty
string) when compared to non-numbers.  Then the usual "mixed-type
comparisons in the absence of __cmp__ compare via type name string" rule
ensured that numbers would always compare "less than" instances of any other
type.  That's the intent of the tail end:

		else if (vtp->tp_as_number != NULL)
			vname = "";
		else if (wtp->tp_as_number != NULL)
			wname = "";
		/* Numerical types compare smaller than all other types */
		return strcmp(vname, wname);

of PyObject_Compare.  So, in the example above, we *should* have

    i < c == 1
    j < c == 1
    j < c < i == 0

Unfortunately, we actually have

    i < c == 0

in that example.  We're apparently not getting to the "number hack" code
because c is an instance, and I'll confess up front that my eyes always
glazed over long before I got to PyInstance_HalfBinOp <0.half wink>.
Whatever, there's at least one bug somewhere in that path!   We should have
n < i == 1 for any numeric type n and any non-numeric type i (in the absence
of user-defined __cmp__).




From skip@mojam.com (Skip Montanaro)  Wed Jan  3 01:27:03 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 2 Jan 2001 19:27:03 -0600 (CST)
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <3A525D63.17ABCC87@ActiveState.com>
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
 <20010102160605.A5211@kronos.cnri.reston.va.us>
 <200101022246.RAA16384@cj20424-a.reston1.va.home.com>
 <14930.23498.53540.401218@beluga.mojam.com>
 <3A525D63.17ABCC87@ActiveState.com>
Message-ID: <14930.32743.525564.69044@beluga.mojam.com>

    Paul> I suggested this at one point but it was pointed out that there is
    Paul> probably a lot of code that works with the resulting list *as a
    Paul> list*

How about this idea?  What if readlines() was allowed to return a lazy
evaluator if a sizehint > 0 was given?  I only saw one example outside of
test cases in the current CVS tree where readlines(sizehint) was used
(Tools/idle/GrepDialog.py), and it used it as expected:

    while 1:
      block = f.readlines(sizehint)
      if not block:
        break
      for line in block:
        more stuff

My suspicion is that most uses of sizehint will be like this.  It hasn't
been around all that long in Python-years (since 1.5a2), so there's probably
not tons of code to break (I agree the semantics would change), and the
majority of code that uses it probably looks like the above, which is almost
safe (if it returned "" instead of an empty evaluator when nothing was left
to read it would be safe).  The advantage would be that the above could
become the more obvious

    for line in f.readlines(sizehint):
      more stuff

and the change to file reading code that is "too slow" becomes much simpler.
(Of course, xreadlines() has that advantage as well.)

I scanned my own code quickly.  I found about 10 uses with sizehint and 300
without.

I presume we are talking about 2.1 here.  In any case, it seems to me that
in Py3k readlines should be lazy.

Skip

P.S.  Why did FileInput class never grow a readlines method?


From nas@arctrix.com  Tue Jan  2 19:38:53 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 11:38:53 -0800
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEHIIGAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 02, 2001 at 07:30:26PM -0500
References: <20010102085148.A30986@glacier.fnational.com> <LNBBLJKPBEHFEDALKOLCKEHIIGAA.tim.one@home.com>
Message-ID: <20010102113853.A31341@glacier.fnational.com>

On Tue, Jan 02, 2001 at 07:30:26PM -0500, Tim Peters wrote:
> > AFAICT, the problem is that instances without a comparison method can
> > compare larger or smaller than numbers depending on where in memory
> > the objects are stored.
> 
> If so, that's a bug ... OK, it *is* a bug, at least in current CVS.  Did you
> cause that, or was it always this way?

To quote Bart Simpson: I didn't do it.  I'm pretty sure the bug
is in PyInstance_DoBinOp.  I don't think its worth fixing though.
I'm ready to check in my coercion overhaul patch, assuming no
veto's from the list.  It should fix this bug (and introduce a
whole slew of new ones :).

Guido suggested that I remove the "number types compare smaller
than other types" behavior.  What's your take on that?  The
current patch on SF always uses the type names.  It should be
easy to implement the old behavior though.

  Neil


From nas@arctrix.com  Tue Jan  2 19:48:09 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 11:48:09 -0800
Subject: [Python-Dev] Applying the PEP 208 (coercion overhaul) patch
Message-ID: <20010102114809.B31341@glacier.fnational.com>

I'm almost ready to apply SF patch #102652.  Guido has give the
okay assuming there are no objections from the rest of
python-dev.  The patch is large and modifies some complicated
parts of the interpreter.  I expect there will be some bugs.  If
you would like me to wait, speak now.

Guido has sent me some comments on the patch today which I plan
to review and address tonight.  I will probably apply the patch
tomorrow evening.

  Neil


From tim.one@home.com  Wed Jan  3 03:05:59 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 2 Jan 2001 22:05:59 -0500
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <20010102113853.A31341@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHNIGAA.tim.one@home.com>

[Neil Schemenauer, on a violation of transitivity j < c < i but not j < i]

> To quote Bart Simpson: I didn't do it.  I'm pretty sure the bug
> is in PyInstance_DoBinOp.  I don't think its worth fixing though.
> I'm ready to check in my coercion overhaul patch, assuming no
> veto's from the list.  It should fix this bug (and introduce a
> whole slew of new ones :).

Sounds good to me!

> Guido suggested that I remove the "number types compare smaller
> than other types" behavior.  What's your take on that?  The
> current patch on SF always uses the type names.  It should be
> easy to implement the old behavior though.

It doesn't matter that they're specifically smaller, it matters that they
can't violate transitivity.  "numbers compare smaller" was introduced
deliberately (by Guido) because, e.g., before that we had

    99 < [99] < 99L

despite that 99 == 99L, because

   "int" < "list" < "long int"

Even stranger, we had

    100 < [99] < 0L < 100

and

    100 < [] < -101L < -100


Making numbers compare smaller than other types is one way to ensure stuff
like that can't happen; I can't think of a simpler way (although making them
compare larger than other types would be equally simple, as would making
them compare as if their type name were "Neil" <wink>).



From paulp@ActiveState.com  Wed Jan  3 03:34:59 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 19:34:59 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
 <20010102160605.A5211@kronos.cnri.reston.va.us>
 <200101022246.RAA16384@cj20424-a.reston1.va.home.com>
 <14930.23498.53540.401218@beluga.mojam.com>
 <3A525D63.17ABCC87@ActiveState.com> <14930.32743.525564.69044@beluga.mojam.com>
Message-ID: <3A529DE3.D93C3916@ActiveState.com>

Skip Montanaro wrote:
> 
>...
> 
> I presume we are talking about 2.1 here.  In any case, it seems to me that
> in Py3k readlines should be lazy.

I agree, but I'm ambivalent about your suggestion for polymorphic return
values from readlines(). Yet another option is a "lazy=1" option.

 Paul Prescod


From tim.one@home.com  Wed Jan  3 04:33:29 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 2 Jan 2001 23:33:29 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101021456.JAA12633@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHPIGAA.tim.one@home.com>

[Guido, writes a timing program]

[Jeff, if you weren't copied on all this stuff, you can play catch-up
 by reading the archives, at
    http://mail.python.org/pipermail/python-dev/
]

> ...
> I am including the timer program below my signature.  The test input
> was the current access_log of dinsdale.python.org, which has about 119
> Mbytes and 1M lines (as counted by the test program).

For a contrast, I cobbled together a large test file out of various chunks
of C source, .py source, HTML source, and email archives.  I was shooting
for the same size you used (~119Mb), but ended up with more than 3x as many
lines.

> I measure about a factor of 2 between readlines with a sizehint (of 1
> MB) and fileinput;

Factor of 7 here (Jeff, NeilS eventually figured out that Guido was using a
CVS version of Python that has AndrewK's glibc getline patch, a zippier
line-input routine than Python 2.0 has; but it only applies to platforms
using glibc).

> ...
> Output (the first time is realtime seconds, the second CPU seconds):
>
> total 119808333 chars and 1009350 lines
> count_chars_lines     7.944  7.890
> readlines_sizehint    5.375  5.320
> using_fileinput      15.861 15.740
> while_readline        8.648  8.570
>
> This was on a 600 MHz Pentium-III Linux box (RH 6.2).

total 117615824 chars and 3237568
count_chars_lines    14.780 14.772
readlines_sizehint    9.390  9.375
using_fileinput      66.130 66.157
while_readline       30.380 30.337

866 MHz P3 Win98SE, current CVS Python.  I have no handy explanation for why
clock() and time() differ on my box (Win98 has no notions of "user time" or
"CPU time" distinct from clock time).

> Note that count_chars_lines and readlines_sizehint use the same
> algorithm -- the difference is that readlines_sizehint uses 'pass' as
> the inner loop body, while count_chars_lines adds two counters.
>
> Given that very light per-line processing (counting lines and
> characters) already increases the time considerably, I'm not sure I
> buy the arguments that the I/O overhead is always considerable.

I disagree that this is "very light processing", although I agree it's hard
to think of lighter processing <wink>:  it's a few Python statements per
line, which I'd say is pretty *typical* processing.  Read a line, run a
string find or regexp search on it, test the result, sometimes fiddle the
line accordingly and sometimes not.  File-crunching apps generally aren't
rocket science!  For example, I changed count_chars_lines to tally the
number of lines containing the string "Guido" instead, and the runtime went
up by just 0.8 seconds (BTW, it found 13808 of them <wink>):  if you're
thinking in C terms, millions of failing searches for "Guido" may seem like
more work, but the number of Python stmts executed usually counts more than
what the stmts do at the C level.

> ...
> Now what to do?  I still don't like xreadlines very much, but I do
> see that it can save some time.  But my test doesn't confirm Neel's
> times as posted by Tim:
>
>> Slowest: for line in fileinput.input('foo'):     # Time 100
>>        : while 1: line = file.readline()         # Time 75
>>        : for line in LinesOf(open('foo')):       # Time 25
>> Fastest: for line in file.readlines():           # Time 10
>>          while 1: lines = file.readlines(hint)   # Time 10
>>          for line in xreadlines(file):           # Time 10
>
> I only see a factor of 3 between fastest and slowest, and
> readline is only about 60% slower than readlines_sizehint.

I don't know what Neel used for an input file, or which platform he used
either.  And this is bound to vary a lot across platforms.  As above, I saw
a factor of 7 between fastest and slowest and a factor of 3 between readline
and readlines_sizehint.

BTW, on my platform the Perl script (using a recent ActiveState Windows
Perl)

open(FILE, "ga.txt");
while (<FILE>) {
    1;
}

ran in about 6 seconds (I never figured how to get Perl to compute usable
timings itself)-- substantially faster than even readlines_sizehint! --and
changing the body to

$nc = $nl = 0;
while (<FILE>) {
    ++$nl;
    $nc += length;
}
print "$nc $nl\n";

boosted that to about 8 seconds.  So Perl has gotten zippier too over the
years.



From tim.one@home.com  Wed Jan  3 09:32:55 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 3 Jan 2001 04:32:55 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101022253.RAA16482@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEIDIGAA.tim.one@home.com>

[Guido & Tim, wonder about faking getline-like functionality for Windows]

The attached is kinda baffling.  The std tests pass with it, and it changes my test
timings from:

count_chars_lines    14.780 14.772
readlines_sizehint    9.390  9.375
using_fileinput      66.130 66.157
while_readline       30.380 30.337

to:

count_chars_lines    14.880 14.854
readlines_sizehint    9.280  9.302
using_fileinput      48.610 48.589
while_readline       13.450 13.451

Big win?  You bet.  But ...

The baffling parts:

1. That Perl still takes only 6 seconds in line-at-a-time mode.

2. I originally wrote a getline workalike, instead of building directly into a PyString
buffer.  That made my test run *slower*, and I'm talking factor of 2, not a yawn.  To
judge from my usually silent disk (I've got 256Mb RAM on this box), I'm afraid the extra
mallocs required may have triggered the horrid Win9x malloc-thrashing problem I wrote
about while I was still at Dragon.  Consider that another vote for Vlad's PyMalloc --
we've got no handle on x-platform dynamic memory behavior now.  Python's destiny is to
replace both the platform OS and libc anyway <0.9 wink>.

The scary parts:

+ As the "XXX" comments indicate, this is full of little insecurities.

+ Another one I just thought of:  if the user's last operations on the fp were two or more
consecutive ungetc calls, all bets are off.  But then MS doesn't define what happens then
either.

+ This is much less ambitious than I recall Perl's code being:  it doesn't try to guess
anything about the file, and effectively captures only what would happen if you could
unroll the guts of a getc-in-a-loop and optimize the snot out of it.  The good news is
that this means it's much easier to maintain (it touches only two of the MS FILE* fields,
and in ways that are pretty obviously correct).  The bad news is that this seems also
pretty clearly all there *is* to be gotten out of breaking into the FILE* abstraction for
the particular test case I'm using; and increasing TUNEME doesn't save any time at all:
the sucker is flying at full speed already.

+ It drops (line-at-a-time) drops to a little under 13 seconds if I comment out the thread
macros.

+ I haven't looked at Perl's implementation in a year, and they must have dreamt up
another trick since then.  That's a "scary part" indeed to anyone who has ever looked at
Perl's implementation.

retreating-into-a-fetal-position-ly y'rs  - tim


Anyone wants to play, the sandbox is fileobject.c.  Do two things:  insert this new chunk
somewhere above get_line:

#ifdef MS_WIN32
static PyObject*
win32_getline(FILE *fp)
{
	/* XXX ignores thread safety -- but so does MS's getc macro! */
	PyObject* v;
	char* pBuf;	/* next free slot in v's buffer */
	/* MS's internals are declared in terms of ints, but it's a sure bet
	 * that won't last forever -- use size_t now & live w/ the casting;
	 * ditto for Python's routines
	 */
	size_t total_buf_size = 100;
	size_t free_buf_size = total_buf_size;
#define TUNEME 1000	/* how much to boost the string buffer when exhausted */

	v = PyString_FromStringAndSize((char *)NULL, (int)total_buf_size);
	if (v == NULL)
		return NULL;
	pBuf = BUF(v);
	Py_BEGIN_ALLOW_THREADS
	for (;;) {
		char ch;
		size_t ms_cnt;	/* FILE->_cnt shadow */
		char* ms_ptr;	/* FILE->_ptr shadow */
		size_t max_to_copy, i;
		/* stdio buffer empty or in unknown state; rather
		 * than try to simulate every quirk of MS's internals,
		 * let the MS macros deal with it.
		 */
		/* XXX we also wind up here when we simply run out of string
		 * XXX buffer space, but I'm not sure I care:  making this a
		 * XXX double-nested loop doesn't seem worth it
		 */
		ch = getc(fp);
		if (ch == EOF)
			break;
		/* make sure we've got some breathing room */
		if (free_buf_size < 100) {
			size_t currentoffset = pBuf - BUF(v);
			total_buf_size += TUNEME;  /* XXX check for overflow */
			Py_BLOCK_THREADS
			if (_PyString_Resize(&v, (int)total_buf_size) < 0)
				return NULL;
			Py_UNBLOCK_THREADS
			pBuf = BUF(v) + currentoffset;
			free_buf_size = TUNEME;
		}
		/* ch wasn't EOF, so store it */
		*pBuf++ = ch;
		--free_buf_size;
		if (ch == '\n') {
			break;
		}
		ms_cnt = (size_t)fp->_cnt;
		if (!ms_cnt) {
			/* XXX this is a slow way to read one character at
			 * XXX a time if, e.g., the stream is unbuffered
			 */
			continue;
		}
		/* payback!  now we don't have to check for buffer overflows or
		 * EOF inside the loop, nor does the macro _filbuf() branch force
		 *  _ptr and _cnt in and out of memory on each iteration
		 */
		ms_ptr = fp->_ptr;
		assert(ms_cnt > 0);
		i = max_to_copy = ms_cnt < free_buf_size ? ms_cnt : free_buf_size;
		do {
			/* XXX unclear to me why MS's getc macro does "& 0xff" */
			*pBuf++ = ch = *ms_ptr++ & 0xff;
		} while (--i && ch != '\n');
		/* update the shadows & counters */
		fp->_ptr = ms_ptr;
		free_buf_size -= max_to_copy - i;
		fp->_cnt = ms_cnt - (max_to_copy - i);
		if (ch == '\n')
			break;
	}
	Py_END_ALLOW_THREADS
	_PyString_Resize(&v, pBuf - BUF(v));
	return v;
}
#endif

2. Within get_line, add this before the #endif (this is the getline #if block):

#elif defined(MS_WIN32)
	if (n == 0) {
		return win32_getline(fp);
	}



From ping@lfw.org  Wed Jan  3 11:40:47 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Wed, 3 Jan 2001 05:40:47 -0600 (CST)
Subject: [Python-Dev] inspect.py
In-Reply-To: <14840.19556.127151.457533@anthem.concentric.net>
Message-ID: <Pine.LNX.4.10.10011021617550.800-100000@skuld.kingmanhall.org>

Uh... hi.  <sheepish look>

I know i've all but dropped out of existence for a long time, what with
my simultaneous first stints as a grad student, a teaching assistant, and
a house cook (!) and all, but i didn't want to let this work go to waste.

Now that the holidays are here i can *finally* try to get some work done!

So, i've updated inspect.py in response to Barry's comments, and below is
my reply to this old thread.  I also wrote some regression tests.

I tried to submit inspect.py to SourceForge, but i got:

    ERROR

    Patch Uploaded ERROR - Submission failed PQsendQuery() -- query is
    too long.  Maximum length is 16382

Does anyone know what's going on with that?


Anyway, the latest module and regression tests are available at:

    http://www.lfw.org/python/inspect.py
    http://www.lfw.org/python/test_inspect.py

for your perusal.




On Thu, 26 Oct 2000 barry@wooz.org wrote:
> Some thoughts after an initial scan of inspect.py:
> 
> - The doc strings for the is*() functions aren't accurate.
>   E.g. ismodule() says that it asks whether "the object is a module
>   with the __file__ special attribute", but that isn't really what it
>   tests!  Guido points out that builtin modules don't currently have
>   __file__ and besides, you're really testing that the type of the
>   object is ModuleType.

Perhaps a different wording would be better, but i should at least
clarify the intention: i wrote them that way because it seemed that
the current objects export an unofficial "interface" by means of the
special attributes they provide.  The purpose of the "is*()" functions
is to determine whether an object meets one of these interfaces.

A complete interface would provide (1) a type-checker, (2) a constructor,
and (3) the methods.  As for (2), we don't normally allow construction of
these things (except for wizards using the newmodule).  As for (3), i
suppose that one could further encapsulate these interfaces by providing
spelled-out methods like "def getcode(f): return f.func_code", but it
didn't seem worth the trouble.  So that left just (1), and i had the
other parts in mind while trying to describe (1).

The type-checkers aren't of much use unless they accurately reflect
the availability of the special attributes.  Do you see what i'm trying
to do?  Maybe you can suggest a better way of doing it... anyway, i've
tried to compromise in the docstrings as submitted.

> - Don't make the predicate in getmembers() default to "lambda x: 1"
>   Instead make the default None, and skip the predicate test if it is
>   None.

Okay, fine.

> - getdoc()'s docstring should describe the margin munging it does.

Okay, done.

> - findsource() seems off-by one, e.g.
> 
>    >>> x = inspect.findsource(inspect.findsource)
>    >>> x[1]
>    138
> 
>    but the function really stars on line 139.

138 was the intended result here.  Indeed the function starts
on line 139 if you start counting from 1.  The reason it returns
138 is that it's the index you would use for the array of lines
(thus x[0][x[1]] or file.readlines()[138] is the first line of
the function).

Which way makes more sense?  Should it be changed?

> - I notice that currentframe() still uses the try/except trick to get
>   the frame object.  It's much more efficient to provide a C
>   trampoline for getting that information.

Sure, if there's a faster way, that's fine.  It just wasn't
something i expected to be used really often, and i wanted to
write the module in pure Python so it could be easily maintained.

I added a line to clobber the pure-Python currentframe() with
sys._getframe() if it exists.

> - If this were included in the library, we might want to 2.0-ify it.

It currently doesn't rely on any 2.0 features, and it would be
kind of nice to have it still work with 1.5 (especially if it is
part of a drop-in documentation tool, as it is now, since it goes
with htmldoc).


-- ?!ng

"Computers are useless.  They can only give you answers."
    -- Pablo Picasso




From guido@python.org  Wed Jan  3 12:06:33 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 07:06:33 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <200101031206.HAA19182@cj20424-a.reston1.va.home.com>

Apparently getc_unlocked() is in the Single Unix spec.  Not sure how
widespread that is -- do Linux developers pay attention to this
standard at all?  According to the webpage it's (c) 1997.

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Wed, 03 Jan 2001 10:58:44 +0200
From:    Erno Kuusela <erno@iki.fi>
To:      guido@python.org
Subject: getc_unlocked note

hello,

i was reading the python-dev archives and saw that someone had noticed
my getline/getc_unlocked post from the newsgroup. a correction to the
python-dev thread: getc_unlocked and friends are infact standard (not c99
though since c99 doesn't specify threads); they are part of the single
unix specification.

link:
http://www.opennc.org/onlinepubs/007908799/xsh/getc_unlocked.html

   -- erno

------- End of Forwarded Message



From guido@python.org  Wed Jan  3 12:37:11 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 07:37:11 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 03 Jan 2001 04:32:55 EST."
 <LNBBLJKPBEHFEDALKOLCIEIDIGAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCIEIDIGAA.tim.one@home.com>
Message-ID: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>

> 1. That Perl still takes only 6 seconds in line-at-a-time mode.

Are you sure Perl still uses stdio at all?

If so, does it open the file in binary or in text mode?  Based on the
APIs in MS's libc, I presume that the crlf->lf translation is not done
by stdio proper but by the Unix I/O emulation just underneath it
(open() has an O_BINARY option flag, so read() probably does the
translation).  That comes down to copying most bytes an extra time.

(To test this hypothesis, you could try to open the test file with
mode "rb" and see if it makes a difference.)

> 2. I originally wrote a getline workalike, instead of building
> directly into a PyString buffer.  That made my test run *slower*,
> and I'm talking factor of 2, not a yawn.  To judge from my usually
> silent disk (I've got 256Mb RAM on this box), I'm afraid the extra
> mallocs required may have triggered the horrid Win9x
> malloc-thrashing problem I wrote about while I was still at Dragon.
> Consider that another vote for Vlad's PyMalloc -- we've got no
> handle on x-platform dynamic memory behavior now.  Python's destiny
> is to replace both the platform OS and libc anyway <0.9 wink>.
>
> The scary parts:
>
> + As the "XXX" comments indicate, this is full of little
> insecurities.

My biggest worry: thread-safety.  There must be a way to lock the file
(you indicated that fgets() uses it).

> + Another one I just thought of: if the user's last operations on
> the fp were two or more consecutive ungetc calls, all bets are off.
> But then MS doesn't define what happens then either.

Python doesn't have an interface to ungetc(), and I believe the stdio
standard says you can only call ungetc() once consecutively.  Assuming
other C code linked with Python obeys this rule (a pretty safe
assumption), we should be fine.  And if the assumption is violated, I
presume it's really that C code's fault -- plus, it code that only
uses getc() would be screwed just as badly.

> + This is much less ambitious than I recall Perl's code being: it
> doesn't try to guess anything about the file, and effectively
> captures only what would happen if you could unroll the guts of a
> getc-in-a-loop and optimize the snot out of it.  The good news is
> that this means it's much easier to maintain (it touches only two of
> the MS FILE* fields, and in ways that are pretty obviously correct).
> The bad news is that this seems also pretty clearly all there *is*
> to be gotten out of breaking into the FILE* abstraction for the
> particular test case I'm using; and increasing TUNEME doesn't save
> any time at all: the sucker is flying at full speed already.

You probably don't have many lines longer than 1000 characters.

> + It drops (line-at-a-time) drops to a little under 13 seconds if I
> comment out the thread macros.

If you mean the Py_BLOCK_THREADS around the resize, that can be safely
dropped.  (If/when we introduce Vladimir's malloc, we'll have to
decide whether it is threadsafe by itself or whether it requires the
global interpreter lock.  I vote to make it threadsafe by itself.)

> + I haven't looked at Perl's implementation in a year, and they must
> have dreamt up another trick since then.  That's a "scary part"
> indeed to anyone who has ever looked at Perl's implementation.
>
> retreating-into-a-fetal-position-ly y'rs - tim
> 
> 
> Anyone wants to play, the sandbox is fileobject.c.  Do two things:
> insert this new chunk somewhere above get_line:
> 
> #ifdef MS_WIN32
> static PyObject*
> win32_getline(FILE *fp)
> {
> 	/* XXX ignores thread safety -- but so does MS's getc macro! */
> 	PyObject* v;
> 	char* pBuf;	/* next free slot in v's buffer */
> 	/* MS's internals are declared in terms of ints, but it's a sure bet
> 	 * that won't last forever -- use size_t now & live w/ the casting;
> 	 * ditto for Python's routines
> 	 */
> 	size_t total_buf_size = 100;
> 	size_t free_buf_size = total_buf_size;
> #define TUNEME 1000	/* how much to boost the string buffer when exhausted */
> 
> 	v = PyString_FromStringAndSize((char *)NULL, (int)total_buf_size);
> 	if (v == NULL)
> 		return NULL;
> 	pBuf = BUF(v);
> 	Py_BEGIN_ALLOW_THREADS
> 	for (;;) {
> 		char ch;
> 		size_t ms_cnt;	/* FILE->_cnt shadow */
> 		char* ms_ptr;	/* FILE->_ptr shadow */
> 		size_t max_to_copy, i;
> 		/* stdio buffer empty or in unknown state; rather
> 		 * than try to simulate every quirk of MS's internals,
> 		 * let the MS macros deal with it.
> 		 */
> 		/* XXX we also wind up here when we simply run out of string
> 		 * XXX buffer space, but I'm not sure I care:  making this a
> 		 * XXX double-nested loop doesn't seem worth it
> 		 */
> 		ch = getc(fp);
> 		if (ch == EOF)
> 			break;
> 		/* make sure we've got some breathing room */
> 		if (free_buf_size < 100) {
> 			size_t currentoffset = pBuf - BUF(v);
> 			total_buf_size += TUNEME;  /* XXX check for overflow */
> 			Py_BLOCK_THREADS
> 			if (_PyString_Resize(&v, (int)total_buf_size) < 0)
> 				return NULL;
> 			Py_UNBLOCK_THREADS
> 			pBuf = BUF(v) + currentoffset;
> 			free_buf_size = TUNEME;
> 		}
> 		/* ch wasn't EOF, so store it */
> 		*pBuf++ = ch;
> 		--free_buf_size;
> 		if (ch == '\n') {
> 			break;
> 		}
> 		ms_cnt = (size_t)fp->_cnt;
> 		if (!ms_cnt) {
> 			/* XXX this is a slow way to read one character at
> 			 * XXX a time if, e.g., the stream is unbuffered
> 			 */
> 			continue;
> 		}
> 		/* payback!  now we don't have to check for buffer overflows or
> 		 * EOF inside the loop, nor does the macro _filbuf() branch force
> 		 *  _ptr and _cnt in and out of memory on each iteration
> 		 */
> 		ms_ptr = fp->_ptr;
> 		assert(ms_cnt > 0);
> 		i = max_to_copy = ms_cnt < free_buf_size ? ms_cnt : free_buf_size;

Doesn't it make more sense to delay the resize until this point?  I
don't know how much the character copying accounts for, but I could
imagine a strategy based on memchr() and memcpy() that first searches
for a \n, and if found, allocates to the right size before copying.
Typically, the buffer contains many lines, so this could be optimized
into requiring a single exactly-sized malloc() call in the common case
(where the buffer doesn't wrap).  But possibly scanning the buffer for
\n and then copying the bytes separately, even with memcmp() and
memcpy(), slows things down too much for this to be faster.

> 		do {
> 			/* XXX unclear to me why MS's getc macro does "& 0xff" */
> 			*pBuf++ = ch = *ms_ptr++ & 0xff;

I know why.  getchar() returns an int in the range [-1, 255].  If
chars are signed the &0xff is needed else you would get a return in
the range [-128, 127] and -1 would be ambiguous (EOF==-1).  Not sure
if they *are* unsigned on any MS platform -- if they aren't, whoever
coded this wasn't thinking -- on the other hand the compiler probagbly
optimizes it out.  But here since you're copying to another character,
it's pointless.

> 		} while (--i && ch != '\n');
> 		/* update the shadows & counters */
> 		fp->_ptr = ms_ptr;
> 		free_buf_size -= max_to_copy - i;
> 		fp->_cnt = ms_cnt - (max_to_copy - i);
> 		if (ch == '\n')
> 			break;
> 	}
> 	Py_END_ALLOW_THREADS
> 	_PyString_Resize(&v, pBuf - BUF(v));
> 	return v;
> }
> #endif
> 
> 2. Within get_line, add this before the #endif (this is the getline #if block):
> 
> #elif defined(MS_WIN32)
> 	if (n == 0) {
> 		return win32_getline(fp);
> 	}

Note that get_line() with negative n could be implemented as
get_line(0) with some post processing.  This should be done completely
separately, in PyFile_GetLine.  The negative n case is only used by
raw_input() -- it means strip the \n and raise EOFError for EOF, and I
expect that this is rarely if ever used in a speed-conscious
situation.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Jan  3 14:56:31 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 09:56:31 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 03 Jan 2001 07:06:33 EST."
 <200101031206.HAA19182@cj20424-a.reston1.va.home.com>
References: <200101031206.HAA19182@cj20424-a.reston1.va.home.com>
Message-ID: <200101031456.JAA19990@cj20424-a.reston1.va.home.com>

> Apparently getc_unlocked() is in the Single Unix spec.  Not sure how
> widespread that is -- do Linux developers pay attention to this
> standard at all?  According to the webpage it's (c) 1997.

Erno Kuusela gave me some more info about this; glibc supports it.

I did a quick test which suggests that it is a lot faster than regular
getc() -- on a small test file it's actually faster than GNU
getline(), even with the proper flockfile() / funlockfile() calls.
(The test file was 6Mb -- 10 copies of /etc/termcap, which has short
lines -- avg 43 chars.)

This together with Tim's Win32x specific hacks might be the best we
can do for get_line().  However, raw xreadlines is still almost twice
as fast, so it's still under consideration.

Maybe MS supports a similar unlocked getc macro, and a separate
primitive to lock/unlock a file?  That would allow more unified code.

(Quick research shows that it exists, but only in internal form.  We
could probably call _lock_file() and _unlock_file(), and define our
own getc_lk(), protected by the proper set of macros.  This could all
be presented by config.h as flockfile(), funlockfile(), and
getc_unlocked() macros.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Wed Jan  3 15:27:09 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 3 Jan 2001 10:27:09 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101031206.HAA19182@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 03, 2001 at 07:06:33AM -0500
References: <200101031206.HAA19182@cj20424-a.reston1.va.home.com>
Message-ID: <20010103102709.A19451@kronos.cnri.reston.va.us>

On Wed, Jan 03, 2001 at 07:06:33AM -0500, Guido van Rossum wrote:
>Apparently getc_unlocked() is in the Single Unix spec.  Not sure how
>widespread that is -- do Linux developers pay attention to this
>standard at all?  According to the webpage it's (c) 1997.

It seems to be in glibc 2.1, but I don't know how much it would help,
and the added complexity of having to lock the file separately worries
me, perhaps due to a superstitious fear of angering the Thread Gods.

--amk


From akuchlin@mems-exchange.org  Wed Jan  3 15:44:57 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 3 Jan 2001 10:44:57 -0500
Subject: [Python-Dev] Help wanted with setup.py script
In-Reply-To: <017201c0759a$c2b180c0$e000a8c0@thomasnotebook>; from thomas.heller@ion-tof.com on Wed, Jan 03, 2001 at 04:35:10PM +0100
References: <200012290046.TAA01346@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com> <017201c0759a$c2b180c0$e000a8c0@thomasnotebook>
Message-ID: <20010103104457.A19493@kronos.cnri.reston.va.us>

[Cc'ing to python-dev].  

On Wed, Jan 03, 2001 at 04:35:10PM +0100, Thomas Heller wrote:
>You didn't expect this script run under windows?
>(It does not run)

It shouldn't matter, I think, since the makesetup stuff doesn't run on
Windows either; presumably the compiled-in modules are specified by an
MSVC project file, or something similar.  Can anyone confirm that I
don't care if setup.py works on Windows?  (Well, I *know* for a fact I
don't care; but should I? :) )

--amk



From guido@python.org  Wed Jan  3 15:49:43 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 10:49:43 -0500
Subject: [Python-Dev] Help wanted with setup.py script
In-Reply-To: Your message of "Wed, 03 Jan 2001 10:44:57 EST."
 <20010103104457.A19493@kronos.cnri.reston.va.us>
References: <200012290046.TAA01346@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com> <017201c0759a$c2b180c0$e000a8c0@thomasnotebook>
 <20010103104457.A19493@kronos.cnri.reston.va.us>
Message-ID: <200101031549.KAA20188@cj20424-a.reston1.va.home.com>

> It shouldn't matter, I think, since the makesetup stuff doesn't run on
> Windows either; presumably the compiled-in modules are specified by an
> MSVC project file, or something similar.  Can anyone confirm that I
> don't care if setup.py works on Windows?  (Well, I *know* for a fact I
> don't care; but should I? :) )

Personally, I don't think it's worth to make setup.py work for
Windows.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Wed Jan  3 20:04:07 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 3 Jan 2001 15:04:07 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #103082] speed up readline() using getc_unlocked()
In-Reply-To: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net>; from noreply@sourceforge.net on Wed, Jan 03, 2001 at 08:47:30AM -0800
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net>
Message-ID: <20010103150407.D20301@kronos.cnri.reston.va.us>

On Wed, Jan 03, 2001 at 08:47:30AM -0800, GvR wrote:
>Summary: speed up readline() using getc_unlocked()

So what does the performance of this version look like?

--amk


From guido@python.org  Wed Jan  3 20:25:53 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 15:25:53 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #103082] speed up readline() using getc_unlocked()
In-Reply-To: Your message of "Wed, 03 Jan 2001 15:04:07 EST."
 <20010103150407.D20301@kronos.cnri.reston.va.us>
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net>
 <20010103150407.D20301@kronos.cnri.reston.va.us>
Message-ID: <200101032025.PAA27457@cj20424-a.reston1.va.home.com>

> >Summary: speed up readline() using getc_unlocked()
> 
> So what does the performance of this version look like?

Very slightly faster than the GNU getline() version.  Without GNU
getline, the old code was about 3.5 times slower.

Here are the current times on a 6 Mb file (fileinput.py has my
sourceforge speedup patch too):

$ ./python ~/rltest.py ~/termcapx10 
total 6252720 chars and 146250 lines; average line length 42.8
count_chars_lines     0.943  0.930
readlines_sizehint    0.544  0.540
using_fileinput       2.089  2.090
while_readline        0.956  0.960

For comparison, here's what Python 1.5.2 does with the same test
(which should be pretty close to what the released Python 2.0 does; I
don't have a copy of that handy).

$ python1.5 ~/rltest.py ~/termcapx10 
total 6252720 chars and 146250 lines; average line length 42.8
count_chars_lines     0.836  0.820
readlines_sizehint    0.523  0.520
using_fileinput       5.739  5.740
while_readline        3.670  3.670

I don't know why count_chars_lines got proportionally more slower than
readlines_sizehint.  (The += operator didn't make a difference either
way.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Jan  3 20:45:38 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 15:45:38 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #103082] speed up readline() using getc_unlocked()
In-Reply-To: Your message of "Wed, 03 Jan 2001 15:25:53 EST."
 <200101032025.PAA27457@cj20424-a.reston1.va.home.com>
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net> <20010103150407.D20301@kronos.cnri.reston.va.us>
 <200101032025.PAA27457@cj20424-a.reston1.va.home.com>
Message-ID: <200101032045.PAA27595@cj20424-a.reston1.va.home.com>

I should add that the patches are on SourceForge:

fileinput.py:
http://sourceforge.net/patch/?func=detailpatch&patch_id=103081&group_id=5470

fileobject.c:
http://sourceforge.net/patch/?func=detailpatch&patch_id=103082&group_id=5470

I'm ready to check these in, but I'm waiting 24 hours in case there's
something I've missed.  (I haven't actually tested these on any other
platform besides Linux.)

Jeff Epler's xreadlines patch is here:
http://sourceforge.net/patch/?func=detailpatch&patch_id=102915&group_id=5470

Note that Jeff's patch includes a patch to fileinput.py that does the
same thing as mine but using his xreadlines module instead of directly
using readlines(sizehint) as does mine.  I like my approach better,
mostly because it reduces depenencies.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Wed Jan  3 21:25:30 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 3 Jan 2001 16:25:30 -0500
Subject: [Python-Dev] speed up readline() using getc_unlocked()
In-Reply-To: <200101032045.PAA27595@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 03, 2001 at 03:45:38PM -0500
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net> <20010103150407.D20301@kronos.cnri.reston.va.us> <200101032025.PAA27457@cj20424-a.reston1.va.home.com> <200101032045.PAA27595@cj20424-a.reston1.va.home.com>
Message-ID: <20010103162530.A20433@kronos.cnri.reston.va.us>

On Wed, Jan 03, 2001 at 03:45:38PM -0500, Guido van Rossum wrote:
>I'm ready to check these in, but I'm waiting 24 hours in case there's
>something I've missed.  (I haven't actually tested these on any other
>platform besides Linux.)

On Solaris 2.6, the configure script doesn't detect that
getc_unlocked() & friends are supported; details available from the
patch.  After editing config.h manually to enable them, the results are:

Before getc_unlocked patch:
total 1559913 chars and 32513 lines
count_chars_lines     0.892  0.730
readlines_sizehint    0.329  0.300
using_fileinput       4.612  4.470
while_readline        2.739  2.670

After patch:
total 1559913 chars and 32513 lines
count_chars_lines     0.698  0.680
readlines_sizehint    0.273  0.270
using_fileinput       2.707  2.700
while_readline        0.778  0.780
amarok src>           

With a patched version of fileinput.py:
using_fileinput       1.675  1.680

--amk


From guido@python.org  Wed Jan  3 21:36:07 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 16:36:07 -0500
Subject: [Python-Dev] speed up readline() using getc_unlocked()
In-Reply-To: Your message of "Wed, 03 Jan 2001 16:25:30 EST."
 <20010103162530.A20433@kronos.cnri.reston.va.us>
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net> <20010103150407.D20301@kronos.cnri.reston.va.us> <200101032025.PAA27457@cj20424-a.reston1.va.home.com> <200101032045.PAA27595@cj20424-a.reston1.va.home.com>
 <20010103162530.A20433@kronos.cnri.reston.va.us>
Message-ID: <200101032136.QAA07752@cj20424-a.reston1.va.home.com>

> On Solaris 2.6, the configure script doesn't detect that
> getc_unlocked() & friends are supported; details available from the
> patch.

(Fixed now, see the new patch.)

> After editing config.h manually to enable them, the results are:
> 
> Before getc_unlocked patch:
> total 1559913 chars and 32513 lines
> count_chars_lines     0.892  0.730
> readlines_sizehint    0.329  0.300
> using_fileinput       4.612  4.470
> while_readline        2.739  2.670
> 
> After patch:
> total 1559913 chars and 32513 lines
> count_chars_lines     0.698  0.680
> readlines_sizehint    0.273  0.270
> using_fileinput       2.707  2.700
> while_readline        0.778  0.780
> amarok src>           
> 
> With a patched version of fileinput.py:
> using_fileinput       1.675  1.680

Thanks!  The bottom line seems to be that your basic readline loop is
still 3x as slow as the fastest way -- so there's still a lot to say
for xreadlines...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Wed Jan  3 21:42:48 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 03 Jan 2001 22:42:48 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib codecs.py,1.13,1.14
References: <E14DvT9-00079N-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <3A539CD8.367361B8@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> Update of /cvsroot/python/python/dist/src/Lib
> In directory usw-pr-cvs1:/tmp/cvs-serv26608/Lib
> 
> Modified Files:
>         codecs.py
> Log Message:
> ...
> 
> This patch closes the bugs #116285 and #119960.

I was too fast... the subject line of #119960 was misleading.
It is still open.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tim.one@home.com  Wed Jan  3 23:13:15 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 3 Jan 2001 18:13:15 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com>

[Guido]
> Are you sure Perl still uses stdio at all?

Pretty sure, but there are so many layers of macros the code is
undecipherable, and I can't step thru macros in the debugger either (that's
assuming I wanted to devote N hours to building Perl from source too --
which I don't).  Perl also makes heavy use of macroizing std library names,
so e.g. when I see "fopen" (which I do!), that doesn't mean I'm getting the
fopen I'm thinking of.  But the MSVC config files define all sorts of macros
to get at the MS stdio _cnt and _ptr (and most other) FILE* fields, and the
version of fopen in the Win32 stuff appears to defer to the platform fopen
(after doing Perlish stuff, like if someone passed "/dev/null" as the file
name, Perl changes it to "NUL").

This is what it's like:  the first line of Perl's win32_fopen is this:

    dTHXo;

That's conditionally defined in perl.h, either as

#define dTHXo			dTHXoa(PERL_GET_THX)

or, if pTHXo is not defined, as

#  define dTHXo		dTHX

dTHX in turn is #defined in 4 different places across 3 different files in 2
different directories.  I'll skip those.  OTOH, dTHXoa is easy!  It's only
defined once:

#define dTHXoa(a)		pTHXo = a

Ah, *that* clears it up <wink>.  Etc.  20 years ago I may have thought this
was fun.  I thought debugging large systems of m4 macros was fun then, and
I'm not sure this is either better or worse than that -- well, it's worse,
because I understood m4's implementation.


> If so, does it open the file in binary or in text mode?

Sorry, but I really don't know and it's a pit to pursue.  If it's not native
text mode, they do a good job of faking it (e.g., Ctrl-Z acts like an EOF
when reading a text file from Perl on Windows -- not something even Larry
would be likely to do on his own <wink>).

> Based on the APIs in MS's libc, I presume that the crlf->lf
> translation is not done by stdio proper but by the Unix I/O
> emulation just underneath it (open() has an O_BINARY option
> flag, so read() probably does the translation).

Yes; and late in the last release cycle, import.c's open_exclusive had a
Windows bug related to this (fdopen() used "wb", but the earlier open()
didn't use O_BINARY, and fdopen *acted* like it had used "w").  Also, the MS
setmode() function works on file handles, not streams.

> That comes down to copying most bytes an extra time.

Understood.  But the CRLF are stored physically on disk, so unless the disk
controller is converting them, *someone's* software (whether MS's or Perl's)
is doing it.  By the time Perl is doing its fast line-input stuff, and doing
what sure looks like a straight copy out of an IO buffer, it's clear from
the code that CRLF has already been translated to LF.

> (To test this hypothesis, you could try to open the test file
> with mode "rb" and see if it makes a difference.)

In Python, that saved about 10% (but got the wrong answers <wink>).  In
Perl, about 15-20%.  But I don't think that tells us who's doing the
translation.  Assuming that the translation takes about the same total time
for each, it makes sense that the percentage would be higher for Perl (since
its total runtime is lower:  same-sized slice of a smaller pie).

> My biggest worry: thread-safety.  There must be a way to lock
> the file (you indicated that fgets() uses it).

Yes, via the unadvertised _lock_str and _unlock_str macros defined in MS
mtdll.h, which is not on the include path:

/*
 * This is an internal C runtime header file. It is used when building
 * the C runtimes only. It is not to be used as a public header file.
 */

The routines and macros it calls are also unadvertised.  After an hour of
thrashing I wasn't able to successfully link any code trying to call these
routines.  Doesn't mean it's impossible, does means they're internal to MS
libc and aren't meant to be called by anything else.  That's why it's called
"cheating" <wink>.  Perl appears to ignore the whole issue (but Perl's
thread story is muddy at best).

[... ungetc ...]

Not worried here either.

> ...
> You probably don't have many lines longer than 1000 characters.

None, in fact.

>> + It drops (line-at-a-time) drops to a little under 13 seconds if I
>> comment out the thread macros.

> If you mean the Py_BLOCK_THREADS around the resize, that can be safely
> dropped.

I meant *all* thread-related macros -- was just trying to get a feel for how
much that fiddling cost (it's an expense Perl doesn't seem to have -- yet).
Was measurable but not substantial.  WRT the resize, there's now a "fast
path" that avoids it.

> (If/when we introduce Vladimir's malloc, we'll have to decide whether
> it is threadsafe by itself or whether it requires the global
> interpreter lock.  I vote to make it threadsafe by itself.)

As feared, this thread is going to consume my life <0.5 wink>.

> ...
> Doesn't it make more sense to delay the resize until this point?  I
> don't know how much the character copying accounts for, but I could
> imagine a strategy based on memchr() and memcpy() that first searches
> for a \n, and if found, allocates to the right size before copying.
> Typically, the buffer contains many lines, so this could be optimized
> into requiring a single exactly-sized malloc() call in the common case
> (where the buffer doesn't wrap).  But possibly scanning the buffer for
> \n and then copying the bytes separately, even with memcmp() and
> memcpy(), slows things down too much for this to be faster.

Turns out that Perl does very much what I was doing; the Perl code is
actually more burdensome, because its routine is trying to deal not only
with \n-termination, but also arbitrary-string termination (Perl's Awk-like
input record separator), and "paragraph mode", and fixed-size reads, and
some other stuff I can't figure out from the macro names.  In all cases with
a terminator, though, it's doing the same business of both copying and
testing in a very tight inner loop.  It doesn't appear to make any serious
attempts to avoid resizing the buffer.  But, Perl has its own malloc
routines, and I'm guessing they're highly tuned for this stuff.

Since we're stuck with the MS malloc-- and Win9x's in particular seems
lame --adding this near the start of my stuff did yield a nice speedup:

	if (fp->_cnt > 0 &&
	    (pBuf = (char *)memchr(fp->_ptr, '\n', fp->_cnt)) != NULL) {
	    	/* it's all in the buffer so don't bother releasing the
	    	 * global lock
	    	 */
		total_buf_size = pBuf - fp->_ptr + 1;
		v = PyString_FromStringAndSize(fp->_ptr,
			                       (int)total_buf_size);
		if (v != NULL) {
			pBuf = BUF(v) + total_buf_size;
			fp->_cnt -= total_buf_size;
			fp->_ptr += total_buf_size;
		}
		goto done;
	}

So that builds the result string directly from the stdio buffer when it can.
Times dropped from (before this particular small hack)

count_chars_lines    14.880 14.854
readlines_sizehint    9.280  9.302
using_fileinput      48.610 48.589
while_readline       13.450 13.451

to

count_chars_lines    14.780 14.784
readlines_sizehint    9.550  9.514
using_fileinput      43.560 43.584
while_readline       10.600 10.578

Since I have no long lines in this test data, and the stdio buffer typically
contains thousands of chars, most calls should be satisfied by the fast
path.  Compared to the previous code, the fast path (1) avoids global lock
fiddling (but that didn't account for much in a distinct test); (2) crawls
over the buffer twice instead of once; and, (3) avoids one (shrinking!)
realloc.  So crawling over the buffer an extra time costs nothing compared
to the cost of a resize; and that's likely just more evidence that
malloc/realloc suck on this platform.

CAUTION:  no file locking is going on now (because I haven't found a way to
do it).  My previous claim that the MS getc macro did no locking was wrong,
as I discovered by stepping thru the generated machine code.  stdio.h
#defines getc without locking, but in _MT mode it later gets #undef'ed and
turned into a function call.

>> /* XXX unclear to me why MS's getc macro does "& 0xff" */
>>			*pBuf++ = ch = *ms_ptr++ & 0xff;

> I know why.  getchar() returns an int in the range [-1, 255].  If
> chars are signed the &0xff is needed else you would get a return in
> the range [-128, 127] and -1 would be ambiguous (EOF==-1).

Bingo -- MS chars are signed.

> ...
> But here since you're copying to another character, it's pointless.

Yup!  Gone.

> ....
> Note that get_line() with negative n could be implemented as
> get_line(0) with some post processing.

Andrew's glibc getline code appears to have wanted to do that, but looks to
me like it's unreachable (unless I'm hallucinating, the "n < 0" test after
return from glibc getline can't succeed, because the enclosing block is
guarded by an "n==0" test).

> This should be done completely separately, in PyFile_GetLine.

I assume you have an editor <wink>.

> The negative n case is only used by raw_input() -- it means strip
> the \n and raise EOFError for EOF, and I expect that this is rarely
> if ever used in a speed-conscious situation.

I've never seen raw_input used except when stdin and stdout were connected
to a tty.  When I tried raw_input from a DOS box under the debugger, it
never called get_line.  Something trickier is going on there; I suspect it's
actually calling fgets (eventually) instead in that case.

more-mysteries-than-i-really-need-ly y'rs  - tim



From jeremy@alum.mit.edu  Thu Jan  4 00:06:58 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Wed, 3 Jan 2001 19:06:58 -0500 (EST)
Subject: [Python-Dev] Mailman problems?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com>
References: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com>
Message-ID: <14931.48802.273143.209933@localhost.localdomain>

Tim & Barry,

It looks like the is some problem with Mailman that is garbling
messages to python-dev.  It may only affect lines that begin with a
tab; not sure.

Your most recent message came through with the following line

>    dTHXo;

(This was not the only example.)

I think this was supposed to be a line of C code, but whatever
meaningful contents it had were rendered as gobbledygook.

Jeremy


    
    


From loewis@informatik.hu-berlin.de  Thu Jan  4 00:13:16 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Thu, 4 Jan 2001 01:13:16 +0100 (MET)
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <200101040013.BAA13436@pandora.informatik.hu-berlin.de>

> Apparently getc_unlocked() is in the Single Unix spec.  Not sure how
> widespread that is -- do Linux developers pay attention to this
> standard at all?

Ulrich Drepper, who is in charge of glibc, is always interested in
following Single Unix to the letter; getc_unlocked is supported
atleast since glibc 2.0.

http://www.sun.com/smcc/solaris-migration/docs/courses/threadsHTML/adv.html

claims that getc_unlocked is already in POSIX.1c; Solaris apparently
supports it atleast since Solaris 2.4.

Irix has it since 6.5, Tru64 atleast since 4.0d (probably much
longer); HPUX since 11.0, AIX since atleast 4.3.

Of the BSDs, only OpenBSD appears to support it; it knows that it is
in ANSI 1003.1 since 1996-07-12.

SCO OpenServer doesn't support it.

Regards,
Martin


From fredrik@effbot.org  Thu Jan  4 00:20:41 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Thu, 4 Jan 2001 01:20:41 +0100
Subject: [Python-Dev] Mailman problems?
References: <200101031237.HAA19244@cj20424-a.reston1.va.home.com><LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com> <14931.48802.273143.209933@localhost.localdomain>
Message-ID: <011901c075e4$2ce96360$e46940d5@hagrid>

> It looks like the is some problem with Mailman that is garbling
> messages to python-dev.  It may only affect lines that begin with a
> tab; not sure.
>
> Your most recent message came through with the following line
> 
> >    dTHXo;
> 
> (This was not the only example.)
> 
> I think this was supposed to be a line of C code, but whatever
> meaningful contents it had were rendered as gobbledygook.

also looks like Mailman removed all smileys from
Jeremys post ;-)

</F>



From thomas@xs4all.net  Thu Jan  4 00:27:54 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 01:27:54 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101040013.BAA13436@pandora.informatik.hu-berlin.de>; from loewis@informatik.hu-berlin.de on Thu, Jan 04, 2001 at 01:13:16AM +0100
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de>
Message-ID: <20010104012753.D2467@xs4all.nl>

On Thu, Jan 04, 2001 at 01:13:16AM +0100, Martin von Loewis wrote:

> Of the BSDs, only OpenBSD appears to support it; it knows that it is
> in ANSI 1003.1 since 1996-07-12.

BSDI supports getc_unlocked() at least since BSDI 3.1. I don't have any
older boxes to check, but the manpage for getc and all its friends carries
the timestamp 'June 4, 1993', which implies it could have been available a
lot longer. (Note that BSD was once known to *define* the standard ;-)

I concur that FreeBSD does not currently support getc_unlocked, but since
BSDI and FreeBSD are merging, I suspect it will, soonish.

In other words: use it! :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From barry@wooz.org  Thu Jan  4 02:59:01 2001
From: barry@wooz.org (Barry A. Warsaw)
Date: Wed, 3 Jan 2001 21:59:01 -0500
Subject: [Python-Dev] Re: Mailman problems?
References: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com>
 <14931.48802.273143.209933@localhost.localdomain>
Message-ID: <14931.59125.391596.730296@anthem.wooz.org>

>>>>> "JH" == Jeremy Hylton <jeremy@alum.mit.edu> writes:

    JH> It looks like the is some problem with Mailman that is
    JH> garbling messages to python-dev.  It may only affect lines
    JH> that begin with a tab; not sure.

    JH> Your most recent message came through with the following line

    >> dTHXo;

    JH> (This was not the only example.)

    JH> I think this was supposed to be a line of C code, but whatever
    JH> meaningful contents it had were rendered as gobbledygook.

Oh shoot, my bad.  I dropped in an experimental Perl filter module in
the delivery pipeline.  It's been so long since I hacked Perl, I think
I meant to write $%_-> when I really wrote %$_->

-Barry



From tim.one@home.com  Thu Jan  4 04:26:51 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 3 Jan 2001 23:26:51 -0500
Subject: [Python-Dev] RE: Mailman problems?
In-Reply-To: <14931.48802.273143.209933@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKIIGAA.tim.one@home.com>

[Jeremy]
> It looks like the is some problem with Mailman that is garbling
> messages to python-dev.  It may only affect lines that begin with a
> tab; not sure.
>
> Your most recent message came through with the following line
>
>>    dTHXo;
>
> (This was not the only example.)
>
> I think this was supposed to be a line of C code, but whatever
> meaningful contents it had were rendered as gobbledygook.

I have no idea where that "o" came from!  It was supposed to be "o".  Barry,
fix it!

BTW, the second line of Perl implementation functions is usually a lot less
mysterious than the first.  If anyone wants the joy of reverse-engineering
Perl's supernaturally fast input, it's function Perl_sv_gets in file sv.c.
sv.c?  Yes!  The destination of a one-line input is a Scalar Value, hence,
sc.  I expect there's similar method behind all of this stuff, but I never
stumbled into the key.  To get you started, here's the first line of
Perl_sv_gets:

    dTHR;

The line you're looking for is 119 lines down from that:

	    if ((*bp++ = *ptr++) == rslast)  /* really   |  dust */

the-comment-makes-more-sense-in-context<wink>-ly y'rs  - tim



From thomas@xs4all.net  Thu Jan  4 06:51:17 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 07:51:17 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101040037.TAA08699@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 03, 2001 at 07:37:22PM -0500
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com>
Message-ID: <20010104075116.J402@xs4all.nl>

On Wed, Jan 03, 2001 at 07:37:22PM -0500, Guido van Rossum wrote:
> > In other words: use it! :)
> 
> Mind doing a few platform tests on the (new version of the) patch?

Well, only a bit :) It's annoying that BSDI doesn't come with autoconf, but
I managed to use all my early-morning wit (it's 6:30AM <wink>) to work
around it. I've tested it on BSDI 4.1 and FreeBSD 4.2-RELEASE.

> I already know that it works on Red Hat Linux 6.2 (my box) and Solaris
> 2.6 (Andrew's box).  I would be delighted to know that it works on at
> least one other platform that has getc_unlocked() and one platform
> that doesn't have it!

Sorry, I have to disappoint you. FreeBSD does have getc_unlocked, they
just didn't document it. Hurrah for autoconf ;P Anyway, it worked like a
charm on BSDI:

(Python 2.0)
total 1794310 chars and 37660 lines
count_chars_lines     0.310  0.300
readlines_sizehint    0.150  0.150
using_fileinput       2.013  2.017
while_readline        1.006  1.000

(CVS Python + getc_unlocked)
daemon2:~/python/python/dist/src > ./python test.py termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.354  0.350
readlines_sizehint    0.182  0.183
using_fileinput       1.594  1.583
while_readline        0.363  0.367

But something weird is going on on FreeBSD:

(Standard CVS Python)
> ./python ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.265  0.266
readlines_sizehint    0.148  0.148
using_fileinput       0.943  0.938
while_readline        0.214  0.219

(CVS+getc_unlocked)
> ./python-getc-unlocked  ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.266  0.266
readlines_sizehint    0.151  0.141
using_fileinput       1.066  1.078
while_readline        0.283  0.281

This was sufficiently unexpected that I looked a bit further. The FreeBSD
Python was compiled without editing Modules/Setup, so it was statically
linked, no readline etc, but *with* threads (which are on by default, and
functional on both FreeBSD and BSDI 4.1.) Here's the timings after I enabled
just '*shared*':

(CVS + *shared*)
> ./python ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.276  0.273
readlines_sizehint    0.150  0.156
using_fileinput       0.902  0.898
while_readline        0.206  0.203

(This was not a fluke, I repeated it several times, getting hardly any
variation.) Enabling readline and cursesmodule had no additional effect.
Adding *shared* to the getc_unlocked tree saw roughly the same improvement,
but was still slower than without getc_unlocked.

(CVS + *shared* + getc_unlocked)
> ./python ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.272  0.273
readlines_sizehint    0.149  0.148
using_fileinput       1.031  1.031
while_readline        0.267  0.266

Increasing the size of the testfile didn't change anything, other than the
absolute numbers. I browsed stdio.h, where both getc() and getc_unlocked()
are defined as macros. getc_unlocked is defined as:

#define __sgetc(p) (--(p)->_r < 0 ? __srget(p) : (int)(*(p)->_p++))
#define getc_unlocked(fp)       __sgetc(fp)

and getc either as

#define getc(fp)        getc_unlocked(fp)
(without threads) or

static __inline int                     \
__getc_locked(FILE *_fp)                \
{                                       \
        extern int __isthreaded;        \
        int _ret;                       \
        if (__isthreaded)               \
                _FLOCKFILE(_fp);        \
        _ret = getc_unlocked(_fp);      \
        if (__isthreaded)               \
                funlockfile(_fp);       \
        return (_ret);                  \
}
#define getc(fp)        __getc_locked(fp)

_FLOCKFILE(x) is defined as flockfile(x), so that isn't the difference. The
speed difference has to be in the quick-and-easy test for whether the
locking is even necessary. Starting a thread on 'time.sleep(900)' in test.py
shows these numbers:

(standard CVS python)
> ./python-shared-std ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.433  0.445
readlines_sizehint    0.204  0.188
using_fileinput       1.595  1.594
while_readline        0.456  0.453

(getc_unlocked)
> ./python-getc-unlocked-shared ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.441  0.453
readlines_sizehint    0.206  0.195
using_fileinput       1.677  1.688
while_readline        0.509  0.508

So... using getc_unlocked manually for performance reasons isn't a cardinal
sin on FreeBSD only if you are really using threads :-)

Lets-outsmart-the-OS-scheduler-next!-ly y'rs
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Thu Jan  4 07:57:26 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 08:57:26 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test/output test_coercion,1.2,1.3
In-Reply-To: <E14DzKN-0005eK-00@usw-pr-cvs1.sourceforge.net>; from nascheme@users.sourceforge.net on Wed, Jan 03, 2001 at 05:36:27PM -0800
References: <E14DzKN-0005eK-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010104085726.E2467@xs4all.nl>

On Wed, Jan 03, 2001 at 05:36:27PM -0800, Neil Schemenauer wrote:
> Update of /cvsroot/python/python/dist/src/Lib/test/output
> In directory usw-pr-cvs1:/tmp/cvs-serv21710/Lib/test/output
> 
> Modified Files:
> 	test_coercion 
> Log Message:
> Sequence repeat works now for in-place multiply with an integer type
> as the left operand.  I don't know if this is a feature or a bug.

> ! 2 *= [1] => [1, 1]

It's a feature.

x = 2 * [1]

works, so

x = 2
x *= [1]

does, too. Obviously, '2 *= [1]' shouldn't, but I'm assuming you don't
actually execute that (it should give a SyntaxError)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From fredrik@effbot.org  Thu Jan  4 09:32:55 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Thu, 4 Jan 2001 10:32:55 +0100
Subject: [Python-Dev] RE: Mailman problems?
References: <LNBBLJKPBEHFEDALKOLCGEKIIGAA.tim.one@home.com>
Message-ID: <00a701c07631$531983b0$e46940d5@hagrid>

tim wrote:
> I have no idea where that "o" came from!  It was supposed to be "o".
> Barry, fix it!

no need.  from the perlguts man page:

    "You can ignore [pad]THX[xo] when browsing the Perl
    headers/sources."

in-my-dictionary-perl's-an-american-physicist-ly yrs /F



From mal@lemburg.com  Thu Jan  4 10:02:35 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 04 Jan 2001 11:02:35 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include
 classobject.h,2.33,2.34
References: <E14DzEi-0005T2-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <3A544A3B.32B86792@lemburg.com>

Neil Schemenauer wrote:
> 
> Update of /cvsroot/python/python/dist/src/Include
> In directory usw-pr-cvs1:/tmp/cvs-serv21006/Include
> 
> Modified Files:
>         classobject.h
> Log Message:
> Remove PyInstance_*BinOp functions.
> 
> Index: classobject.h
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Include/classobject.h,v
> retrieving revision 2.33
> retrieving revision 2.34
> diff -C2 -r2.33 -r2.34
> *** classobject.h       2000/09/01 23:29:26     2.33
> --- classobject.h       2001/01/04 01:30:34     2.34
> ***************
> *** 60,71 ****
>   extern DL_IMPORT(int) PyClass_IsSubclass(PyObject *, PyObject *);
> 
> - extern DL_IMPORT(PyObject *) PyInstance_DoBinOp(PyObject *, PyObject *,
> -                                                 char *, char *,
> -                                                 PyObject * (*)(PyObject *,
> -                                                                PyObject *));
> -
> - extern DL_IMPORT(int)
> - PyInstance_HalfBinOp(PyObject *, PyObject *, char *, PyObject **,
> -                       PyObject * (*)(PyObject *, PyObject *), int);

Wouldn't it be safer to provide emulation APIs for these ? There
might be code out there using these APIs.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@python.org  Thu Jan  4 14:06:53 2001
From: guido@python.org (Guido van Rossum)
Date: Thu, 04 Jan 2001 09:06:53 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include classobject.h,2.33,2.34
In-Reply-To: Your message of "Thu, 04 Jan 2001 11:02:35 +0100."
 <3A544A3B.32B86792@lemburg.com>
References: <E14DzEi-0005T2-00@usw-pr-cvs1.sourceforge.net>
 <3A544A3B.32B86792@lemburg.com>
Message-ID: <200101041406.JAA11926@cj20424-a.reston1.va.home.com>

> > - extern DL_IMPORT(PyObject *) PyInstance_DoBinOp(PyObject *, PyObject *,
> > -                                                 char *, char *,
> > -                                                 PyObject * (*)(PyObject *,
> > -                                                                PyObject *));
> > -
> > - extern DL_IMPORT(int)
> > - PyInstance_HalfBinOp(PyObject *, PyObject *, char *, PyObject **,
> > -                       PyObject * (*)(PyObject *, PyObject *), int);
> 
> Wouldn't it be safer to provide emulation APIs for these ? There
> might be code out there using these APIs.

No.  These were never intended to be part of the API (and it was a
mistake that they used DL_IMPORT()).  They had to be extern because
they were defined in one file and used in another.  I'm glad they're
gone.  They are so obscure that I'd be *very* surprised if anybody was
using them, and even more if they even *wanted* emulation under the
new scheme -- I'd expect them to eagerly convert their code to using
new-style numbers right away.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Thu Jan  4 14:16:39 2001
From: guido@python.org (Guido van Rossum)
Date: Thu, 04 Jan 2001 09:16:39 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 04 Jan 2001 07:51:17 +0100."
 <20010104075116.J402@xs4all.nl>
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com>
 <20010104075116.J402@xs4all.nl>
Message-ID: <200101041416.JAA11983@cj20424-a.reston1.va.home.com>

[Thomas finds that on FreeBSD, getc() is faster than getc_unlocked().]

Thomas, I really don't understand it.  The getc() source code you
showed calls getc_unlocked().  So how can it be faster?  The answer
must be somewhere else...  Cache line conflicts, the rewriting of the
loop that I did, a compiler bug, the inlining, who knows.  Can you
compare the generated assembly code?  On other platforms,
getc_unlocked() typically speeds the readline() test case up by a
significant factor (as in your BSDI numbers, where it's almost 3x
faster).

Could it be that you're mistaken and that somehow getc_unlocked() is
*not* chosen on FreeBSD?  Then I could believe it, the rewritten loop
is so different that the optimizer might have done something different
to it.  (Check config.h.  When all else fails, I put an #error in the
#ifdef branch that I expect not to be taken.)

Could it be that somehow getc_unlocked() is later defined to be the
same as getc(), so choosing it just adds the overhead of calling
f[un]lockfile() for each line?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Thu Jan  4 14:59:05 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 15:59:05 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101041416.JAA11983@cj20424-a.reston1.va.home.com>; from guido@python.org on Thu, Jan 04, 2001 at 09:16:39AM -0500
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com>
Message-ID: <20010104155904.L402@xs4all.nl>

On Thu, Jan 04, 2001 at 09:16:39AM -0500, Guido van Rossum wrote:
> [Thomas finds that on FreeBSD, getc() is faster than getc_unlocked().]

> Thomas, I really don't understand it.  The getc() source code you
> showed calls getc_unlocked().  So how can it be faster?  The answer
> must be somewhere else...  Cache line conflicts, the rewriting of the
> loop that I did, a compiler bug, the inlining, who knows.  Can you
> compare the generated assembly code?  On other platforms,
> getc_unlocked() typically speeds the readline() test case up by a
> significant factor (as in your BSDI numbers, where it's almost 3x
> faster).

Nono, reread my message, and your code. getc() isn't faster than
getc_unlocked(). getc() is faster than flockfile(f) + getc_unlocked(f) (+
the rearranging of the function, use of PyTHREAD_ALLOW inside the outer loop,
etc.) Significantly so when there is only one thread running (which is still
the common case, for most systems, and FreeBSD's libc has easy inside
knowledge about) and marginally so when there is at least one other thread.
The small advantage in the multi-threaded case can be explained by the
rest of the changes. 

You see, I was comparing a patched tree versus a non-patched tree, not a
getc_unlocked() enabled one versus a disabled one, so I was measuring the
speed difference of the *patch*, not of the use of getc_unlocked() vs
getc(). Here is the speed difference of just the use of getc() vs
getc_unlocked() (same tree, hand-edited config.h) in a non-threaded
environment:

> ./python-getc-disabled ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.271  0.273
readlines_sizehint    0.149  0.148
using_fileinput       0.898  0.898
while_readline        0.214  0.211

> ./python-getc-enabled ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.271  0.273
readlines_sizehint    0.148  0.148
using_fileinput       0.898  0.898
while_readline        0.214  0.211


As you see, no significant difference. Here is the difference in a threaded
environment (a second thread that does just 'time.sleep(900)'):

> ./python-getc-disabled ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.429  0.422
readlines_sizehint    0.200  0.211
using_fileinput       1.604  1.594
while_readline        0.465  0.461

> ./python-getc-enabled ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.429  0.430
readlines_sizehint    0.201  0.203
using_fileinput       1.600  1.602
while_readline        0.463  0.461

... where I have to note that the getc-disabled version's 'using_fileinput'
time fluctuates a lot more, mostly upwards, in the threaded environment. (I
see it jump to 1.609, 1.617 cputime, every few runs.) Still not a terribly
significant difference, but a hint that we, too, can use inside knowledge ;)

> Could it be that you're mistaken and that somehow getc_unlocked() is
> *not* chosen on FreeBSD?  Then I could believe it, the rewritten loop
> is so different that the optimizer might have done something different
> to it.  (Check config.h.  When all else fails, I put an #error in the
> #ifdef branch that I expect not to be taken.)

Yah, #error is great for debugging, I use it a lot ;) But I'm sure of this.
FreeBSD's getc() is just craftily optimized. Note that if we can get
get_line using getc_unlocked() to run as fast as get_line using getc() on
FreeBSD, it should also benifit other platforms, because the only speed to
be had is in our own code :) Not that I'm saying it can be improved, just
that it apparently got slower, because of this patch. I can't be much help
doing any performance tuning, though, I've about used up my lunchhour and
I'm working late tonight ;P

Good-thing-my-boss-can't-tell-the-difference-between-Apache-and-Python-src-ly
	y'rs, 
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@python.org  Thu Jan  4 15:27:28 2001
From: guido@python.org (Guido van Rossum)
Date: Thu, 04 Jan 2001 10:27:28 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 04 Jan 2001 15:59:05 +0100."
 <20010104155904.L402@xs4all.nl>
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com>
 <20010104155904.L402@xs4all.nl>
Message-ID: <200101041527.KAA12181@cj20424-a.reston1.va.home.com>

[Me & Thomas in violent agreement that there's something weird about
the speed of getc_unlocked() vs. getc() on FreeBSD.]

I just realized what's the probable cause.  Read your timing post
again:

# BSDI:
# 
# (Python 2.0)
# while_readline        1.006  1.000
# 
# (CVS Python + getc_unlocked)
# while_readline        0.363  0.367

# FreeBSD:
# 
# (Standard CVS Python)
# while_readline        0.214  0.219
# 
# (CVS+getc_unlocked)
# while_readline        0.283  0.281

Standard CVS Python, as opposed to Python 2.0 as released, uses GNU
getline()!  So on FreeBSD, for this test case, GNU getline() is faster
than getc_unlocked().

So the question is, should I leave the GNU getline() code in?  I'm
inclined against it -- it's not that much faster, and on other
platform getc_unlocked() is faster.  Given that getc_unlocked() is a
standard (of some sort) and GNU getline() is, well, just that, I'd say
let's stick with getc_unlocked().

(Unfortunately, from a phone conversation I had last night with Tim,
there's not much hope of doing something there -- and that platform
sorely needs it!  The hacks that Tim reported earlier are definitely
not thread-safe.  While it's easy to come up with getc_unlocked() for
Windows, the locking operations used internally there by the /MT code
are not exported from MSVCRT.DLL, and that's crucial.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Thu Jan  4 15:31:39 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 16:31:39 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101041527.KAA12181@cj20424-a.reston1.va.home.com>; from guido@python.org on Thu, Jan 04, 2001 at 10:27:28AM -0500
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl> <200101041527.KAA12181@cj20424-a.reston1.va.home.com>
Message-ID: <20010104163139.M402@xs4all.nl>

On Thu, Jan 04, 2001 at 10:27:28AM -0500, Guido van Rossum wrote:
> [Me & Thomas in violent agreement that there's something weird about
> the speed of getc_unlocked() vs. getc() on FreeBSD.]

> I just realized what's the probable cause.  Read your timing post
> again:

> Standard CVS Python, as opposed to Python 2.0 as released, uses GNU
> getline()!

Sorry, no go. You need two things to use getline(): getline() itself, and a
GNU libc. FreeBSD has neither. (And autoconf agrees with me.) If you *really
really* want me to, I can compile 2.0-standard on FreeBSD and show you. But
I'd rather not :)

Now go back and read my other mail about why FreeBSD is faster :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From akuchlin@mems-exchange.org  Thu Jan  4 15:43:15 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 4 Jan 2001 10:43:15 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010104155904.L402@xs4all.nl>; from thomas@xs4all.net on Thu, Jan 04, 2001 at 03:59:05PM +0100
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl>
Message-ID: <20010104104315.C23803@kronos.cnri.reston.va.us>

On Thu, Jan 04, 2001 at 03:59:05PM +0100, Thomas Wouters wrote:
>getc_unlocked(). getc() is faster than flockfile(f) + getc_unlocked(f) (+
>the rearranging of the function, use of PyTHREAD_ALLOW inside the outer loop,
>etc.) Significantly so when there is only one thread running (which is still

So it looks like the ALLOW_THREADS should be moved out of the for
loop.  This produced no measureable performance difference on Solaris;
I'll leave it to GvR to try it on Linux.  I wonder if FreeBSD has some
unusually slow thread operation?

--amk


From thomas@xs4all.net  Thu Jan  4 15:59:25 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 16:59:25 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010104104315.C23803@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Jan 04, 2001 at 10:43:15AM -0500
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl> <20010104104315.C23803@kronos.cnri.reston.va.us>
Message-ID: <20010104165925.G2467@xs4all.nl>

On Thu, Jan 04, 2001 at 10:43:15AM -0500, Andrew Kuchling wrote:
> On Thu, Jan 04, 2001 at 03:59:05PM +0100, Thomas Wouters wrote:

> >getc_unlocked(). getc() is faster than flockfile(f) + getc_unlocked(f) (+
> >the rearranging of the function, use of PyTHREAD_ALLOW inside the outer loop,
> >etc.) Significantly so when there is only one thread running (which is still

> So it looks like the ALLOW_THREADS should be moved out of the for
> loop.  This produced no measureable performance difference on Solaris;
> I'll leave it to GvR to try it on Linux.  I wonder if FreeBSD has some
> unusually slow thread operation?

Note that I was just guessing there. I did a quick scan of the function, and
noticed that the ALLOW_THREADS statements had moved into the outer loop. I
didn't even contemplate whether that made a difference, so don't trust that
judgement.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From akuchlin@mems-exchange.org  Thu Jan  4 16:10:29 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 4 Jan 2001 11:10:29 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010104165925.G2467@xs4all.nl>; from thomas@xs4all.net on Thu, Jan 04, 2001 at 04:59:25PM +0100
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl> <20010104104315.C23803@kronos.cnri.reston.va.us> <20010104165925.G2467@xs4all.nl>
Message-ID: <20010104111029.A28510@kronos.cnri.reston.va.us>

On Thu, Jan 04, 2001 at 04:59:25PM +0100, Thomas Wouters wrote:
>Note that I was just guessing there. I did a quick scan of the function, and
>noticed that the ALLOW_THREADS statements had moved into the outer loop. I
>didn't even contemplate whether that made a difference, so don't trust that
>judgement.

According to your benchmark, the performance of the threaded version
was the same whether or not getc_unlocked() was unused, so it's not
that flockfile() is really slow.  I can't believe the compiler
optimized the old, ungainly loop better than the newer, tighter loop.
That leaves the ALLOW_THREADS as the most reasonable culprit.

--amk



From akuchlin@mems-exchange.org  Thu Jan  4 17:10:11 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 04 Jan 2001 12:10:11 -0500
Subject: [Python-Dev] SGI's Digital Media SDK
Message-ID: <E14EDtz-0007fS-00@kronos.cnri.reston.va.us>

SGI just made a source release of their digital media SDK for IRIX and
Linux at http://oss.sgi.com/projects/dmsdk/ .  According to the FAQ,
this is derived from previous SGI libraries, "including the Video
Library (VL), the Audio Library (AL), Digital Media Image Convertor
(DMIC), Digital Media Audio Convertor (DMAC), and the Compression
Library (CL)."  Interested parties may want to look into this, because
Python still has the al, cd, cl, and sv modules; maybe they'd work
with the new software with a reasonable amount of fixing, and at least
now there's a reasonable chance that non-IRIX platforms will be
supported.

--amk



From guido@python.org  Thu Jan  4 19:07:13 2001
From: guido@python.org (Guido van Rossum)
Date: Thu, 04 Jan 2001 14:07:13 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 04 Jan 2001 10:43:15 EST."
 <20010104104315.C23803@kronos.cnri.reston.va.us>
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl>
 <20010104104315.C23803@kronos.cnri.reston.va.us>
Message-ID: <200101041907.OAA12573@cj20424-a.reston1.va.home.com>

> So it looks like the ALLOW_THREADS should be moved out of the for
> loop.  This produced no measureable performance difference on Solaris;
> I'll leave it to GvR to try it on Linux.  I wonder if FreeBSD has some
> unusually slow thread operation?

I kind of doubt that it's Py_ALLOW_THREADS -- it's in the outer loop,
which typically only gets executed once.  It only goes around a second
time when the line is longer than the initial buffer.  We could tweak
the initial buffer size (currently 100, with increments of 1000).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal@lemburg.com  Thu Jan  4 19:32:15 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 04 Jan 2001 20:32:15 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include
 classobject.h,2.33,2.34
References: <E14DzEi-0005T2-00@usw-pr-cvs1.sourceforge.net>
 <3A544A3B.32B86792@lemburg.com> <200101041406.JAA11926@cj20424-a.reston1.va.home.com>
Message-ID: <3A54CFBF.CDD2138B@lemburg.com>

Guido van Rossum wrote:
> 
> > > - extern DL_IMPORT(PyObject *) PyInstance_DoBinOp(PyObject *, PyObject *,
> > > -                                                 char *, char *,
> > > -                                                 PyObject * (*)(PyObject *,
> > > -                                                                PyObject *));
> > > -
> > > - extern DL_IMPORT(int)
> > > - PyInstance_HalfBinOp(PyObject *, PyObject *, char *, PyObject **,
> > > -                       PyObject * (*)(PyObject *, PyObject *), int);
> >
> > Wouldn't it be safer to provide emulation APIs for these ? There
> > might be code out there using these APIs.
> 
> No.  These were never intended to be part of the API (and it was a
> mistake that they used DL_IMPORT()).  They had to be extern because
> they were defined in one file and used in another.  I'm glad they're
> gone.  They are so obscure that I'd be *very* surprised if anybody was
> using them, and even more if they even *wanted* emulation under the
> new scheme -- I'd expect them to eagerly convert their code to using
> new-style numbers right away.

I'll see whether I can get mxDateTime working with the new
scheme later this year -- it would be really great to do away
with the coercion hack I was using until now :-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tim.one@home.com  Fri Jan  5 06:04:56 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 5 Jan 2001 01:04:56 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101041527.KAA12181@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENHIGAA.tim.one@home.com>

[Guido van Rossum]
> ...
> (Unfortunately, from a phone conversation I had last night with
> Tim, there's not much hope of doing something there -- and that
> platform [Win32] sorely needs it!  The hacks that Tim reported
> earlier are definitely not thread-safe.  While it's easy to come
> up with getc_unlocked() for Windows, the locking operations used
> internally there by the /MT code are not exported from MSVCRT.DLL,
> and that's crucial.)

The short course is that I still haven't found a workable way to lock
streams on Windows:  they do have a complete set of stream-locking functions
and macros, but there's no way short of deep magic I can find to get at them
("deep magic" == resort to assembler and patch in function addresses).

The only file-locking functions advertised in the C and platform SDK
libraries are trivial variants of Python's msvcrt.locking, but that has to
do with locking specific file byte-position ranges across processes, not
ensuring the integrity of runtime stream structures across threads.

Perl appears to ignore the issue of thread safety here (on Windows and
everywhere else).

Revealing experiment!

1. I threw away my changes and rebuilt from current CVS.

2. I made one change, expanding the getc() call in get_line to what MSVC
*would* expand it to if we weren't building in thread mode:

    if ((c = (--fp->_cnt >= 0 ?
              0xff & *fp->_ptr++ :
              _filbuf(fp))) == EOF) {

That alone reduced the runtime of my "while 1: readline" test case from over
30 seconds to 12.8.  What I did before went beyond that, by also (in effect)
unrolling the loop and optimizing it.  That bought an additional ~2 seconds.

So compared to Perl's 6 seconds, it looks like we're paying (on Win98SE)
approximately:

   17 seconds for compiling with _MT (threadsafe libc)
    6 seconds to do the work <wink>
    5 seconds for "other stuff", best guess mostly a poor
          platform malloc/realloc
    2 seconds for not optimizing the loop
   --
   30 total

Unfortunately, the smoking gun is the only one whose firing pin we can't
file down on this platform.

so-the-good-news-is-that-it's-impossible-for-perl-not-to-be-at-
    least-twice-as-fast<wink>-ly y'rs  - tim



From guido@python.org  Fri Jan  5 15:29:05 2001
From: guido@python.org (Guido van Rossum)
Date: Fri, 05 Jan 2001 10:29:05 -0500
Subject: [Python-Dev] Python 2.1 release schedule (PEP 226)
Message-ID: <200101051529.KAA19100@cj20424-a.reston1.va.home.com>

We had our first PythonLabs meeting of the year yesterday, and we went
over the 2.1 release schedule.  The release schedule is posted in PEP
226: http://python.sourceforge.net/peps/pep-0226.html

We found that the schedule previously posted there was a bit too
aggressive, given our goals for this release, so we have adjusted the
dates somewhat.  We have also decided on a date for the first alpha
release (previously unmentioned in the PEP).  So, here are the
relevant dates:

    19-Jan-2001: First 2.1 alpha release
    23-Feb-2001: First 2.1 beta release
    01-Apr-2001: 2.1 final release

We're already in PEP freeze mode -- no more PEPs will be considered
for inclusion in 2.1.  Below is a list of the PEPs that we are
currently considering, with some comments.  But first some general
remarks:

- The alpha release cycle is for testing of tentative features.  Alpha
  releases contain working code that we want to see widely tested;
  however, it's possible that a feature present in an alpha release is
  changed or even retracted in a later release.

- Beta releases represent a feature freeze -- after the first beta
  release, we will resign ourselves to fixing bugs.  Once beta 1 is
  released, no new features will be introduced, and no features will
  be withdrawn.

The alpha cycle is especially important for features (such as nested
scopes) that (may) introduce backwards incompatibilities.  There may
be more than one alpha release depending on feedback on the alpha 1
release.  (But having too many alpha releases is not good -- people
won't bother downloading.)

Thus, we can only introduce a new feature in beta 1 if we're very sure
that it is mature enough to stay without interface changes.  The final
decision on all PEPs under consideration has to be made before the
beta 1 release.

The beta cycle is important to ensure stability of the final release.

Specific PEPs under consideration:

 I    42  pep-0042.txt  Small Feature Requests                 Hylton

	  Actually, most of these won't be fulfilled in 2.1.

 SD  205  pep-0205.txt  Weak References                        Drake

	  Fred is still working on this.  I hope Tim can assist.  But
	  we may have to postpone this.

 S   207  pep-0207.txt  Rich Comparisons                   Lemburg, van Rossum

	  I'm pretty sure that this is a piece of cake now that the
	  coercion patches are checked in.

 S   208  pep-0208.txt  Reworking the Coercion Model           Schemenauer

	  All checked in.  Great work, Neil!

 S   217  pep-0217.txt  Display Hook for Interactive Use       Zadka

	  Moshe, this was accepted ages ago.  Would you mind
	  submitting a patch to SourceForge?  If you don't champion
	  this (and nobody else does), we may have to postpone it
	  still.

 S   222  pep-0222.txt  Web Library Enhancements               Kuchling

	  This is really up to Andrew.  It seems he plans to create
	  new modules, so he won't be introducing incompatibilities in
	  existing APIs.

 S   227  pep-0227.txt  Statically Nested Scopes               Hylton

	  Jeremy is still working on a proper implementation, which he
	  hopes to have ready in time for the first alpha release
	  date.

 S   229  pep-0229.txt  Using Distutils to Build Python        Kuchling

	  I just moved this from pie-in-the-sky to active.  Andrew has
	  a working prototype, it just doesn't work 100% yet, so I'm
	  very hopeful.

 S   230  pep-0230.txt  Warning Framework                      van Rossum

	  All done.

 S   232  pep-0232.txt  Function Attributes                    Warsaw

	  Still waiting for Barry to implement this, but it's pretty
	  straightforward.

 S   233  pep-0233.txt  Python Online Help                     Prescod

	  Paul, what's up with this?  Tim & I recommended to do
	  something simple and working, and then you disappeared from
	  the face of the earth.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Fri Jan  5 15:28:16 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 5 Jan 2001 10:28:16 -0500 (EST)
Subject: [Python-Dev] new "theme" on SourceForge!
Message-ID: <14933.59408.512734.105160@cj42289-a.reston1.va.home.com>

  While "theme-ability" is becoming very popular for desktop software
(think about the latest Gnome and KDE systems for Unix, and some of
the multimedia applications for Windows, and the newest MacOS
desktops), it can be a huge drain on Web sites; too many graphics is a
pain, and too many tables just makes it worse.
  SourceForge had definately fallen prey to the overly-fancy themes,
and all of us developers paid the price with slow rendering.
  But they've fixed that!
  The SF crew has announced a new "theme" called "Ultra Light" which
is optimized for slow connections.  What that really means is less
embedded graphics and fewer nested tables, so rendering is *much*
faster.
  To try the new theme, go to the "Change My Theme" link near the top
of the left-hand navigation area.  Use the form to select "Ultra
Light"; you can preview the theme first if you want.
  Guido also thinks its cool that the bug & patch report pages are
printable with this theme.  (Sheesh... managers! ;)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From tim.one@home.com  Fri Jan  5 17:46:16 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 5 Jan 2001 12:46:16 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Lib fileinput.py,1.5,1.6
In-Reply-To: <E14EY6j-0000wX-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEPBIGAA.tim.one@home.com>

[Guido]
> Modified Files:
> 	fileinput.py
> Log Message:
> Speed it up by using readlines(sizehint).  It's still slower than
> other ways of reading input. :-(

On my box, it's now head-to-head with (maybe even a little quicker than) the
while 1: line-at-a-time way:

total 117615824 chars and 3237568 lines
readlines_sizehint    9.450  9.459
using_fileinput      29.880 29.884
while_readline       30.480 30.506

(stock CVS Python under Win98SE)

So that's a huge improvement!

the-two-people-using-fileinput-should-be-delighted<wink>-ly y'rs  - tim



From skip@mojam.com (Skip Montanaro)  Fri Jan  5 19:05:14 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 5 Jan 2001 13:05:14 -0600 (CST)
Subject: [Python-Dev] fileinput.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEPBIGAA.tim.one@home.com>
References: <E14EY6j-0000wX-00@usw-pr-cvs1.sourceforge.net>
 <LNBBLJKPBEHFEDALKOLCOEPBIGAA.tim.one@home.com>
Message-ID: <14934.6890.160122.384692@beluga.mojam.com>

    Tim> the-two-people-using-fileinput-should-be-delighted<wink>-ly

What do you think contributes to fileinput's relative disfavor?  This whole
thread on Python's file reading performance was started by the eternal whine
"why is Python so much slower than Perl?" which really means why is

   line = f.readline()
   while line:
      process(line)

so much slower than whatever that thing is in Perl that everybody uses as
the be-all-end-all performance benchmark (something with <> in it).

Given that fileinput is supposed to make the I/O loop in Python more
familiar to those people wandering over from Perl (at least in part), you'd
think that people would naturally gravitate to it.  Would it benefit from
some exposure in the Python tutorial?  Is it fast enough now to warrant the
extra exposure?

just-whining-out-loud-ly y'rs

Skip


From tim.one@home.com  Fri Jan  5 19:11:00 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 5 Jan 2001 14:11:00 -0500
Subject: [Python-Dev] new "theme" on SourceForge!
In-Reply-To: <14933.59408.512734.105160@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEPJIGAA.tim.one@home.com>

[Fred L. Drake, Jr.]

Who would have guessed that the "L." stands for Light?

> ...
> The SF crew has announced a new "theme" called "Ultra Light" which
> is optimized for slow connections.

Indeed, I think I can cancel my cable modem now and go back to a 28.8 phone
modem.

liking-it!-ly y'rs  - tim



From jeremy@alum.mit.edu  Fri Jan  5 19:14:49 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 5 Jan 2001 14:14:49 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
Message-ID: <14934.7465.360749.199433@localhost.localdomain>

There was a brief discussion of unit testing last millennium, which did
not reach any conclusions.  I'd like to restart the discussion and set
some specific goals.  The action item is a unit testing bake-off, held
next week, to choose a tool.

The primary goal is to choose a unit testing framework for the
regression test suite.  Tests written with this framework would
eventually replace the current regrtest.py framework, based on
comparing test output to expected output.

For the 2.1 release, the goal would be to choose a test framework to
include in the standard distribution and use it to write some or all
of the new tests.  We would need to integrate it in some way with
regrtest.py, so that a single command can be used to run all the
tests.

In the long run, we can migrate existing tests to use the new system.
The new system can help us address some other goals:

    - running an entire test suite to completion instead of stopping
      on the first failure

    - clearer reporting of what went wrong

    - better support for conditional tests, e.g. write a test for
      httplib that only runs if the network is up.  This is tied into
      better error reporting, since the current test suite could only
      report that httplib succeeded or failed.

Does anyone disagree with the goal?

Three tools have been proposed: PyUnit, Quixote unittest, and doctest.

doctest has been championed by Peter Funk, who wants a few new
features, but Tim, its author, isn't pushing it as a tool for writing
stand alone tests.  I think the best way to use doctest is for module
writers to consider it when writing a new module.  If doctest is used
from the start for a module, we could integrate it with the regression
test.  It seems quite useful for what it is intended for, but is not a
general solution.

That leaves PyUnit and Quixote's unittest.  The two tools are fairly
similar, but differ on a number of non-trivial details.  Quixote also
integrates code coverage, which is quite handy.  If we don't adopt its
unittest, we should add code coverage to PyUnit.

Is anyone else interested in the choice between the two?  If so, I
suggest you try writing some tests with each tool and reporting back
with your feedback.  I propose leaving one week for such a bake-off and
making a decision next Friday.

Jeremy


From fredrik@effbot.org  Fri Jan  5 19:55:18 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Fri, 5 Jan 2001 20:55:18 +0100
Subject: [Python-Dev] unit testing bake-off
References: <14934.7465.360749.199433@localhost.localdomain>
Message-ID: <004c01c07751$6eed84d0$e46940d5@hagrid>

Jeremy Hylton wrote:
> Is anyone else interested in the choice between the two?

yes.  I suggest adding doctest.py plus one unit test implementation.

> If so, I suggest you try writing some tests with each tool and
> reporting back with your feedback.

we've recently migrated from a 30-minute reimplementation of Kent
Beck's original framework to one of the frameworks you mention.  with
that background, the choice was easy.  let me know when it's time to
vote...

</F>



From guido@python.org  Fri Jan  5 19:55:33 2001
From: guido@python.org (Guido van Rossum)
Date: Fri, 05 Jan 2001 14:55:33 -0500
Subject: [Python-Dev] fileinput.py
In-Reply-To: Your message of "Fri, 05 Jan 2001 13:05:14 CST."
 <14934.6890.160122.384692@beluga.mojam.com>
References: <E14EY6j-0000wX-00@usw-pr-cvs1.sourceforge.net> <LNBBLJKPBEHFEDALKOLCOEPBIGAA.tim.one@home.com>
 <14934.6890.160122.384692@beluga.mojam.com>
Message-ID: <200101051955.OAA20190@cj20424-a.reston1.va.home.com>

> What do you think contributes to fileinput's relative disfavor?

In my view, fileinput is one of those unfortunate features that exist
solely to shut up a particular kind of criticism.  Without fileinput,
Perl zealots would have an easy argument for a "trivial reject" of
even considering Python.  Now, when somebody claims the superiority of
Perl's "loop involving a <> thingie", you can point to fileinput to
prevent them from scoring a point.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Jan  5 20:01:13 2001
From: guido@python.org (Guido van Rossum)
Date: Fri, 05 Jan 2001 15:01:13 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: Your message of "Fri, 05 Jan 2001 20:55:18 +0100."
 <004c01c07751$6eed84d0$e46940d5@hagrid>
References: <14934.7465.360749.199433@localhost.localdomain>
 <004c01c07751$6eed84d0$e46940d5@hagrid>
Message-ID: <200101052001.PAA20238@cj20424-a.reston1.va.home.com>

> yes.  I suggest adding doctest.py plus one unit test implementation.

I second this vote for doctest (in addition to a unittest thing).  I
propose that Tim checks in his latest version of doctest.  It should
go under Lib, not under Lib/test, I think.  (Certainly that's how Tim
has been proposing its use.)

It requires LaTeX docs, but since it's got a great docstring, that
should be easy.

> > If so, I suggest you try writing some tests with each tool and
> > reporting back with your feedback.
> 
> we've recently migrated from a 30-minute reimplementation of Kent
> Beck's original framework to one of the frameworks you mention.  with
> that background, the choice was easy.  let me know when it's time to
> vote...

Which framework are you now using?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Jan  5 20:14:41 2001
From: guido@python.org (Guido van Rossum)
Date: Fri, 05 Jan 2001 15:14:41 -0500
Subject: [Python-Dev] Add __exports__ to modules
Message-ID: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>

Please have a look at this SF patch:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470

This implements control over which names defined in a module are
externally visible: if there's a variable __exports__ in the module,
it is a list of identifiers, and any access from outside the module to
names not in the list is disallowed.  This affects access using the
getattr and setattr protocols (which raise AttributeError for
disallowed names), as well as "from M import v" (which raises
ImportError).

I like it.  This has been asked for many times.  Does anybody see a
reason why this should *not* be added?

Tim remarked that introducing this will prompt demands for a similar
feature on classes and instances, where it will be hard to implement
without causing a bit of a slowdown.  It causes a slight slowdown (an
extra dictionary lookup for each use of "M.v") even when it is not
used, but for accessing module variables that's acceptable.  I'm not
so sure about instance variable references.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@alum.mit.edu  Fri Jan  5 20:19:55 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 5 Jan 2001 15:19:55 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <200101052001.PAA20238@cj20424-a.reston1.va.home.com>
References: <14934.7465.360749.199433@localhost.localdomain>
 <004c01c07751$6eed84d0$e46940d5@hagrid>
 <200101052001.PAA20238@cj20424-a.reston1.va.home.com>
Message-ID: <14934.11371.879059.610988@localhost.localdomain>

If anyone is interested in experimenting with a test suite, here is a
summary of the code coverage for the current regression test suite as
run on my Linux box.  Pick a module with low code coverage and your
experiment can also improve the regression test suite.

Jeremy

 67.42%    798  Modules/arraymodule.c
 74.39%    773  Modules/audioop.c
 81.84%    380  Modules/binascii.c
 62.36%    449  Modules/bsddbmodule.c
 78.29%    152  Modules/cmathmodule.c
 67.89%    246  Modules/_codecsmodule.c
 47.41%   2647  Modules/cPickle.c
 87.50%      8  Modules/cryptmodule.c
 64.34%    272  Modules/cStringIO.c
  0.00%   1351  Modules/_cursesmodule.c
  0.00%    202  Modules/_curses_panel.c
 99.28%    139  Modules/errnomodule.c
 30.71%    127  Modules/fcntlmodule.c
 81.90%    315  Modules/gcmodule.c
  0.00%      4  Modules/getbuildinfo.c
 47.29%    277  Modules/getpath.c
 72.22%     54  Modules/grpmodule.c
 79.95%    419  Modules/imageop.c
  0.00%     11  Modules/../Include/cStringIO.h
 13.25%    234  Modules/linuxaudiodev.c
 14.80%    223  Modules/_localemodule.c
 30.66%    137  Modules/main.c
 73.20%     97  Modules/mathmodule.c
 98.39%    124  Modules/md5c.c
 69.70%     66  Modules/md5module.c
 48.62%    362  Modules/mmapmodule.c
 66.22%     74  Modules/newmodule.c
 84.91%     53  Modules/operator.c
 50.57%   1236  Modules/parsermodule.c
  0.00%    350  Modules/pcremodule.c
 28.88%   1077  Modules/posixmodule.c
 82.05%     39  Modules/pwdmodule.c
 77.96%    431  Modules/pyexpat.c
  0.00%   1876  Modules/pypcre.c
 50.00%      2  Modules/python.c
  0.00%    189  Modules/readline.c
 78.35%    425  Modules/regexmodule.c
 72.93%    931  Modules/regexpr.c
  0.00%     81  Modules/resource.c
 76.98%    443  Modules/rgbimgmodule.c
 82.70%    289  Modules/rotormodule.c
 82.47%    291  Modules/selectmodule.c
 85.10%    208  Modules/shamodule.c
 81.52%    276  Modules/signalmodule.c
 51.18%    678  Modules/socketmodule.c
 78.64%   1105  Modules/_sre.c
 69.67%    689  Modules/stropmodule.c
 80.49%    656  Modules/structmodule.c
  4.88%    123  Modules/termios.c
 60.71%    140  Modules/threadmodule.c
 68.78%    205  Modules/timemodule.c
 76.92%     65  Modules/ucnhash.c
 87.50%     16  Modules/unicodedatabase.c
 65.83%    120  Modules/unicodedata.c
 68.81%    420  Modules/zlibmodule.c
 64.68%   1005  Objects/abstract.c
 18.77%    261  Objects/bufferobject.c
 68.77%   1204  Objects/classobject.c
 27.59%     58  Objects/cobject.c
 59.41%    271  Objects/complexobject.c
 78.32%    678  Objects/dictobject.c
 52.14%    723  Objects/fileobject.c
 80.43%    368  Objects/floatobject.c
 84.86%    185  Objects/frameobject.c
 60.40%    149  Objects/funcobject.c
 78.68%    455  Objects/intobject.c
 77.66%    779  Objects/listobject.c
 81.17%   1142  Objects/longobject.c
 50.68%    148  Objects/methodobject.c
 58.82%    136  Objects/moduleobject.c
 76.50%    549  Objects/object.c
 15.24%    105  Objects/rangeobject.c
 41.03%     78  Objects/sliceobject.c
 76.63%   1797  Objects/stringobject.c
 77.00%    287  Objects/tupleobject.c
 22.22%     18  Objects/typeobject.c
 84.26%    108  Objects/unicodectype.c
 66.61%   2743  Objects/unicodeobject.c
 90.79%     76  Parser/acceler.c
  0.00%     28  Parser/bitset.c
  0.00%     67  Parser/firstsets.c
 18.18%     22  Parser/grammar1.c
  0.00%    139  Parser/grammar.c
  0.00%     30  Parser/intrcheck.c
  0.00%     38  Parser/listnode.c
  0.00%      2  Parser/metagrammar.c
  0.00%     63  Parser/myreadline.c
 90.70%     43  Parser/node.c
 82.26%    124  Parser/parser.c
 79.38%     97  Parser/parsetok.c
  0.00%    366  Parser/pgen.c
  0.00%     85  Parser/pgenmain.c
  0.00%     60  Parser/printgrammar.c
 76.70%    588  Parser/tokenizer.c
 62.31%   1231  Python/bltinmodule.c
 76.55%   2021  Python/ceval.c
 64.78%    230  Python/codecs.c
 73.85%   2367  Python/compile.c
 76.67%     30  Python/dynload_shlib.c
 75.75%    301  Python/errors.c
 65.59%    401  Python/exceptions.c
  0.00%     31  Python/frozenmain.c
 56.83%    776  Python/getargs.c
100.00%      2  Python/getcompiler.c
100.00%      2  Python/getcopyright.c
 80.00%      5  Python/getmtime.c
 15.62%     32  Python/getopt.c
100.00%      2  Python/getplatform.c
100.00%      4  Python/getversion.c
 61.78%   1167  Python/import.c
 66.67%     42  Python/importdl.c
 51.35%    483  Python/marshal.c
 60.58%    274  Python/modsupport.c
 88.73%     71  Python/mystrtoul.c
  0.00%      2  Python/pyfpe.c
 91.15%    113  Python/pystate.c
 37.80%    635  Python/pythonrun.c
  0.00%      5  Python/sigcheck.c
 12.67%    150  Python/structmember.c
 53.87%    323  Python/sysmodule.c
100.00%      5  Python/thread.c
 53.47%    144  Python/thread_pthread.h
 21.74%    138  Python/traceback.c
 58.65%  48417  TOTAL


From tim.one@home.com  Fri Jan  5 20:46:10 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 5 Jan 2001 15:46:10 -0500
Subject: [Python-Dev] RE: fileinput.py
In-Reply-To: <14934.6890.160122.384692@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEAAIHAA.tim.one@home.com>

[Skip Montanaro]
> What do you think contributes to fileinput's relative disfavor?

Only half jokingly, because I never use it <wink>, and I don't think Fredrik
or Alex Martelli do either.  That means it rarely gets mentioned by the
c.l.py reply bots.  Plus it's not *used* anywhere in the Python
distribution, so nobody stumbles into it that way either.  Plus the docs
require more than one line to explain what it does, and get bogged down
describing the Awk-like (Perl took this from Awk) convolutions before the
simplest (one explictly named file) case.  It *is* regularly mentioned in
the eternal "while 1:" debate, but that's it.

> This whole thread on Python's file reading performance was started
> by the eternal whine "why is Python so much slower than Perl?"

No, it started with Guido's objections to Jeff's xreadlines patch.  I
dragged Perl into it -- because, like it or not, that was the right thing to
do <wink>.

> which really means why is
>
>    line = f.readline()
>    while line:
>       process(line)
>
> so much slower than whatever that thing is in Perl that everybody
> uses as the be-all-end-all performance benchmark (something with
> <> in it).

"<FILE>" is simply Perl's way of spelling Python's FILE.readline() (and
FILE.readlines(), when <FILE> appears in an array context; and FILE.read()
when Perl's Awkish "record separator" is disabled; and ...).  "<>" without
an explict filehandle does all the inherited-from-Awk magic with argv, else
that stuff doesn't come into play.   "<>" (wihtout a filehandle) seems
rarely used in Perl practice, though, *except* in support of

your_shell_prompt> some_perl_script < some_file

That is, "<>" is usually used simply as an abbrevision for <STDIN>, and I
bet *most* Perl programmers don't even know "<>" is more general than that.

> Given that fileinput is supposed to make the I/O loop in Python more
> familiar to those people wandering over from Perl (at least in part),
> you'd think that people would naturally gravitate to it.

I guess you didn't actually read the timing results <wink>.  Really, it's
been an outrageously slow way to do input.  That's better now, and I'm much
more likely now than I used to be to use

    for line in fileinput.input('file'):

instead of

    f = open('file')
    while 1:
        line = f.readline()
        if not line:
            break

The relative attraction of the former is obvious if it's reasonably quick.
I don't really have any use for the Awk complications (note that I'm running
on Windows, though, and the shells here don't expand wildcards -- the Awk
gimmicks are much more useful on Unix systems).

> Would it benefit from some exposure in the Python tutorial?

Heh -- that's a tough one.  The *simplest* case is the only one deserving of
promotion.  But in that case, Jeff's xreadlines is about as convenient and
much quicker.  I bet we'll all be afraid to change the tutorial to mention
either <0.9 wink>.

> Is it fast enough now to warrant the extra exposure?

Don't know.  It's the same speed as "while 1: on *my* box now, but still 3x
slower than the double-loop method.

> just-whining-out-loud-ly y'rs

so-do-*you*-want-to-use-it-now?-ly y'rs  - tim



From thomas@xs4all.net  Fri Jan  5 21:19:42 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 5 Jan 2001 22:19:42 +0100
Subject: [Python-Dev] RE: fileinput.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEAAIHAA.tim.one@home.com>; from tim.one@home.com on Fri, Jan 05, 2001 at 03:46:10PM -0500
References: <14934.6890.160122.384692@beluga.mojam.com> <LNBBLJKPBEHFEDALKOLCIEAAIHAA.tim.one@home.com>
Message-ID: <20010105221942.J2467@xs4all.nl>

On Fri, Jan 05, 2001 at 03:46:10PM -0500, Tim Peters wrote:

> "<>" (wihtout a filehandle) seems
> rarely used in Perl practice, though, *except* in support of
> 
> your_shell_prompt> some_perl_script < some_file
> 
> That is, "<>" is usually used simply as an abbrevision for <STDIN>, and I
> bet *most* Perl programmers don't even know "<>" is more general than that.

Well, I can't say anything about *most* Perl programmers, but all Perl
programmers I know (including me) know damned well what <> does, and use it
frequently. And in all the ways: no arguments meaning <STDIN>, a list of
files meaning open those files one at a time, using - to include stdin in
that list, accessing the filename and linenumber, etc. None of them can be
called newbies, though.

But then, I like using Python's fileinput, too, so maybe I'm just weird :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From ping@lfw.org  Fri Jan  5 22:01:53 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Fri, 5 Jan 2001 16:01:53 -0600 (CST)
Subject: [Python-Dev] RE: fileinput.py
In-Reply-To: <20010105221942.J2467@xs4all.nl>
Message-ID: <Pine.LNX.4.10.10101051553200.452-100000@skuld.kingmanhall.org>

On Fri, 5 Jan 2001, Thomas Wouters wrote:

> On Fri, Jan 05, 2001 at 03:46:10PM -0500, Tim Peters wrote:
> > That is, "<>" is usually used simply as an abbrevision for <STDIN>, and I
> > bet *most* Perl programmers don't even know "<>" is more general than that.
> 
> Well, I can't say anything about *most* Perl programmers, but all Perl
> programmers I know (including me) know damned well what <> does, and use it
> frequently. And in all the ways: no arguments meaning <STDIN>, a list of
> files meaning open those files one at a time, using - to include stdin in
> that list, accessing the filename and linenumber, etc.

I was just about to chime in and say the same thing.  I don't even
program in Perl any more, and i still remember all the ways that <> works.

For text-processing scripts, it's unbeatable.  It does pretty much
exactly everything you want, and the idiom

    while (<>) {
        ...
    }

is simple, quickly learned, frequently used, and instantly recognizable.

    import sys
    if len(sys.argv) > 1:
        file = open(sys.argv[1])
    else:
        file = sys.stdin
    while 1:
        line = file.readline()
        if not line:
            break
        ...

is much more complex, harder to explain, harder to learn, and runs slower.

I have two separate suggestions:

    1.  Include 'sys' in builtins.  It's silly to have to 'import sys'
        just to be able to see sys.argv and sys.stdin.

    2.  Put fileinput.input() in sys.

With both, the while (<>) idiom becomes:

    for line in sys.input():
        ...


-- ?!ng

"This code is better than any code that doesn't work has any right to be."
    -- Roger Gregory, on Xanadu



From skip@mojam.com (Skip Montanaro)  Fri Jan  5 22:19:36 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 5 Jan 2001 16:19:36 -0600 (CST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <14934.11371.879059.610988@localhost.localdomain>
References: <14934.7465.360749.199433@localhost.localdomain>
 <004c01c07751$6eed84d0$e46940d5@hagrid>
 <200101052001.PAA20238@cj20424-a.reston1.va.home.com>
 <14934.11371.879059.610988@localhost.localdomain>
Message-ID: <14934.18552.749081.871226@beluga.mojam.com>

    Jeremy> If anyone is interested in experimenting with a test suite, here
    Jeremy> is a summary of the code coverage for the current regression
    Jeremy> test suite as run on my Linux box.

Speaking of which, I am still running my nightly code coverage thing (still
with warts) whose results are available at

    http://musi-cal.mojam.com/~skip/python/Python/dist/src/

Does anyone care?  Should I turn it off?

Skip


From thomas@xs4all.net  Fri Jan  5 23:18:58 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sat, 6 Jan 2001 00:18:58 +0100
Subject: [Python-Dev] RE: fileinput.py
In-Reply-To: <Pine.LNX.4.10.10101051553200.452-100000@skuld.kingmanhall.org>; from ping@lfw.org on Fri, Jan 05, 2001 at 04:01:53PM -0600
References: <20010105221942.J2467@xs4all.nl> <Pine.LNX.4.10.10101051553200.452-100000@skuld.kingmanhall.org>
Message-ID: <20010106001858.B402@xs4all.nl>

On Fri, Jan 05, 2001 at 04:01:53PM -0600, Ka-Ping Yee wrote:

>     while (<>) {
>         ...
>     }

> is simple, quickly learned, frequently used, and instantly recognizable.

>     import sys
>     if len(sys.argv) > 1:
>         file = open(sys.argv[1])
>     else:
>         file = sys.stdin
>     while 1:
>         line = file.readline()
>         if not line:
>             break
>         ...

... Except that it can take more than one filename, and will do the one
after another, and that it takes "-" as a filename for stdin. Doing it in a
script is not dead simple, unless you open up all files at once (which can
be harmful, and Perl, for one, doesn't do) or you do most of the work
fileinput does. That is why I use fileinput (and while-diamond) -- I might
not need it now, but when I do need it, it already works :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From moshez@zadka.site.co.il  Sat Jan  6 11:00:33 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Sat,  6 Jan 2001 13:00:33 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
Message-ID: <20010106110033.52127A84F@darjeeling.zadka.site.co.il>

On Fri, 05 Jan 2001 15:14:41 -0500, Guido van Rossum <guido@python.org> wrote:

> Please have a look at this SF patch:
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> 
> This implements control over which names defined in a module are
> externally visible: if there's a variable __exports__ in the module,
> it is a list of identifiers, and any access from outside the module to
> names not in the list is disallowed.  This affects access using the
> getattr and setattr protocols (which raise AttributeError for
> disallowed names), as well as "from M import v" (which raises
> ImportError).

Ummmmm.....why do we want this? What's wrong with the current
suggestion of using "_"? __exports__ feels somehow wrong to
me. None of the rest of Python has any access control, and
I really like that. A big -1 from me, for what it's worth.

> I like it.

I'm surprised. Why do you like that?

>  This has been asked for many times.  

So has adding curly-braces as control structure, with all due respect.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From billtut@microsoft.com  Sat Jan  6 03:43:06 2001
From: billtut@microsoft.com (Bill Tutt)
Date: Fri, 5 Jan 2001 19:43:06 -0800
Subject: [Python-Dev] Add __exports__ to modules
Message-ID: <58C671173DB6174A93E9ED88DCB0883DB8637E@red-msg-07.redmond.corp.microsoft.com>

I think I'm with Moshe on this one, whats wrong with just using underscores
(__) to play the hiding game.

Here's my silly language suggestion for this week:

with self:
  .bar = foo
  bar.blah = .fubar
  .bar = .bar + 1
  # etc....

Bill


From skip@mojam.com (Skip Montanaro)  Sat Jan  6 04:15:12 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 5 Jan 2001 22:15:12 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010106110033.52127A84F@darjeeling.zadka.site.co.il>
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
 <20010106110033.52127A84F@darjeeling.zadka.site.co.il>
Message-ID: <14934.39888.908416.983794@beluga.mojam.com>

    > On Fri, 05 Jan 2001 15:14:41 -0500, Guido van Rossum <guido@python.org> wrote:
    > Please have a look at this SF patch:
    > 
    > http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
    > 
    > This implements control over which names defined in a module are
    > externally visible: if there's a variable __exports__ in the module,
    > it is a list of identifiers, and any access from outside the module to
    > names not in the list is disallowed.  This affects access using the
    > getattr and setattr protocols (which raise AttributeError for
    > disallowed names), as well as "from M import v" (which raises
    > ImportError).

I have to agree with Moshe.  If __exports__ is implemented for modules we'll
have multiple, different access control mechanisms for different things,
some of which thoughtful programmers would be able to get around, some of
which they wouldn't.  Here are the ways I'm aware of to control attribute
visibility (there may be others - I don't usually delve too deeply into this
stuff):

  * preface module globals with "_": This just prevents those globals from
    being added to the current namespace when a programmer executes "from
    module import *".  Programmers can workaround this by attribute access
    through the module object or by explicitly importing it: "from module
    import _foo" works, yes?

  * preface class or instance attributes with "__":  This just mangles the
    name by prefacing the visible name with _<classname>.  The programmer
    can still access it by knowing the simple name mangling rule.

In both cases the programmer can still get at the attribute value when
necessary.

If you were to add some sort of access control to module globals, I would
have thought it would have been along the same lines as the existing
mechanisms in place to "hide" class/instance attributes.  Would it be
possible (or desirable) to add the name mangling restriction to module
globals as an alternative to this more restrictive implementation?  What
about the chances that class/instance attribute hiding will get more
restrictive in the future?  Finally, are the motivations for wanting to
restrict access to module globals and class/instance attributes that much
different from one another that they call for fundamentally different
mechanisms?

Skip


From barry@digicool.com  Sat Jan  6 05:15:20 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Sat, 6 Jan 2001 00:15:20 -0500
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
 <20010106110033.52127A84F@darjeeling.zadka.site.co.il>
Message-ID: <14934.43496.322436.612746@anthem.wooz.org>

I'm -0 on this, largely for the reasons already brought up: if modules
grow __exports__ then there will be pressure to add it to classes, and
modules already have a limited version of access control through
leading underscore names.

I might be more positive on the addition if __exports__ were added to
classes, because at least there'd be a consistently stronger fence
added to name access rules that prevented even consenting adults from
fiddling with the naughty bits.

-Barry



From nas@arctrix.com  Fri Jan  5 23:20:58 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Fri, 5 Jan 2001 15:20:58 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <14934.43496.322436.612746@anthem.wooz.org>; from barry@digicool.com on Sat, Jan 06, 2001 at 12:15:20AM -0500
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com> <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <14934.43496.322436.612746@anthem.wooz.org>
Message-ID: <20010105152058.A6016@glacier.fnational.com>

On Sat, Jan 06, 2001 at 12:15:20AM -0500, Barry A. Warsaw wrote:
> I might be more positive on the addition if __exports__ were added to
> classes, because at least there'd be a consistently stronger fence
> added to name access rules that prevented even consenting adults from
> fiddling with the naughty bits.

I think you, Skip and Moshe are missing a big advantage of having
the __exports__ mechanism.  It should allow some attribute access
inside of modules to become faster (like LOAD_FAST for locals).
I think that optimization could be implemented without too much
difficultly.  I've never channeled Guido before so I could be off
the mark.  If the only advantage is encapsulation then I'm -0.

  Neil


From barry@digicool.com  Sat Jan  6 07:09:31 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Sat, 6 Jan 2001 02:09:31 -0500
Subject: [Python-Dev] PEP 232 update and patch
Message-ID: <14934.50347.851118.581484@anthem.wooz.org>

--qCCjmbam8k
Content-Type: text/plain; charset=us-ascii
Content-Description: message body text
Content-Transfer-Encoding: 7bit


I've updated PEP 232, function attributes, and uploaded a patch to SF.
I couldn't coax cvs diff into including the new files
Lib/test/test_funcattrs.py and Lib/test/output/test_funcattrs so I'll
attach them below.

PEP 232:
   http://python.sourceforge.net/peps/pep-0232.html

SF patch #103123:
   http://sourceforge.net/patch/?func=detailpatch&patch_id=103123&group_id=5470

Enjoy,
-Barry


--qCCjmbam8k
Content-Type: text/plain
Content-Description: regrtest for function attributes
Content-Disposition: inline;
	filename="test_funcattrs.py"
Content-Transfer-Encoding: 7bit

from test_support import verbose, TestFailed

class F:
    def a(self):
        pass

def b():
    'my docstring'
    pass

# setting attributes on functions
try:
    b.publish
except AttributeError:
    pass
else:
    raise TestFailed, 'expected AttributeError'

b.publish = 1
if b.publish <> 1:
    raise TestFailed, 'function attribute not set to expected value'

docstring = 'its docstring'
b.__doc__ = docstring
if b.__doc__ <> docstring:
    raise TestFailed, 'problem with setting __doc__ attribute'

if 'publish' not in dir(b):
    raise TestFailed, 'attribute not in dir()'

f1 = F()
f2 = F()

try:
    F.a.publish
except AttributeError:
    pass
else:
    raise TestFailed, 'expected AttributeError'

try:
    f1.a.publish
except AttributeError:
    pass
else:
    raise TestFailed, 'expected AttributeError'


F.a.publish = 1
if F.a.publish <> 1:
    raise TestFailed, 'unbound method attribute not set to expected value'

if f1.a.publish <> 1:
    raise TestFailed, 'bound method attribute access did not work'

if f2.a.publish <> 1:
    raise TestFailed, 'bound method attribute access did not work'

if 'publish' not in dir(F.a):
    raise TestFailed, 'attribute not in dir()'

try:
    f1.a.publish = 0
except TypeError:
    pass
else:
    raise TestFailed, 'expected TypeError'

F.a.myclass = F
f1.a.myclass
f2.a.myclass
f1.a.myclass
F.a.myclass

if f1.a.myclass is not f2.a.myclass or \
   f1.a.myclass is not F.a.myclass:
    raise TestFailed, 'attributes were not the same'

# try setting __dict__
try:
    F.a.__dict__ = (1, 2, 3)
except TypeError:
    pass
else:
    raise TestFailed, 'expected TypeError'

F.a.__dict__ = {'one': 11, 'two': 22, 'three': 33}
if f1.a.two <> 22:
    raise TestFailed, 'setting __dict__'

from UserDict import UserDict
d = UserDict({'four': 44, 'five': 55})

try:
    F.a.__dict__ = d
except TypeError:
    pass
else:
    raise TestFailed

if f2.a.one <> f1.a.one <> F.a.one <> 11:
    raise TestFailed

--qCCjmbam8k
Content-Type: text/plain
Content-Description: output of regrtest for function attributes
Content-Disposition: inline;
	filename="test_funcattrs"
Content-Transfer-Encoding: 7bit

test_funcattrs

--qCCjmbam8k--



From martin@loewis.home.cs.tu-berlin.de  Sat Jan  6 10:06:49 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 6 Jan 2001 11:06:49 +0100
Subject: [Python-Dev] PEP 208 comment
Message-ID: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de>

I just studied PEP 208 for the first time. Overall, it seems all
natural and nice, but there is one one aspect I'd like to see changed:
the naming of the type flag.

Currently, it is called Py_TPFLAGS_NEWSTYLENUMBER. IMHO, nothing in a
program should be called "new". The flag will still be there five
years from now, but it won't be new anymore. Also, while the flag
indicates that style of the numbers is new, it does not say what it
does. So I propose to rename it; if nobody finds a better name, I
propose to call it Py_TPFLAGS_UNCOERCED.

Regards,
Martin


From thomas@xs4all.net  Sat Jan  6 12:52:19 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sat, 6 Jan 2001 13:52:19 +0100
Subject: [Python-Dev] PEP 208 comment
In-Reply-To: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Sat, Jan 06, 2001 at 11:06:49AM +0100
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de>
Message-ID: <20010106135219.L2467@xs4all.nl>

On Sat, Jan 06, 2001 at 11:06:49AM +0100, Martin v. Loewis wrote:

> Currently, it is called Py_TPFLAGS_NEWSTYLENUMBER. IMHO, nothing in a
> program should be called "new". The flag will still be there five
> years from now, but it won't be new anymore. Also, while the flag
> indicates that style of the numbers is new, it does not say what it
> does. So I propose to rename it; if nobody finds a better name, I
> propose to call it Py_TPFLAGS_UNCOERCED.

Wrong name. The TPFLAGs only indicate whether a struct is large enough to
contain a particular member, not whether that member is going to contain or
do anything. 'Py_TPFLAGS_HASCOERCE' or some such would seem more appropriate
to me.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From martin@loewis.home.cs.tu-berlin.de  Sat Jan  6 13:36:39 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 6 Jan 2001 14:36:39 +0100
Subject: [Python-Dev] PEP 208 comment
In-Reply-To: <20010106135219.L2467@xs4all.nl> (message from Thomas Wouters on
 Sat, 6 Jan 2001 13:52:19 +0100)
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de> <20010106135219.L2467@xs4all.nl>
Message-ID: <200101061336.f06DadP02895@mira.informatik.hu-berlin.de>

> Wrong name. The TPFLAGs only indicate whether a struct is large enough to
> contain a particular member, not whether that member is going to contain or
> do anything. 

That may have been the original intention; *this* specific flag is not
of that kind. Please look at abstract.c:binary_op1, which has

	if (v->ob_type->tp_as_number != NULL && NEW_STYLE_NUMBER(v)) {
		slot = NB_BINOP(v->ob_type->tp_as_number, op_slot);
		if (*slot) {
			x = (*slot)(v, w);
			if (x != Py_NotImplemented) {
				return x;
			}
			Py_DECREF(x); /* can't do it */
		}
		if (v->ob_type == w->ob_type) {
			goto binop_error;
		}
	}

Here, no additional member was added: there always was tp_as_number,
and that also supported all possible op_slot values. What is new here
is that the slot may be called even if v and w have different types;
that was not allowed before the PEP 208 changes. Yet it tests for
NEW_STYLE_NUMBER(v), which is

PyType_HasFeature((o)->ob_type, Py_TPFLAGS_NEWSTYLENUMBER)

So the presence of this flag is indeed an promise that a specific
member will do something that it normally wouldn't do.

> 'Py_TPFLAGS_HASCOERCE' or some such would seem more appropriate to
> me.

Well, all numbers still have coercion - it just may not be used if the
flag is present. It's not a matter of having or not having something
(well, only the "new style" numbers may have nb_cmp, but calling it
Py_TPFLAGS_HAS_NB_CMP would be besides the point, IMO).

Anyway, I don't want to defend my version too much - I just want to
request that the current name is changed to *something* more
descriptive.

Regards,
Martin


From skip@mojam.com (Skip Montanaro)  Sat Jan  6 14:40:30 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sat, 6 Jan 2001 08:40:30 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010105152058.A6016@glacier.fnational.com>
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
 <20010106110033.52127A84F@darjeeling.zadka.site.co.il>
 <14934.43496.322436.612746@anthem.wooz.org>
 <20010105152058.A6016@glacier.fnational.com>
Message-ID: <14935.11870.360839.235102@beluga.mojam.com>

    Neil> I think you, Skip and Moshe are missing a big advantage of having
    Neil> the __exports__ mechanism.  It should allow some attribute access
    Neil> inside of modules to become faster (like LOAD_FAST for locals).  I
    Neil> think that optimization could be implemented without too much
    Neil> difficultly.

True enough, that hadn't occurred to me.  Knowing that now, I still don't
think consistency of the interface should suffer as a result of
under-the-covers performance gains.

Skip



From skip@mojam.com (Skip Montanaro)  Sat Jan  6 14:42:25 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sat, 6 Jan 2001 08:42:25 -0600 (CST)
Subject: [Python-Dev] Re: [Patches] [Patch #103123] PEP 232 implementation (function attributes)
In-Reply-To: <E14En6H-0000ol-00@usw-sf-web1.sourceforge.net>
References: <E14En6H-0000ol-00@usw-sf-web1.sourceforge.net>
Message-ID: <14935.11985.972526.108391@beluga.mojam.com>

Oooo...  I tried went to check out Barry's function attribute patch at

    http://sourceforge.net/patch/?func=detailpatch&patch_id=103123&group_id=5470

and got

    Fatal error: Call to a member function on a non-object in
    /usr/local/htdocs/alexandria/www/patch/index.php on line 55

in response.  Any idea whazzup?

Skip


From akuchlin@mems-exchange.org  Sat Jan  6 14:47:59 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Sat, 6 Jan 2001 09:47:59 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <14934.18552.749081.871226@beluga.mojam.com>; from skip@mojam.com on Fri, Jan 05, 2001 at 04:19:36PM -0600
References: <14934.7465.360749.199433@localhost.localdomain> <004c01c07751$6eed84d0$e46940d5@hagrid> <200101052001.PAA20238@cj20424-a.reston1.va.home.com> <14934.11371.879059.610988@localhost.localdomain> <14934.18552.749081.871226@beluga.mojam.com>
Message-ID: <20010106094759.A13723@newcnri.cnri.reston.va.us>

On Fri, Jan 05, 2001 at 04:19:36PM -0600, Skip Montanaro wrote:
>Speaking of which, I am still running my nightly code coverage thing (still
>with warts) whose results are available at
>    http://musi-cal.mojam.com/~skip/python/Python/dist/src/

Add a link to it from the Python development pages on SourceForge; I
suspect much of the problem is that people don't remember the URL for
it, and don't want to dig through the archives to find it.

--amk



From mal@lemburg.com  Sat Jan  6 15:15:27 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 06 Jan 2001 16:15:27 +0100
Subject: [Python-Dev] PEP 208 comment
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de>
Message-ID: <3A57368F.FC01F78@lemburg.com>

"Martin v. Loewis" wrote:
> 
> I just studied PEP 208 for the first time. Overall, it seems all
> natural and nice, but there is one one aspect I'd like to see changed:
> the naming of the type flag.
> 
> Currently, it is called Py_TPFLAGS_NEWSTYLENUMBER. IMHO, nothing in a
> program should be called "new". The flag will still be there five
> years from now, but it won't be new anymore. Also, while the flag
> indicates that style of the numbers is new, it does not say what it
> does. So I propose to rename it; if nobody finds a better name, I
> propose to call it Py_TPFLAGS_UNCOERCED.

Given that the design could well be applied to other slots as
well, I think you've got a point there. The idea behind the
flag was to signal that slots will no longer make object type
assumptions which they could previously. Right now, only numeric
types support this feature. In the future I could imaging
strings and other types involving coercion would also want
to use the feature.

Given this design idea, how about calling the flag
Py_TPFLAGS_CHECKTYPES ?!

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From skip@mojam.com (Skip Montanaro)  Sat Jan  6 15:35:20 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sat, 6 Jan 2001 09:35:20 -0600 (CST)
Subject: [Python-Dev] function attributes as "true" class attributes & reclamation error
Message-ID: <14935.15160.130742.390323@beluga.mojam.com>

You know, I thought of something (which was probably already obvious to the
rest of you) while perusing Barry's patch.  Attaching function attributes to
unbound methods could really function like C++ static data members.  You'd
have to write accessor functions to make setting the attributes look clean,
but that wouldn't be all bad.  Precisely because you couldn't modify them
through the bound method, there's be no chance you could make the mistake of
modifying them that way and having them transmogrify into instance
attributes.

Here's a quick example:

    class C:
      def __init__(self):
	self.just_resting()
      __init__.howmany = 0

      def __del__(self):
	self.hes_dead()

      def hes_dead(self):
	C.__init__.howmany -= 1

      def just_resting(self):
	C.__init__.howmany += 1

      def howmany(self):
	return C.__init__.howmany

    def howmany():
	return C.__init__.howmany

    c = C()
    print c.howmany()
    d = C()
    print d.howmany()
    del c
    print d.howmany()

After applying Barry's patch, if I execute this script from the command line
it displays

    1
    2
    1

as one would expect, but then catches an attribute error during cleanup:

    Exception exceptions.AttributeError: "'None' object has no attribute
    '__init__'" in <method C.__del__ of C instance at 0x80ffc14> ignored

If I add "del d" to the end of the script the exception disappears.  I
suspect there is a cleanup order problem of some sort.  It seems like C is
getting reclaimed before d (not possible), or that d's __class__ attribute
is set to None before its __del__ method is called.  Is this a known problem
or something introduced by Barry's patch?

Skip



From barry@digicool.com  Sat Jan  6 16:09:47 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Sat, 6 Jan 2001 11:09:47 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #103123] PEP 232 implementation (function attributes)
References: <E14En6H-0000ol-00@usw-sf-web1.sourceforge.net>
 <14935.11985.972526.108391@beluga.mojam.com>
Message-ID: <14935.17227.634808.132783@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@mojam.com> writes:

    SM> and got

    |     Fatal error: Call to a member function on a non-object in
    |     /usr/local/htdocs/alexandria/www/patch/index.php on line 55

    SM> in response.  Any idea whazzup?

I got a similar error on SF when I tried to find my patch on the
patches page.  I still think the patch manager just gives you no way
to see all the patches when there's more than what fits on one page.
The error dropped a cookie in my lap that logged me out too.

After I logged in again, it all seemed to work.

-Barry



From martin@loewis.home.cs.tu-berlin.de  Sat Jan  6 15:20:51 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 6 Jan 2001 16:20:51 +0100
Subject: [Python-Dev] PEP 208 comment
In-Reply-To: <3A57368F.FC01F78@lemburg.com> (mal@lemburg.com)
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de> <3A57368F.FC01F78@lemburg.com>
Message-ID: <200101061520.f06FKpu03218@mira.informatik.hu-berlin.de>

> Given this design idea, how about calling the flag
> Py_TPFLAGS_CHECKTYPES ?!

Sounds good to me.

Martin


From thomas@xs4all.net  Sat Jan  6 16:47:24 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sat, 6 Jan 2001 17:47:24 +0100
Subject: [Python-Dev] PEP 208 comment
In-Reply-To: <200101061336.f06DadP02895@mira.informatik.hu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Sat, Jan 06, 2001 at 02:36:39PM +0100
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de> <20010106135219.L2467@xs4all.nl> <200101061336.f06DadP02895@mira.informatik.hu-berlin.de>
Message-ID: <20010106174724.M2467@xs4all.nl>

On Sat, Jan 06, 2001 at 02:36:39PM +0100, Martin v. Loewis wrote:

> That may have been the original intention; *this* specific flag is not
> of that kind. Please look at abstract.c:binary_op1, which has

You're right, I stand corrected, I retract my proposal :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@python.org  Sat Jan  6 22:05:23 2001
From: guido@python.org (Guido van Rossum)
Date: Sat, 06 Jan 2001 17:05:23 -0500
Subject: [Python-Dev] function attributes as "true" class attributes & reclamation error
In-Reply-To: Your message of "Sat, 06 Jan 2001 09:35:20 CST."
 <14935.15160.130742.390323@beluga.mojam.com>
References: <14935.15160.130742.390323@beluga.mojam.com>
Message-ID: <200101062205.RAA23603@cj20424-a.reston1.va.home.com>

> You know, I thought of something (which was probably already obvious to the
> rest of you) while perusing Barry's patch.  Attaching function attributes to
> unbound methods could really function like C++ static data members.  You'd
> have to write accessor functions to make setting the attributes look clean,
> but that wouldn't be all bad.  Precisely because you couldn't modify them
> through the bound method, there's be no chance you could make the mistake of
> modifying them that way and having them transmogrify into instance
> attributes.
> 
> Here's a quick example:
> 
>     class C:
>       def __init__(self):
> 	self.just_resting()
>       __init__.howmany = 0
> 
>       def __del__(self):
> 	self.hes_dead()
> 
>       def hes_dead(self):
> 	C.__init__.howmany -= 1
> 
>       def just_resting(self):
> 	C.__init__.howmany += 1
> 
>       def howmany(self):
> 	return C.__init__.howmany
> 
>     def howmany():
> 	return C.__init__.howmany
> 
>     c = C()
>     print c.howmany()
>     d = C()
>     print d.howmany()
>     del c
>     print d.howmany()

Skip, I don't find this better than the existing solution, which uses
C._howmany instead of C.__init__.howmany.

True, you can access it as self._howmany and if you assign to
self._howmany you'd transform it into an instance attribute -- but
that falls in the "then don't do that" category.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim_one@email.msn.com  Sat Jan  6 22:14:44 2001
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 6 Jan 2001 17:14:44 -0500
Subject: [Python-Dev] Rehabilitating fgets
Message-ID: <LNBBLJKPBEHFEDALKOLCCEBOIHAA.tim_one@email.msn.com>

[Guido]
> ...
> Unfortunately we can't use fgets(), even if it were faster than
> getline(), because it doesn't tell how many characters it read.

Let's think about that a little harder, because it appears to be our only
hope on Windows (the MS fgets isn't optimized like the Perl inner loop, but
it does lock/unlock the stream only at routine entry/exit, and uses a hidden
non-locking (== much faster) variant of getc in the guts -- we've seen that
the "locking" part of MS getc accounts for 17 of 30 seconds in my test
case).

> On files containing null bytes, readline() is supposed to treat
> these like any other character;

fgets does too (at least it does on Windows, and I believe that's std
behavior).  The problem is that it also makes up a null byte on its own.

> If your input is "abc\0def\nxyz\n", the first readline() call
> should return "abc\0def\n".

Yes.

> But with fgets(), you're left to look in the returned buffer for
> a null byte,

Also yes.  But suppose I search "from the right", and ensure the buffer is
free of null bytes before the fgets.  For your input file above, fgets
overwrites the initial 9 bytes of the buffer (assuming the buffer is at
least 9 bytes long ...) with

    "abc\0def\n\0"

and there's no problem if I search from the right.

> and there's no way (in general) to distinguish this result from
> an input file that only consisted of the three characters "abc".

As above, I'm not convinced of that.  The input file "abc" would overwrite
the first four bytes of the buffer with

    "abc\0"

and leave the tail end alone (well, the MS fgets leaves the tail alone,
although I'm not sure ANSI C guarantees that).

Of course I've *read* any number of Unix(tm) FAQs that also claim it's
impossible, but I never believed them either <wink>.

This extra buffer fiddling is surely an expense I don't want to pay, but the
timing evidence on Windows so far says that I can probably search and/or
copy the whole buffer 100 times and still be faster than enduring the
threadsafe getc.

Am I missing something obvious?




From guido@python.org  Sat Jan  6 22:33:00 2001
From: guido@python.org (Guido van Rossum)
Date: Sat, 06 Jan 2001 17:33:00 -0500
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: Your message of "Sat, 06 Jan 2001 17:14:44 EST."
 <LNBBLJKPBEHFEDALKOLCCEBOIHAA.tim_one@email.msn.com>
References: <LNBBLJKPBEHFEDALKOLCCEBOIHAA.tim_one@email.msn.com>
Message-ID: <200101062233.RAA23942@cj20424-a.reston1.va.home.com>

[Tim suggests to use fgets(), preparing the buffer with non-null
bytes, and searching for a null byte from the right.]

If this is really sufficiently fast, I'd say, go for it.  Looks
bullet-proof as long as the source code to MSVCRT doesn't change. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim_one@email.msn.com  Sat Jan  6 22:34:42 2001
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 6 Jan 2001 17:34:42 -0500
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEBOIHAA.tim_one@email.msn.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEBPIHAA.tim_one@email.msn.com>

[Tim, pondering]
> ... But suppose I search "from the right", and ensure the buffer is
> free of null bytes before the fgets.

Even better, suppose I ensure the buffer is free of both null bytes and
newlines before the fgets; then if I search from the *left* for a newline
and find one, it must be that fgets found a line and it ends right there,
and this should usually obtain.  There's no need to search from the right
unless I don't find a newline ...




From skip@mojam.com (Skip Montanaro)  Sun Jan  7 01:15:08 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sat, 6 Jan 2001 19:15:08 -0600 (CST)
Subject: [Python-Dev] function attributes as "true" class attributes & reclamation error
In-Reply-To: <200101062205.RAA23603@cj20424-a.reston1.va.home.com>
References: <14935.15160.130742.390323@beluga.mojam.com>
 <200101062205.RAA23603@cj20424-a.reston1.va.home.com>
Message-ID: <14935.49948.574427.668588@beluga.mojam.com>

    Skip> Attaching function attributes to unbound methods could really
    Skip> function like C++ static data members....

    Guido> Skip, I don't find this better than the existing solution, which
    Guido> uses C._howmany instead of C.__init__.howmany.

It was more a "hey, I never thought of it quite that way" than a "hey, I
think this would be a great new idiom".  In fact, I believe the more
important part of my note was the bit about the attribute error on exit.

I'm sure function attributes will attract their fair share of abuse. ;-)

Skip




From tim_one@email.msn.com  Sun Jan  7 03:16:31 2001
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 6 Jan 2001 22:16:31 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin test_charmapcodec test_pow
Message-ID: <LNBBLJKPBEHFEDALKOLCMECFIHAA.tim_one@email.msn.com>

I'm pretty sure the test_pow and test_charmapcodec failures aren't my doing.

test_builtin fails because raw_input() isn't stripping a trailing newline.
I've got my own code in this area that *may* be to blame, but I don't see
how it could be.  I note that fileobject.c's new function get_line_raw has
the comment

/* Internal routine to get a line for raw_input():
   strip trailing '\n', raise EOFError if EOF reached immediately
*/

but the code doesn't look for a trailing newline (let alone strip one).




From tim_one@email.msn.com  Sun Jan  7 03:33:02 2001
From: tim_one@email.msn.com (Tim Peters)
Date: Sat, 6 Jan 2001 22:33:02 -0500
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: <200101062233.RAA23942@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com>

> [Tim suggests to use fgets(), preparing the buffer with non-null
> bytes, and searching for a null byte from the right.]

[Guido]
> If this is really sufficiently fast, I'd say, go for it.  Looks
> bullet-proof as long as the source code to MSVCRT doesn't change. :-)

Surprise?  Despite all the memsets, memchrs (looking for a newline), and
one-at-a-time backward searches (looking for a null byte), it's a huge win
on Windows:

total 117615824 chars and 3237568 lines
readlines_sizehint    9.550  9.578
using_fileinput      28.790 28.781
while_readline       13.120 13.134

The last one was 30.5 seconds before the fgets hackery.

I'll check it in tomorrow after sleeping on it (there's a large pile of
messy endcases (not only does fgets() invent a null byte, it can't tell you
whether it stopped reading due to EOF, so maybe the last line in the file
ends with 10000 null bytes + no newline + exactly lines up with a buffer
boundary -- etc); test_builtin is failing in a closely related area but
nobody would have checked in code that failed a std test <wink>; and it's
been a frustrating day all around).

i-want-my-cable-modem-back-now-ly y'rs  - tim




From esr@thyrsus.com  Sun Jan  7 04:01:25 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Sat, 6 Jan 2001 23:01:25 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com>; from tim_one@email.msn.com on Sat, Jan 06, 2001 at 10:33:02PM -0500
References: <200101062233.RAA23942@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com>
Message-ID: <20010106230125.A29058@thyrsus.com>

Tim Peters <tim_one@email.msn.com>:
> > [Tim suggests to use fgets(), preparing the buffer with non-null
> > bytes, and searching for a null byte from the right.]

No, I haven't forgotten about the curses autoconfig stuff.  But...

This mess reminds me.  For some work I'm doing right now, it would be
very useful if there were a way to query the end-of-file status of a
file descriptor without actually doing a read.

I don't see this ability anywhere in the 2.0 API.  Questions:

1. Am I missing something obvious?

2. If the answer to 1 is that I am not, in fact, being a dumbass, what
   is the right way to support this?  The obvious alternatives are an 
   eof member (analogous to the existing `closed' member, or an eof()
   method.  I favor the latter.

3. If we agree on a design, I'm willing to implement this at least for
   Unix.  Should be a small project.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The direct use of physical force is so poor a solution to the problem of
limited resources that it is commonly employed only by small children and
great nations.
	-- David Friedman


From skip@mojam.com (Skip Montanaro)  Sun Jan  7 04:05:22 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sat, 6 Jan 2001 22:05:22 -0600 (CST)
Subject: [Python-Dev] readline module seems crippled - am I missing something?
Message-ID: <14935.60162.726131.593211@beluga.mojam.com>

For a more-or-less throwaway script I'm working on I need a little input
function similar to Emacs's read-from-minibuffer, which accepts both a
prompt and an initial string for the input buffer.  Seems like I ought to be
able to whip something up using readline, but it's not happening.  GNU
readline's docs aren't the greatest, but I thought this simple script would
work:

    import readline
    readline.insert_text("default")
    x = raw_input("?")
    print x

I expected to see an editable "default" displayed after the prompt and have
x default to "default" if I just hit the return key.  I see nothing
displayed after the question mark, and x is the empty string if I just hit
return.  

This does print "default":

    readline.insert_text("default")
    x = readline.get_line_buffer()
    print x

so I know that insert_text and get_line_buffer seem to be working as
intended.  Looking at call_readline in Modules/readline.c I see nothing that
would disrupt the line buffer before the call to readline().

Am I missing something totally obvious about how GNU readline works or the
conditions under which readline is used (only at the interactive prompt?) or
is some required bit of GNU readline not exposed through Python's readline
module?

Skip


From tim.one@home.com  Sun Jan  7 10:09:02 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 7 Jan 2001 05:09:02 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <20010106230125.A29058@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>

[Eric S. Raymond]
> ...
> For some work I'm doing right now, it would be very useful if
> there were a way to query the end-of-file status of a file
> descriptor without actually doing a read.
>
> I don't see this ability anywhere in the 2.0 API.

When someone says "API", I think "C API".  In that case you can use
feof(stream) directly, or whatever the heck your platform supports for
handles (_eof(handle) on Windows, which I know is an OS you're secretly
longing to master <wink>).

I don't believe there's a way to find out from Python short of trying to
read, though.  Well, I suppose you could try to compare f.tell() to the
size, if you knew that f.tell() and "the size" made sense for f ...

> 1. Am I missing something obvious?

I don't know!  I never asked Guido about this, and given that he's not on
vacation now I'm not allowed to channel him.  I would hazard a guess,
though, that he thinks "you do or don't get something back when you read" is
clearer than "you may or may not get something back when you read,
regardless of which answer I give you in response to .eof() -- depending".
The latter is particularly muddy in a threaded environment, even for plain
old disk files.

> 2. If the answer to 1 is that I am not, in fact, being a dumbass,
>    what is the right way to support this?  The obvious alternatives
>    are an eof member (analogous to the existing `closed' member, or
>    an eof() method.  I favor the latter.
>
> 3. If we agree on a design, I'm willing to implement this at least
>    for Unix.  Should be a small project.

I agree an .eof() method would be better than a data member.  Note that
whenever Python internals hit stream EOF today, they call clearerr(), so
simply adding an feof() wrapper wouldn't suffice.  Guido seemed to try to
make sure that feof() would never be useful <0.8 wink>.

one-of-life's-little-mysteries-ly y'rs  - tim



From gstein@lyra.org  Sun Jan  7 10:46:54 2001
From: gstein@lyra.org (Greg Stein)
Date: Sun, 7 Jan 2001 02:46:54 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects fileobject.c,2.96,2.97
In-Reply-To: <E14EY5D-0000pm-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Fri, Jan 05, 2001 at 06:43:07AM -0800
References: <E14EY5D-0000pm-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010107024654.W17220@lyra.org>

On Fri, Jan 05, 2001 at 06:43:07AM -0800, Guido van Rossum wrote:
> Update of /cvsroot/python/python/dist/src/Objects
> In directory usw-pr-cvs1:/tmp/cvs-serv3183
> 
> Modified Files:
> 	fileobject.c 
> Log Message:
> Restructured get_line() for clarity and speed.
> 
> - The raw_input() functionality is moved to a separate function.
> 
> - Drop GNU getline() in favor of getc_unlocked(), which exists on more
>   platforms (and is even a tad faster on my system).

The "configure" tests for getline() can be punted if we won't use it any
more...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Sun Jan  7 12:27:57 2001
From: gstein@lyra.org (Greg Stein)
Date: Sun, 7 Jan 2001 04:27:57 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Jan 05, 2001 at 03:14:41PM -0500
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
Message-ID: <20010107042757.X17220@lyra.org>

It feels wrong. Whatever happened to the "we're all adults here" mantra.

Besides people asking for it, what is a good reason *for* it to be added?

Cheers,
-g

On Fri, Jan 05, 2001 at 03:14:41PM -0500, Guido van Rossum wrote:
> Please have a look at this SF patch:
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> 
> This implements control over which names defined in a module are
> externally visible: if there's a variable __exports__ in the module,
> it is a list of identifiers, and any access from outside the module to
> names not in the list is disallowed.  This affects access using the
> getattr and setattr protocols (which raise AttributeError for
> disallowed names), as well as "from M import v" (which raises
> ImportError).
> 
> I like it.  This has been asked for many times.  Does anybody see a
> reason why this should *not* be added?
> 
> Tim remarked that introducing this will prompt demands for a similar
> feature on classes and instances, where it will be hard to implement
> without causing a bit of a slowdown.  It causes a slight slowdown (an
> extra dictionary lookup for each use of "M.v") even when it is not
> used, but for accessing module variables that's acceptable.  I'm not
> so sure about instance variable references.
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Greg Stein, http://www.lyra.org/


From guido@python.org  Sun Jan  7 16:52:11 2001
From: guido@python.org (Guido van Rossum)
Date: Sun, 07 Jan 2001 11:52:11 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: Your message of "Sat, 06 Jan 2001 23:01:25 EST."
 <20010106230125.A29058@thyrsus.com>
References: <200101062233.RAA23942@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com>
 <20010106230125.A29058@thyrsus.com>
Message-ID: <200101071652.LAA31411@cj20424-a.reston1.va.home.com>

> This mess reminds me.  For some work I'm doing right now, it would be
> very useful if there were a way to query the end-of-file status of a
> file descriptor without actually doing a read.

I hope you really mean file object (== wrapper around stdio FILE
object).  A file descriptor (small little integer in Unix) doesn't
have a way to find this out.

Even for file objects, it is typically only known that there's an EOF
condition after a lowest-level read operation returned 0 bytes.  So in
effect you must still do a read in order to determine EOF status.

I just ran a small test program, and fread() appears to set the eof
status when it returns a short count.  Normally, Python's read() uses
fread() so this might be useful.  However after a readline(), you
can't know the eof status (unless the last line of the file doesn't
end in a newline).

> I don't see this ability anywhere in the 2.0 API.  Questions:
> 
> 1. Am I missing something obvious?
> 
> 2. If the answer to 1 is that I am not, in fact, being a dumbass, what
>    is the right way to support this?  The obvious alternatives are an 
>    eof member (analogous to the existing `closed' member, or an eof()
>    method.  I favor the latter.
> 
> 3. If we agree on a design, I'm willing to implement this at least for
>    Unix.  Should be a small project.

Before adding an eof() method, can you explain what your program is
trying to do?  Is it reading from a pipe or socket?  Then select() or
poll() might be useful.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@thyrsus.com  Sun Jan  7 18:30:32 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Sun, 7 Jan 2001 13:30:32 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 07, 2001 at 05:09:02AM -0500
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>
Message-ID: <20010107133032.F4586@thyrsus.com>

Tim Peters <tim.one@home.com>:
> I agree an .eof() method would be better than a data member.  Note that
> whenever Python internals hit stream EOF today, they call clearerr(), so
> simply adding an feof() wrapper wouldn't suffice.  Guido seemed to try to
> make sure that feof() would never be useful <0.8 wink>.

That's inconvenient, but only means the internal Python state flag
that feof() would inspect would have to be checked after each read.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"...The Bill of Rights is a literal and absolute document. The First
Amendment doesn't say you have a right to speak out unless the
government has a 'compelling interest' in censoring the Internet. The
Second Amendment doesn't say you have the right to keep and bear arms
until some madman plants a bomb. The Fourth Amendment doesn't say you
have the right to be secure from search and seizure unless some FBI
agent thinks you fit the profile of a terrorist. The government has no
right to interfere with any of these freedoms under any circumstances."
	-- Harry Browne, 1996 USA presidential candidate, Libertarian Party


From esr@thyrsus.com  Sun Jan  7 18:45:41 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Sun, 7 Jan 2001 13:45:41 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <200101071652.LAA31411@cj20424-a.reston1.va.home.com>; from guido@python.org on Sun, Jan 07, 2001 at 11:52:11AM -0500
References: <200101062233.RAA23942@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com> <20010106230125.A29058@thyrsus.com> <200101071652.LAA31411@cj20424-a.reston1.va.home.com>
Message-ID: <20010107134541.G4586@thyrsus.com>

Guido van Rossum <guido@python.org>:
> > This mess reminds me.  For some work I'm doing right now, it would be
> > very useful if there were a way to query the end-of-file status of a
> > file descriptor without actually doing a read.
> 
> I hope you really mean file object (== wrapper around stdio FILE
> object).  A file descriptor (small little integer in Unix) doesn't
> have a way to find this out.

You're right, my bad.
 
> Even for file objects, it is typically only known that there's an EOF
> condition after a lowest-level read operation returned 0 bytes.  So in
> effect you must still do a read in order to determine EOF status.
> 
> I just ran a small test program, and fread() appears to set the eof
> status when it returns a short count.  Normally, Python's read() uses
> fread() so this might be useful.  However after a readline(), you
> can't know the eof status (unless the last line of the file doesn't
> end in a newline).

I considered trying a zero-length read() in Python, but this strikes me 
as inelegant even if it would work.

> Before adding an eof() method, can you explain what your program is
> trying to do?  Is it reading from a pipe or socket?  Then select() or
> poll() might be useful.

Sadly, it's exactly the wrong case.  Hmmm...omitting irrelevant details,
it's a situation where a markup file can contain sections in two different
languages.  The design requires the first interpreter to exit on seeing
either EOF or a marker that says "switching to second language".  For
reasons too compllicated to explain, it would be best if the parser for
the first language didn't simply call the second parser.

The logic I wanted to write amounts to:

while 1:
    line = fp.readline()
    if not line or line == "history":
        break
    interpret_in-language_1(line)

if not fp.feof()
    while 1:
        line = fp.readline()
        if not line:
            break
    	interpret_in-language_2(line)

I just tested the zero-length-read method.  That worked.  I guess I'll
use it.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Today, we need a nation of Minutemen, citizens who are not only prepared to
take arms, but citizens who regard the preservation of freedom as the basic
purpose of their daily life and who are willing to consciously work and
sacrifice for that freedom."
	-- John F. Kennedy


From martin@loewis.home.cs.tu-berlin.de  Sun Jan  7 18:45:15 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 7 Jan 2001 19:45:15 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
Message-ID: <200101071845.f07IjFi01249@mira.informatik.hu-berlin.de>

Authors of extension packages often find the need to auto-import some
of their modules. This is often needed for registration, e.g. a codec
author (like Tamito KAJIYAMA, who wrote the JapaneseCodecs package)
may need to register a search function with codecs.register. This is
currently only possible by writing into sitecustomize.py, which must
be done by the system administrator manually.

To enhance the service of site.py, I've written the patch

http://sourceforge.net/patch/?func=detailpatch&patch_id=103134&group_id=5470

which treats lines in PTH files which start with "import" as
statements and executes them, instead of appending these lines to
sys.path.

The patch is relatively small, but since it is an extension: Do I need
to write a PEP for it?

Regards,
Martin


From tismer@tismer.com  Sun Jan  7 18:05:21 2001
From: tismer@tismer.com (Christian Tismer)
Date: Sun, 07 Jan 2001 20:05:21 +0200
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
 <20010106110033.52127A84F@darjeeling.zadka.site.co.il>
 <14934.43496.322436.612746@anthem.wooz.org>
 <20010105152058.A6016@glacier.fnational.com> <14935.11870.360839.235102@beluga.mojam.com>
Message-ID: <3A58AFE1.3AB619BD@tismer.com>


Skip Montanaro wrote:
> 
>     Neil> I think you, Skip and Moshe are missing a big advantage of having
>     Neil> the __exports__ mechanism.  It should allow some attribute access
>     Neil> inside of modules to become faster (like LOAD_FAST for locals).  I
>     Neil> think that optimization could be implemented without too much
>     Neil> difficultly.
> 
> True enough, that hadn't occurred to me.  Knowing that now, I still don't
> think consistency of the interface should suffer as a result of
> under-the-covers performance gains.

Ok, vice versa:
Given that we can support access control via __exports__
for modules, classes and instances as well, *and* if we
can think up a scheme that allows a LOAD_FAST like speedup
for all of these cases at the same time,
then I would say +1, otherwise -0, half-hearted solution.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From guido@python.org  Sun Jan  7 21:13:01 2001
From: guido@python.org (Guido van Rossum)
Date: Sun, 07 Jan 2001 16:13:01 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: Your message of "Sun, 07 Jan 2001 13:30:32 EST."
 <20010107133032.F4586@thyrsus.com>
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>
 <20010107133032.F4586@thyrsus.com>
Message-ID: <200101072113.QAA32467@cj20424-a.reston1.va.home.com>

> Tim Peters <tim.one@home.com>:
> > I agree an .eof() method would be better than a data member.  Note that
> > whenever Python internals hit stream EOF today, they call clearerr(), so
> > simply adding an feof() wrapper wouldn't suffice.  Guido seemed to try to
> > make sure that feof() would never be useful <0.8 wink>.
> 
[ESR]
> That's inconvenient, but only means the internal Python state flag
> that feof() would inspect would have to be checked after each read.

This was done because some platforms set feof() when there's still a
possibity to read more (e.g. after an interactive user typed ^D),
while others don't.  It's inconvenient to get an endless stream of
EOFs from stdin when a user typed ^D to one particular prompt, so I
decided to clear the EOF status.

[ESR in a later message]
> I considered trying a zero-length read() in Python, but this strikes me 
> as inelegant even if it would work.

I doubt that a zero-length read conveys any information.  It should
return "" whether or not there is more to read!  Plus, look at the
implementation of readline() (file_readline() in
Objects/fileobject.c): it shortcuts the n == 0 case and returns an
empty string without touching the file.

[me]
> > Before adding an eof() method, can you explain what your program is
> > trying to do?  Is it reading from a pipe or socket?  Then select() or
> > poll() might be useful.

[ESR again]
> Sadly, it's exactly the wrong case.  Hmmm...omitting irrelevant details,
> it's a situation where a markup file can contain sections in two different
> languages.  The design requires the first interpreter to exit on seeing
> either EOF or a marker that says "switching to second language".  For
> reasons too compllicated to explain, it would be best if the parser for
> the first language didn't simply call the second parser.
> 
> The logic I wanted to write amounts to:
> 
> while 1:
>     line = fp.readline()
>     if not line or line == "history":
>         break
>     interpret_in-language_1(line)
> 
> if not fp.feof()
>     while 1:
>         line = fp.readline()
>         if not line:
>             break
>     	interpret_in-language_2(line)
> 
> I just tested the zero-length-read method.  That worked.  I guess I'll
> use it.

Bizarre (given what I know about zero-length read).  But in the above
code, you can replace "if not fp.feof()" with "if line".  In other
words, you just have to carry the state over within your program.

So, I see no reason why the logic in your program couldn't take care
of this, which in general is a preferred way to solve a problem than
to change the language.

Also note that in Python it's no sin to attempt to read a line even
when the file is already at EOF -- you will simply get an empty line
again.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@effbot.org  Sun Jan  7 21:29:46 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Sun, 7 Jan 2001 22:29:46 +0100
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>              <20010107133032.F4586@thyrsus.com>  <200101072113.QAA32467@cj20424-a.reston1.va.home.com>
Message-ID: <035901c078f0$f6180f70$e46940d5@hagrid>

Guido van Rossum wrote:
> Bizarre (given what I know about zero-length read).  But in the above
> code, you can replace "if not fp.feof()" with "if line".  In other
> words, you just have to carry the state over within your program.

and if that's too hard, just hide the state in
a class:

class FileWrapper:

    def __init__(self, file):
        self.__file = file
        self.__line = None

    def __more(self):
        # try reading another line
        if not self.__line:
            self.__line = self.__file.readline()

    def eof(self):
        self.__more()
        return not self.__line

    def readline(self):
        self.__more()
        line = self.__line
        self.__line = None
        return line

file = open("myfile.txt")

file = FileWrapper(file)

while not file.eof():
    print repr(file.readline())

</F>



From guido@python.org  Sun Jan  7 21:32:26 2001
From: guido@python.org (Guido van Rossum)
Date: Sun, 07 Jan 2001 16:32:26 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin test_charmapcodec test_pow
In-Reply-To: Your message of "Sat, 06 Jan 2001 22:16:31 EST."
 <LNBBLJKPBEHFEDALKOLCMECFIHAA.tim_one@email.msn.com>
References: <LNBBLJKPBEHFEDALKOLCMECFIHAA.tim_one@email.msn.com>
Message-ID: <200101072132.QAA32627@cj20424-a.reston1.va.home.com>

> I'm pretty sure the test_pow and test_charmapcodec failures aren't my doing.
> 
> test_builtin fails because raw_input() isn't stripping a trailing newline.
> I've got my own code in this area that *may* be to blame, but I don't see
> how it could be.  I note that fileobject.c's new function get_line_raw has
> the comment
> 
> /* Internal routine to get a line for raw_input():
>    strip trailing '\n', raise EOFError if EOF reached immediately
> */
> 
> but the code doesn't look for a trailing newline (let alone strip one).

My bad.  Try the latest CVS now.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@thyrsus.com  Sun Jan  7 22:15:27 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Sun, 7 Jan 2001 17:15:27 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <200101072113.QAA32467@cj20424-a.reston1.va.home.com>; from guido@python.org on Sun, Jan 07, 2001 at 04:13:01PM -0500
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com>
Message-ID: <20010107171527.A5093@thyrsus.com>

Guido van Rossum <guido@python.org>:
> [ESR in a later message]
> > I considered trying a zero-length read() in Python, but this strikes me 
> > as inelegant even if it would work.
> 
> I doubt that a zero-length read conveys any information.  It should
> return "" whether or not there is more to read!

Duh.  Of course it would.  

You know, I've always been half-consciously dissatisfied with Python's
use of "" as an EOF marker, and now I know why.  It's precisely
because there's no way to distinguish these cases.  I think a zero-length
read ought to return "" and a read on EOF ought to return None.

> Bizarre (given what I know about zero-length read).  But in the above
> code, you can replace "if not fp.feof()" with "if line".  In other
> words, you just have to carry the state over within your program.
> 
> So, I see no reason why the logic in your program couldn't take care
> of this, which in general is a preferred way to solve a problem than
> to change the language.

OK, two objections, one practical and one (more important) esthetic:

Practical: I guess I oversimplified the code for expository purposes.
What's actually going on is that I have two parser classes both based
on shlex -- they do character-at-a-time input and don't actually
*have* accessible line buffers.

Esthetic: Yes, I can have the first parser set a flag, or return some
EOF token.  But this seems deeply wrong to me, because EOFness is not
a property of the parser but of the underlying stream object.  It
seems to me that my program ought to be able to ask the stream object
whether it's at EOF rather than carrying its own flag for that state.

In Python as it is, there's no clean way to do this.  I'd have to do a
nonzero-length read to test it (I failed to check the right alternate
case before when I tried zero-length).  That's really broken.  What if the
neither the underlying stream nor the parser supports pushback?

Do you see now why I think this is a more general issue?

Now, another and more general way to handle this would be to make an
equivalent of the old FIONCLEX ioctl part of Python's standard set of 
file object methods -- a way to ask "how many bytes are ready to be
read in this stream?  

Trivial to make it work for plain files, of course.  Harder to make it   
work usefully for pipes/fifos/sockets/terminals.  Having it pass up the
results of the fstat.size field (corrected for the current seek address
if you're reading a plain file) would be a good start.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Live free or die; death is not the worst of evils.
	-- General George Stark.


From tismer@tismer.com  Sun Jan  7 22:37:55 2001
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 08 Jan 2001 00:37:55 +0200
Subject: [Python-Dev] ANN: Stackless Python 2.0
Message-ID: <3A58EFC3.5A722FF0@tismer.com>

Dear community,

I'm happy to announce that

		Stackless Python 2.0

is finally ready and available for download.

Stackless Python for Python 1.5.2+ also got some minor
enhancements. Both versions are available as Win32
installer files here:

http://www.stackless.com/spc20-win32.exe
http://www.stackless.com/spc15-win32.exe

Speed: Stackless Python for Python 2.0 is again a bit faster
than the original. This time even better: About 9-10 percent.
I have to say that optimization was much harder this time.
My speed patches are now done by a Python script, which will
make maintenance and diff reading much easier in the future.

There is now also a bit of example code available, like
the uthread9.py Microthreads module from Will Ware, Just van Rossum,
and Mike Fletcher.

Source code and an update to the website will become available in
the next days.

enjoy - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From mal@lemburg.com  Mon Jan  8 00:26:00 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 01:26:00 +0100
Subject: [Python-Dev] Std tests failing, Windows: test_builtin
 test_charmapcodec test_pow
References: <LNBBLJKPBEHFEDALKOLCMECFIHAA.tim_one@email.msn.com>
Message-ID: <3A590918.E90031AA@lemburg.com>

Tim Peters wrote:
> 
> I'm pretty sure the test_pow and test_charmapcodec failures aren't my doing.

test_charmapcodec is my fault... I should run the tests in a
clean room environment before checkin: my PYTHONPATH picked up
some other file which it was not supposed to do.

I'll fix it next week.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tim.one@home.com  Mon Jan  8 04:13:26 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 7 Jan 2001 23:13:26 -0500
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEBPIHAA.tim_one@email.msn.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEEDIHAA.tim.one@home.com>

The "Win32" readline() hack is now checked in, but there's really nothing
Win32-specific about it anymore.  It makes one mild assumption about what
the C std doesn't clearly address but may have intended:  that in case of a
non-NULL return, fgets doesn't overwrite any of the buffer positions beyond
the terminating null byte (the std is clear that it doesn't overwrite
anything at all in case of a NULL-because-EOF return, but I can't say
whether they're pointing that out as a consequence, or pointing that out as
an exception).

I'm curious about how it performs (relative to the getc_unlocked hack) on
other platforms.  If you'd like to try that, just recompile fileobject.c
with

    USE_MS_GETLINE_HACK

#define'd.  It should *work* on any platform with fgets() meeting the
assumption.  The new test_bufio.py std test gives it a pretty good
correctness workout, if you're worried about that.



From esr@snark.thyrsus.com  Mon Jan  8 04:16:53 2001
From: esr@snark.thyrsus.com (Eric S. Raymond)
Date: Sun, 7 Jan 2001 23:16:53 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
Message-ID: <200101080416.f084GrM10912@snark.thyrsus.com>

Setting things up so curses is autoconfigured into the default build
if your system has it in the expected places turned out to be dead
easy.  Some clever person (the BDFL himself?) wrote the build process
so that there is *already* a Setup.config.in that gets configure
expansions done on it, with the generated Setup.config used when
makesetup does its magic.

As a bonus, I've also added autoconfiguration for readline.  A small
detail, but one which I suspect many people building their own Pythons
frequently trip over.

The technique generalizes easily.  The archetype for a facility for
autoconfiguring libfoo with a Python extension foo.c if it's present
has just two steps:

Add this to Modules/Setup.config.in:

@USE_FOO_MODULE@foo foo.c -lfoo

Add this to configure.in:

# This is used to generate Setup.config
AC_SUBST(USE_FOO_MODULE)
AC_CHECK_LIB(foo, random_foo_function, 
	[USE_FOO_MODULE=""],
	[USE_FOO_MODULE="#"])

(Apologies for the lack of description with the patch.  I tripped over
a SourceForge interface bug.)
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The possession of arms by the people is the ultimate warrant
that government governs only with the consent of the governed.
        -- Jeff Snyder


From tim.one@home.com  Mon Jan  8 05:34:20 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 8 Jan 2001 00:34:20 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <3A590918.E90031AA@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com>

An update:  test_builtin works again (thanks, Guido!), and test_charmapcodec
will "next week" (thanks, MAL!).

Still unknown (to me):  is the test_pow failure unique to Windows?  One
response from a Unix(tm) geek would settle that.



From nas@arctrix.com  Sun Jan  7 22:59:49 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Sun, 7 Jan 2001 14:59:49 -0800
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 12:34:20AM -0500
References: <3A590918.E90031AA@lemburg.com> <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com>
Message-ID: <20010107145949.A14166@glacier.fnational.com>

On Mon, Jan 08, 2001 at 12:34:20AM -0500, Tim Peters wrote:
> Still unknown (to me):  is the test_pow failure unique to Windows?  One
> response from a Unix(tm) geek would settle that.

It works fine for me on Linux.  I thought I tested on Windows
before checking in the coerce patch.  I'll try again.

  Neil


From nas@arctrix.com  Sun Jan  7 23:29:14 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Sun, 7 Jan 2001 15:29:14 -0800
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <20010107145949.A14166@glacier.fnational.com>; from nas@arctrix.com on Sun, Jan 07, 2001 at 02:59:49PM -0800
References: <3A590918.E90031AA@lemburg.com> <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com> <20010107145949.A14166@glacier.fnational.com>
Message-ID: <20010107152914.A14228@glacier.fnational.com>

On Sun, Jan 07, 2001 at 02:59:49PM -0800, Neil Schemenauer wrote:
> It works fine for me on Linux.  I thought I tested on Windows
> before checking in the coerce patch.  I'll try again.

Wierd. rt.bat does not run the test_pow script.  If I run
"regrtet test_pow" then the test fails.  It could be a problem
with line endings (I copied the source for a Unix CVS checkout).

Anyhow, I found the bug.  I don't know how test_pow was passing
under Linux.  Time to reboot again.

  Neil


From tim.one@home.com  Mon Jan  8 06:39:20 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 8 Jan 2001 01:39:20 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <20010107152914.A14228@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEEKIHAA.tim.one@home.com>

[NeilS]
> Wierd. rt.bat does not run the test_pow script.

Works for me, else I never would have noticed <wink>.  Also works for me in
single-test mode:

C:\Code\python\dist\src\PCbuild>rt test_pow

C:\Code\python\dist\src\PCbuild>python ../lib/test/regrtest.py test_pow
test_pow
The actual stdout doesn't match the expected stdout.
This much did match (between asterisk lines):
**********************************************************************
test_pow
Testing integer mode...
    Testing 2-argument pow() function...
    Testing 3-argument pow() function...
Testing long integer mode...
    Testing 2-argument pow() function...
    Testing 3-argument pow() function...
Testing floating point mode...
    Testing 3-argument pow() function...
The number in both columns should match.
3 3
-5 -5
-1 -1
5 5
-3 -3
-7 -7

3L 3L
-5L -5L
-1L -1L
5L 5L
-3L -3L
-7L -7L

3.0 3.0
-5.0 -5.0
-1.0 -1.0
-7.0 -7.0

**********************************************************************
Then ...
We expected (repr): ''
But instead we got: 'Float mismatch:'
test test_pow failed -- Writing: 'Float mismatch:', expected: ''
1 test failed: test_pow

C:\Code\python\dist\src\PCbuild>

That may point to the problem, too:  the canned output file is truncated?

> If I run "regrtet test_pow" then the test fails.  It could be a
> problem with line endings (I copied the source for a Unix CVS
> checkout).

Don't understand; e.g., "copied" what, from where to where?  I'm not sure I
gave you write access to my box, and hacking into Windows machines is uncool
because it's not challenging <wink>.

> Anyhow, I found the bug.  I don't know how test_pow was passing
> under Linux.  Time to reboot again.

Cool!  BTW, Windows solves the "don't reboot enough" problem for you via
automation, sometimes on an hourly basis.

Thanks for sharing the brain cells, Neil!



From thomas@xs4all.net  Mon Jan  8 06:44:11 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 07:44:11 +0100
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <200101080416.f084GrM10912@snark.thyrsus.com>; from esr@snark.thyrsus.com on Sun, Jan 07, 2001 at 11:16:53PM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com>
Message-ID: <20010108074411.N2467@xs4all.nl>

On Sun, Jan 07, 2001 at 11:16:53PM -0500, Eric S. Raymond wrote:
> Setting things up so curses is autoconfigured into the default build
> if your system has it in the expected places turned out to be dead
> easy.  Some clever person (the BDFL himself?) wrote the build process
> so that there is *already* a Setup.config.in that gets configure
> expansions done on it, with the generated Setup.config used when
> makesetup does its magic.

Skip, actually, IIRC. It was added in the last stages of 2.0 development, to
auto-detect bsddb. However, I still think it should be a separate
'configure', in the Modules directory. Especially now that Andrew is
practically checking in the distutils setup ;) The main configure can make
an educated guess whether Python and distutils are available, and call
configure with some passed-through options if not. It does depend on what
the distutils setup does, though, and I'll shamefully admit that I haven't
looked at that ;P

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From nas@arctrix.com  Sun Jan  7 23:51:16 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Sun, 7 Jan 2001 15:51:16 -0800
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEEKIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 01:39:20AM -0500
References: <20010107152914.A14228@glacier.fnational.com> <LNBBLJKPBEHFEDALKOLCGEEKIHAA.tim.one@home.com>
Message-ID: <20010107155116.A14312@glacier.fnational.com>

On Mon, Jan 08, 2001 at 01:39:20AM -0500, Tim Peters wrote:
> [NeilS]
> > If I run "regrtet test_pow" then the test fails.  It could be a
> > problem with line endings (I copied the source for a Unix CVS
> > checkout).
> 
> Don't understand; e.g., "copied" what, from where to where?

I should have been clearer.  I mean the problem with rt.bat not
running test_pow.  I copied the CVS source from my Linux ext2
filesystem to a VFAT filesystem.  I was too lazy to fix the line
endings.

  Neil


From nas@arctrix.com  Sun Jan  7 23:52:38 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Sun, 7 Jan 2001 15:52:38 -0800
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <20010107152914.A14228@glacier.fnational.com>; from nas@arctrix.com on Sun, Jan 07, 2001 at 03:29:14PM -0800
References: <3A590918.E90031AA@lemburg.com> <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com> <20010107145949.A14166@glacier.fnational.com> <20010107152914.A14228@glacier.fnational.com>
Message-ID: <20010107155238.A14291@glacier.fnational.com>

On Sun, Jan 07, 2001 at 03:29:14PM -0800, Neil Schemenauer wrote:
> I don't know how test_pow was passing under Linux.

Under Linux with the buggy float_pow:

    >>> pow(10.0, 0, 10)
    nan
    >>> pow(10.0, 0, 10) == 1
    1
    >>> pow(10.0, 0, 10) == 0
    1

Under Windows NAN obviously behaves differently.

  floating-point-is-fun-ly y'rs Neil


From esr@thyrsus.com  Mon Jan  8 06:49:45 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 8 Jan 2001 01:49:45 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <20010108074411.N2467@xs4all.nl>; from thomas@xs4all.net on Mon, Jan 08, 2001 at 07:44:11AM +0100
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl>
Message-ID: <20010108014945.A19516@thyrsus.com>

Thomas Wouters <thomas@xs4all.net>:
> On Sun, Jan 07, 2001 at 11:16:53PM -0500, Eric S. Raymond wrote:
> > Setting things up so curses is autoconfigured into the default build
> > if your system has it in the expected places turned out to be dead
> > easy.  Some clever person (the BDFL himself?) wrote the build process
> > so that there is *already* a Setup.config.in that gets configure
> > expansions done on it, with the generated Setup.config used when
> > makesetup does its magic.
> 
> Skip, actually, IIRC. It was added in the last stages of 2.0 development, to
> auto-detect bsddb. However, I still think it should be a separate
> 'configure', in the Modules directory.

You may be right.  Still, this patch solves the immediate problem in a
reasonably clean way, and I urge that it should go in.  We can do a
more complete reorganization of the build process later.  (I'll help with
that; I'm pretty expert with autoconf and friends.)
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"As to the species of exercise, I advise the gun. While this gives
[only] moderate exercise to the body, it gives boldness, enterprise,
and independence to the mind.  Games played with the ball and others
of that nature, are too violent for the body and stamp no character on
the mind. Let your gun, therefore, be the constant companion to your
walks."
        -- Thomas Jefferson, writing to his teenaged nephew.


From tim.one@home.com  Mon Jan  8 07:05:46 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 8 Jan 2001 02:05:46 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010106110033.52127A84F@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>

Well, I like __exports__ (but not some details of the patch, for which see
my SF comments).  Guido is aware of the optimization possibilities, but
that's not what's driving it.  I don't know why he likes it; I like it
because the only normal use for a module is to do module.attr, or "from
module import attr", and dir(module) very often exposes stuff today that the
module author had no intention of exporting.  For example, if I do

    import os
    dir(os)

under CVS Python today, on my box I see that os exports "i".  It's bound to
_exit.  That's baffling, and is purely an accident of how module os.py
initialization works when you're running on Windows.

Couple that with that I've hardly ever seen (or bothered to write) a module
docstring spelling out everything a module *intends* to export, and an
__exports__ line near the top (when present) would also automagically give a
solid answer to that question.

modules aren't classes or instances, and in normal practice modules
accumulate all sorts of accidental attrs (due to careless (== normal)
imports, and module init code).  It doesn't make any *sense* that os exports
"sys" either, or that random exports "cos", or that cgi exports "string", or
... this inelegance is ubiquitous.

In a world with an __exports__ that gets used, though, I do wonder whether
people will or won't export their test() functions.  I really like that they
do now.

or-maybe-it's-just-that-i-like-modules-that-*have*-a-
    test-function<wink>-ly y'rs  - tim




From gstein@lyra.org  Mon Jan  8 07:25:32 2001
From: gstein@lyra.org (Greg Stein)
Date: Sun, 7 Jan 2001 23:25:32 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 02:05:46AM -0500
References: <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
Message-ID: <20010107232532.V17220@lyra.org>

On Mon, Jan 08, 2001 at 02:05:46AM -0500, Tim Peters wrote:
>...
> modules aren't classes or instances, and in normal practice modules
> accumulate all sorts of accidental attrs (due to careless (== normal)
> imports, and module init code).  It doesn't make any *sense* that os exports
> "sys" either, or that random exports "cos", or that cgi exports "string", or
> ... this inelegance is ubiquitous.

Simple question: so what?

"Oh, no! My module exposes mod.sys! Oh, woe is me!"  *snort*

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim.one@home.com  Mon Jan  8 07:29:39 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 8 Jan 2001 02:29:39 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <20010107155238.A14291@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEENIHAA.tim.one@home.com>

[Neil Schemenauer]
> Under Linux with the buggy float_pow:
>
>     >>> pow(10.0, 0, 10)
>     nan
>     >>> pow(10.0, 0, 10) == 1
>     1
>     >>> pow(10.0, 0, 10) == 0
>     1
>
> Under Windows NAN obviously behaves differently.

Comparisons with NaN are a platform-dependent accident, partly because some
C compilers generate nonsense code, partly because Python isn't coded to
cater to NaN's peculiarities either.  The behavior under Windows is
(accidentally) better in these cases today (NaN should never compare equal
to anything -- not even to itself -- and, curiously, MSVC's codegen mistakes
cancel out Python's mistakes in this case!).

Thank you for fixing the bug.  Only test_charmapcodec is failing for me now,
and MAL knows the cause and cure.

nothing-can-stop-the-alpha-now-ly y'rs  - tim



From thomas@xs4all.net  Mon Jan  8 07:42:30 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 08:42:30 +0100
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEENIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 02:29:39AM -0500
References: <20010107155238.A14291@glacier.fnational.com> <LNBBLJKPBEHFEDALKOLCIEENIHAA.tim.one@home.com>
Message-ID: <20010108084230.O2467@xs4all.nl>

On Mon, Jan 08, 2001 at 02:29:39AM -0500, Tim Peters wrote:

> (NaN should never compare equal to anything -- not even to itself

You know that's impossible, in Python, right ? (Due to the shortcut taken by
'==', based on object identity.) Is that going to be 'fixed', too ? :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From ping@lfw.org  Mon Jan  8 07:51:11 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Sun, 7 Jan 2001 23:51:11 -0800 (PST)
Subject: [Python-Dev] inspect.py
In-Reply-To: <Pine.LNX.4.10.10011021617550.800-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101072348530.1032-100000@skuld.kingmanhall.org>

Hi again.

Sorry to bother you if you're busy -- i haven't seen any responses
about inspect.py for a few days and wanted to know what your
reactions were.  The module and test suite are still at:

    http://www.lfw.org/python/inspect.py
    http://www.lfw.org/python/test_inspect.py

The only change since my announcement last Wednesday is that
getframe() has been renamed to getframeinfo().

Thanks,


-- ?!ng

"Old code doesn't die -- it just smells that way."
    -- Bill Frantz



From tim.one@home.com  Mon Jan  8 08:17:57 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 8 Jan 2001 03:17:57 -0500
Subject: NaN nonsense (was RE: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow)
In-Reply-To: <20010108084230.O2467@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEPIHAA.tim.one@home.com>

>> (NaN should never compare equal to anything -- not even to itself

[Thomas Wouters]
> You know that's impossible, in Python, right ? (Due to the
> shortcut taken by '==', based on object identity.)

Surely you jest:  I probably knew that while you were still nursing <wink>.

OTOH, Python on WinTel comes remarkably close (by accident):

C:\Code\python\dist\src\PCbuild>python
Python 2.0 (#8, Jan  5 2001, 00:33:19) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> inf = 1e300**2
>>> inf
1.#INF
>>> nan = inf - inf
>>> nan
-1.#IND
>>> nan2 = nan * 1.0
>>> nan2
-1.#IND
>>> nan == nan2
0
>>>

> Is that going to be 'fixed', too ? :)

Not if I can help it.  I'd be in favor of adding an fcmp function that needs
to be called explicitly when you want the full complexity of 754
comparisons.  Count them all up, and there are 32 distinct 754 binary float
comparison operators!  The 754 std says 26 (from memory, may be 2 more or
less) of those have to be supplied, but-- since 754 is not a language
std --says nothing about how they're to be spelled.

OTOH, C99 resolutely tries to map that into C, and 754 True Believers will
use that as a club.

On the third hand, as Tom MacDonald posted here earlier (he was X3J11
chair), he's not sure anyone will ever implement C99 in whole.  The
complexities of full 754 support are a large part of why he worries about
that.

too-much-too-late-ly y'rs  - tim



From tim.one@home.com  Mon Jan  8 08:17:59 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 8 Jan 2001 03:17:59 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010107232532.V17220@lyra.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>

[Greg Stein]
> Simple question: so what?
>
> "Oh, no! My module exposes mod.sys! Oh, woe is me!"  *snort*

Couldn't care less about the module author.  It's the module user who has to
sort this stuff out.  "Don't use 'import *'" is good advice but not followed
either, and after I do

from MyPackage import sys  # intentionally exports its own sys
from GregSnort import *    # accidentally exports some other sys

madness ensues.  Like I said, it's inelegant, and at best.

Simple question for you:  what would __exports__ hurt?  "Oh, no!  Tim's
module explicitly lists what it intended to export!  Oh, woe is me!".  Gimme
a break.



From gstein@lyra.org  Mon Jan  8 08:26:03 2001
From: gstein@lyra.org (Greg Stein)
Date: Mon, 8 Jan 2001 00:26:03 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 03:17:59AM -0500
References: <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>
Message-ID: <20010108002603.X17220@lyra.org>

On Mon, Jan 08, 2001 at 03:17:59AM -0500, Tim Peters wrote:
> [Greg Stein]
> > Simple question: so what?
> >
> > "Oh, no! My module exposes mod.sys! Oh, woe is me!"  *snort*
> 
> Couldn't care less about the module author.  It's the module user who has to
> sort this stuff out.  "Don't use 'import *'" is good advice but not followed
> either, and after I do
> 
> from MyPackage import sys  # intentionally exports its own sys
> from GregSnort import *    # accidentally exports some other sys
> 
> madness ensues.  Like I said, it's inelegant, and at best.
> 
> Simple question for you:  what would __exports__ hurt?  "Oh, no!  Tim's
> module explicitly lists what it intended to export!  Oh, woe is me!".  Gimme
> a break.

hehe... adding __exports__ to your module is fine. Adding more crud to
Python, in opposition to the "we're all adults" motto, doesn't seem Right.

Somebody wants to use "from foo import *" on a module not designed for it?
Too bad for them. If you're suggesting __exports__ is to patch over problems
caused by "from foo import *", then I think you're barking up the wrong tree
:-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From moshez@zadka.site.co.il  Mon Jan  8 16:50:57 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Mon,  8 Jan 2001 18:50:57 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010107232532.V17220@lyra.org>
References: <20010107232532.V17220@lyra.org>, <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
Message-ID: <20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>

[Tim Peters]
> modules aren't classes or instances, and in normal practice modules
> accumulate all sorts of accidental attrs (due to careless (== normal)
> imports, and module init code).  It doesn't make any *sense* that os exports
> "sys" either, or that random exports "cos", or that cgi exports "string", or
> ... this inelegance is ubiquitous.

[Greg Stein]
> Simple question: so what?
> 
> "Oh, no! My module exposes mod.sys! Oh, woe is me!"  *snort*

Let me "me to" here:
Put another way, what Greg said is just a rephrase of "don't use from
foo import * unless foo's docos say it's OK". Add to that the simple
access control of a leading underscore, and I don't see any place
which needs it.

Something better to do would be to use 
import foo as _foo

In some standard library modules, and minimize using from foo import bar
in them. Since everyone know that leading underscore means "implementation
detail - ignore at your convenience, use at yor peril", this would keep
the "we're all adults" philosophy of Python, with all the advantages
*I* see in __exports__.

One more point against __exports__, which I hoped I would not have to
make (but when I'm up against the timbot *and* Guido, I need to pull
out the heavy artillery): it would *totally* stop any hope in the
future of module level __getattr__ (or at least complicate the semantics).
I think Alex M. is thinking of a PEP, but he's taking his time, since
no PEPs can be considered until 2.1 is out.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From tim.one@home.com  Mon Jan  8 08:49:58 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 8 Jan 2001 03:49:58 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010108002603.X17220@lyra.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEFBIHAA.tim.one@home.com>

[Greg Stein]
> hehe... adding __exports__ to your module is fine. Adding more
> crud to Python, in opposition to the "we're all adults" motto,
> doesn't seem Right.

My idea of what's Right is copied from my boss <wink>.

> Somebody wants to use "from foo import *" on a module not designed
> for it?  Too bad for them.

How is someone supposed to know whether a module "was designed" for import*?
Even Tkinter (which just about everyone does "import *" on) also exports
sys, and everything from the "types" module, by accident too.

> If you're suggesting __exports__ is to patch over problems
> caused by "from foo import *", then I think you're barking up the
> wrong tree
> :-)

Indeed.  But I'm suggesting that the problems that *can* arise from
"import*" illustrate the fundamental silliness of exporting things by
accident.  It's come up much more often for me when I'm looking over
someone's shoulder, teaching them how to use dir() in an interactive shell
to answer their own damn questions <0.5 wink>.  It's usually the case that
dir(M) shows them something that isn't documented, and over time I am *not*
pleased that "oh, I guess the 'string' in there is just crap" is how they
learn to view it.

I can live without __exports__; but I'd prefer not to, because I would
always use it if it were there.

if-i'd-both-use-it-and-heartily-recommend-it-it's-hard-to-
    oppose-it-ly y'rs  - tim



From m.favas@per.dem.csiro.au  Mon Jan  8 11:48:40 2001
From: m.favas@per.dem.csiro.au (Mark Favas)
Date: Mon, 08 Jan 2001 19:48:40 +0800
Subject: [Python-Dev] _cursesmodule.c clobbered since Christmas
Message-ID: <3A59A918.E0D02E0D@per.dem.csiro.au>

I last successfully downloaded from CVS, compiled, linked and tested on
Dec. 22 last year. For the last week or so, the current CVS
_cursesmodule.c gives a bunch of compiler warning messages of the form:

cc: Warning: ./_cursesmodule.c, line 619: In this statement,
"derwin(...)" of ty
pe "int", is being converted to "pointer to struct _win_st".
(cvtdiftypes)
  win = derwin(self->win,nlines,ncols,begin_y,begin_x);
--^
cc: Warning: ./_cursesmodule.c, line 1259: In this statement,
"subpad(...)" of t
ype "int", is being converted to "pointer to struct _win_st".
(cvtdiftypes)
    win = subpad(self->win, nlines, ncols, begin_y, begin_x);
----^
cc: Warning: ./_cursesmodule.c, line 1488: In this statement,
"termname(...)" of
 type "int", is being converted to "pointer to const char".
(cvtdiftypes)
NoArgReturnStringFunction(termname)
^
(more elided)

and

cc: Warning: ./_cursesmodule.c, line 305: The scalar variable "arg1" is
fetched 
but not initialized.  And there may be other such fetches of this
variable that 
have not been reported in this compilation. (uninit1)
Window_NoArg2TupleReturnFunction(getparyx, int, "(ii)")
^
cc: Warning: ./_cursesmodule.c, line 305: The scalar variable "arg2" is
fetched 
but not initialized.  And there may be other such fetches of this
variable that 
have not been reported in this compilation. (uninit1)
Window_NoArg2TupleReturnFunction(getparyx, int, "(ii)")
^
(more elided)

and at link time, fails with:

ld:
Unresolved:
getbegyx
getmaxyx
getparyx


I've held off bothering anyone about this, but it begins to look as
though no-one else has noticed... My platform? Tru64 Unix, V4.0F (aka
OSF1). The recent pow() bug hit this platform, too. Happy to do any
testing...



-- 
Mark Favas  -   m.favas@per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA


From guido@python.org  Mon Jan  8 14:27:50 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 09:27:50 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: Your message of "Mon, 08 Jan 2001 01:49:45 EST."
 <20010108014945.A19516@thyrsus.com>
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl>
 <20010108014945.A19516@thyrsus.com>
Message-ID: <200101081427.JAA03146@cj20424-a.reston1.va.home.com>

> You may be right.  Still, this patch solves the immediate problem in a
> reasonably clean way, and I urge that it should go in.  We can do a
> more complete reorganization of the build process later.  (I'll help with
> that; I'm pretty expert with autoconf and friends.)

I expect Andrew's code to go in before 2.1 is released.  So I don't
see a reason why we should hurry and check in a stop-gap measure.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Jan  8 14:33:09 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 09:33:09 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Mon, 08 Jan 2001 00:26:03 PST."
 <20010108002603.X17220@lyra.org>
References: <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>
 <20010108002603.X17220@lyra.org>
Message-ID: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>

> hehe... adding __exports__ to your module is fine. Adding more crud to
> Python, in opposition to the "we're all adults" motto, doesn't seem Right.
> 
> Somebody wants to use "from foo import *" on a module not designed for it?
> Too bad for them. If you're suggesting __exports__ is to patch over problems
> caused by "from foo import *", then I think you're barking up the wrong tree
> :-)

You haven't been answering many newbie questions lately, have you? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Jan  8 15:06:28 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 10:06:28 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: Your message of "Sun, 07 Jan 2001 17:15:27 EST."
 <20010107171527.A5093@thyrsus.com>
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com>
 <20010107171527.A5093@thyrsus.com>
Message-ID: <200101081506.KAA03404@cj20424-a.reston1.va.home.com>

> > So, I see no reason why the logic in your program couldn't take care
> > of this, which in general is a preferred way to solve a problem than
> > to change the language.
> 
> OK, two objections, one practical and one (more important) esthetic:
> 
> Practical: I guess I oversimplified the code for expository purposes.
> What's actually going on is that I have two parser classes both based
> on shlex -- they do character-at-a-time input and don't actually
> *have* accessible line buffers.

And what's wrong with always starting the second parser?  If the
stream was at EOF it will simply process zero lines.  Or does your
parser have a problem with empty input?

> Esthetic: Yes, I can have the first parser set a flag, or return some
> EOF token.  But this seems deeply wrong to me, because EOFness is not
> a property of the parser but of the underlying stream object.  It
> seems to me that my program ought to be able to ask the stream object
> whether it's at EOF rather than carrying its own flag for that state.

Eric, before we go furhter, can you give an exact definition of
EOFness to me?

> In Python as it is, there's no clean way to do this.  I'd have to do a
> nonzero-length read to test it (I failed to check the right alternate
> case before when I tried zero-length).  That's really broken.  What if the
> neither the underlying stream nor the parser supports pushback?
> 
> Do you see now why I think this is a more general issue?

No.  What's wrong with just setting the parser loose on the input and
letting it deal with EOF?  In your example, apparently a line
containing the word "history" signals that the rest of the file must
be parsed by the second parser.  What if "history" is the last line of
the file?  The eof() test can't tell you *that*!

> Now, another and more general way to handle this would be to make an
> equivalent of the old FIONCLEX ioctl part of Python's standard set of 
> file object methods -- a way to ask "how many bytes are ready to be
> read in this stream?  

There's no portable way to do that.

> Trivial to make it work for plain files, of course.  Harder to make it   
> work usefully for pipes/fifos/sockets/terminals.  Having it pass up the
> results of the fstat.size field (corrected for the current seek address
> if you're reading a plain file) would be a good start.

This seems totally the wrong level to solve your problem.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From moshez@zadka.site.co.il  Mon Jan  8 23:13:21 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 01:13:21 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>
References: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>
 <20010108002603.X17220@lyra.org>
Message-ID: <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il>

On Mon, 08 Jan 2001 09:33:09 -0500, Guido van Rossum <guido@python.org> wrote:
> > hehe... adding __exports__ to your module is fine. Adding more crud to
> > Python, in opposition to the "we're all adults" motto, doesn't seem Right.
> > 
> > Somebody wants to use "from foo import *" on a module not designed for it?
> > Too bad for them. If you're suggesting __exports__ is to patch over problems
> > caused by "from foo import *", then I think you're barking up the wrong tree
> > :-)
> 
> You haven't been answering many newbie questions lately, have you? :-)

Well, I have. 
And frankly, I think having "from foo import *" issue a warning at 2.1
a *much* better solution.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From guido@python.org  Mon Jan  8 15:15:20 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 10:15:20 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Tue, 09 Jan 2001 01:13:21 +0200."
 <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il>
References: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com> <20010108002603.X17220@lyra.org>
 <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il>
Message-ID: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>

[Greg]
> > > hehe... adding __exports__ to your module is fine. Adding more crud to
> > > Python, in opposition to the "we're all adults" motto, doesn't seem Right.
> > > 
> > > Somebody wants to use "from foo import *" on a module not designed for it?
> > > Too bad for them. If you're suggesting __exports__ is to patch over problems
> > > caused by "from foo import *", then I think you're barking up the wrong tree
> > > :-)

[Guido]
> > You haven't been answering many newbie questions lately, have you? :-)

[Moshe]
> Well, I have. 
> And frankly, I think having "from foo import *" issue a warning at 2.1
> a *much* better solution.

(1) For what problem?

(2) Under exactly what circumstances do you want from foo import *
    issue a warning?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Mon Jan  8 15:26:21 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 16:26:21 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101071845.f07IjFi01249@mira.informatik.hu-berlin.de>
Message-ID: <3A59DC1D.29DE500B@lemburg.com>

"Martin v. Loewis" wrote:
> 
> Authors of extension packages often find the need to auto-import some
> of their modules. This is often needed for registration, e.g. a codec
> author (like Tamito KAJIYAMA, who wrote the JapaneseCodecs package)
> may need to register a search function with codecs.register. This is
> currently only possible by writing into sitecustomize.py, which must
> be done by the system administrator manually.
> 
> To enhance the service of site.py, I've written the patch
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=103134&group_id=5470
> 
> which treats lines in PTH files which start with "import" as
> statements and executes them, instead of appending these lines to
> sys.path.
> 
> The patch is relatively small, but since it is an extension: Do I need
> to write a PEP for it?

Just curious: wouldn't this introduce a /tmp-style problem to
Python ?

The scenario is quite simple: a Python script runs under root.
The script could pick up a lingering .pth file (e.g. from /tmp
or one of its subdirs -- distutils does this !) and then executes
arbitrary code as *root*.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jim@interet.com  Mon Jan  8 15:43:05 2001
From: jim@interet.com (James C. Ahlstrom)
Date: Mon, 08 Jan 2001 10:43:05 -0500
Subject: [Python-Dev] Create a synthetic stdout for Windows?
Message-ID: <3A59E009.96922CA5@interet.com>

There a number of problems which frequently recur on c.l.p
that can serve as a source of Python improvement ideas.
On December 30, 2000 gerson.kurz@t-online.de (Gerson Kurz) writes:

   If I embedd Python in a Win32 console application (using
   Demo\embed.c), everything works fine. If I take the very same piece
of
   code and put it in a Win32 Windows application (not MFC, just a plain
   WinMain()) I see no output (and more importantly so, no errors),
   because the application does not have a stdout/stderr set up.

This is well known.  Windows developers must replace sys.stdout and
sys.stderr with alternative mechanisms.  Unfortunately this solution
does not completely work because errors can occur before sys.stdout
is replaced.  I propose patching pythonw.exe (WinMain.c) and adding
a new module to fix this so it Just Works.  The patch is completely
Windows specific.  I am not sure if this constitutes a PEP, but would
like everyone's feedback anyway.

Design Requirements

1) "pythonw.exe myfile.py" will give the usual error message if
   myfile.py does not exist.

2) "pythonw.exe myfile.py" will give the usual traceback for a
   syntax error in myfile.py.

3) python.exe will provide a useful C-language stdout/stderr so
   the user does not have to replace sys.stdout/err herself.

4) None of the above will interfere will the user's replacement
   of sys.stdout/err for her own purposes.

Description of Patch

A new module winstdoutmodule.c (138 lines) is included in Windows
builds. It contains a C entry point PyWin_StdoutReplace() which
creates a valid C stdout/err, and code to display output
in a popup dialog box.  There is a Python entry point
winstdout.print() to display output, but it is only used
for special purposes, and the typical user will never import
winstdout.

The file WinMain.c calls PyWin_StdoutReplace() before it
calls Py_Main(), and PyWin_StdoutPrint() afterwards.  This
is meant to display startup error messages.  Normally,
any available output is displayed when the system is idle.

Technical Details

Some experimentation (as opposed to documentation) shows that
Win32 programs have a valid FILE * stdout, but fileno(stdout)
gives INVALID_HANDLE_VALUE; the FILE * has an invalid OS file
object.  It is tempting to hack the FILE structure directly.
But it is more prudent to use the only documented way to
replace stdout, namely the standard call "freopen()" (also
available on Unix).  The design uses this call to open a
temporary file to append stdout and stderr output.  To
display output, the file is checked when the system is
idle, and MessageBox() is called with the file contents if any.

Status

After a few false starts, I now have working code.

Is this a good idea?  If so, is the implementation optimal
(comments from MarkH especially welcome)?

JimA


From mal@lemburg.com  Mon Jan  8 15:52:32 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 16:52:32 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>
 <20010108002603.X17220@lyra.org> <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il>
Message-ID: <3A59E240.7F77790E@lemburg.com>

Moshe Zadka wrote:
> 
> On Mon, 08 Jan 2001 09:33:09 -0500, Guido van Rossum <guido@python.org> wrote:
> > > hehe... adding __exports__ to your module is fine. Adding more crud to
> > > Python, in opposition to the "we're all adults" motto, doesn't seem Right.
> > >
> > > Somebody wants to use "from foo import *" on a module not designed for it?
> > > Too bad for them. If you're suggesting __exports__ is to patch over problems
> > > caused by "from foo import *", then I think you're barking up the wrong tree
> > > :-)
> >
> > You haven't been answering many newbie questions lately, have you? :-)
> 
> Well, I have.
> And frankly, I think having "from foo import *" issue a warning at 2.1
> a *much* better solution.

Why raise a warning ? "from xyz import *" is still very useful in
intercative sessions and also has some merrits when it comes to
importing all subpackages of a package (well, at least those listed
in __all__).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From barry@digicool.com  Mon Jan  8 15:54:10 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 8 Jan 2001 10:54:10 -0500
Subject: [Python-Dev] Add __exports__ to modules
References: <20010107232532.V17220@lyra.org>
 <20010106110033.52127A84F@darjeeling.zadka.site.co.il>
 <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
 <20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>
Message-ID: <14937.58018.792925.31985@anthem.wooz.org>

>>>>> "MZ" == Moshe Zadka <moshez@zadka.site.co.il> writes:

    MZ> it would *totally* stop any hope in the future of module level
    MZ> __getattr__ (or at least complicate the semantics).  I think
    MZ> Alex M. is thinking of a PEP, but he's taking his time, since
    MZ> no PEPs can be considered until 2.1 is out.

Given the current discussion, I'm now -1 on __exports__ unless a PEP
is written.  I think enough issues and interactions have been brought
up that a PEP is warranted first.

-Barry



From moshez@zadka.site.co.il  Tue Jan  9 00:03:00 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 02:03:00 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>
References: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>, <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com> <20010108002603.X17220@lyra.org>
 <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il>
Message-ID: <20010109000300.DF2A5A82D@darjeeling.zadka.site.co.il>

On Mon, 08 Jan 2001 10:15:20 -0500, Guido van Rossum <guido@python.org> wrote:

> (1) For what problem?

Users seeing things they didn't expect in their modules.

> (2) Under exactly what circumstances do you want from foo import *
>     issue a warning?

All.
If you want to be less extreme, don't warn if the module defines
a __from_star_ok__

But in any case, I'm done with this thread. We'll probably won't
manage to convince each other.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From guido@python.org  Mon Jan  8 16:04:58 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 11:04:58 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Mon, 08 Jan 2001 10:54:10 EST."
 <14937.58018.792925.31985@anthem.wooz.org>
References: <20010107232532.V17220@lyra.org> <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>
 <14937.58018.792925.31985@anthem.wooz.org>
Message-ID: <200101081604.LAA04464@cj20424-a.reston1.va.home.com>

> Given the current discussion, I'm now -1 on __exports__ unless a PEP
> is written.  I think enough issues and interactions have been brought
> up that a PEP is warranted first.

I have to agree.  I am no longer championing this patch.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@mojam.com (Skip Montanaro)  Mon Jan  8 16:27:17 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 8 Jan 2001 10:27:17 -0600 (CST)
Subject: [Python-Dev] inspect.py
In-Reply-To: <Pine.LNX.4.10.10101072348530.1032-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10011021617550.800-100000@skuld.kingmanhall.org>
 <Pine.LNX.4.10.10101072348530.1032-100000@skuld.kingmanhall.org>
Message-ID: <14937.60005.951163.80255@beluga.mojam.com>

    Ping> Sorry to bother you if you're busy -- i haven't seen any responses
    Ping> about inspect.py for a few days and wanted to know what your
    Ping> reactions were.

Fiddling code bits is not the sort of stuff I do very often, but every time
I do I wind up having to reacquaint myself with all sorts of object details
that slip out of my brain shortly after the latest need is gone.  Having a
module that hides the details seems like a good idea to me.

+1.  I vote it go into 2.1 assuming a bit for the library reference can be
written in time.

Skip


From akuchlin@mems-exchange.org  Mon Jan  8 16:31:09 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 11:31:09 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <200101081427.JAA03146@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 08, 2001 at 09:27:50AM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com>
Message-ID: <20010108113109.C7563@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 09:27:50AM -0500, Guido van Rossum wrote:
>I expect Andrew's code to go in before 2.1 is released.  So I don't
>see a reason why we should hurry and check in a stop-gap measure.

But it might not; the final version might be unacceptable or run into
some intractable problem.  Assuming the patch is correct (I haven't
looked at it), why not check it in?  The work has already been done to
write it, after all.

--amk



From akuchlin@mems-exchange.org  Mon Jan  8 16:41:10 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 11:41:10 -0500
Subject: [Python-Dev] _cursesmodule.c clobbered since Christmas
In-Reply-To: <3A59A918.E0D02E0D@per.dem.csiro.au>; from m.favas@per.dem.csiro.au on Mon, Jan 08, 2001 at 07:48:40PM +0800
References: <3A59A918.E0D02E0D@per.dem.csiro.au>
Message-ID: <20010108114110.D7563@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 07:48:40PM +0800, Mark Favas wrote:
>I last successfully downloaded from CVS, compiled, linked and tested on
>Dec. 22 last year. For the last week or so, the current CVS
>_cursesmodule.c gives a bunch of compiler warning messages of the form:

Hmm... on Dec. 22 there was a sizable change to export a C API from
the module; since then there's only been one minor change.  Perhaps
the last version you compiled successfully was from before I checked
in those changes.  In any case, I'll look into it as soon as my Compaq
test drive account is usable and I have access to a Tru64 4.0
machine again.  Thanks for the report!

Once the PEP 229 changes go in, many more modules will be tried on
many more platforms.  It might be worth considering setting up a
Tinderbox for Python, or at least doing a systematic test on several
platforms before releases.

--amk



From paulp@ActiveState.com  Mon Jan  8 16:46:47 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 08 Jan 2001 08:46:47 -0800
Subject: [Python-Dev] Add __exports__ to modules
References: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
Message-ID: <3A59EEF7.BB4118BD@ActiveState.com>

Tim Peters wrote:
> 
> ... It doesn't make any *sense* that os exports
> "sys" either, or that random exports "cos", or that cgi exports "string", or
> ... this inelegance is ubiquitous.

I agree strongly. I think that Python people are careless about what
their module dictionaries look like. My two main annoyances are modules
that export other modules randomly and modules that export huge wacks of
constants.

> Indeed.  But I'm suggesting that the problems that *can* arise from
> "import*" illustrate the fundamental silliness of exporting things by
> accident.  It's come up much more often for me when I'm looking over
> someone's shoulder, teaching them how to use dir() in an interactive shell
> to answer their own damn questions <0.5 wink>.  It's usually the case that
> dir(M) shows them something that isn't documented, and over time I am *not*
> pleased that "oh, I guess the 'string' in there is just crap" is how they
> learn to view it.

Screw dir()! Let's talk about important stuff: Komodo. And Idle. And
WingIDE. And PythonWorks and PythonWin. :)

How are class browsers and "intellisense prompters" supposed to know
that it "makes sense" to prompt the user with os.path but not
CGIHTTPServer.os.path. 

Overall, I think Tim is right. We are all adults here and part of being
adults is keeping your privates private and your nose clean.

 Paul Prescod


From paulp@ActiveState.com  Mon Jan  8 16:47:39 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 08 Jan 2001 08:47:39 -0800
Subject: [Python-Dev] Add __exports__ to modules
References: <20010107232532.V17220@lyra.org>, <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>
Message-ID: <3A59EF2B.792801E5@ActiveState.com>

Moshe Zadka wrote:
> 
> ...
> Let me "me to" here:
> Put another way, what Greg said is just a rephrase of "don't use from
> foo import * unless foo's docos say it's OK". 

That's not the issue. It's not about keeping people out of your module.
In fact I would propose that mod.__dict__ should be as loose as ever.

It's a user interface issue. If we encourage people to learn about
modules in interactive environments like the prompt using dir(), class
browsers and IDEs then we need to create modules that are friendly for
those users. I think that the current situation is pretty bad that way.
what does CGIHTTPServer export BaseHTTPServer? And why is
CGIHTTPServer.CGIHTTPServer a class but CGIHTTPServer.BaseHTTPServer is
a module?

We go to great lengths to make the syntax newbie friendly. I think that
we should make similar efforts in a cleanly reflective class library.

> Add to that the simple
> access control of a leading underscore, and I don't see any place
> which needs it.
> 
> Something better to do would be to use
> import foo as _foo

It's pretty clear that nobody does this now and nobody is going to start
doing it in the near future. It's too invasive and it makes the code too
ugly. Why obfuscate thousands of lines of code when a simple feature can
mitigate that?

>...
> One more point against __exports__, which I hoped I would not have to
> make (but when I'm up against the timbot *and* Guido, I need to pull
> out the heavy artillery): it would *totally* stop any hope in the
> future of module level __getattr__ (or at least complicate the semantics).
> I think Alex M. is thinking of a PEP, but he's taking his time, since
> no PEPs can be considered until 2.1 is out.

__exports__ would merely be considered an implementation detail of the
"default __getattr__". Custom __getattr__'s could decide whether to
respect it or not. It doesn't complicate anything much.

 Paul Prescod


From nas@arctrix.com  Mon Jan  8 09:54:55 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Mon, 8 Jan 2001 01:54:55 -0800
Subject: [Python-Dev] Create a synthetic stdout for Windows?
In-Reply-To: <3A59E009.96922CA5@interet.com>; from jim@interet.com on Mon, Jan 08, 2001 at 10:43:05AM -0500
References: <3A59E009.96922CA5@interet.com>
Message-ID: <20010108015455.A15138@glacier.fnational.com>

On Mon, Jan 08, 2001 at 10:43:05AM -0500, James C. Ahlstrom wrote:
> Is this a good idea?  If so, is the implementation optimal
> (comments from MarkH especially welcome)?

The general idea sounds good to me.  Having tracebacks go nowhere
when running pythonw is un-Python-like.  I don't know enough
about MFC, etc. to comment on the specifics of your patch.

  Neil


From akuchlin@mems-exchange.org  Mon Jan  8 16:49:13 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 11:49:13 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59EEF7.BB4118BD@ActiveState.com>; from paulp@ActiveState.com on Mon, Jan 08, 2001 at 08:46:47AM -0800
References: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <3A59EEF7.BB4118BD@ActiveState.com>
Message-ID: <20010108114913.E7563@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 08:46:47AM -0800, Paul Prescod wrote:
>How are class browsers and "intellisense prompters" supposed to know
>that it "makes sense" to prompt the user with os.path but not
>CGIHTTPServer.os.path. 

Could we then simply adopt __exports__ as a convention for such
browsers, but with no changes to core Python to support it?  Browsers
would then follow the algorithm "Use __exports__ if present, dir() if
not."  

--amk


From paulp@ActiveState.com  Mon Jan  8 16:51:26 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 08 Jan 2001 08:51:26 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <3A59F00E.53A0A32A@ActiveState.com>

Tim Peters wrote:
> 
> ....
> 
> Perl appears to ignore the issue of thread safety here (on Windows and
> everywhere else).

If you can create a sample program that demonstrates the unsafety I'll
anonymously submit it as a bug on our internal system and ensure that
the next version of Perl is as slow as Python. :)

Seriously: If someone comes at me with
Perl-IO-is-way-faster-than-Python-IO, I'd like to know what concretely
they've given up in order to achieve that performance. And even just
for my own interest I'd like to understand the cost/benefit of
stream thread safety. For instance would it make sense to just write
a thread-safe wrapper for streams used from multiple threads?

 Paul Prescod


From paulp@ActiveState.com  Mon Jan  8 17:01:49 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 08 Jan 2001 09:01:49 -0800
Subject: [Python-Dev] Add __exports__ to modules
References: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <3A59EEF7.BB4118BD@ActiveState.com> <20010108114913.E7563@kronos.cnri.reston.va.us>
Message-ID: <3A59F27D.C27B8CD0@ActiveState.com>

Andrew Kuchling wrote:
> 
> ...
> 
> Could we then simply adopt __exports__ as a convention for such
> browsers, but with no changes to core Python to support it?  Browsers
> would then follow the algorithm "Use __exports__ if present, dir() if
> not."

dir() is one of the "interactive tools" I'd like to work better in the
presence of __exports__. On the other hand, dir() works pretty poorly
for object instances today so maybe we need something new anyhow. 
Perhaps attrs()? 

If there were an "attrs()" and it basically returned __exports__ if it
existed and dir() if it didn't, then I would buy it. Graphical apps
would just build on attrs().

 Paul


From MarkH@ActiveState.com  Mon Jan  8 17:04:31 2001
From: MarkH@ActiveState.com (Mark Hammond)
Date: Mon, 8 Jan 2001 09:04:31 -0800
Subject: [Python-Dev] Create a synthetic stdout for Windows?
In-Reply-To: <3A59E009.96922CA5@interet.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPGEDKCOAA.MarkH@ActiveState.com>

> Is this a good idea?  If so, is the implementation optimal

Im really on the fence here.  Note however that your solution does not solve
the original problem.  Eg, your example is:

> On December 30, 2000 gerson.kurz@t-online.de (Gerson Kurz) writes:
>
>    If I embedd Python in a Win32 console application (using
>    Demo\embed.c), everything works fine. If I take the very same piece

But your solution involves:

> The file WinMain.c calls PyWin_StdoutReplace() before it
> calls Py_Main(), and PyWin_StdoutPrint() afterwards.  This

Note that the original problem was _embedding_ Python - thus, you need to
patch _their_ WinMain to make it work for them - something you can't do.

Even if PyWin_StdoutReplace() was a public symbol so they _could_ call it, I
am not convinced they would - it is almost certain they will still need to
redirect output to somewhere useful, so why bother redirecting it
temporarily just to redirect it for real immediately after?

Finally, I am slightly concerned about the possibility of "hanging" certain
programs. For example, I believe that DCOM will often invoke a COM server in
a different "desktop" than the user (this is also true for Services, but
Python services don't use pythonw.exe).  Thus, a Python program may end up
hanging with a dialog box, but in the context where no user is able to see
it.  However, this could be addressed by adding a command-line option to
prevent this new behaviour kicking in.

I would prefer to see a decent API for extracting error and traceback
information from Python.  On the other hand, I _do_ see the problem for
"newbies" trying to use pythonw.exe.

So - I guess I am saying that I don't see this as optimal, and it doesnt
solve the original problem you pointed at - but in the interests of making
pythonw.exe seem "less broken" for newbies, I could live with this as long
as I could prevent it when necessary.

Another option would be to use the Win32 Console APIs, and simply attempt to
create a console for the error message.  Eg, maybe PyErr_Print() could be
changed to check for the existance of a console, and if not found, create
it.  However, the problem with this approach is that the error message will
often be printed just as the process is terminating - meaning you will see a
new console with the error message for about 0.025 of a second before it
vanishes due to process termination.  Any sort of "press any key to
terminate" option then leaves us in the same position - if no user can see
the message, the process appears hung.

Mark.



From Andreas Jung <andreas@andreas-jung.com>  Mon Jan  8 17:06:16 2001
From: Andreas Jung <andreas@andreas-jung.com> (Andreas Jung)
Date: Mon, 8 Jan 2001 18:06:16 +0100
Subject: [Python-Dev] Re: ANN: Stackless Python 2.0
In-Reply-To: <3A58EFC3.5A722FF0@tismer.com>; from tismer@tismer.com on Mon, Jan 08, 2001 at 12:37:55AM +0200
References: <3A58EFC3.5A722FF0@tismer.com>
Message-ID: <20010108180616.A18993@yetix.sz-sb.de>

On Mon, Jan 08, 2001 at 12:37:55AM +0200, Christian Tismer wrote:
> Dear community,
> 
> I'm happy to announce that
> 
> 		Stackless Python 2.0
> 
> is finally ready and available for download.
> 
> Stackless Python for Python 1.5.2+ also got some minor
> enhancements. Both versions are available as Win32
> installer files here:

Are there patches available against the standard Python 2.0 
source code tree ?

Andreas 


From tismer@tismer.com  Mon Jan  8 16:15:55 2001
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 08 Jan 2001 18:15:55 +0200
Subject: [Python-Dev] Re: ANN: Stackless Python 2.0
References: <3A58EFC3.5A722FF0@tismer.com> <20010108180616.A18993@yetix.sz-sb.de>
Message-ID: <3A59E7BB.6908B7E2@tismer.com>


Andreas Jung wrote:
> 
> On Mon, Jan 08, 2001 at 12:37:55AM +0200, Christian Tismer wrote:
> > Dear community,
> >
> > I'm happy to announce that
> >
> >               Stackless Python 2.0
> >
> > is finally ready and available for download.
> >
> > Stackless Python for Python 1.5.2+ also got some minor
> > enhancements. Both versions are available as Win32
> > installer files here:
> 
> Are there patches available against the standard Python 2.0
> source code tree ?

I had no time yet to put the source trees on the web.
Should happen in one or two days.
The I will probably not provide patches, hoping that
some other Unix people will catch up and provide that
part. This worked the same for the 1.5.2 version.

The 2.0 port consists of 10 or so files, which can be used
as direct replacements for the same files in the 2.0 distro.
I think on Unix this is the right way to go.
For me it is simpler to have my own litle tree, since I'm
working with Windows, and I just have to modify my VC++
project file.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From moshez@zadka.site.co.il  Tue Jan  9 01:30:09 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 03:30:09 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59F27D.C27B8CD0@ActiveState.com>
References: <3A59F27D.C27B8CD0@ActiveState.com>, <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <3A59EEF7.BB4118BD@ActiveState.com> <20010108114913.E7563@kronos.cnri.reston.va.us>
Message-ID: <20010109013009.37D6DA82D@darjeeling.zadka.site.co.il>

On Mon, 08 Jan 2001 09:01:49 -0800, Paul Prescod <paulp@ActiveState.com> wrote:

> dir() is one of the "interactive tools" I'd like to work better in the
> presence of __exports__. On the other hand, dir() works pretty poorly
> for object instances today so maybe we need something new anyhow. 
> Perhaps attrs()? 
> 
> If there were an "attrs()" and it basically returned __exports__ if it
> existed and dir() if it didn't, then I would buy it. Graphical apps
> would just build on attrs().

Even better, __exports__ could be what was imported in 
from foo import *.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From Andreas Jung <andreas@andreas-jung.com>  Mon Jan  8 17:25:36 2001
From: Andreas Jung <andreas@andreas-jung.com> (Andreas Jung)
Date: Mon, 8 Jan 2001 18:25:36 +0100
Subject: [Python-Dev] Re: ANN: Stackless Python 2.0
In-Reply-To: <3A59E7BB.6908B7E2@tismer.com>; from tismer@tismer.com on Mon, Jan 08, 2001 at 06:15:55PM +0200
References: <3A58EFC3.5A722FF0@tismer.com> <20010108180616.A18993@yetix.sz-sb.de> <3A59E7BB.6908B7E2@tismer.com>
Message-ID: <20010108182536.A20361@yetix.sz-sb.de>

On Mon, Jan 08, 2001 at 06:15:55PM +0200, Christian Tismer wrote:
> 
> The 2.0 port consists of 10 or so files, which can be used
> as direct replacements for the same files in the 2.0 distro.
> I think on Unix this is the right way to go.
> For me it is simpler to have my own litle tree, since I'm
> working with Windows, and I just have to modify my VC++
> project file.

I would prefer a tar.gz archive that contains just the modified files.
With this approach it is easy possible to extract the archive inside
the Python source tree.

Andreas


From loewis@informatik.hu-berlin.de  Mon Jan  8 17:51:28 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 8 Jan 2001 18:51:28 +0100 (MET)
Subject: [Python-Dev] Extending startup code: PEP needed?
Message-ID: <200101081751.SAA08918@pandora.informatik.hu-berlin.de>

> Just curious: wouldn't this introduce a /tmp-style problem to
> Python ?

I tried, but I could not produce such a problem.

> The scenario is quite simple: a Python script runs under root.
> The script could pick up a lingering .pth file (e.g. from /tmp
> or one of its subdirs -- distutils does this !) and then executes
> arbitrary code as *root*.

No, Python looks only in a few places for pth file: 
{<prefix>,<exec_prefix>}{,/lib/python<version>/site-packages,/lib/site-python}

so it won't pick up pth files in /tmp.

Regards,
Martin


From esr@thyrsus.com  Mon Jan  8 18:01:37 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 8 Jan 2001 13:01:37 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <200101081506.KAA03404@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 08, 2001 at 10:06:28AM -0500
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com> <20010107171527.A5093@thyrsus.com> <200101081506.KAA03404@cj20424-a.reston1.va.home.com>
Message-ID: <20010108130137.E22834@thyrsus.com>

Guido van Rossum <guido@python.org>:
> Eric, before we go furhter, can you give an exact definition of
> EOFness to me?

A file is at EOF when attempts to read more data from it will fail
returning no data.

> What's wrong with just setting the parser loose on the input and
> letting it deal with EOF?

Nothing wrong in theory, but it's a problem in practice.  I don't want
to import the second parser unless it's actually needed, because it's much
larger than the first one.

>                                In your example, apparently a line
> containing the word "history" signals that the rest of the file must
> be parsed by the second parser.  What if "history" is the last line of
> the file?  The eof() test can't tell you *that*!

Right.  That case never happens.  I mean it *really* never happens :-).

What we're talking about is a game system.  The first parser recognizes
a spec language for describing games of a particular class (variants of
Diplomacy, if that's meaningful to you).  The system keeps logfiles which
consist of a a section in the game description language, optionally 
followed by the token "history" and an order log.

The parser for the order log language is a *lot* larger than the one
for the description language.  This is why I said I don't want the
first parser to just call the second.  I want to test for EOF to
know whether I have to import the second parser at all!

Here's the beginning of my problem: the first parser can't export a line
buffer, because it doesn't *have* a line buffer.  It's a subclass of
shlex and does single-character reads.

There are two ways I can cope with this.  One is to do a (nonzero)
length read after the first parser exits; the other is to have the
first parser set a state flag controlling whether the second parser
loads.

This is where it bites that I can't test for EOF with a read(0). The
second shlex parser only has token-level pushback!  If do a
nonzero-length read and I get data, I'm screwed.  On the other hand
(as I said before) setting a lexer state flag seems wrong, because
EOFness is a property of the underlying stream rather than the parser.
I'd be duplicating state that exists in the stdio stream structure
anyway; it ought to be accessible.

> > Now, another and more general way to handle this would be to make an
> > equivalent of the old FIONCLEX ioctl part of Python's standard set of 
> > file object methods -- a way to ask "how many bytes are ready to be
> > read in this stream?  
> 
> There's no portable way to do that.

Actually, fstat(2) is portable enough to support a very useful
approximation of FIONCLEX.  I know, because I tried it.

Last night I coded up a "waiting" method for file objects that calls
fstat(2) on the associated file descriptor.  For a plain file, it
then subtracts the result of ftell() from the fstat size field and
returns that -- for other files, it simply returns the size field.

I then tested this on plain files, FIFOs, and sockets under Linux. It
turns out fstat(2) gives useful information in all three cases (a
count of characters waiting in the buffer in the latter two).  I expected
this; it should be true under all current Unixes.

fstat(2) does not give useful size-field results for Linux block
devices.  I didn't test the character (terminal) devices.  (I
documented my results in Python's Doc/lib/stat.tex, in a patch I have
already submitted to SourceForge.)

I would be quite surprised if the plain-file case didn't work on Mac
and Windows.  I would be a little surprised if the socket case failed,
because all three probably inherited fstat(2) from the ancestral BSD
TCP/IP stack.

Just having the plain-file case work would, IMHO, be justification
enough for this method.  If it turns out to be portable across Mac and
Windows sockets as well, *huge* win.  Could this be tested by someone
with access to Windows and Mac systems?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

An armed society is a polite society.  Manners are good when one 
may have to back up his acts with his life.
        -- Robert A. Heinlein, "Beyond This Horizon", 1942



From mal@lemburg.com  Mon Jan  8 18:10:50 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 19:10:50 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de>
Message-ID: <3A5A02AA.675A35D1@lemburg.com>

Martin von Loewis wrote:
> 
> > Just curious: wouldn't this introduce a /tmp-style problem to
> > Python ?
> 
> I tried, but I could not produce such a problem.
> 
> > The scenario is quite simple: a Python script runs under root.
> > The script could pick up a lingering .pth file (e.g. from /tmp
> > or one of its subdirs -- distutils does this !) and then executes
> > arbitrary code as *root*.
> 
> No, Python looks only in a few places for pth file:
> {<prefix>,<exec_prefix>}{,/lib/python<version>/site-packages,/lib/site-python}
> 
> so it won't pick up pth files in /tmp.

Hmm, but what if the Python script picks up a site.py which is
different from the standard one distributed with Python ?

The code adding (and with the patch: executing) the .pth files
is defined in site.py and it is rather easy to override this
file by adding a modified site.py file to the current working dir...
a potential security hole in its own right, I guess :(

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@python.org  Mon Jan  8 18:30:34 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 13:30:34 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: Your message of "Mon, 08 Jan 2001 13:01:37 EST."
 <20010108130137.E22834@thyrsus.com>
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com> <20010107171527.A5093@thyrsus.com> <200101081506.KAA03404@cj20424-a.reston1.va.home.com>
 <20010108130137.E22834@thyrsus.com>
Message-ID: <200101081830.NAA05301@cj20424-a.reston1.va.home.com>

Eric, take a hint.  You're not going to get your eof() method no
matter what arguments you bring up.  But I'll explain it to you again
anyway... :-)

> Guido van Rossum <guido@python.org>:
> > Eric, before we go furhter, can you give an exact definition of
> > EOFness to me?

[Eric]
> A file is at EOF when attempts to read more data from it will fail
> returning no data.

I was afraid you would say this.  That's not a condition that's easy
to calculate without doing I/O, *and* that's not the condition that
you are interested in for your problem.  According to your definition,
f.eof() should be true in this example:

    f = open("/etc/passwd")
    f.seek(0, 2)                 # Seek to end of file
    print f.eof()                # What will this print???
    print `f.readline()`         # Will print ''

But getting the right result here requires a lot of knowledge about
how the file is implemented!  While you've explained how this can be
implemented on Unix, it can't be implemented with just the tools that
stdio gives us.  Going beyond stdio in order to implement a feature is
a grave decision.  After all, Python is portable to many
less-than-mainstream operating systems (VxWorks, OS/9, VMS...).  Now,
if this was just a speed hack (like xreadlines) I could accept having
some platform-dependent code, if at least there was a portable way to
do it that was just a bit slower.  But here you can't convince me that
this can be done in a portable way, and I don't want to force porters
to figure out how to do this for their platform before their port can
work.  I also don't want to make f.eof() a non-portable feature: *if*
it is provided, it's too important for that.

Note that stdio's feof() doesn't have this definition!  It is set when
the last *read* (or getc(), etc.) stumbled upon an EOF condition.
That's also of limited value; it's mostly defined so you can
distinguish between errors and EOF when you get a short read.  The
stdio feof() flag would be false in the above example.

> > What's wrong with just setting the parser loose on the input and
> > letting it deal with EOF?
> 
> Nothing wrong in theory, but it's a problem in practice.  I don't want
> to import the second parser unless it's actually needed, because it's much
> larger than the first one.

So be practical and let the first parser set a global flag that tells
you whether it's necessary to load the second one.

> >                                In your example, apparently a line
> > containing the word "history" signals that the rest of the file must
> > be parsed by the second parser.  What if "history" is the last line of
> > the file?  The eof() test can't tell you *that*!
> 
> Right.  That case never happens.  I mean it *really* never happens :-).
> 
> What we're talking about is a game system.  The first parser recognizes
> a spec language for describing games of a particular class (variants of
> Diplomacy, if that's meaningful to you).  The system keeps logfiles which
> consist of a a section in the game description language, optionally 
> followed by the token "history" and an order log.
> 
> The parser for the order log language is a *lot* larger than the one
> for the description language.  This is why I said I don't want the
> first parser to just call the second.  I want to test for EOF to
> know whether I have to import the second parser at all!
> 
> Here's the beginning of my problem: the first parser can't export a line
> buffer, because it doesn't *have* a line buffer.  It's a subclass of
> shlex and does single-character reads.
> 
> There are two ways I can cope with this.  One is to do a (nonzero)
> length read after the first parser exits; the other is to have the
> first parser set a state flag controlling whether the second parser
> loads.

Do the latter.  Nothing wrong with it that I can see.

> This is where it bites that I can't test for EOF with a read(0).

And can you tell me a system where you *can* test for EOF with a
read(0)?  I've never heard of such a thing.  The Unix read() system
call has the same properties as Python's f.read().  I'm pretty sure
that fread() with a zero count also doesn't give you the information
you're after.

> The
> second shlex parser only has token-level pushback!  If do a
> nonzero-length read and I get data, I'm screwed.  On the other hand
> (as I said before) setting a lexer state flag seems wrong, because
> EOFness is a property of the underlying stream rather than the parser.
> I'd be duplicating state that exists in the stdio stream structure
> anyway; it ought to be accessible.

Bullshit.  The EOFness that you're after (according to your own
definition) is not the same as the EOFness of the stdio stream.  The
EOFness in the stdio stream could help you, but Python resets it -- so
that making it available wouldn't be as easy as you claim.  Anyway,
you seem to have a sufficiently vague idea of what "EOFness" means
that I don't think providing access to whatever low-level EOFness
condition might exist would do you much good.

> > > Now, another and more general way to handle this would be to make an
> > > equivalent of the old FIONCLEX ioctl part of Python's standard set of 
> > > file object methods -- a way to ask "how many bytes are ready to be
> > > read in this stream?  
> > 
> > There's no portable way to do that.
> 
> Actually, fstat(2) is portable enough to support a very useful
> approximation of FIONCLEX.  I know, because I tried it.
> 
> Last night I coded up a "waiting" method for file objects that calls
> fstat(2) on the associated file descriptor.  For a plain file, it
> then subtracts the result of ftell() from the fstat size field and
> returns that -- for other files, it simply returns the size field.
> 
> I then tested this on plain files, FIFOs, and sockets under Linux. It
> turns out fstat(2) gives useful information in all three cases (a
> count of characters waiting in the buffer in the latter two).  I expected
> this; it should be true under all current Unixes.
> 
> fstat(2) does not give useful size-field results for Linux block
> devices.  I didn't test the character (terminal) devices.  (I
> documented my results in Python's Doc/lib/stat.tex, in a patch I have
> already submitted to SourceForge.)
> 
> I would be quite surprised if the plain-file case didn't work on Mac
> and Windows.  I would be a little surprised if the socket case failed,
> because all three probably inherited fstat(2) from the ancestral BSD
> TCP/IP stack.
> 
> Just having the plain-file case work would, IMHO, be justification
> enough for this method.  If it turns out to be portable across Mac and
> Windows sockets as well, *huge* win.  Could this be tested by someone
> with access to Windows and Mac systems?

I don't see the huge win.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Jan  8 18:33:26 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 13:33:26 -0500
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: Your message of "Mon, 08 Jan 2001 19:10:50 +0100."
 <3A5A02AA.675A35D1@lemburg.com>
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de>
 <3A5A02AA.675A35D1@lemburg.com>
Message-ID: <200101081833.NAA05325@cj20424-a.reston1.va.home.com>

Discussions based on Python running as root and picking up untrusted
code from $PYTHONPATH are pointless.  Of course this is a security
hole.  If root runs *any* Python script in a way that could pick up
even a single untrusted module, there's a security hole.  site.py or
*.pth files are just a special case of this, so I don't see why this
is used as an example.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Mon Jan  8 18:48:40 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 8 Jan 2001 13:48:40 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59EF2B.792801E5@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEGFIHAA.tim.one@home.com>

[Moshe]
> Something better to do would be to use
> import foo as _foo

[Paul]
> It's pretty clear that nobody does this now and nobody is going
> to start doing it in the near future. It's too invasive and it
> makes the code too ugly.

Actually, this function is one of my std utilities:

def _pvt_import(globs, modname, *items):
    """globs, modname, *items -> import into globs with leading "_".

    If *items is empty, set globs["_" + modname] to module modname.
    If *items is not empty, import each item similarly but don't
    import the module into globs.
    Leave names that already begin with an underscore as-is.

    # import math as _math
    >>> _pvt_import(globals(), "math")
    >>> round(_math.pi, 0)
    3.0

    # import math.sin as _sin and math.floor as _floor
    >>> _pvt_import(globals(), "math", "sin", "floor")
    >>> _floor(3.14)
    3.0
    """

    mod = __import__(modname, globals())
    if items:
        for name in items:
            xname = name
            if xname[0] != "_":
                xname = "_" + xname
            globs[xname] = getattr(mod, name)
    else:
        xname = modname
        if xname[0] != "_":
            xname = "_" + xname
        globs[xname] = mod

Note that it begins with an underscore because it's *meant* to be exported
<0.5 wink>.  That is, the module importing this does

    from utils import _pvt_import

because they don't already have _pvt_import to automate adding the
underscore, and without the underscore almost everyone would accidentally
export "pvt_import" in turn.  IOW,

    import M
    from N import M

not only import M, by default they usually export it too, but the latter is
rarely *intended*.  So, over the years, I've gone thru several phases of
naming objects I *intend* to export with a leading underscore.  That's the
only way to prevent later imports from exporting by accident.  I don't
believe I've distributed any code using _pvt_import, though, because it
fights against the language and expectations.  Metaprogramming against the
grain should be a private sin <0.9 wink>.

_metaprogramming-ly y'rs  - tim



From mal@lemburg.com  Mon Jan  8 18:40:37 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 19:40:37 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de>
 <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com>
Message-ID: <3A5A09A5.D0DC33A1@lemburg.com>

Guido van Rossum wrote:
> 
> Discussions based on Python running as root and picking up untrusted
> code from $PYTHONPATH are pointless.  Of course this is a security
> hole.  If root runs *any* Python script in a way that could pick up
> even a single untrusted module, there's a security hole.  site.py or
> *.pth files are just a special case of this, so I don't see why this
> is used as an example.

Agreed; see my reply to Martin.

Still, wouldn't it be wise to add some logic to Python to prevent
importing untrusted modules, e.g. by making sys.path read-only and
disabling the import hook usage using a command line ? 

This would at least prevent the most obvious attacks. I wonder how
RedHat works around these problems.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jim@interet.com  Mon Jan  8 19:16:45 2001
From: jim@interet.com (James C. Ahlstrom)
Date: Mon, 08 Jan 2001 14:16:45 -0500
Subject: [Python-Dev] Create a synthetic stdout for Windows?
References: <LCEPIIGDJPKCOIHOBJEPGEDKCOAA.MarkH@ActiveState.com>
Message-ID: <3A5A121D.FDD8C2C1@interet.com>

Mark Hammond wrote:

> Note that the original problem was _embedding_ Python - thus, you need to
> patch _their_ WinMain to make it work for them - something you can't do.

Correct, if they don't use pythonw.exe, but use a different
main program, the new stdout will not be installed.  But then
they must have their own main.c, and they can add the C call.
 
> Even if PyWin_StdoutReplace() was a public symbol so they _could_ call it, I

Yes, the symbol PyWin_StdoutReplace() is public, and they
can call it.

> am not convinced they would - it is almost certain they will still need to
> redirect output to somewhere useful, so why bother redirecting it
> temporarily just to redirect it for real immediately after?

Redirecting it temporarily is valuable, because if the sys.stdout
replacement occurs in (for example) myprog.py, then "pythonw.exe
myprog.py"
will fail to produce any error messages for a syntax error in myprog.py.

Also, I was hoping further sys.stdout redirection would be unnecessary.
 
> Finally, I am slightly concerned about the possibility of "hanging" certain
> programs. For example, I believe that DCOM will often invoke a COM server in
> a different "desktop" than the user (this is also true for Services, but
> Python services don't use pythonw.exe).  Thus, a Python program may end up
> hanging with a dialog box, but in the context where no user is able to see
> it.  However, this could be addressed by adding a command-line option to
> prevent this new behaviour kicking in.

Limiting the code to pythonw.exe instead of trying to install
it in python20.dll was supposed to prevent damage to the use
of Python in servers.  Since pythonw.exe is a Windows (GUI) program,
I am assuming there is a screen.  The dialog box is started with
MessageBox() and a window handle of GetForegroundWindow().  So
there doesn't need to be an application window.  I have tested it
with GUI programs, and it also works when run from a console.

Having said that, you may be right that there is some way to
hang on a dialog box which can not be seen.  It depends on what
MessageBox() and GetForegroundWindow() actually do.  If it seems
that this patch has merit, I would be grateful if you would review
the code to look for issues of this type.
 
> I would prefer to see a decent API for extracting error and traceback
> information from Python.  On the other hand, I _do_ see the problem for
> "newbies" trying to use pythonw.exe.

There could be an API added to the winstdout module such as
  msg = winstdout.GetMessageText()
which would return saved text, control its display etc.
But then the problem remains of actually displaying the messages
especially in the context of tracebacks and errors.  And it is
probably easier to redirect sys.stdout so it does what you want
rather than use the API.

I do not view winstdout as a "newbie" feature, but rather a
generally useful C-language addition to Python.

> So - I guess I am saying that I don't see this as optimal, and it doesnt
> solve the original problem you pointed at - but in the interests of making
> pythonw.exe seem "less broken" for newbies, I could live with this as long
> as I could prevent it when necessary.

I guess I am saying, perhaps incorrectly, that the mechanism provided
will make further redirection of sys.stdout unnecessary 99% of the
time.  Experimentation shows that Python composes tracebacks and
error messages a line or partial line at a time.  That is, you can
not display each call to printf(), but must wait until the system is
idle to be sure that multiple calls to printf() are complete.  So this
forces you to use the idle processing loop, not rocket science but
at least inconvenient.  And the only source of stdout/err is tracebacks,
error messages and the "print" statement.  What would you do with
these in a Windows program except display an "OK" dialog box?

If someone out there knows of a different example of sys.stdout
redirection in use in the real world, it would be helpful if
they would describe it.  Maybe it could be incorporated.

> Another option would be to use the Win32 Console APIs, and simply attempt to
> create a console for the error message.  Eg, maybe PyErr_Print() could be
> changed to check for the existance of a console, and if not found, create
> it.  However, the problem with this approach is that the error message will
> often be printed just as the process is terminating - meaning you will see a
> new console with the error message for about 0.025 of a second before it
> vanishes due to process termination.  Any sort of "press any key to
> terminate" option then leaves us in the same position - if no user can see
> the message, the process appears hung.

Yes, this a problem with the console API approach.  Another is that
popping up a black console for output instead of the usual "OK"
dialog box is unnatural, and will force the user to replace sys.stdout.
I was hoping this C stdout will make this unnecessary.

JimA


From esr@thyrsus.com  Mon Jan  8 19:17:50 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 8 Jan 2001 14:17:50 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <200101081830.NAA05301@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 08, 2001 at 01:30:34PM -0500
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com> <20010107171527.A5093@thyrsus.com> <200101081506.KAA03404@cj20424-a.reston1.va.home.com> <20010108130137.E22834@thyrsus.com> <200101081830.NAA05301@cj20424-a.reston1.va.home.com>
Message-ID: <20010108141750.C23214@thyrsus.com>

Guido van Rossum <guido@python.org>:
> [Eric]
> > A file is at EOF when attempts to read more data from it will fail
> > returning no data.
> 
> I was afraid you would say this.  That's not a condition that's easy
> to calculate without doing I/O, *and* that's not the condition that
> you are interested in for your problem.  According to your definition,
> f.eof() should be true in this example:
> 
>     f = open("/etc/passwd")
>     f.seek(0, 2)                 # Seek to end of file
>     print f.eof()                # What will this print???
>     print `f.readline()`         # Will print ''

I agree that after f.seek(0, 2) f is in an end-of-file condition.  But
I think it's precisely the definition that would be useful for my
problem.  Contrary to what you say, I think my definition of EOF is
quite sharp -- a sequential read would return no data.

Better to think of what I need as an "is there data waiting?" query.
I should have framed it that way, rather than about EOFness, from the
beginning.

> But getting the right result here requires a lot of knowledge about
> how the file is implemented!  While you've explained how this can be
> implemented on Unix, it can't be implemented with just the tools that
> stdio gives us.

Granted.  However, it looks possible that "is there data waiting"
*can* be portably implemented with the help of fstat(2), which by
precedent is also part of Python's toolkit.

> I also don't want to make f.eof() a non-portable feature: *if*
> it is provided, it's too important for that.

Agreed.

> Note that stdio's feof() doesn't have this definition!  It is set when
> the last *read* (or getc(), etc.) stumbled upon an EOF condition.
> That's also of limited value; it's mostly defined so you can
> distinguish between errors and EOF when you get a short read.  The
> stdio feof() flag would be false in the above example.

OK.  You're right about that.  I should have thought more clearly about
the difference between the state of stdio and the state of the underlying
file or device.  Access to stdio state won't do by itself.

> > This is where it bites that I can't test for EOF with a read(0).
> 
> And can you tell me a system where you *can* test for EOF with a
> read(0)?  I've never heard of such a thing.  The Unix read() system
> call has the same properties as Python's f.read().  I'm pretty sure
> that fread() with a zero count also doesn't give you the information
> you're after.

I'd have to test -- but what Unix read(2) does in this case isn't
really my point.  My real point is that I can't probe for whether
there's data waiting to be read in what seems like the obvious way.  I
expect Python to compensate for the deficiencies of the underlying C,
not reflect them.

> > Just having the plain-file case work would, IMHO, be justification
> > enough for this method.  If it turns out to be portable across Mac and
> > Windows sockets as well, *huge* win.  Could this be tested by someone
> > with access to Windows and Mac systems?
> 
> I don't see the huge win.

Try "polling after a non-blocking open".  A lower-overhead and more 
natural way to do it than with a poller object.  (This is on my mind 
because I used a poller object to query FIFOs just last week.)

The game system I'm working on, BTW, has another point of interest for
this list.  It is a rather large and complex suite of C programs that
makes heavy use of dynamic-memory allocation; I am translating to
Python partly in order to avoid chronic misallocation problems (leaks
and wild pointers) and partly because the thing needed to be rewritten
anyway to eliminate global state so I can embed it an multithreaded
server.

Side-by-side comparison of the original C and its translation should
be quite an interesting educational experience once it's done.  That
just might be my next yesar's paper.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

It is the assumption of this book that a work of art is a gift, not a
commodity.  Or, to state the modern case with more precision, that works of
art exist simultaneously in two "economies," a market economy and a gift
economy.  Only one of these is essential, however: a work of art can survive
without the market, but where there is no gift there is no art.
	-- Lewis Hyde, The Gift: Imagination and the Erotic Life of Property


From guido@python.org  Mon Jan  8 19:36:02 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 14:36:02 -0500
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: Your message of "Mon, 08 Jan 2001 19:40:37 +0100."
 <3A5A09A5.D0DC33A1@lemburg.com>
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com>
 <3A5A09A5.D0DC33A1@lemburg.com>
Message-ID: <200101081936.OAA05440@cj20424-a.reston1.va.home.com>

> Still, wouldn't it be wise to add some logic to Python to prevent
> importing untrusted modules, e.g. by making sys.path read-only and
> disabling the import hook usage using a command line ? 
> 
> This would at least prevent the most obvious attacks. I wonder how
> RedHat works around these problems.

I don't understand what kind of attacks you are thinking of.  What
would making sys.path read-only prevent?  You seem to be thinking that
some malicious piece of code could try to subvert you by setting
sys.path.  But what you forget is that if this piece of code cannot be
trusted wiuth sys.path, it should not be trusted to run at all!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From loewis@informatik.hu-berlin.de  Mon Jan  8 19:45:44 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 8 Jan 2001 20:45:44 +0100 (MET)
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: <3A5A02AA.675A35D1@lemburg.com> (mal@lemburg.com)
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com>
Message-ID: <200101081945.UAA12178@pandora.informatik.hu-berlin.de>

> The code adding (and with the patch: executing) the .pth files
> is defined in site.py and it is rather easy to override this
> file by adding a modified site.py file to the current working dir...
> a potential security hole in its own right, I guess :(

Indeed - independent of my patch changing the other site.py :-)

Regards,
Martin


From skip@mojam.com (Skip Montanaro)  Mon Jan  8 19:49:22 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 8 Jan 2001 13:49:22 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59EF2B.792801E5@ActiveState.com>
References: <20010107232532.V17220@lyra.org>
 <20010106110033.52127A84F@darjeeling.zadka.site.co.il>
 <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
 <20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>
 <3A59EF2B.792801E5@ActiveState.com>
Message-ID: <14938.6594.44596.509259@beluga.mojam.com>

    Paul> It's not about keeping people out of your module.  In fact I would
    Paul> propose that mod.__dict__ should be as loose as ever.

Okay, how about this as a compromise first step?  Allow programmers to put
__exports__ lists in their modules but don't do anything with them *except*
modify dir() to respect that if it exists?  That would pretty up dir()
output for newbies, almost certainly not break anything, improve the
internal documentation of the modules that use __exports__, and still allow
us to move in a more restrictive direction at a later time if we so choose.

Skip


From moshez@zadka.site.co.il  Tue Jan  9 04:04:23 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 06:04:23 +0200 (IST)
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: <3A5A02AA.675A35D1@lemburg.com>
References: <3A5A02AA.675A35D1@lemburg.com>, <200101081751.SAA08918@pandora.informatik.hu-berlin.de>
Message-ID: <20010109040423.68AA4A82D@darjeeling.zadka.site.co.il>

On Mon, 08 Jan 2001 19:10:50 +0100, "M.-A. Lemburg" <mal@lemburg.com> wrote:

> Hmm, but what if the Python script picks up a site.py which is
> different from the standard one distributed with Python ?

Then the site.py can do whatever it wants.
No need to go through PTHs
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From tim.one@home.com  Mon Jan  8 19:59:48 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 8 Jan 2001 14:59:48 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <20010108130137.E22834@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEGHIHAA.tim.one@home.com>

Quickie:

[Guido]
> Eric, before we go furhter, can you give an exact definition of
> EOFness to me?

[Eric]
> A file is at EOF when attempts to read more data from it will fail
> returning no data.

To be very clear about this, that's not what C's feof() means:  in general,
the end-of-file indicator in std C stream input is set only *after* you've
attempted a read that "didn't work".  For example,

#include <stdio.h>

void
main()
{
	FILE* fp = fopen("guts", "wb");
	fputs("abc", fp);
	fclose(fp);
	fp = fopen("guts", "rb");
	for (;;) {
		int c;
		c = getc(fp);
		printf("getc returned %c (%d)\n", c, c);
		printf("At EOF after getc? %d\n", feof(fp));
		if (c == EOF)
			break;
	}
}

Unless your C is broken, feof() will return 0 after getc() returns 'a', and
again after 'b', and again after 'c'.  It's not until getc() returns EOF
that feof() first returns a non-zero result.

Then add these two lines after the "for":

	fseek(fp, 0L, SEEK_END);
	printf("after seeking to the end, feof() says %d\n", feof(fp));

Unless your fseek() is non-std, that clears the end-of-file indicator, and
regardless of to where you seek.  So the std behavior throughout libc is
much like Python's behavior:  there's nothing that can tell you whether
you're at the end of the file, in general, short of trying to read and
failing to get something back.

In your case you seem to *know* that you have a "plain old file", meaning
that its size is well-defined and that ftell() makes sense for it.  You also
seem to know that you don't have to worry about anyone else, e.g., appending
to it (or in any other way changing its size, or changing your stream's file
position), while you're mucking with it.  So why not just do f.tell() and
compare that to the size yourself?  This sounds easy for you to do, but in
this particular case you enjoy the benefits of a world of assumptions that
aren't true in general.

> ...
> This is where it bites that I can't test for EOF with a read(0).

You can't in std C using an fread of 0 bytes either -- that has no effect on
the end-of-file indicator.  Add

		if (c == 'c') {
			char buf[100];
			size_t i = fread(buf, 1, 0, fp);
			printf("after fread of 0 bytes, feof() says %d\n",
			       feof(fp));
		}

before the "(c == EOF)" test above to try that on your platform.

> ...
> I would be quite surprised if the plain-file case didn't work on Mac
> and Windows.

Don't know about Mac.  On Windows everything is grossly complicated because
of line-end translations in text mode.  Like the C std says, the only
*portable* thing you can do with an ftell() result for a text file is feed
it back unaltered to fseek().  It so happens that on Windows, using MS's
libc, if f.readline() returns "abc\n" for the first line of a native text
file, f.tell() returns 5, reflecting the actual byte offset in the file
(including the \r that .readline() doesn't show you).  So you *can* get away
with comparing f.tell() to the file's size on Windows too (using the MS C
compiler; don't know about others).

the-operational-defn-of-eof-is-the-only-portable-defn-
    there-is-ly y'rs  - tim



From moshez@zadka.site.co.il  Tue Jan  9 04:08:29 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 06:08:29 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <14938.6594.44596.509259@beluga.mojam.com>
References: <14938.6594.44596.509259@beluga.mojam.com>, <20010107232532.V17220@lyra.org>
 <20010106110033.52127A84F@darjeeling.zadka.site.co.il>
 <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
 <20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>
 <3A59EF2B.792801E5@ActiveState.com>
Message-ID: <20010109040829.BDB66A82D@darjeeling.zadka.site.co.il>

[Paul Prescod] 
> It's not about keeping people out of your module.  In fact I would
> propose that mod.__dict__ should be as loose as ever.

[Skip Montanaro]
> Okay, how about this as a compromise first step?  Allow programmers to put
> __exports__ lists in their modules but don't do anything with them *except*
> modify dir() to respect that if it exists?  That would pretty up dir()
> output for newbies, almost certainly not break anything, improve the
> internal documentation of the modules that use __exports__, and still allow
> us to move in a more restrictive direction at a later time if we so choose.

I'm +1 on that personally. 
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From mal@lemburg.com  Mon Jan  8 20:38:00 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 21:38:00 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com>
 <3A5A09A5.D0DC33A1@lemburg.com> <200101081936.OAA05440@cj20424-a.reston1.va.home.com>
Message-ID: <3A5A2528.C289BE1D@lemburg.com>

Guido van Rossum wrote:
> 
> > Still, wouldn't it be wise to add some logic to Python to prevent
> > importing untrusted modules, e.g. by making sys.path read-only and
> > disabling the import hook usage using a command line ?
> >
> > This would at least prevent the most obvious attacks. I wonder how
> > RedHat works around these problems.
> 
> I don't understand what kind of attacks you are thinking of.  What
> would making sys.path read-only prevent?  You seem to be thinking that
> some malicious piece of code could try to subvert you by setting
> sys.path.  But what you forget is that if this piece of code cannot be
> trusted wiuth sys.path, it should not be trusted to run at all!

I was thinking an attack where knowledge of common temporary
execution locations is used to trick Python into executing
untrusted code -- the untrusted code would only have to be
copied to the known temporary execution directory and then
gets executed by Python next time the program using the temporary
location is invoked.

But you're right: this is possible with and without sys.path being
writeable or not.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From thomas@xs4all.net  Mon Jan  8 20:45:57 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 21:45:57 +0100
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <200101081427.JAA03146@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 08, 2001 at 09:27:50AM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com>
Message-ID: <20010108214557.H402@xs4all.nl>

On Mon, Jan 08, 2001 at 09:27:50AM -0500, Guido van Rossum wrote:
> > You may be right.  Still, this patch solves the immediate problem in a
> > reasonably clean way, and I urge that it should go in.  We can do a
> > more complete reorganization of the build process later.  (I'll help with
> > that; I'm pretty expert with autoconf and friends.)

> I expect Andrew's code to go in before 2.1 is released.  So I don't
> see a reason why we should hurry and check in a stop-gap measure.

Oh, we're gonna distribute binaries of Python 2.0/1.5.2-with-distutils for
every known platform that can run configure ? :) I still think there are
more than enough platforms without Python to warrant using autoconf for
configuring modules. The module list and their demands are stable enough to
make maintenance a fair breeze, IMHO.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From akuchlin@mems-exchange.org  Mon Jan  8 21:57:58 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 16:57:58 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <20010108214557.H402@xs4all.nl>; from thomas@xs4all.net on Mon, Jan 08, 2001 at 09:45:57PM +0100
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com> <20010108214557.H402@xs4all.nl>
Message-ID: <20010108165758.B9260@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 09:45:57PM +0100, Thomas Wouters wrote:
>every known platform that can run configure ? :) I still think there are
>more than enough platforms without Python to warrant using autoconf for
>configuring modules. The module list and their demands are stable enough to
>make maintenance a fair breeze, IMHO.

Umm... the proposed PEP 229 patch would compile a Python binary with
sre, posix, and strop statically linked; this minimal Python is then
used to run the setup.py script.  You shouldn't require a preinstalled
Python, though the current version of the patch doesn't meet this
requirement yet.

--amk



From tim.one@home.com  Mon Jan  8 20:59:40 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 8 Jan 2001 15:59:40 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <3A59F00E.53A0A32A@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>

[Tim]
> Perl appears to ignore the issue of thread safety here (on Windows and
> everywhere else).

[Paul Prescod]
> If you can create a sample program that demonstrates the unsafety
> I'll anonymously submit it as a bug on our internal system

I don't want to spend time on that, as I *assume* it's already well-known
within the Perl thread community.  Besides, the last version of Perl I got
from ActiveState <wink> complains:

     No threads in this perl at temp.pl line 14

if I try to use Perl threads.  That's:

> \perl\bin\perl -v

This is perl, v5.6.0 built for MSWin32-x86-multi-thread
(with 1 registered patch, see perl -V for more detail)

Copyright 1987-2000, Larry Wall

Binary build 620 provided by ActiveState Tool Corp.
http://www.ActiveState.com
Built 18:31:05 Oct 31 2000

...

If I can repair that by downloading a more recent release, let me know.

> and ensure that the next version of Perl is as slow as Python. :)

I don't want to slow them down!  To the contrary, now I've got a solid
reason for why I keep using Perl for simple high-volume text-crunching jobs
<wink>.

> Seriously: If someone comes at me with Perl-IO-is-way-faster-than-
> Python-IO, I'd like to know what concretely they've given up in order
> to achieve that performance.

My line-at-a-time test case used (rounding to nearest whole integers) 30
seconds in Python and 6 in Perl.  The result of testing many changes to
Python's implementation was that the excess 24 seconds broke down like so:

    17   spent inside internal MS threadsafe getc() lock/unlock
             routines
     5   uncertain, but evidence suggests much of it due to MS
             malloc/realloc (Perl does its own memory mgmt)
     2   for not copying directly out of the platform FILE*
             implementation struct in a highly optimized loop (like
             Perl does)

My last checkin to fileobject.c reclaimed 17 seconds on Win98SE while
remaining threadsafe, via a combination of locking per line instead of per
character, and invoking realloc much less often (only for lines exceeding
200 chars).  (BTW, I'm still curious to know how that compares to the
getc_unlocked hack on a platform other than Windows!)

> And even just for my own interest I'd like to understand the cost/
> benefit of stream thread safety.

If you're not *using* threads, or not using them to muck with the same
stream at the same time, the ratio is infinite.  And that's usually the
case.

> For instance would it make sense to just write a thread-safe
> wrapper for streams used from multiple threads?

Alas, on Windows you can't pick and choose:  you get the threadsafe libc, or
you don't.  So long as anyone may want to use threads for any reason
whatsoever, we must link with threadsafe libraries.  But, as above, on
Windows we're not paying much for that anymore in this case (unless maybe
the threadsafe MS malloc family is also outrageously slower than its
careless counterpart ...).  It does prevent me from persuing the "optimized
inner loop" business, because MS doesn't expose its locking primitives (so I
can't do in C everything I would need to do to optimize the inner loop while
remaining threadsafe).

there-are-damn-few-pieces-of-libc-we-wouldn't-be-better-off-
    writing-ourselves-but-then-we'd-have-a-much-harder-time-
    playing-with-others'-code-ly y'rs  - tim



From akuchlin@mems-exchange.org  Mon Jan  8 21:15:34 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 16:15:34 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 03:59:40PM -0500
References: <3A59F00E.53A0A32A@ActiveState.com> <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>
Message-ID: <20010108161534.A2392@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 03:59:40PM -0500, Tim Peters wrote:
>200 chars).  (BTW, I'm still curious to know how that compares to the
>getc_unlocked hack on a platform other than Windows!)

On Solaris and Linux, the results seemed to be lost in the noise.
Repeated runs of filetest.py were sometimes faster than without
USE_MS_GETLINE_HACK, so the variation is probably large enough to
swamp any difference between the two.  (Assuming I enabled the getline
hack correctly of course; someone please replicate...)

--amk

Linux: w/o USE_MS_GETLINE_HACK
kronos Python-2.0>./python ~/filetest.py
total 1559913 chars and 32513 lines
count_chars_lines     0.186  0.190
readlines_sizehint    0.108  0.110
using_fileinput       0.447  0.450
while_readline        0.184  0.180

Linux w/ USE_MS_GETLINE_HACK:
kronos Python-2.0>./python ~/filetest.py
total 1559913 chars and 32513 lines
count_chars_lines     0.178  0.180
readlines_sizehint    0.108  0.110
using_fileinput       0.434  0.430
while_readline        0.183  0.190                                              
Solaris w/o USE_MS_GETLINE_HACK:
amarok src>./python ~/filetest.py
total 1559913 chars and 32513 lines
count_chars_lines     0.640  0.630
readlines_sizehint    0.278  0.280
using_fileinput       1.874  1.820
while_readline        0.839  0.840

Solaris w/ USE_MS_GETLINE_HACK:
amarok src>./python ~/filetest.py
total 1559913 chars and 32513 lines
count_chars_lines     0.569  0.570
readlines_sizehint    0.275  0.280
using_fileinput       1.902  1.900
while_readline        0.769  0.770


From gstein@lyra.org  Mon Jan  8 21:29:40 2001
From: gstein@lyra.org (Greg Stein)
Date: Mon, 8 Jan 2001 13:29:40 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010108161534.A2392@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Mon, Jan 08, 2001 at 04:15:34PM -0500
References: <3A59F00E.53A0A32A@ActiveState.com> <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com> <20010108161534.A2392@kronos.cnri.reston.va.us>
Message-ID: <20010108132940.G4141@lyra.org>

On Mon, Jan 08, 2001 at 04:15:34PM -0500, Andrew Kuchling wrote:
> On Mon, Jan 08, 2001 at 03:59:40PM -0500, Tim Peters wrote:
> >200 chars).  (BTW, I'm still curious to know how that compares to the
> >getc_unlocked hack on a platform other than Windows!)
> 
> On Solaris and Linux, the results seemed to be lost in the noise.

Your times are so small... I'd suggest do a few iterations within
filetest.py so your margin of error isn't so noticable.

Cheers,
-g

>...
> Linux: w/o USE_MS_GETLINE_HACK
> kronos Python-2.0>./python ~/filetest.py
> total 1559913 chars and 32513 lines
> count_chars_lines     0.186  0.190
> readlines_sizehint    0.108  0.110
> using_fileinput       0.447  0.450
> while_readline        0.184  0.180
> 
> Linux w/ USE_MS_GETLINE_HACK:
> kronos Python-2.0>./python ~/filetest.py
> total 1559913 chars and 32513 lines
> count_chars_lines     0.178  0.180
> readlines_sizehint    0.108  0.110
> using_fileinput       0.434  0.430
> while_readline        0.183  0.190                                              
> Solaris w/o USE_MS_GETLINE_HACK:
> amarok src>./python ~/filetest.py
> total 1559913 chars and 32513 lines
> count_chars_lines     0.640  0.630
> readlines_sizehint    0.278  0.280
> using_fileinput       1.874  1.820
> while_readline        0.839  0.840
> 
> Solaris w/ USE_MS_GETLINE_HACK:
> amarok src>./python ~/filetest.py
> total 1559913 chars and 32513 lines
> count_chars_lines     0.569  0.570
> readlines_sizehint    0.275  0.280
> using_fileinput       1.902  1.900
> while_readline        0.769  0.770

-- 
Greg Stein, http://www.lyra.org/


From thomas@xs4all.net  Mon Jan  8 21:59:17 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 22:59:17 +0100
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <20010108165758.B9260@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Mon, Jan 08, 2001 at 04:57:58PM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com> <20010108214557.H402@xs4all.nl> <20010108165758.B9260@kronos.cnri.reston.va.us>
Message-ID: <20010108225916.P2467@xs4all.nl>

On Mon, Jan 08, 2001 at 04:57:58PM -0500, Andrew Kuchling wrote:

> Umm... the proposed PEP 229 patch would compile a Python binary with
> sre, posix, and strop statically linked; this minimal Python is then
> used to run the setup.py script.  You shouldn't require a preinstalled
> Python, though the current version of the patch doesn't meet this
> requirement yet.

Apologies. I should've bothered to read the PEP first, but I haven't found
the time yet :P I retract all my comments on the subject until I do.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Mon Jan  8 22:08:50 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 23:08:50 +0100
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010109000300.DF2A5A82D@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Tue, Jan 09, 2001 at 02:03:00AM +0200
References: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>, <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com> <20010108002603.X17220@lyra.org> <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il> <200101081515.KAA03474@cj20424-a.reston1.va.home.com> <20010109000300.DF2A5A82D@darjeeling.zadka.site.co.il>
Message-ID: <20010108230850.Q2467@xs4all.nl>

On Tue, Jan 09, 2001 at 02:03:00AM +0200, Moshe Zadka wrote:

> > (2) Under exactly what circumstances do you want from foo import *
> >     issue a warning?

> All.
> If you want to be less extreme, don't warn if the module defines
> a __from_star_ok__

We already have a perfectly acceptable way of turning off warnings in
particular circumstances. I'm +1 on warning against using 'from spam import
*' by the way, though it would be even better (+2!) if there was a 'import *
considered harmful' page/chapter in the documentation somewhere, so we could
point to it.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@python.org  Mon Jan  8 22:23:02 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 17:23:02 -0500
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: Your message of "Mon, 08 Jan 2001 21:38:00 +0100."
 <3A5A2528.C289BE1D@lemburg.com>
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com> <3A5A09A5.D0DC33A1@lemburg.com> <200101081936.OAA05440@cj20424-a.reston1.va.home.com>
 <3A5A2528.C289BE1D@lemburg.com>
Message-ID: <200101082223.RAA05858@cj20424-a.reston1.va.home.com>

> I was thinking an attack where knowledge of common temporary
> execution locations is used to trick Python into executing
> untrusted code -- the untrusted code would only have to be
> copied to the known temporary execution directory and then
> gets executed by Python next time the program using the temporary
> location is invoked.

When does Python execute code from a predictable common temporary
location?  When is that likely to be used from a Python script running
as root?

Note that if you use tempfile.TemporaryFile(), you can create a
temporary file that's not subvertible.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Mon Jan  8 22:35:17 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 8 Jan 2001 17:35:17 -0500 (EST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010108230850.Q2467@xs4all.nl>
References: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>
 <200101081433.JAA03185@cj20424-a.reston1.va.home.com>
 <20010107232532.V17220@lyra.org>
 <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>
 <20010108002603.X17220@lyra.org>
 <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il>
 <20010109000300.DF2A5A82D@darjeeling.zadka.site.co.il>
 <20010108230850.Q2467@xs4all.nl>
Message-ID: <14938.16549.944123.917467@cj42289-a.reston1.va.home.com>

Thomas Wouters writes:
 > *' by the way, though it would be even better (+2!) if there was a 'import *
 > considered harmful' page/chapter in the documentation somewhere, so we could
 > point to it.

  Care to write it?


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From MarkH@ActiveState.com  Mon Jan  8 23:00:01 2001
From: MarkH@ActiveState.com (Mark Hammond)
Date: Mon, 8 Jan 2001 15:00:01 -0800
Subject: [Python-Dev] Create a synthetic stdout for Windows?
In-Reply-To: <3A5A05DA.86B3EB86@interet.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPEEEGCOAA.MarkH@ActiveState.com>

> Limiting the code to pythonw.exe instead of trying to install
> it in python20.dll was supposed to prevent damage to the use
> of Python in servers.  Since pythonw.exe is a Windows (GUI) program,
> I am assuming there is a screen.

Sometimes _no_ screen at all is wanted - ie, no main GUI window, and no
console window.  pythonw is used in this case.  COM uses pythonw.exe in just
this way, and when executed by DCOM, it will be executed in a context where
the user can not see any such dialog.

However, I would be happy to ensure the correct command-line is used to
prevent this behaviour in this case.

Indeed, in _every_ case I use pythonw.exe I would disable this - but I
accept that other users have simpler requirements.

> Having said that, you may be right that there is some way to
> hang on a dialog box which can not be seen.  It depends on what
> MessageBox() and GetForegroundWindow() actually do.  If it seems
> that this patch has merit, I would be grateful if you would review
> the code to look for issues of this type.

There will be no issues in the code - it is just that Win2k will execute in
a different "workspace" (I think that is the term).  This is identical to
the problem of a service attempting to display a messagebox - the code is
perfect and works perfectly - just in a context where noone can see it, or
dismiss it.

> > I would prefer to see a decent API for extracting error and traceback
> > information from Python.  On the other hand, I _do_ see the problem for
> > "newbies" trying to use pythonw.exe.
>
> There could be an API added to the winstdout module such as
>   msg = winstdout.GetMessageText()
> which would return saved text, control its display etc.

I was thinking more of a "Py_GetTraceback()", which would return a complete
exception string.

Thus, embedders could write code similar to:

  whatever = Py_BuildValue(...);
  ret = PyObject_Call(foo, whatever);
  ...
  if (!ok) {
    char *text = Py_GetTraceback();
    MsgBox(text);
  }

Thus, with only a small amount of work, they have _complete_ control over
the output.  However, I agree this doesnt really solve pythonw.exe's
problems.

> I do not view winstdout as a "newbie" feature, but rather a
> generally useful C-language addition to Python.

Hrm.  I dont believe a commercial app, for example, would find this
suitable - they would roll their own solution.

Hence I see this purely for newbie users.  Advanced users have complete
control now - a simple try/except block around their main code, and you are
pretty good.  A builtin module for displaying a messagebox is as robust as
an experienced user needs to emulate this, IMO.

> I guess I am saying, perhaps incorrectly, that the mechanism provided
> will make further redirection of sys.stdout unnecessary 99% of the
> time.

Yes, I disagree here.  IMO it is no good for a commercial, real app.  As I
said, I see this as a feature so the newbie will not believe pythonw.exe is
broken.  Advanced users can already do similar things themselves.

> Experimentation shows that Python composes tracebacks and
> error messages a line or partial line at a time.  That is, you can
> not display each call to printf(), but must wait until the system is
> idle to be sure that multiple calls to printf() are complete.  So this
> forces you to use the idle processing loop, not rocket science but
> at least inconvenient.

What "idle processing loop"?

> And the only source of stdout/err is tracebacks,
> error messages and the "print" statement.  What would you do with
> these in a Windows program except display an "OK" dialog box?

Log the error to a file, and display a "friendly" dialog - possibly offering
to automatically submit a support request/bug report.

The casual user is going to be _very_ scared by a Python traceback.  This is
a sin of a similar magnitude to those crappy applications with unhandled VB
exceptions.

IMO, nothing looks more unprofessional than an app that displays an internal
VB error message.  Python is no different IMO.  For real applications, there
is a good chance that the majority of your users have never heard of Python.

Thus, I don't believe your solution suitable for the real, professional,
commercial user.  However, I agree that your solution does not prevent this
user doing the "right thing"...

But all this does keep me believing this is a "newbie" helper.

>
> If someone out there knows of a different example of sys.stdout
> redirection in use in the real world, it would be helpful if
> they would describe it.  Maybe it could be incorporated.

Sure.  Komodo to a file with a friendly dialog (sometimes ;-).

Pythonwin actually attempts a few things first - eg, not every exception
Pythonwin casues at startup should be logged.

Python services write unhandled errors to the event log.

I don't believe I have worked on 2 projects with the same requirement
here!!!

Mark.



From nas@arctrix.com  Mon Jan  8 16:22:10 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Mon, 8 Jan 2001 08:22:10 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 03:59:40PM -0500
References: <3A59F00E.53A0A32A@ActiveState.com> <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>
Message-ID: <20010108082210.A16149@glacier.fnational.com>

On Mon, Jan 08, 2001 at 03:59:40PM -0500, Tim Peters wrote:
> My line-at-a-time test case used (rounding to nearest whole integers) 30
> seconds in Python and 6 in Perl.  The result of testing many changes to
> Python's implementation was that the excess 24 seconds broke down like so:
> 
>     17   spent inside internal MS threadsafe getc() lock/unlock
>              routines
>      5   uncertain, but evidence suggests much of it due to MS
>              malloc/realloc (Perl does its own memory mgmt)
>      2   for not copying directly out of the platform FILE*
>              implementation struct in a highly optimized loop (like
>              Perl does)

Have you tried pymalloc?  

  Neil


From billtut@microsoft.com  Tue Jan  9 00:38:14 2001
From: billtut@microsoft.com (Bill Tutt)
Date: Mon, 8 Jan 2001 16:38:14 -0800
Subject: [Python-Dev] Create a synthetic stdout for Windows?
Message-ID: <58C671173DB6174A93E9ED88DCB0883D0A6202@red-msg-07.redmond.corp.microsoft.com>

> From: 	Mark Hammond [mailto:MarkH@ActiveState.com] 

> There will be no issues in the code - it is just that Win2k will execute
in
> a different "workspace" (I think that is the term).  This is identical to
> the problem of a service attempting to display a messagebox - the code is
> perfect and works perfectly - just in a context where noone can see it, or
> dismiss it.


The term Mark is looking for here is Windowstation, and it's an NT thing,
not just a Win2k thing. Windowstations have been around for ages.

Bill


From ping@lfw.org  Tue Jan  9 01:51:15 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 8 Jan 2001 17:51:15 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <14938.6594.44596.509259@beluga.mojam.com>
Message-ID: <Pine.LNX.4.10.10101081749580.5156-100000@skuld.kingmanhall.org>

On Mon, 8 Jan 2001, Skip Montanaro wrote:
> Okay, how about this as a compromise first step?  Allow programmers to put
> __exports__ lists in their modules but don't do anything with them *except*
> modify dir() to respect that if it exists?

I'd say: Just have dir() and import * pay attention to __exports__.
Don't mess with getattr or __dict__.


-- ?!ng

Happiness comes more from loving than being loved; and often when our
affection seems wounded it is is only our vanity bleeding. To love, and
to be hurt often, and to love again--this is the brave and happy life.
    -- J. E. Buchrose 



From ping@lfw.org  Tue Jan  9 02:00:08 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 8 Jan 2001 18:00:08 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59F27D.C27B8CD0@ActiveState.com>
Message-ID: <Pine.LNX.4.10.10101081751530.5156-100000@skuld.kingmanhall.org>

On Mon, 8 Jan 2001, Paul Prescod wrote:
> dir() is one of the "interactive tools" I'd like to work better in the
> presence of __exports__. On the other hand, dir() works pretty poorly
> for object instances today so maybe we need something new anyhow. 

I suggest a built-in function "methods()" that works like this:

    def methods(obj):
        if type(obj) is InstanceType: return methods(obj.__class__)
        results = []
        if hasattr(obj, '__bases__'):
            for base in obj.__bases__:
                results.extend(methods(base))
        results.extend(
            filter(lambda k, o=obj: type(getattr(o, k)) in
                   [MethodType, BuiltinMethodType], dir(obj)))
        return unique(results)

    def unique(seq):
        dict = {}
        for item in seq: dict[item] = 1
        results = dict.keys()
        results.sort()
        return results


    >>> import sys
    >>> 
    >>> methods(sys.stdin)
    ['close', 'fileno', 'flush', 'isatty', 'read', 'readinto', 'readline', 'readlines', 'seek', 'tell', 'truncate', 'write', 'writelines']
    >>>        
    >>> import SocketServer
    >>> 
    >>> methods(SocketServer.ForkingTCPServer)
    ['__init__', 'collect_children', 'fileno', 'finish_request', 'get_request', 'handle_error', 'handle_request', 'process_request', 'serve_forever', 'server_activate', 'server_bind', 'verify_request']
    >>> 



-- ?!ng

Happiness comes more from loving than being loved; and often when our
affection seems wounded it is is only our vanity bleeding. To love, and
to be hurt often, and to love again--this is the brave and happy life.
    -- J. E. Buchrose 



From gstein@lyra.org  Tue Jan  9 02:20:56 2001
From: gstein@lyra.org (Greg Stein)
Date: Mon, 8 Jan 2001 18:20:56 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects fileobject.c,2.102,2.103
In-Reply-To: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Mon, Jan 08, 2001 at 06:00:13PM -0800
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010108182056.C4640@lyra.org>

On Mon, Jan 08, 2001 at 06:00:13PM -0800, Guido van Rossum wrote:
>...
> Modified Files:
> 	fileobject.c 
> Log Message:
> Tsk, tsk, tsk.  Treat FreeBSD the same as the other BSDs when defining
> a fallback for TELL64.  Fixes SF Bug #128119.
>...
> *** fileobject.c	2001/01/08 04:02:07	2.102
> --- fileobject.c	2001/01/09 02:00:11	2.103
> ***************
> *** 59,63 ****
>   #if defined(MS_WIN64)
>   #define TELL64 _telli64
> ! #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
>   /* NOTE: this is only used on older
>      NetBSD prior to f*o() funcions */
> --- 59,63 ----
>   #if defined(MS_WIN64)
>   #define TELL64 _telli64
> ! #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(__FreeBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
>   /* NOTE: this is only used on older
>      NetBSD prior to f*o() funcions */

All of those #ifdefs could be tossed and it would be more robust (long term)
if an autoconf macro were used to specify when TELL64 should be defined.

[ I've looked thru fileobject.c and am a bit confused: the conditions for
  defining TELL64 do not match the conditions for *using* it. that would
  seem to imply a semantic error somewhere and/or a potential gotcha when
  they get skewed (like I assume what happened to FreeBSD). simplifying with
  an autoconf macro may help to rationalize it. ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim.one@home.com  Tue Jan  9 04:29:02 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 8 Jan 2001 23:29:02 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010108161534.A2392@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHPIHAA.tim.one@home.com>

[Andrew Kuchling]

I'll chop everything except while_readline (which is most affected by this
stuff):

> Linux: w/o USE_MS_GETLINE_HACK
> while_readline        0.184  0.180
>
> Linux w/ USE_MS_GETLINE_HACK:
> while_readline        0.183  0.190
>
> Solaris w/o USE_MS_GETLINE_HACK:
> while_readline        0.839  0.840
>
> Solaris w/ USE_MS_GETLINE_HACK:
> while_readline        0.769  0.770

So it's probably a wash.  In that case, do we want to maintain two hacks for
this?  I can't use the FLOCKFILE/etc approach on Windows, while "the
Windows" approach probably works everywhere (although its speed relies on
the platform factoring out at least the locking/unlocking in fgets).

Both methods lack a refinement I would like to see, but can't achieve in
"the Windows way":  ensure that consistency is on no worse than a per-line
basis.  Right now, both methods lock/unlock the file only for the extent of
the current buffer size, so that two threads *can* get back different
interleaved pieces of a single long line.  Like so:

import thread

def read(f):
    x = f.readline()
    print "thread saw " + `len(x)` + " chars"
    m.release()

f = open("ga", "w") # a file with one long line
f.write("x" * 100000 + "\n")
f.close()

m = thread.allocate_lock()
for i in range(10):
    print i
    f = open("ga", "r")
    m.acquire()
    thread.start_new_thread(read, (f,))
    x = f.readline()
    print "main saw " + `len(x)` + " chars"
    m.acquire(); m.release()
    f.close()

Here's a typical run on Windows (current CVS Python):

0
main saw 95439 chars
thread saw 4562 chars
1
main saw 97941 chars
thread saw 2060 chars
2
thread saw 43801 chars
main saw 56200 chars
3
thread saw 8011 chars
main saw 91990 chars
4
main saw 46546 chars
thread saw 53455 chars
5
thread saw 53125 chars
main saw 46876 chars
6
main saw 98638 chars
thread saw 1363 chars
7
main saw 72121 chars
thread saw 27880 chars
8
thread saw 70031 chars
main saw 29970 chars
9
thread saw 27555 chars
main saw 72446 chars

So, yes, it's threadsafe now:  between them, the threads always see a grand
total of 100001 characters.  But what friggin' good is that <wink>?  If,
e.g., Guido wants multiple threads to chew over his giant logfile, there's
no guarantee that .readline() ever returns an actual line from the file.

Not that Python 2.0 was any better in this respect ...



From tim.one@home.com  Tue Jan  9 04:48:25 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 8 Jan 2001 23:48:25 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010108082210.A16149@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEIAIHAA.tim.one@home.com>

[Tim]
>      5   uncertain, but evidence suggests much of it due to MS
>              malloc/realloc (Perl does its own memory mgmt)

[NeilS]
> Have you tried pymalloc?

Not recently, and don't expect to find time for it this week.  IIRC,
Vladimir did get significant speedups-- lo those many years ago! --when he
tried it on Windows, though.  Maybe (or maybe not) that was due to
exploiting the global lock (i.e., exploiting that pymalloc didn't need to do
its own serialization, when called from the Python core).



From tim.one@home.com  Tue Jan  9 04:52:25 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 8 Jan 2001 23:52:25 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHPIHAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEIAIHAA.tim.one@home.com>

[Tim]
> ...
> Here's a typical run on Windows (current CVS Python):
>
> 0
> main saw 95439 chars
> thread saw 4562 chars
> 1
> main saw 97941 chars
> thread saw 2060 chars
> 2
> thread saw 43801 chars
> main saw 56200 chars
> 3
> thread saw 8011 chars
> main saw 91990 chars
> 4
> main saw 46546 chars
> thread saw 53455 chars
> 5
> thread saw 53125 chars
> main saw 46876 chars
> 6
> main saw 98638 chars
> thread saw 1363 chars
> 7
> main saw 72121 chars
> thread saw 27880 chars
> 8
> thread saw 70031 chars
> main saw 29970 chars
> 9
> thread saw 27555 chars
> main saw 72446 chars

Oops!  I lied.  That was the released 2.0.  Current CVS is either better or
worse, depending on whether you think "working" by accident more often is a
good thing or leads to false confidence <wink>:

0
main saw 100001 chars
thread saw 0 chars
1
main saw 100001 chars
thread saw 0 chars
2
main saw 100001 chars
thread saw 0 chars
3
main saw 100001 chars
thread saw 0 chars
4
main saw 100001 chars
thread saw 0 chars
5
thread saw 25802 chars
main saw 74199 chars
6
thread saw 802 chars
main saw 99199 chars
7
main saw 100001 chars
thread saw 0 chars
8
main saw 100001 chars
thread saw 0 chars
9
main saw 100001 chars
thread saw 0 chars



From mal@lemburg.com  Tue Jan  9 07:23:42 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 09 Jan 2001 08:23:42 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com> <3A5A09A5.D0DC33A1@lemburg.com> <200101081936.OAA05440@cj20424-a.reston1.va.home.com>
 <3A5A2528.C289BE1D@lemburg.com> <200101082223.RAA05858@cj20424-a.reston1.va.home.com>
Message-ID: <3A5ABC7E.E953962B@lemburg.com>

Guido van Rossum wrote:
> 
> > I was thinking an attack where knowledge of common temporary
> > execution locations is used to trick Python into executing
> > untrusted code -- the untrusted code would only have to be
> > copied to the known temporary execution directory and then
> > gets executed by Python next time the program using the temporary
> > location is invoked.
> 
> When does Python execute code from a predictable common temporary
> location?  When is that likely to be used from a Python script running
> as root?
> 
> Note that if you use tempfile.TemporaryFile(), you can create a
> temporary file that's not subvertible.

It's not Python itself that's running temporary files. Tools
like distutils, RPM, etc. tend to run Python code in temporary
locations during build stages. That's what I was thinking about.
OTOH, root should know where these tools run their code, so
I guess it's moot to discuss who's fault this really is, e.g.
distutils style distributions should never be unzipped to /tmp
for subsequent installation, but nobody will prevent root
from doing so.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tim.one@home.com  Tue Jan  9 07:35:09 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 9 Jan 2001 02:35:09 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEIGIHAA.tim.one@home.com>

[Guido]
> Are you sure Perl still uses stdio at all?

I've got solid answers now, but I'll paraphrase them anonymously to save the
bother of untangling multi-person email etiquette snarls:

+ Yes, Perl uses platform stdio.  Usually.  Yes on Windows anyway.

+ But Perl "cheats" on Windows (well, everywhere it can ...), as I've
explained in great detail half a dozen times over the years.  No reason to
retract any of that.

+ The cheating is not thread-safe.

+ The last stab at threads accessible from Perl was an experiment that got
dropped.  There are no user-muckable threads in std Perl builds.

+ But there is a notion of threads available at the C level.

+ This latter notion of threads is used to implement Perl's fork() on
Windows, so can be exploited to test Windows Perl thread safety without
writing a Perl extension module in C.

+ This Perl program (very much like the 2-threaded one I just posted for
Python) uses that trick:

-------------------------------------------------------------------
sub counter {
    my $nc = 0;
    while (<FILE>) {
        $nc += length;
    }
    print "num bytes seen = $nc\n";
}

open(FILE, "ga");
binmode FILE;

fork();
&counter();
-------------------------------------------------------------------

Under the covers, that really shares the FILE filehandle on Windows via
threads.  Running it multiple times yields multiple wild results; the number
of bytes seen by parent and child rarely sum to the number of bytes actually
in the input file ("ga").  The most common output for me is that one thread
sees the entire file, while the other sees "a lot" of it (since the Perl
inner loop registerizes its FILE* struct member shadows for as long as
possible, that's actually what I expected).

So the code is exactly as thread-unsafe as it looked.

bosses-demand-answers-but-they-forget-their-questions<wink>-ly
    y'rs  - tim



From guido@python.org  Tue Jan  9 13:41:24 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 09 Jan 2001 08:41:24 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Mon, 08 Jan 2001 23:29:02 EST."
 <LNBBLJKPBEHFEDALKOLCIEHPIHAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCIEHPIHAA.tim.one@home.com>
Message-ID: <200101091341.IAA09132@cj20424-a.reston1.va.home.com>

> So it's probably a wash.  In that case, do we want to maintain two hacks for
> this?  I can't use the FLOCKFILE/etc approach on Windows, while "the
> Windows" approach probably works everywhere (although its speed relies on
> the platform factoring out at least the locking/unlocking in fgets).

I'm much more confident about the getc_unlocked() approach than about
fgets() -- with the latter we need much more faith in the C library
implementers.  (E.g. that fgets() never writes beyond the null bytes
it promises, and that it locks/unlocks only once.)  Also, you're
relying on blindingly fast memchr() and memset() implementations.

> Both methods lack a refinement I would like to see, but can't achieve in
> "the Windows way":  ensure that consistency is on no worse than a per-line
> basis.  [Example omitted]

The only portable way to ensure this that I can see, is to have a
separate mutex in the Python file object.  Since this is hardly a
common thing to do, I think it's better to let the application manage
that lock if they need it.

(Then why are we bothering with flockfile(), you may ask?  Because
otherwise, accidental multithreaded reading from the same file could
cause core dumps.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Tue Jan  9 15:48:13 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Tue, 9 Jan 2001 10:48:13 -0500
Subject: [Python-Dev] Python 2.1 release schedule (PEP 226)
In-Reply-To: <200101051529.KAA19100@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Jan 05, 2001 at 10:29:05AM -0500
References: <200101051529.KAA19100@cj20424-a.reston1.va.home.com>
Message-ID: <20010109104813.D6203@kronos.cnri.reston.va.us>

On Fri, Jan 05, 2001 at 10:29:05AM -0500, Guido van Rossum wrote:
> S   222  pep-0222.txt  Web Library Enhancements               Kuchling
>
>	  This is really up to Andrew.  It seems he plans to create
>	  new modules, so he won't be introducing incompatibilities in
>	  existing APIs.

I don't think PEP 222 will be worked on for 2.1; there have only been
a few reactions, and none at all on the python-web-modules mailing
list, so I don't think anyone really cares very much at this point.
Maybe for 2.2, or maybe I'll just write new classes for Quixote.

That leaves PEP 229 as the only PEP I need to work on for 2.1.

--amk


From tim.one@home.com  Tue Jan  9 21:12:42 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 9 Jan 2001 16:12:42 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101091341.IAA09132@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKAIHAA.tim.one@home.com>

[Guido]
> I'm much more confident about the getc_unlocked() approach than about
> fgets() -- with the latter we need much more faith in the C library
> implementers.  (E.g. that fgets() never writes beyond the null bytes
> it promises, and that it locks/unlocks only once.)  Also, you're
> relying on blindingly fast memchr() and memset() implementations.

Yet Andrew's timings say it's a wash on Linux and Solaris (perhaps even a
bit quicker on Solaris, despite that it's paying an extra layer of function
call per line, to keep it out of get_line proper).  That tells me the
assumptions are indeed mild.  The business about not writing beyond the null
byte is a concern only I would have raised:  the possibility is an
aggressively paranoid reading of the std (I do *lots* of things with libc
I'm paranoid about <0.9 wink>).  If even *Microsoft* didn't blow these
things, it's hard to imagine any other vendor exploding ...

Still, I'd rather get rid of ms_getline_hack if I could, because the code is
so much more complicated.

>> Both methods lack a refinement I would like to see, but can't
>> achieve in "the Windows way":  ensure that consistency is on no
>> worse than a per-line basis.  [Example omitted]

> The only portable way to ensure this that I can see, is to have a
> separate mutex in the Python file object.  Since this is hardly a
> common thing to do, I think it's better to let the application manage
> that lock if they need it.

Well, it would be easy to fiddle the HAVE_GETC_UNLOCKED method to keep the
file locked until the line was complete, and I wouldn't be opposed to making
life saner on platforms that allow it.  But there's another problem here:
part of the reason we release Python threads around the fgets is in case
some other thread is trying to write the data we're trying to read, yes?
But since FLOCKFILE is in effect, other threads *trying* to write to the
stream we're reading will get blocked anyway.  Seems to give us potential
for deadlocks.

> (Then why are we bothering with flockfile(), you may ask?

I wouldn't ask that, no <wink>.

> Because otherwise, accidental multithreaded reading from the same
> file could cause core dumps.)

Ugh ... turns out that on my box I can provoke core dumps anyway, with this
program.  Blows up under released 2.0 and CVS Pythons (so it's not due to
anything new):

import thread

def read(f):
    import time
    time.sleep(.01)
    n = 0
    while n < 1000000:
        x = f.readline()
        n += len(x)
        print "r",
    print "read " + `n`
    m.release()

m = thread.allocate_lock()
f = open("ga", "w+")
print "opened"
m.acquire()
thread.start_new_thread(read, (f,))
n = 0
x = "x" * 113 + "\n"
while n < 1000000:
    f.write(x)
    print "w",
    n += len(x)
m.acquire()
print "done"

Typical run:

C:\Python20>\code\python\dist\src\pcbuild\python temp.py
opened
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w r w r
w r w r w r w r w r w r w r w r w r w r w r w r w r w r w r w r w
r r w r w r w r w r w r

and then it dies in msvcrt.dll with a bad pointer.  Also dies under the
debugger (yay!) ... always dies like so:

+ We (Python) call the MS fwrite, from fileobject.c file_write.
+ MS fwrite succeeds with its _lock_str(stream) call.
+ MS fwrite then calls MS _fwrite_lk.
+ MS _fwrite_lk calls memcpy, which blows up for a non-obvious reason.

Looks like the stream's _cnt member has gone mildly negative, which
_fwrite_lk casts to unsigned and so treats like a giant positive count, and
so memcpy eventually runs off the end of the process address space.

Only thing I can conclude from this is that MS's internal stream-locking
implementation is buggy.  At least on W98SE.  Other flavors of Windows?
Other platforms?

Note that I don't claim the program above is *sensible*, just that it
shouldn't blow up.  Alas, short of indeed adding a separate mutex in Python
file objects-- or writing our own stdio --I don't believe I can fix this.

the-best-thing-to-do-with-threads-is-don't-ly y'rs  - tim



From fdrake@acm.org  Tue Jan  9 22:58:49 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 9 Jan 2001 17:58:49 -0500 (EST)
Subject: [Python-Dev] Updated development documentation
Message-ID: <14939.38825.218757.535010@cj42289-a.reston1.va.home.com>

  I've just updated the development version of the documentation, but
am not sure the automated notice got sent.
  This version contains a wide variety of smaller updates, plus added
documentation on the fpectl and xreadlines modules.


        http://python.sourceforge.net/devel-docs/


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From MarkH@ActiveState.com  Wed Jan 10 00:00:03 2001
From: MarkH@ActiveState.com (Mark Hammond)
Date: Tue, 9 Jan 2001 16:00:03 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEKAIHAA.tim.one@home.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPOEFJCOAA.MarkH@ActiveState.com>

> Only thing I can conclude from this is that MS's internal stream-locking
> implementation is buggy.  At least on W98SE.  Other flavors of Windows?
> Other platforms?

Same behaviour on Win2k for me.

Mark.


From tim.one@home.com  Wed Jan 10 00:55:11 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 9 Jan 2001 19:55:11 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEHPIGAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEKMIHAA.tim.one@home.com>

Final report (I've spent way more time on this than I can afford already, so
it's "final" by defn <0.3 wink>).  We started here (on my Win98SE box, using
Guido's test program):

total 117615824 chars and 3237568 lines
count_chars_lines    14.780 14.772
readlines_sizehint    9.390  9.375
using_fileinput      66.130 66.157
while_readline       30.380 30.337

Here's where we are today:

total 117615824 chars and 3237568 lines
count_chars_lines    14.670 14.667
readlines_sizehint    9.500  9.506
using_fileinput      28.670 28.708
while_readline       13.680 13.676
for_xreadlines        7.630  7.635

Same box, same input file, same test program except for this addition:

def for_xreadlines(fn):
    f = open(fn, MODE)
    for line in xreadlines.xreadlines(f):
        pass
    f.close()

This last is within 25% of Perl "while (<>)" speed, but-- unlike Perl --is
thread-safe.  Good show!  The other speedups are nothing to snort at either.

The strangest thing left to my eye is why xreadlines enjoys a significant
advantage over the double-loop buffering method (readlines_sizehint) on my
box; reducing the very large (1Mb) buffer in Guido's test program made no
material difference to that.

nothing's-ever-finished-but-everything-ends-ly y'rs  - tim



From tim.one@home.com  Wed Jan 10 05:46:24 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 10 Jan 2001 00:46:24 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPOEFJCOAA.MarkH@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKELFIHAA.tim.one@home.com>

[Tim]
> Only thing I can conclude from this is that MS's internal stream-
> locking implementation is buggy.  At least on W98SE.  Other flavors
> of Windows?  Other platforms?

[Mark Hammond]
> Same behaviour on Win2k for me.

Thanks, Mark!  I opened a bug on SF to record more clues:

http://sourceforge.net/bugs/?func=detailbug&bug_id=128210&group_id=5470

I didn't assign it to anyone because-- best I can tell --there's nothing
realistic we can do about it.  Probably won't happen in practice anyway
<wink>.

there's-a-reason-thread-problems-pop-up-on-windows-first-but-
    ms-isn't-it-ly y'rs  - tim



From billtut@microsoft.com  Wed Jan 10 09:10:51 2001
From: billtut@microsoft.com (Bill Tutt)
Date: Wed, 10 Jan 2001 01:10:51 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <58C671173DB6174A93E9ED88DCB0883DB863A8@red-msg-07.redmond.corp.microsoft.com>

With a nice simple C test case from Tim, I've submitted this one to internal
support.
I'll let everybody know what happens when I know more.

Bill

 -----Original Message-----
From: 	Tim Peters [mailto:tim.one@home.com] 
Sent:	Tuesday, January 09, 2001 9:46 PM
To:	python-dev@python.org
Subject:	RE: [Python-Dev] xreadlines : readlines :: xrange : range

[Tim]
> Only thing I can conclude from this is that MS's internal stream-
> locking implementation is buggy.  At least on W98SE.  Other flavors
> of Windows?  Other platforms?

[Mark Hammond]
> Same behaviour on Win2k for me.

Thanks, Mark!  I opened a bug on SF to record more clues:

http://sourceforge.net/bugs/?func=detailbug&bug_id=128210&group_id=5470

I didn't assign it to anyone because-- best I can tell --there's nothing
realistic we can do about it.  Probably won't happen in practice anyway
<wink>.

there's-a-reason-thread-problems-pop-up-on-windows-first-but-
    ms-isn't-it-ly y'rs  - tim


_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://www.python.org/mailman/listinfo/python-dev


From m.favas@per.dem.csiro.au  Wed Jan 10 11:57:56 2001
From: m.favas@per.dem.csiro.au (Mark Favas)
Date: Wed, 10 Jan 2001 19:57:56 +0800
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
Message-ID: <3A5C4E44.23B593E9@per.dem.csiro.au>

Just Another Data Point - my box (DEC Alpha, Tru64 Unix) shows the same
behaviour as Tim's WinBox wrt the new xreadline and the double-loop
readlines (so it's not just something funny with MS (not that there's
not anything funny with MS...)):

total 131426612 chars and 514216 lines
count_chars_lines     5.450  5.066
readlines_sizehint    4.112  4.083
using_fileinput      10.928 10.916
while_readline       11.766 11.733
for_xreadlines        3.569  3.533

-- 
Mark Favas  -   m.favas@per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA


From tismer@tismer.com  Wed Jan 10 11:06:42 2001
From: tismer@tismer.com (Christian Tismer)
Date: Wed, 10 Jan 2001 13:06:42 +0200
Subject: [Python-Dev] Add __exports__ to modules
References: <Pine.LNX.4.10.10101081749580.5156-100000@skuld.kingmanhall.org>
Message-ID: <3A5C4242.E445C3A1@tismer.com>


Ka-Ping Yee wrote:
> 
> On Mon, 8 Jan 2001, Skip Montanaro wrote:
> > Okay, how about this as a compromise first step?  Allow programmers to put
> > __exports__ lists in their modules but don't do anything with them *except*
> > modify dir() to respect that if it exists?
> 
> I'd say: Just have dir() and import * pay attention to __exports__.
> Don't mess with getattr or __dict__.

quadruple-nodd - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From mal@lemburg.com  Wed Jan 10 13:21:28 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 10 Jan 2001 14:21:28 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
Message-ID: <3A5C61D8.2E5D098C@lemburg.com>

Guido van Rossum wrote:
> 
> Please have a look at this SF patch:
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> 
> This implements control over which names defined in a module are
> externally visible: if there's a variable __exports__ in the module,
> it is a list of identifiers, and any access from outside the module to
> names not in the list is disallowed.  This affects access using the
> getattr and setattr protocols (which raise AttributeError for
> disallowed names), as well as "from M import v" (which raises
> ImportError).

Can't we use the existing attribute __all__ (this is currently
only used for packages) for this kind of thing. As other have already
remarked: I would rather like to see this attribute being used
as basis for 'from M import *' rather than enforce the access
restrictions like the patch suggests.

Access control mechanisms should be treated in different ways
such as wrapping objects using access-control proxies (see mx.Proxy
for an example of such an implementation) and on-demand only.
I wouldn't wan't to pay the performance hit for each and every
lookup in all my Python applications just because someone out
there feels that "from M import *" has a meaning in life
apart from being useful in interactive sessions to ease typing ;-)
 
> I like it.  This has been asked for many times.  Does anybody see a
> reason why this should *not* be added?
> 
> Tim remarked that introducing this will prompt demands for a similar
> feature on classes and instances, where it will be hard to implement
> without causing a bit of a slowdown.  It causes a slight slowdown (an
> extra dictionary lookup for each use of "M.v") even when it is not
> used, but for accessing module variables that's acceptable.  I'm not
> so sure about instance variable references.

Again, I'd rather see these implemented using different
techniques which are under programmer control and made
explicit and visible in the program flow. Proxies are ideal
for these things, since they allow great flexibility while
still providing reasonable security at Python level.

I have been using the proxy approach for years now and 
so far with great success. What's even better is that
weak references and garbage finalization aids come along with
it for free.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@python.org  Wed Jan 10 15:12:56 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 10:12:56 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 09 Jan 2001 19:55:11 EST."
 <LNBBLJKPBEHFEDALKOLCKEKMIHAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCKEKMIHAA.tim.one@home.com>
Message-ID: <200101101512.KAA26193@cj20424-a.reston1.va.home.com>

> The strangest thing left to my eye is why xreadlines enjoys a significant
> advantage over the double-loop buffering method (readlines_sizehint) on my
> box; reducing the very large (1Mb) buffer in Guido's test program made no
> material difference to that.

I was baffled at this too (same difference on my box), until I
discovered that the buffer size is specified *twice*: once as a
default in the arg list of readlines_sizehint(), then *again* in the
call to timer() near the bottom of the file.

Take the latter one out and the times are comparable, in fact
readlines_sizehint() is a few percent quicker.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jim@interet.com  Wed Jan 10 15:19:01 2001
From: jim@interet.com (James C. Ahlstrom)
Date: Wed, 10 Jan 2001 10:19:01 -0500
Subject: [Python-Dev] Create a synthetic stdout for Windows?
References: <LCEPIIGDJPKCOIHOBJEPEEEGCOAA.MarkH@ActiveState.com>
Message-ID: <3A5C7D65.780065C6@interet.com>

Mark Hammond wrote:

> Sometimes _no_ screen at all is wanted - ie, no main GUI window, and no
> console window.  pythonw is used in this case.  COM uses pythonw.exe in just
> this way, and when executed by DCOM, it will be executed in a context where
> the user can not see any such dialog.
> 
> However, I would be happy to ensure the correct command-line is used to
> prevent this behaviour in this case.
> 
> Indeed, in _every_ case I use pythonw.exe I would disable this - but I
> accept that other users have simpler requirements.

It would be easier to have a pythonw2.exe where this feature is
built in, rather than a command line option.  But see below.
 
> > I do not view winstdout as a "newbie" feature, but rather a
> > generally useful C-language addition to Python.
> 
> Hrm.  I dont believe a commercial app, for example, would find this
> suitable - they would roll their own solution.
...
> > I guess I am saying, perhaps incorrectly, that the mechanism provided
> > will make further redirection of sys.stdout unnecessary 99% of the
> > time.
> 
> Yes, I disagree here.  IMO it is no good for a commercial, real app.  As I
...
> > If someone out there knows of a different example of sys.stdout
> > redirection in use in the real world, it would be helpful if
> > they would describe it.  Maybe it could be incorporated.
> 
> Sure.  Komodo to a file with a friendly dialog (sometimes ;-).
...
> I don't believe I have worked on 2 projects with the same requirement
> here!!!

Well, that is the problem.  Is this feature "generally useful"?
I am writing Windows programs in which Python is the "main"
and provides the GUI, so I find this useful.  And I do show
my users tracebacks.  But perhaps this is unique to me.  I
don't see users of wxPython nor tkinter replying "great idea"
so maybe they don't use pythonw.

Absent more support, I don't think this idea has enough
merit to justify a patch.

JimA


From guido@python.org  Wed Jan 10 16:39:34 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 11:39:34 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 10 Jan 2001 01:10:51 PST."
 <58C671173DB6174A93E9ED88DCB0883DB863A8@red-msg-07.redmond.corp.microsoft.com>
References: <58C671173DB6174A93E9ED88DCB0883DB863A8@red-msg-07.redmond.corp.microsoft.com>
Message-ID: <200101101639.LAA26776@cj20424-a.reston1.va.home.com>

> With a nice simple C test case from Tim, I've submitted this one to internal
> support.
> I'll let everybody know what happens when I know more.

I bet you it's rejected on the basis of "the docs tell you not to mix
reading and writing on the same stream without intervening seek or
flush."  If I were on the support line I would do that.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Jan 10 16:38:16 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 11:38:16 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 09 Jan 2001 16:12:42 EST."
 <LNBBLJKPBEHFEDALKOLCGEKAIHAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCGEKAIHAA.tim.one@home.com>
Message-ID: <200101101638.LAA26759@cj20424-a.reston1.va.home.com>

> [Guido]
> > I'm much more confident about the getc_unlocked() approach than about
> > fgets() -- with the latter we need much more faith in the C library
> > implementers.  (E.g. that fgets() never writes beyond the null bytes
> > it promises, and that it locks/unlocks only once.)  Also, you're
> > relying on blindingly fast memchr() and memset() implementations.

[Tim]
> Yet Andrew's timings say it's a wash on Linux and Solaris (perhaps even a
> bit quicker on Solaris, despite that it's paying an extra layer of function
> call per line, to keep it out of get_line proper).  That tells me the
> assumptions are indeed mild.  The business about not writing beyond the null
> byte is a concern only I would have raised:  the possibility is an
> aggressively paranoid reading of the std (I do *lots* of things with libc
> I'm paranoid about <0.9 wink>).  If even *Microsoft* didn't blow these
> things, it's hard to imagine any other vendor exploding ...
> 
> Still, I'd rather get rid of ms_getline_hack if I could, because the code is
> so much more complicated.

Which is another argument to prefer the getc_unlocked() code when it
works -- it's obviously correct. :-)

> >> Both methods lack a refinement I would like to see, but can't
> >> achieve in "the Windows way":  ensure that consistency is on no
> >> worse than a per-line basis.  [Example omitted]
> 
> > The only portable way to ensure this that I can see, is to have a
> > separate mutex in the Python file object.  Since this is hardly a
> > common thing to do, I think it's better to let the application manage
> > that lock if they need it.
> 
> Well, it would be easy to fiddle the HAVE_GETC_UNLOCKED method to keep the
> file locked until the line was complete, and I wouldn't be opposed to making
> life saner on platforms that allow it.

Hm...  That would be possible, except for one unfortunate detail:
_PyString_Resize() may call PyErr_BadInternalCall() which touches
thread state.

> But there's another problem here:
> part of the reason we release Python threads around the fgets is in case
> some other thread is trying to write the data we're trying to read, yes?

NO, NO NO!  Mixing reads and writes on the same stream wasn't what we
are locking against at all.  (As you've found out, it doesn't even
work.)  We're only trying to protect against concurrent *reads*.

> But since FLOCKFILE is in effect, other threads *trying* to write to the
> stream we're reading will get blocked anyway.  Seems to give us potential
> for deadlocks.

Only if tyeh are holding other locks at the same time.  I haven't done
a thorough survey of fileobject.c, but I've skimmed it, I believe it's
religious about releasing the Global Interpreter Lock around I/O
calls.  But, of course, 3rd party C code might not be.

> > (Then why are we bothering with flockfile(), you may ask?
> 
> I wouldn't ask that, no <wink>.
> 
> > Because otherwise, accidental multithreaded reading from the same
> > file could cause core dumps.)
> 
> Ugh ... turns out that on my box I can provoke core dumps anyway, with this
> program.  Blows up under released 2.0 and CVS Pythons (so it's not due to
> anything new):

Yeah.  But this is insane use -- see my comments on SF.  It's only
worth fixing because it could be used to intentionally crash Python --
but there are easier ways...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@mojam.com (Skip Montanaro)  Wed Jan 10 16:41:47 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 10 Jan 2001 10:41:47 -0600 (CST)
Subject: [Python-Dev] Shouldn't the Mac be listed as an environment?
Message-ID: <14940.37067.893679.750918@beluga.mojam.com>

I just noticed that the "Environment" options for Python on the SF site are
listed as

     Console (Text Based), Win32 (MS Windows), X11 Applications

Shouldn't something Macintosh-related be in that list as well?

Skip


From guido@python.org  Wed Jan 10 16:53:16 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 11:53:16 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Wed, 10 Jan 2001 14:21:28 +0100."
 <3A5C61D8.2E5D098C@lemburg.com>
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
 <3A5C61D8.2E5D098C@lemburg.com>
Message-ID: <200101101653.LAA28986@cj20424-a.reston1.va.home.com>

> Guido van Rossum wrote:
> > 
> > Please have a look at this SF patch:
> > 
> > http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> > 
> > This implements control over which names defined in a module are
> > externally visible: if there's a variable __exports__ in the module,
> > it is a list of identifiers, and any access from outside the module to
> > names not in the list is disallowed.  This affects access using the
> > getattr and setattr protocols (which raise AttributeError for
> > disallowed names), as well as "from M import v" (which raises
> > ImportError).

[Marc-Andre]
> Can't we use the existing attribute __all__ (this is currently
> only used for packages) for this kind of thing. As other have already
> remarked: I would rather like to see this attribute being used
> as basis for 'from M import *' rather than enforce the access
> restrictions like the patch suggests.

Yes -- I came up with the same thought.

So here's a plan: somebody please submit a patch that does only one
thing: from...import * looks for __all__ and if it exists, imports
exactly those names.  No changes to dir(), or anything.

> Access control mechanisms should be treated in different ways
> such as wrapping objects using access-control proxies (see mx.Proxy
> for an example of such an implementation) and on-demand only.
> I wouldn't wan't to pay the performance hit for each and every
> lookup in all my Python applications just because someone out
> there feels that "from M import *" has a meaning in life
> apart from being useful in interactive sessions to ease typing ;-)

In the process of looking into Zope internals I've noticed that
proxies are indeed very useful!

I note that the IMPORT opcodes in ceval.c require that the imported
module (as found in sys.modules[name] or returned by __import__()) is
a real module object.  I think this is unnecessary -- at least
IMPORT_FROM should work even if the module is a proxy or some other
thing (I've been known to smuggle class instances into sys.modules :-)
and IMPORT_STAR should work with a non-module at least if it has an
__all__ attribute.

> > I like it.  This has been asked for many times.  Does anybody see a
> > reason why this should *not* be added?
> > 
> > Tim remarked that introducing this will prompt demands for a similar
> > feature on classes and instances, where it will be hard to implement
> > without causing a bit of a slowdown.  It causes a slight slowdown (an
> > extra dictionary lookup for each use of "M.v") even when it is not
> > used, but for accessing module variables that's acceptable.  I'm not
> > so sure about instance variable references.
> 
> Again, I'd rather see these implemented using different
> techniques which are under programmer control and made
> explicit and visible in the program flow. Proxies are ideal
> for these things, since they allow great flexibility while
> still providing reasonable security at Python level.
> 
> I have been using the proxy approach for years now and 
> so far with great success. What's even better is that
> weak references and garbage finalization aids come along with
> it for free.

Agreed.  Which reminds me -- would you mind reviewing Fred's new
version of PEP 205 (weak refs)?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Wed Jan 10 17:12:20 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 10 Jan 2001 18:12:20 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
 <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <3A5C97F4.945D0C1@lemburg.com>

Guido van Rossum wrote:
> 
> > Guido van Rossum wrote:
> > >
> > > Please have a look at this SF patch:
> > >
> > > http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> > >
> > > This implements control over which names defined in a module are
> > > externally visible: if there's a variable __exports__ in the module,
> > > it is a list of identifiers, and any access from outside the module to
> > > names not in the list is disallowed.  This affects access using the
> > > getattr and setattr protocols (which raise AttributeError for
> > > disallowed names), as well as "from M import v" (which raises
> > > ImportError).
> 
> [Marc-Andre]
> > Can't we use the existing attribute __all__ (this is currently
> > only used for packages) for this kind of thing. As other have already
> > remarked: I would rather like to see this attribute being used
> > as basis for 'from M import *' rather than enforce the access
> > restrictions like the patch suggests.
> 
> Yes -- I came up with the same thought.

Sorry, I didn't read the whole thread on the topic. Rereading the
above paragraph I guess I should have had some more coffee at the
time of writing ;-)
 
> So here's a plan: somebody please submit a patch that does only one
> thing: from...import * looks for __all__ and if it exists, imports
> exactly those names.  No changes to dir(), or anything.

+1 -- this won't be me though (at least not this week).
 
> > Access control mechanisms should be treated in different ways
> > such as wrapping objects using access-control proxies (see mx.Proxy
> > for an example of such an implementation) and on-demand only.
> > I wouldn't wan't to pay the performance hit for each and every
> > lookup in all my Python applications just because someone out
> > there feels that "from M import *" has a meaning in life
> > apart from being useful in interactive sessions to ease typing ;-)
> 
> In the process of looking into Zope internals I've noticed that
> proxies are indeed very useful!
> 
> I note that the IMPORT opcodes in ceval.c require that the imported
> module (as found in sys.modules[name] or returned by __import__()) is
> a real module object.  I think this is unnecessary -- at least
> IMPORT_FROM should work even if the module is a proxy or some other
> thing (I've been known to smuggle class instances into sys.modules :-)
> and IMPORT_STAR should work with a non-module at least if it has an
> __all__ attribute.

Cool.  This could make Python instances usable as "modules"
-- with full getattr() hook support !

For IMPORT_STAR I'd suggest first looking for __all__ and
then reverting to __dict__.items() in case this fails. 

BTW, is __dict__ needed by the import mechanism or would
the getattr/setattr slots suffice ? And if yes, must it
be a real Python dictionary ?
 
> > > I like it.  This has been asked for many times.  Does anybody see a
> > > reason why this should *not* be added?
> > >
> > > Tim remarked that introducing this will prompt demands for a similar
> > > feature on classes and instances, where it will be hard to implement
> > > without causing a bit of a slowdown.  It causes a slight slowdown (an
> > > extra dictionary lookup for each use of "M.v") even when it is not
> > > used, but for accessing module variables that's acceptable.  I'm not
> > > so sure about instance variable references.
> >
> > Again, I'd rather see these implemented using different
> > techniques which are under programmer control and made
> > explicit and visible in the program flow. Proxies are ideal
> > for these things, since they allow great flexibility while
> > still providing reasonable security at Python level.
> >
> > I have been using the proxy approach for years now and
> > so far with great success. What's even better is that
> > weak references and garbage finalization aids come along with
> > it for free.
> 
> Agreed.  Which reminds me -- would you mind reviewing Fred's new
> version of PEP 205 (weak refs)?

I'll have a look at it next week. Is that OK ?
 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fdrake@acm.org  Wed Jan 10 17:37:58 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 10 Jan 2001 12:37:58 -0500 (EST)
Subject: [Python-Dev] Shouldn't the Mac be listed as an environment?
In-Reply-To: <14940.37067.893679.750918@beluga.mojam.com>
References: <14940.37067.893679.750918@beluga.mojam.com>
Message-ID: <14940.40438.1654.487682@cj42289-a.reston1.va.home.com>

Skip Montanaro writes:
 > I just noticed that the "Environment" options for Python on the SF site are
 > listed as
 > 
 >      Console (Text Based), Win32 (MS Windows), X11 Applications
 > 
 > Shouldn't something Macintosh-related be in that list as well?

  Are the maintainers of the MacOS port using the SF bug tracker or
something else?  If they're using it, then by all means we should add
it.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From thomas@xs4all.net  Wed Jan 10 18:06:06 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 10 Jan 2001 19:06:06 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules xreadlinesmodule.c,NONE,1.1 Setup.dist,1.3,1.4
In-Reply-To: <E14G6bV-0004nX-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Tue, Jan 09, 2001 at 01:46:53PM -0800
References: <E14G6bV-0004nX-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010110190606.T2467@xs4all.nl>

On Tue, Jan 09, 2001 at 01:46:53PM -0800, Guido van Rossum wrote:

> static void
> xreadlines_dealloc(PyXReadlinesObject *op) {
> 	Py_XDECREF(op->file);
> 	Py_XDECREF(op->lines);
> 	PyObject_DEL(op);
> }

I'm confuzzled. Is this breach of the style guidelines intentional,
accidental, or just not cared enough about ? The style isn't even consistent
in that single module!

> void
> initxreadlines(void)
> {
> 	PyObject *m;
> 
> 	m = Py_InitModule("xreadlines", xreadlines_methods);
> }


-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From skip@mojam.com (Skip Montanaro)  Wed Jan 10 18:11:52 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 10 Jan 2001 12:11:52 -0600 (CST)
Subject: [Python-Dev] Shouldn't the Mac be listed as an environment?
In-Reply-To: <14940.40438.1654.487682@cj42289-a.reston1.va.home.com>
References: <14940.37067.893679.750918@beluga.mojam.com>
 <14940.40438.1654.487682@cj42289-a.reston1.va.home.com>
Message-ID: <14940.42472.174920.866172@beluga.mojam.com>

    Fred> Are the maintainers of the MacOS port using the SF bug tracker or
    Fred> something else?  If they're using it, then by all means we should
    Fred> add it.

Even if they aren't, I think it would be valuable to list.  There aren't all
that many tools (open source or otherwise) that run on Unix, Windows and Mac
and can be used as either a console app or a GUI.

I assume the reason Fred asks is that the Environment: list is generated
on-the-fly and somehow ties into use of the SF bug tracker.

Skip


From thomas@xs4all.net  Wed Jan 10 18:45:44 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 10 Jan 2001 19:45:44 +0100
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101101653.LAA28986@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 10, 2001 at 11:53:16AM -0500
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com> <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <20010110194544.V2467@xs4all.nl>

On Wed, Jan 10, 2001 at 11:53:16AM -0500, Guido van Rossum wrote:

> I note that the IMPORT opcodes in ceval.c require that the imported
> module (as found in sys.modules[name] or returned by __import__()) is
> a real module object.  I think this is unnecessary -- at least
> IMPORT_FROM should work even if the module is a proxy or some other
> thing (I've been known to smuggle class instances into sys.modules :-)
> and IMPORT_STAR should work with a non-module at least if it has an
> __all__ attribute.

Hmm.... Have you been sneaking looks at python-list again, Guido ? :-) I'm
certain the expanding of IMPORT would make a lot of people very happy. Alex
Martelli only just discovered the fact you can populate sys.modules
yourself, with non-module objects, and was wondering about its legality and
compatibility.

I, for one, am very +1 on the idea, also on MAL's idea to do our best in the
IMPORT_STAR case (try dict.items(), etc.)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From tim.one@home.com  Wed Jan 10 18:49:40 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 10 Jan 2001 13:49:40 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101101512.KAA26193@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGENEIHAA.tim.one@home.com>

[Tim]
> The strangest thing left to my eye is why xreadlines enjoys a
> significant advantage over the double-loop buffering method
> (readlines_sizehint) on my box; reducing the very large
> (1Mb) buffer in Guido's test program made no material difference
> to that.

[Guido]
> I was baffled at this too (same difference on my box), until I
> discovered that the buffer size is specified *twice*: once as a
> default in the arg list of readlines_sizehint(), then *again* in
> the call to timer() near the bottom of the file.

Bingo!

> Take the latter one out and the times are comparable, in fact
> readlines_sizehint() is a few percent quicker.

They're indistinguishable then on my box (on one run xreadlines is .1
seconds  (out of around 7.6 total) quicker, on another readlines_sizehint),
*provided* that I specify the same buffer size (8192) that xreadlines uses
internally.  However, if I even double that, readlines_sizehint is uniformly
about 10% slower.  It's also a tiny bit slower if I cut the sizehint buffer
size to 4096.

I'm afraid Mysteries will remain no matter how many person-decades we spend
staring at this <0.5 wink> ...



From guido@python.org  Wed Jan 10 18:50:10 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 13:50:10 -0500
Subject: [Python-Dev] Shouldn't the Mac be listed as an environment?
In-Reply-To: Your message of "Wed, 10 Jan 2001 10:41:47 CST."
 <14940.37067.893679.750918@beluga.mojam.com>
References: <14940.37067.893679.750918@beluga.mojam.com>
Message-ID: <200101101850.NAA29744@cj20424-a.reston1.va.home.com>

> I just noticed that the "Environment" options for Python on the SF site are
> listed as
> 
>      Console (Text Based), Win32 (MS Windows), X11 Applications
> 
> Shouldn't something Macintosh-related be in that list as well?

Yeah, except for two problems: :-)

(1) This is a selection from a drop-down menu that doesn't have a Mac
    option;

(2) There are only three slots allowed.

So this is the best we can do.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gstein@lyra.org  Wed Jan 10 18:53:32 2001
From: gstein@lyra.org (Greg Stein)
Date: Wed, 10 Jan 2001 10:53:32 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010110194544.V2467@xs4all.nl>; from thomas@xs4all.net on Wed, Jan 10, 2001 at 07:45:44PM +0100
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com> <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com> <20010110194544.V2467@xs4all.nl>
Message-ID: <20010110105332.T4640@lyra.org>

On Wed, Jan 10, 2001 at 07:45:44PM +0100, Thomas Wouters wrote:
> On Wed, Jan 10, 2001 at 11:53:16AM -0500, Guido van Rossum wrote:
> 
> > I note that the IMPORT opcodes in ceval.c require that the imported
> > module (as found in sys.modules[name] or returned by __import__()) is
> > a real module object.  I think this is unnecessary -- at least
> > IMPORT_FROM should work even if the module is a proxy or some other
> > thing (I've been known to smuggle class instances into sys.modules :-)
> > and IMPORT_STAR should work with a non-module at least if it has an
> > __all__ attribute.
> 
> Hmm.... Have you been sneaking looks at python-list again, Guido ? :-) I'm
> certain the expanding of IMPORT would make a lot of people very happy. Alex
> Martelli only just discovered the fact you can populate sys.modules
> yourself, with non-module objects, and was wondering about its legality and
> compatibility.
> 
> I, for one, am very +1 on the idea, also on MAL's idea to do our best in the
> IMPORT_STAR case (try dict.items(), etc.)

+1 ... I'm always up for removing type restrictions. Did that with the
bytecodes in function objects a while back.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From MarkH@ActiveState.com  Wed Jan 10 18:54:34 2001
From: MarkH@ActiveState.com (Mark Hammond)
Date: Wed, 10 Jan 2001 10:54:34 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules xreadlinesmodule.c,NONE,1.1 Setup.dist,1.3,1.4
In-Reply-To: <20010110190606.T2467@xs4all.nl>
Message-ID: <LCEPIIGDJPKCOIHOBJEPMEGKCOAA.MarkH@ActiveState.com>

> I'm confuzzled. Is this breach of the style guidelines intentional,
> accidental, or just not cared enough about ?

I vote the latter!

Who-really-cares ly,

Mark.


From guido@python.org  Wed Jan 10 19:00:24 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 14:00:24 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: Your message of "Mon, 08 Jan 2001 11:31:09 EST."
 <20010108113109.C7563@kronos.cnri.reston.va.us>
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com>
 <20010108113109.C7563@kronos.cnri.reston.va.us>
Message-ID: <200101101900.OAA30486@cj20424-a.reston1.va.home.com>

[me]
> >I expect Andrew's code to go in before 2.1 is released.  So I don't
> >see a reason why we should hurry and check in a stop-gap measure.

[Andrew]
> But it might not; the final version might be unacceptable or run into
> some intractable problem.  Assuming the patch is correct (I haven't
> looked at it), why not check it in?  The work has already been done to
> write it, after all.

OK, done.

It was more work than I had hoped for, because Eric apparently
(despite having developer privileges!) doesn't use the CVS tree -- he
sent in a diff relative to the 2.0 release.  I munged it into place,
adding the feature that readline, _curses and bsdddb are built as
shared libraries by default.  You'd have to edit Setup.config.in to
change this.  Hope this doesn't break anybody's setup.  (Skip???)

Question for Eric: do you still want developer privileges?  They come
with responsibilities too.  Please check out the @#$%& CVS tree! :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Wed Jan 10 19:03:07 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 14:03:07 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Mon, 01 Jan 2001 19:49:35 CST."
 <20010101194935.19672@falcon.inetnebr.com>
References: <20010101194935.19672@falcon.inetnebr.com>
Message-ID: <200101101903.OAA30522@cj20424-a.reston1.va.home.com>

Hi Jeff,

I'm glad to tell you that I've accepted your xreadlines patches.  It's
all checked into the CVS tree now, except for your patch to
fileinput.py, where I had already checked in a similar change using
readlines(sizehint) directly.

Thanks again for your contribution!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From paulp@ActiveState.com  Wed Jan 10 20:08:31 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Wed, 10 Jan 2001 12:08:31 -0800
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
 <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <3A5CC13F.DFB26A0B@ActiveState.com>

Guido van Rossum wrote:
> 
> ...
> 
> Yes -- I came up with the same thought.
> 
> So here's a plan: somebody please submit a patch that does only one
> thing: from...import * looks for __all__ and if it exists, imports
> exactly those names.  No changes to dir(), or anything.

Why? From my point of view, the changes to dir() are much more
important. I seldom tell newbies about import * but I always tell them
how they can browse objects (especially modules) with dir. If dir() is
changed then IDEs and so forth would use that and inherit the right
behavior. If the module exporting behavior gets more sophisticated in a
future version of Python they will continue to inherit the behavior.

Also, dir() could look for an __all__ on all objects including "module
proxies", classes and "plain old instances". In other words we can
extend the convention to other objects "for free".

 Paul


From tim.one@home.com  Wed Jan 10 20:25:24 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 10 Jan 2001 15:25:24 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101101638.LAA26759@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAENJIHAA.tim.one@home.com>

[Tim]
>> Well, it would be easy to fiddle the HAVE_GETC_UNLOCKED method
>> to keep the file locked until the line was complete, and I
>> wouldn't be opposed to making life saner on platforms that allow it.

[Guido]
> Hm...  That would be possible, except for one unfortunate detail:
> _PyString_Resize() may call PyErr_BadInternalCall() which touches
> thread state.

FLOCKFILE/FUNLOCKFILE are independent of Python's notion of thread state.
IOW, do FLOCKFILE once before the for(;;), and FUNLOCKFILE once on every
*exit* path thereafter.  We can block/unblock Python threads as often as
desired between those *file*-locking brackets.  The only thing the repeated
FLOCKFILE/FUNLOCKFILE calls do to my eyes now is to create the *possibility*
for multiple readers to get partial lines of the file.

> ...
> NO, NO NO!  Mixing reads and writes on the same stream wasn't what we
> are locking against at all.  (As you've found out, it doesn't even
> work.)

On Windows, yes, but that still seems to me to be a bug in MS's code.  If
anyone had reported a core dump on any other platform, I'd be more tractable
<wink> on this point.

> We're only trying to protect against concurrent *reads*.

As above, I believe that we could do a better job of that, then, on
platforms that HAVE_GETC_UNLOCKED, by protecting not only against core dumps
but also against .readline() not delivering an intact line from the file.

>> But since FLOCKFILE is in effect, other threads *trying* to write
>> to the stream we're reading will get blocked anyway.  Seems to give us
>> potential for deadlocks.

> Only if tyeh are holding other locks at the same time.

I'm not being clear, then.  Thread X does f.readline(), on a
HAVE_GETC_UNLOCKED platform.  get_line allows other threads to run and
invokes FLOCKFILE on f->f_fp.  get_line's GETC in thread X eventually hits
the end of the stdio buffer, and does its platform's version of _filbuf.
_filbuf may wait (depending on the nature of the stream) for more input to
show up.  Simultaneously, thread Y attempts to write some data to f.  But
the *FLOCKFILE* lock prevents it from doing anything with f.  So X is
waiting for Y to write data inside platform _filbuf, but Y is waiting for X
to release the platform stream lock inside some platform stream-output
routine (if I'm being clear now, Python locks have nothing to do with this
scenario:  it's the platform stream lock).

I think this is purely the user's fault if it happens.  Just pointing it out
as another insecurity we're probably not able to protect users from.

> ...
> Yeah.  But this is insane use -- see my comments on SF.  It's only
> worth fixing because it could be used to intentionally crash Python --
> but there are easier ways...

If it's unique to MS (as I suspect), I see no reason to even consider trying
to fix it in Python.  Unless the Perl Mongers use it to crash Zope <wink>.




From cgw@fnal.gov  Wed Jan 10 21:57:41 2001
From: cgw@fnal.gov (Charles G Waldman)
Date: Wed, 10 Jan 2001 15:57:41 -0600 (CST)
Subject: [Python-Dev] Interning filenames of imported modules
Message-ID: <14940.56021.646147.770080@buffalo.fnal.gov>

I have a question about the following code in compile.c:jcompile (line 3678)

		filename = PyString_InternFromString(sc.c_filename); 
		name = PyString_InternFromString(sc.c_name);

In the case of a long-running server which constantly imports modules,
this causes the interned string dict to grow without bound.  Is there
a strong reason that the filename needs to be interned?  How about the
module name?

How about some way to enforce a limit on the size of the interned
strings dictionary?



From mwh21@cam.ac.uk  Wed Jan 10 22:02:49 2001
From: mwh21@cam.ac.uk (Michael Hudson)
Date: Wed, 10 Jan 2001 22:02:49 +0000 (GMT)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A5CC13F.DFB26A0B@ActiveState.com>
Message-ID: <Pine.SOL.4.21.0101102121460.10616-100000@red.csi.cam.ac.uk>

On Wed, 10 Jan 2001, Paul Prescod wrote:

> Guido van Rossum wrote:
> > 
> > ...
> > 
> > Yes -- I came up with the same thought.
> > 
> > So here's a plan: somebody please submit a patch that does only one
> > thing: from...import * looks for __all__ and if it exists, imports
> > exactly those names.  No changes to dir(), or anything.
> 
> Why? From my point of view, the changes to dir() are much more
> important. I seldom tell newbies about import * but I always tell them
> how they can browse objects (especially modules) with dir. If dir() is
> changed then IDEs and so forth would use that and inherit the right
> behavior. If the module exporting behavior gets more sophisticated in a
> future version of Python they will continue to inherit the behavior.

Changing dir would also make rlcompleter nicer - it's something of a pain
to use with a module that has, eg, "from TERMIOS import *"-ed.  This might
also make "from ... import *" less of a pariah...

Sounds good to me, IOW.

Cheers,
M.



From tim.one@home.com  Wed Jan 10 22:23:14 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 10 Jan 2001 17:23:14 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101101639.LAA26776@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGENMIHAA.tim.one@home.com>

[Guido]
> I bet you it's rejected on the basis of "the docs tell you not to mix
> reading and writing on the same stream without intervening seek or
> flush."  If I were on the support line I would do that.

So would I if I were a typical first-line support idiot <wink>.  But the
*implementers*-- if they ever see it --should be very keen to figure out how
they managed to let the _iobuf get corrupted.  *I'm* not mucking with their
internals, nor doing wild pointer stores, nor anything else sneaky to
subvert their locking protection.  I wasn't even trying to break it.  The
only code reading from or storing into the _iobuf is theirs.  They're
ordinary stdio calls with ordinary arguments, and if *any* sequence of those
can cause internal corruption, they've almost certainly got a problem that
will manifest in other situations too.

Think like an implementer here <0.5 wink>:  they've lost track of how many
characters are in the buffer despite a locking scheme whose purpose is to
prevent that.  If it were my implementation, that would be a top-priority
bug no matter how silly the first program I saw that triggered it.

but-willing-to-let-them-decide-whether-they-care-ly y'rs  - tim



From skip@mojam.com (Skip Montanaro)  Wed Jan 10 22:52:55 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 10 Jan 2001 16:52:55 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A5CC13F.DFB26A0B@ActiveState.com>
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
 <3A5C61D8.2E5D098C@lemburg.com>
 <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
 <3A5CC13F.DFB26A0B@ActiveState.com>
Message-ID: <14940.59335.723701.574821@beluga.mojam.com>

    Paul> Also, dir() could look for an __all__ on all objects including
    Paul> "module proxies", classes and "plain old instances". In other
    Paul> words we can extend the convention to other objects "for free".

The __exports__/dir() patch I submitted will do this if you remove the
PyModule_Check that guards it.

Skip





From tim.one@home.com  Wed Jan 10 23:06:05 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 10 Jan 2001 18:06:05 -0500
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
In-Reply-To: <3A5C4E44.23B593E9@per.dem.csiro.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCGENOIHAA.tim.one@home.com>

[Mark Favas]
> Just Another Data Point - my box (DEC Alpha, Tru64 Unix) shows the same
> behaviour as Tim's WinBox wrt the new xreadline and the double-loop
> readlines (so it's not just something funny with MS (not that there's
> not anything funny with MS...)):
>
> total 131426612 chars and 514216 lines

You average over 255 chars/line?  Really?  What kind of file are you
reading?  I don't really want to measure the speed of line-at-a-time input
on binary files where "line" doesn't actually make sense <0.6 wink>.

> count_chars_lines     5.450  5.066
> readlines_sizehint    4.112  4.083
> using_fileinput      10.928 10.916
> while_readline       11.766 11.733
> for_xreadlines        3.569  3.533

Guido pointed out that his readlines_sizehint test forced use of a 1Mb
buffer (in the call, not only the default value).  For whatever reason, that
was significantly slower than using an 8Kb sizehint on my box.

Another oddity is that while_readline is slower than using_fileinput for
you.  From that I take it Python config does *not* #define

     HAVE_GETC_UNLOCKED

on your platform.  If that's true (or esp. if it's not!), would you do me a
favor?  Recompile fileobject.c with

     USE_MS_GETLINE_HACK

#define'd, try the timing test again (while_readline is the most interesting
test for this), and run the test_bufio.py std test to make sure you're
actually getting the right answers.

At this point I'm +0.5 on the idea of fileobject.c using ms_getline_hack
whenever HAVE_GETC_UNLOCKED isn't available.  I'd be surprised if
ms_getline_hack failed to work correctly on any platform; a bigger unknown
(to me) is whether it will yield a speedup.  So far it yields a large
speedup on Windows, and looks like a speedup equal to getc_unlocked() yields
on Linux and Solaris.  Info on a platform from Mars (like Tru64 Unix <wink>)
would be valuable in deciding whether to boost +0.5.

don't-want-your-python-to-run-slower-than-possible-if-possible-ly
    y'rs  - tim



From tismer@tismer.com  Wed Jan 10 22:38:57 2001
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 11 Jan 2001 00:38:57 +0200
Subject: [Python-Dev] [Stackless] ANN: Sourcecode for Stackless Python 2.0
Message-ID: <3A5CE481.24A7656@tismer.com>

On Monday, Jan 8th, I spake

"""
Source code and an update to the website will become available in
the next days.
"""

Now, here it is, together with a slightly updated website,
which tries to mention all the people who are helping
or sponsoring me (yes, there are sponsors!).
If somebody feels ignored by me, let me know. I'm good at
making mistakes.

Let me also know if there are problems building the code,
or if there are *no* problems understanding the code.
I don't expect either :-)

There is nearly no support for Unix, but Stackless *should*
build on Unix as it did before without problems.

enjoy - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From nas@arctrix.com  Wed Jan 10 18:15:45 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Wed, 10 Jan 2001 10:15:45 -0800
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGENOIHAA.tim.one@home.com>; from tim.one@home.com on Wed, Jan 10, 2001 at 06:06:05PM -0500
References: <3A5C4E44.23B593E9@per.dem.csiro.au> <LNBBLJKPBEHFEDALKOLCGENOIHAA.tim.one@home.com>
Message-ID: <20010110101545.A21305@glacier.fnational.com>

On Wed, Jan 10, 2001 at 06:06:05PM -0500, Tim Peters wrote:
> At this point I'm +0.5 on the idea of fileobject.c using ms_getline_hack
> whenever HAVE_GETC_UNLOCKED isn't available.

Leave it to the timbot use floating point votes. :)

Compare ms_getline_hack to what Perl does in order speed up IO.
I think its worth maintaining that piece of relatively portable
code given the benefit.  If the code has to be maintained then it
might was well be used.  If we find a platform the breaks we can
always disable it before the final release.

  Neil


From m.favas@per.dem.csiro.au  Thu Jan 11 01:28:59 2001
From: m.favas@per.dem.csiro.au (Mark Favas)
Date: Thu, 11 Jan 2001 09:28:59 +0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <3A5D0C5B.162F624A@per.dem.csiro.au>

[Tim produces a warped threader that crashes on MS OS's]
>> ...
>> NO, NO NO!  Mixing reads and writes on the same stream wasn't what
>> we are locking against at all.  (As you've found out, it doesn't 
>> even work.)

>On Windows, yes, but that still seems to me to be a bug in MS's code.  >If anyone had reported a core dump on any other platform, I'd be more >tractable <wink> on this point.

On Tru64 Unix, I get an infinite generator of 'r's (after an initial few
'w's) to the screen (but no crashes). If I reduce the size of the loop
counters from 1000000 to 3000, I get the following output:
opened
w w w w w w w w w w w w w w w w w w w w w w w w w w w r read 5114
done

-- 
Mark Favas  -   m.favas@per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA


From m.favas@per.dem.csiro.au  Thu Jan 11 03:40:18 2001
From: m.favas@per.dem.csiro.au (Mark Favas)
Date: Thu, 11 Jan 2001 11:40:18 +0800
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
Message-ID: <3A5D2B22.B8028AC@per.dem.csiro.au>

[Tim responded]
>>
>> total 131426612 chars and 514216 lines

>You average over 255 chars/line?  Really?  What kind of file are you
>reading?  I don't really want to measure the speed of line-at-a-time >input on binary files where "line" doesn't actually make sense <0.6 wink>.

Real-life input, my boy! It's actually a syslog from my mailserver,
consisting mainly of sendmail log messages, and I have a current need to
process these things (MS Exchange, corrupted database, clobbered backup
tapes), so this thread came along at the right time...

>Guido pointed out that his readlines_sizehint test forced use of a 1Mb
>buffer (in the call, not only the default value).  For whatever >reason, that was significantly slower than using an 8Kb sizehint on my >box.

Removing the buffer size arg in the call to readlines_sizehint results
in this (using up-to-the-minute CVS):
total 131426612 chars and 514216 lines
count_chars_lines     4.922  4.916
readlines_sizehint    3.881  3.850
using_fileinput      10.371 10.366
while_readline       10.943 10.916
for_xreadlines        2.990  2.967

and with an 8Kb sizehint:
total 131426612 chars and 514216 lines
count_chars_lines     5.241  5.216
readlines_sizehint    2.917  2.900
using_fileinput      10.351 10.333
while_readline       10.990 10.983
for_xreadlines        2.877  2.867


>Another oddity is that while_readline is slower than using_fileinput >for you.  From that I take it Python config does *not* #define
>
>     HAVE_GETC_UNLOCKED
>
>on your platform.  If that's true 

Nope, HAVE_GETC_UNLOCKED is indeed #define'd

>(or esp. if it's not!), would you do me a
>favor?  Recompile fileobject.c with
>
>     USE_MS_GETLINE_HACK
>
>#define'd, try the timing test again (while_readline is the most >interesting test for this), and run the test_bufio.py std test to make >sure you're actually getting the right answers.

Sure:
With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #define'd (although
defining the former makes the latter def irrelevant):
(test_bufio also OK)
total 131426612 chars and 514216 lines
count_chars_lines     5.056  5.050
readlines_sizehint    3.771  3.667
using_fileinput      11.128 11.116
while_readline        8.287  8.233
for_xreadlines        3.090  3.083

With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #undef'ed (just for
completeness):
total 131426612 chars and 514216 lines
count_chars_lines     4.916  4.900
readlines_sizehint    3.875  3.867
using_fileinput      14.404 14.383
while_readline       322.728 321.837
for_xreadlines        7.113  7.100

So, having HAVE_GETC_UNLOCKED #define'd does make a small improvement
<grin>

-- 
Mark Favas  -   m.favas@per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA


From nas@arctrix.com  Wed Jan 10 21:55:23 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Wed, 10 Jan 2001 13:55:23 -0800
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
In-Reply-To: <3A5D2B22.B8028AC@per.dem.csiro.au>; from m.favas@per.dem.csiro.au on Thu, Jan 11, 2001 at 11:40:18AM +0800
References: <3A5D2B22.B8028AC@per.dem.csiro.au>
Message-ID: <20010110135523.A21894@glacier.fnational.com>

On Thu, Jan 11, 2001 at 11:40:18AM +0800, Mark Favas wrote:
[with getc_unlocked]
> while_readline       10.943 10.916

[without]
> while_readline       322.728 321.837

Holy crap.  Great work team.

  Neil


From tim.one@home.com  Thu Jan 11 05:03:51 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 11 Jan 2001 00:03:51 -0500
Subject: [Python-Dev] Baffled on Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCCEOGIHAA.tim.one@home.com>

In version 2.26 of mmapmodule.c, Guido replaced (as part of a contributed
Cygwin patch):

#ifdef MS_WIN32
__declspec(dllexport) void
#endif /* MS_WIN32 */
#ifdef UNIX
extern void
#endif

by:

DL_EXPORT(void)

before initmmap.

1. Windows Python can no longer import mmap:

>>> import mmap
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ImportError: dynamic module does not define init function (initmmap)
>>>

This is because GetProcAddress returns NULL.

2. Everything's fine if I revert Guido's change (although I assume that
breaks Cygwin then).

3. DL_EXPORT(void) expands to "void".

4. The way mmapmodule.c is coded and built after Guido's change appears to
me to be the same as how every other non-builtin module is coded and built
on Windows.  For example, winsound.c, which uses DL_EXPORT(void) before its
initwinsound and where that macro also expands to "void".  But importing
winsound works fine.

Since what I'm seeing makes no consistent sense, I'm at a loss how to fix
it.  But then I'm punch-drunk too <0.7 wink>.

Any Windows geek got a clue?



From tim.one@home.com  Thu Jan 11 06:10:40 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 11 Jan 2001 01:10:40 -0500
Subject: [Python-Dev] RE: xreadline speed vs readlines_sizehint
In-Reply-To: <3A5D2B22.B8028AC@per.dem.csiro.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPGIHAA.tim.one@home.com>

[Tim, to MarkF]
>> You average over 255 chars/line?  [nag, nag, nag]

[Mark Favas]
> Real-life input, my boy! It's actually a syslog from my
> mailserver, consisting mainly of sendmail log messages, and I
> have a current need to process these things (MS Exchange,
> corrupted database, clobbered backup tapes), so this thread
> came along at the right time...

Hmm.  I tuned ms_getline_hack for Guido's logfiles, which he said don't
often exceed 160 chars/line.  I guess if you're on a 64-bit platform,
though, it must take about twice as many chars per line to record a log msg
<wink>.

> ...
> Removing the buffer size arg in the call to readlines_sizehint results
> in this (using up-to-the-minute CVS):
> total 131426612 chars and 514216 lines
> count_chars_lines     4.922  4.916
> readlines_sizehint    3.881  3.850
> using_fileinput      10.371 10.366
> while_readline       10.943 10.916
> for_xreadlines        2.990  2.967
>
> and with an 8Kb sizehint:
> total 131426612 chars and 514216 lines
> count_chars_lines     5.241  5.216
> readlines_sizehint    2.917  2.900
> using_fileinput      10.351 10.333
> while_readline       10.990 10.983
> for_xreadlines        2.877  2.867

That's sure consistent across platforms, then.  I guess we'll write it off
to "cache effects" (a catch-all explanation for any timing mystery -- go
ahead, just *try* to prove it's wrong <0.5 wink>).

[and Mark has HAVE_GETC_UNLOCKED on his Tru64 Unix box, yet
 using_fileinput is quicker than while_readline]

> With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #define'd
> (although defining the former makes the latter def irrelevant):
> (test_bufio also OK)
> total 131426612 chars and 514216 lines
> count_chars_lines     5.056  5.050
> readlines_sizehint    3.771  3.667
> using_fileinput      11.128 11.116
> while_readline        8.287  8.233
> for_xreadlines        3.090  3.083

So ms_getline_hack is significantly faster on your box (I'm only looking at
while_readline:  11 using getc_unlocked, 8.3 using ms_getline_hack).  There
are only two reasons I can imagine for that:

1. Your vendor optimizes the inner loop in fgets (as all vendors should, but
few do).

and/or

2. Despite the long average length of your lines, many of them are
nevertheless shorter than 200 chars, and so all the pain ms_getline_hack
endures to avoid a realloc pays off.

Unfortunately, there's not enough info to figure out if either, both, or
none of those are on-target.  It's such a large percentage speedup, though,
that my bet goes primarily to #1 -- unless realloc is really pig slow on
your box.  Which some things *are*:

> With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #undef'ed (just
> for completeness):
> total 131426612 chars and 514216 lines
> count_chars_lines     4.916  4.900
> readlines_sizehint    3.875  3.867
> using_fileinput      14.404 14.383
> while_readline       322.728 321.837
> for_xreadlines        7.113  7.100
>
> So, having HAVE_GETC_UNLOCKED #define'd does make a small improvement
> <grin>

Yes, that's the "platform from Mars" evidence I was seeking:  if
ms_getline_hack survives test_bufio on *your* crazy box, it's as close to
provably correct as any algorithm in all of Computer Science <wink>.

a-factor-of-39-is-almost-big-enough-to-notice!-ly y'rs  - tim



From m.favas@per.dem.csiro.au  Thu Jan 11 07:26:37 2001
From: m.favas@per.dem.csiro.au (Mark Favas)
Date: Thu, 11 Jan 2001 15:26:37 +0800
Subject: [Python-Dev] Re: xreadline speed vs readlines_sizehint
References: <LNBBLJKPBEHFEDALKOLCIEPGIHAA.tim.one@home.com>
Message-ID: <3A5D602D.9DC991CB@per.dem.csiro.au>

[Tim speculates on getc_unlocked and his ms_getline_hack]:
> 
> So ms_getline_hack is significantly faster on your box (I'm only
> looking at while_readline:  11 using getc_unlocked, 8.3 using 
> ms_getline_hack).  There are only two reasons I can imagine for that:
> 
> 1. Your vendor optimizes the inner loop in fgets (as all vendors
> should, but few do).

Digital engineering, Compaq management/marketing <0.6 wink>
> 
> and/or
> 
> 2. Despite the long average length of your lines, many of them are
> nevertheless shorter than 200 chars, and so all the pain
> ms_getline_hack endures to avoid a realloc pays off.
> 
> Unfortunately, there's not enough info to figure out if either, both,
> or none of those are on-target.  It's such a large percentage
> speedup, though, that my bet goes primarily to #1 -- unless realloc
> is really pig slow on your box.

The lines range in length from 96 to 747 characters, with 11% @ 233, 17%
@ 252 and 52% @ 254 characters, so #1 looks promising - most lines are
long enough to trigger a realloc. Cranking up INITBUFSIZE in
ms_getline_hack to 260 from 200 improves thing again, by another 25%: 
total 131426612 chars and 514216 lines
count_chars_lines     5.081  5.066
readlines_sizehint    3.743  3.717
using_fileinput      11.113 11.100
while_readline        6.100  6.083
for_xreadlines        3.027  3.033

Apart from the name <grin>, I like ms_getline_hack...

tho'-a-factor-of-100-makes-xreadlines-a-welcome-addition!-ly y'rs

-- 
Mark Favas  -   m.favas@per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA


From m.favas@per.dem.csiro.au  Thu Jan 11 09:08:29 2001
From: m.favas@per.dem.csiro.au (Mark Favas)
Date: Thu, 11 Jan 2001 17:08:29 +0800
Subject: [Python-Dev] Current CVS version of sysmodule.c fails to compile
Message-ID: <3A5D780D.62D0F473@per.dem.csiro.au>

On Tru64 Unix, with Compaq's C/CXX compilers, the current CVS version of
sysmodule.c produces the following errors:

cc -O -Olimit 1500 -I./../Include -I.. -DHAVE_CONFIG_H   -c -o
sysmodule.o sysmodule.c
cc: Error: sysmodule.c, line 73: Invalid declarator. (declarator)
        PyObject *o, *stdout;
----------------------^
cc: Error: sysmodule.c, line 79: In this statement, "o" is not declared.
(undeclared)
        if (!PyArg_ParseTuple(args, "O:displayhook", &o))
------------------------------------------------------^
cc: Error: sysmodule.c, line 93: In this statement, "(&_iob[1])" is not
an lvalue, but occurs in a context that requires one. (needlvalue)
        stdout = PySys_GetObject("stdout");
--------^
cc: Warning: sysmodule.c, line 98: In this statement, the referenced
type of the pointer value "(&_iob[1])" is "struct declared without a
tag", which is not compatible with "struct _object". (ptrmismatch)
        if (PyFile_WriteObject(o, stdout, 0) != 0)
----------------------------------^
cc: Warning: sysmodule.c, line 100: In this statement, the referenced
type of the pointer value "(&_iob[1])" is "struct declared without a
tag", which is not compatible with "struct _object". (ptrmismatch)
        PyFile_SoftSpace(stdout, 1);
-------------------------^

The problem is that stdout is a macro #define'd in stdio.h as (&_iob[1])
(stdin and stderr also are similarly #define'd).

-- 
Mark Favas  -   m.favas@per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA


From gstein@lyra.org  Thu Jan 11 09:18:44 2001
From: gstein@lyra.org (Greg Stein)
Date: Thu, 11 Jan 2001 01:18:44 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.216,2.217 sysmodule.c,2.80,2.81
In-Reply-To: <E14GaUL-0005nd-00@usw-pr-cvs1.sourceforge.net>; from moshez@users.sourceforge.net on Wed, Jan 10, 2001 at 09:41:29PM -0800
References: <E14GaUL-0005nd-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010111011843.W4640@lyra.org>

On Wed, Jan 10, 2001 at 09:41:29PM -0800, Moshe Zadka wrote:
> Update of /cvsroot/python/python/dist/src/Python
> In directory usw-pr-cvs1:/tmp/cvs-serv21213/Python
> 
> Modified Files:
> 	ceval.c sysmodule.c
>...
> --- 1246,1269 ----
>   		case PRINT_EXPR:
>   			v = POP();
> ! 			w = PySys_GetObject("displayhook");
> ! 			if (w == NULL) {
> ! 				PyErr_SetString(PyExc_RuntimeError,
> ! 						"lost sys.displayhook");
> ! 				err = -1;
>   			}
> + 			if (err == 0) {
> + 				x = Py_BuildValue("(O)", v);
> + 				if (x == NULL)
> + 					err = -1;
> + 			}
> + 			if (err == 0) {
> + 				w = PyEval_CallObject(w, x);
> + 				if (w == NULL)
> + 					err = -1;
> + 			}
>   			Py_DECREF(v);
> + 			Py_XDECREF(x);

x was never initialized to NULL. In fact, the loop sets it to Py_None. If
you get an error in the initial "w" setup case, then you could erroneously
decref None.

Further, there is no DECREF for the CallObject result ("w"). But watch out:
you don't want to DECREF the PySys_GetObject result (that is a borrowed
reference).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From gstein@lyra.org  Thu Jan 11 09:28:16 2001
From: gstein@lyra.org (Greg Stein)
Date: Thu, 11 Jan 2001 01:28:16 -0800
Subject: [Python-Dev] Current CVS version of sysmodule.c fails to compile
In-Reply-To: <3A5D780D.62D0F473@per.dem.csiro.au>; from m.favas@per.dem.csiro.au on Thu, Jan 11, 2001 at 05:08:29PM +0800
References: <3A5D780D.62D0F473@per.dem.csiro.au>
Message-ID: <20010111012815.X4640@lyra.org>

You're quite right! I've checked in a change, renaming it to "outf".

Cheers,
-g

On Thu, Jan 11, 2001 at 05:08:29PM +0800, Mark Favas wrote:
> On Tru64 Unix, with Compaq's C/CXX compilers, the current CVS version of
> sysmodule.c produces the following errors:
> 
> cc -O -Olimit 1500 -I./../Include -I.. -DHAVE_CONFIG_H   -c -o
> sysmodule.o sysmodule.c
> cc: Error: sysmodule.c, line 73: Invalid declarator. (declarator)
>         PyObject *o, *stdout;
> ----------------------^
> cc: Error: sysmodule.c, line 79: In this statement, "o" is not declared.
> (undeclared)
>         if (!PyArg_ParseTuple(args, "O:displayhook", &o))
> ------------------------------------------------------^
> cc: Error: sysmodule.c, line 93: In this statement, "(&_iob[1])" is not
> an lvalue, but occurs in a context that requires one. (needlvalue)
>         stdout = PySys_GetObject("stdout");
> --------^
> cc: Warning: sysmodule.c, line 98: In this statement, the referenced
> type of the pointer value "(&_iob[1])" is "struct declared without a
> tag", which is not compatible with "struct _object". (ptrmismatch)
>         if (PyFile_WriteObject(o, stdout, 0) != 0)
> ----------------------------------^
> cc: Warning: sysmodule.c, line 100: In this statement, the referenced
> type of the pointer value "(&_iob[1])" is "struct declared without a
> tag", which is not compatible with "struct _object". (ptrmismatch)
>         PyFile_SoftSpace(stdout, 1);
> -------------------------^
> 
> The problem is that stdout is a macro #define'd in stdio.h as (&_iob[1])
> (stdin and stderr also are similarly #define'd).
> 
> -- 
> Mark Favas  -   m.favas@per.dem.csiro.au
> CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Greg Stein, http://www.lyra.org/


From skip@mojam.com (Skip Montanaro)  Thu Jan 11 14:13:55 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 11 Jan 2001 08:13:55 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <E14GgKS-0002AH-00@usw-pr-cvs1.sourceforge.net>
References: <E14GgKS-0002AH-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <14941.49059.26189.733094@beluga.mojam.com>

    Moshe> * Did not DECREF result from displayhook function
    ...
    Moshe>   				w = PyEval_CallObject(w, x);
    Moshe> + 				Py_XDECREF(w);
    Moshe>   				if (w == NULL)
    ...


While it works, is it really kosher to test w's value after the DECREF?
Just seems like an odd construct to me.  I'm used to seeing the test
immediately after it's been set.

Skip





From guido@python.org  Thu Jan 11 14:44:58 2001
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 09:44:58 -0500
Subject: [Python-Dev] Interning filenames of imported modules
In-Reply-To: Your message of "Wed, 10 Jan 2001 15:57:41 CST."
 <14940.56021.646147.770080@buffalo.fnal.gov>
References: <14940.56021.646147.770080@buffalo.fnal.gov>
Message-ID: <200101111444.JAA14597@cj20424-a.reston1.va.home.com>

> I have a question about the following code in compile.c:jcompile (line 3678)
> 
> 		filename = PyString_InternFromString(sc.c_filename); 
> 		name = PyString_InternFromString(sc.c_name);
> 
> In the case of a long-running server which constantly imports modules,
> this causes the interned string dict to grow without bound.  Is there
> a strong reason that the filename needs to be interned?  How about the
> module name?

It's probably not *necessary* for the filename, but I know why I am
interning it: since a module typically contains a bunch of functions,
and each function has its own code object with a reference to the
filename, I'm trying to save memory (the filename is a C string
pointer in the "sc" structure, so it has to be turned into a Python
string when creating the code object).

The module name is used as an identifier elsewhere so will become
interned anyway.

> How about some way to enforce a limit on the size of the interned
> strings dictionary?

I've never thought of this -- but I suppose that a weak dictionary
could be used.  Fred's working on a PEP for weak references, so
there's a chance that we might use this eventually.

In the mean time, a possibility would be to provide a service function
that goes through the "interned" dictionary and looks for values with
a reference count of 1, and deletes them.  You could then explicitly
call this service function occasionally in your program.  I would let
it return a tuple: (number of values kept, number of values deleted).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Thu Jan 11 15:08:48 2001
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:08:48 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 10 Jan 2001 13:49:40 EST."
 <LNBBLJKPBEHFEDALKOLCGENEIHAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCGENEIHAA.tim.one@home.com>
Message-ID: <200101111508.KAA14870@cj20424-a.reston1.va.home.com>

> They're indistinguishable then on my box (on one run xreadlines is .1
> seconds  (out of around 7.6 total) quicker, on another readlines_sizehint),
> *provided* that I specify the same buffer size (8192) that xreadlines uses
> internally.  However, if I even double that, readlines_sizehint is uniformly
> about 10% slower.  It's also a tiny bit slower if I cut the sizehint buffer
> size to 4096.
> 
> I'm afraid Mysteries will remain no matter how many person-decades we spend
> staring at this <0.5 wink> ...

8192 happens to be the size of the stack-allocated buffer readlines()
uses, and also the stdio BUFSIZ parameter, on many systems.  Look for
SMALLCHUNK in fileobject.c.

Would it make sense to tie the two constants together more to tune
this optimally even when BUFSIZ is different?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@digicool.com  Thu Jan 11 15:09:54 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Thu, 11 Jan 2001 10:09:54 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
References: <200101080416.f084GrM10912@snark.thyrsus.com>
 <20010108074411.N2467@xs4all.nl>
 <20010108014945.A19516@thyrsus.com>
 <200101081427.JAA03146@cj20424-a.reston1.va.home.com>
 <20010108113109.C7563@kronos.cnri.reston.va.us>
 <200101101900.OAA30486@cj20424-a.reston1.va.home.com>
Message-ID: <14941.52418.18484.898061@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> It was more work than I had hoped for, because Eric
    GvR> apparently (despite having developer privileges!) doesn't use
    GvR> the CVS tree -- he sent in a diff relative to the 2.0
    GvR> release.  I munged it into place, adding the feature that
    GvR> readline, _curses and bsdddb are built as shared libraries by
    GvR> default.  You'd have to edit Setup.config.in to change this.
    GvR> Hope this doesn't break anybody's setup.  (Skip???)

We may need to move dbm module to Setup.config from Setup and build it
shared too.  The problem I ran into when building the pybsddb3 module
was that even though I'd built the standard bsddb shared, I was also
building in dbm statically.  This pulled in a dependency to the old
db.so module (under RH6.1) and core dumped me during the test suite
for pybsddb.  Commenting out dbm did the trick, so building it shared
should work too.

Couple of things: dbm isn't enabled by default I believe so moving it
to Setup.config may not be the right thing after all (would that imply
an autoconf test and auto-enabling if it's detected?)  Also, Andrew's
distutils-based build procedure may obviate the need for this change.

-Barry



From ping@lfw.org  Thu Jan 11 15:14:17 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 07:14:17 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>

On Wed, 10 Jan 2001, Guido van Rossum wrote:
> Yes -- I came up with the same thought.
> 
> So here's a plan: somebody please submit a patch that does only one
> thing: from...import * looks for __all__ and if it exists, imports
> exactly those names.  No changes to dir(), or anything.

Please don't use __all__.  At the moment, __all__ is the only way
to easily tell whether a particular module object really represents
a package, and the only way to get the list of submodule names.

If __all__ is overloaded to also represent exportable symbols in
modules, these two pieces of information will be impossible (or
require much ugly hackery) to obtain.


-- ?!ng



From guido@python.org  Thu Jan 11 15:23:26 2001
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:23:26 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 10 Jan 2001 15:25:24 EST."
 <LNBBLJKPBEHFEDALKOLCAENJIHAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAENJIHAA.tim.one@home.com>
Message-ID: <200101111523.KAA14982@cj20424-a.reston1.va.home.com>

> [Tim]
> >> Well, it would be easy to fiddle the HAVE_GETC_UNLOCKED method
> >> to keep the file locked until the line was complete, and I
> >> wouldn't be opposed to making life saner on platforms that allow it.
> 
> [Guido]
> > Hm...  That would be possible, except for one unfortunate detail:
> > _PyString_Resize() may call PyErr_BadInternalCall() which touches
> > thread state.

[Tim]
> FLOCKFILE/FUNLOCKFILE are independent of Python's notion of thread state.
> IOW, do FLOCKFILE once before the for(;;), and FUNLOCKFILE once on every
> *exit* path thereafter.  We can block/unblock Python threads as often as
> desired between those *file*-locking brackets.  The only thing the repeated
> FLOCKFILE/FUNLOCKFILE calls do to my eyes now is to create the *possibility*
> for multiple readers to get partial lines of the file.

I don't want to call FLOCKFILE while holding the Python lock, as this
means that *if* we're blocked in FLOCKFILE (e.g. we're reading from a
pipe or socket), no other Python thread can run!

> > ...
> > NO, NO NO!  Mixing reads and writes on the same stream wasn't what we
> > are locking against at all.  (As you've found out, it doesn't even
> > work.)
> 
> On Windows, yes, but that still seems to me to be a bug in MS's code.  If
> anyone had reported a core dump on any other platform, I'd be more tractable
> <wink> on this point.

Yes, it's a Windows bug.

> > We're only trying to protect against concurrent *reads*.
> 
> As above, I believe that we could do a better job of that, then, on
> platforms that HAVE_GETC_UNLOCKED, by protecting not only against core dumps
> but also against .readline() not delivering an intact line from the file.

See above for a reason why I think that's not safe.  I think that
applications that want to do this can do their own locking.  (They'll
find out soon enough that readline() isn't atomic. :-)

> >> But since FLOCKFILE is in effect, other threads *trying* to write
> >> to the stream we're reading will get blocked anyway.  Seems to give us
> >> potential for deadlocks.
> 
> > Only if tyeh are holding other locks at the same time.
> 
> I'm not being clear, then.  Thread X does f.readline(), on a
> HAVE_GETC_UNLOCKED platform.  get_line allows other threads to run and
> invokes FLOCKFILE on f->f_fp.  get_line's GETC in thread X eventually hits
> the end of the stdio buffer, and does its platform's version of _filbuf.
> _filbuf may wait (depending on the nature of the stream) for more input to
> show up.  Simultaneously, thread Y attempts to write some data to f.  But
> the *FLOCKFILE* lock prevents it from doing anything with f.  So X is
> waiting for Y to write data inside platform _filbuf, but Y is waiting for X
> to release the platform stream lock inside some platform stream-output
> routine (if I'm being clear now, Python locks have nothing to do with this
> scenario:  it's the platform stream lock).

I don't think that _filbuf can possibly wait for another thread to
write data to the same stream object.  A single stream object doesn't
act like a pipe, even if it is open for simultaneous reading and
writing.  So if there's no more data in the file, _fulbuf will simply
return with an EOF status, not wait for the data that the other thread
would write.

> I think this is purely the user's fault if it happens.  Just pointing it out
> as another insecurity we're probably not able to protect users from.

I don't think this can happen.

> > ...
> > Yeah.  But this is insane use -- see my comments on SF.  It's only
> > worth fixing because it could be used to intentionally crash Python --
> > but there are easier ways...
> 
> If it's unique to MS (as I suspect), I see no reason to even consider trying
> to fix it in Python.  Unless the Perl Mongers use it to crash Zope <wink>.

OK.  It's unique to MS.  So close the bug report with a "won't fix"
resolution.  There's no point in having bug reports remain open that
we know we can't fix.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Thu Jan 11 15:27:05 2001
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:27:05 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 10 Jan 2001 17:23:14 EST."
 <LNBBLJKPBEHFEDALKOLCGENMIHAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCGENMIHAA.tim.one@home.com>
Message-ID: <200101111527.KAA15005@cj20424-a.reston1.va.home.com>

> Think like an implementer here <0.5 wink>:  they've lost track of how many
> characters are in the buffer despite a locking scheme whose purpose is to
> prevent that.  If it were my implementation, that would be a top-priority
> bug no matter how silly the first program I saw that triggered it.

The locking prevents concurrent threads accessing the stream.

But mixing reads and writes (without intervening fseek etc.) is
illegal use of the stream, and the C standard allows them to be lax
here, even if the program was single-threaded.

In other words: the locking is so good that it serializes the sequence
of reads and writes; but if the sequence of reads and writes is
illegal, they don't guarantee anything.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Thu Jan 11 15:28:23 2001
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:28:23 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 11 Jan 2001 09:28:59 +0800."
 <3A5D0C5B.162F624A@per.dem.csiro.au>
References: <3A5D0C5B.162F624A@per.dem.csiro.au>
Message-ID: <200101111528.KAA15021@cj20424-a.reston1.va.home.com>

> On Tru64 Unix, I get an infinite generator of 'r's (after an initial few
> 'w's) to the screen (but no crashes).

Same here on Linux.

> If I reduce the size of the loop
> counters from 1000000 to 3000, I get the following output:
> opened
> w w w w w w w w w w w w w w w w w w w w w w w w w w w r read 5114
> done

I still get an infinite amount of 'r's.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Thu Jan 11 15:28:21 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 11 Jan 2001 16:28:21 +0100
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEEDIHAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 07, 2001 at 11:13:26PM -0500
References: <LNBBLJKPBEHFEDALKOLCEEBPIHAA.tim_one@email.msn.com> <LNBBLJKPBEHFEDALKOLCIEEDIHAA.tim.one@home.com>
Message-ID: <20010111162820.W2467@xs4all.nl>

On Sun, Jan 07, 2001 at 11:13:26PM -0500, Tim Peters wrote:

> I'm curious about how it performs (relative to the getc_unlocked hack) on
> other platforms.  If you'd like to try that, just recompile fileobject.c
> with

>     USE_MS_GETLINE_HACK

> #define'd.  It should *work* on any platform with fgets() meeting the
> assumption.  The new test_bufio.py std test gives it a pretty good
> correctness workout, if you're worried about that.

FreeBSD seems to work fine. Speed is practically the same as without
USE_MS_GETLINE_HACK (but with HAVE_GETC_UNLOCKED), though still not quite
the same as before all this hackery :-) Not by much though. For most tests
it's smaller than the margin of error, though the difference is still as
much as 20, 30% for the while_readline test. When using a second thread
somewhere in the test, the difference vanishes further.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mal@lemburg.com  Thu Jan 11 15:33:28 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 11 Jan 2001 16:33:28 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>
Message-ID: <3A5DD248.8EE0DF63@lemburg.com>

Ka-Ping Yee wrote:
> 
> On Wed, 10 Jan 2001, Guido van Rossum wrote:
> > Yes -- I came up with the same thought.
> >
> > So here's a plan: somebody please submit a patch that does only one
> > thing: from...import * looks for __all__ and if it exists, imports
> > exactly those names.  No changes to dir(), or anything.
> 
> Please don't use __all__.  At the moment, __all__ is the only way
> to easily tell whether a particular module object really represents
> a package, and the only way to get the list of submodule names.

But __all__ has to be user-defined, so I don't buy that argument.
Note that the only true way to recognize a package is by looking
for an attribute "__path__" since Python adds this for packages
only.
 
> If __all__ is overloaded to also represent exportable symbols in
> modules, these two pieces of information will be impossible (or
> require much ugly hackery) to obtain.

Again, __all__ is not automatically generated, so trusting it
doesn't get you very far. To be able to find subpackages you will
always have to apply some hackery (based on __path__) in order
to be sure. It would be better to add a helper function to
packages to query this kind of information -- the package usually
knows best where to look and what to look for.

Note that __all__ was explicitly invented to be used by
from package import * so I think it is the right choice here.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From esr@thyrsus.com  Thu Jan 11 15:37:19 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Thu, 11 Jan 2001 10:37:19 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <14941.52418.18484.898061@anthem.wooz.org>; from barry@digicool.com on Thu, Jan 11, 2001 at 10:09:54AM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com> <20010108113109.C7563@kronos.cnri.reston.va.us> <200101101900.OAA30486@cj20424-a.reston1.va.home.com> <14941.52418.18484.898061@anthem.wooz.org>
Message-ID: <20010111103719.A7191@thyrsus.com>

GvR> It was more work than I had hoped for, because Eric
GvR> apparently (despite having developer privileges!) doesn't use
GvR> the CVS tree -- he sent in a diff relative to the 2.0
GvR> release.

I'm using the CVS tree now.  I did that patch relative to 2.0 for
boring reasons having to do with the state of my laptop.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The IRS has become morally corrupted by the enormous power which we in
Congress have unwisely entrusted to it. Too often it acts like a
Gestapo preying upon defenseless citizens.
	-- Senator Edward V. Long


From thomas@xs4all.net  Thu Jan 11 15:48:32 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 11 Jan 2001 16:48:32 +0100
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A5DD248.8EE0DF63@lemburg.com>; from mal@lemburg.com on Thu, Jan 11, 2001 at 04:33:28PM +0100
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> <3A5DD248.8EE0DF63@lemburg.com>
Message-ID: <20010111164831.X2467@xs4all.nl>

On Thu, Jan 11, 2001 at 04:33:28PM +0100, M.-A. Lemburg wrote:

> > Please don't use __all__.  At the moment, __all__ is the only way
> > to easily tell whether a particular module object really represents
> > a package, and the only way to get the list of submodule names.
> 
> But __all__ has to be user-defined, so I don't buy that argument.
> Note that the only true way to recognize a package is by looking
> for an attribute "__path__" since Python adds this for packages
> only.

Ehm.... What, exactly, prevents usercode from doing

__path__ = "neener, neener"

? In other words, even *that* isn't a true way to recognize a package. You
can see what isn't a package, but not what is.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@python.org  Thu Jan 11 15:58:55 2001
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:58:55 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Thu, 11 Jan 2001 07:14:17 PST."
 <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>
Message-ID: <200101111558.KAA15447@cj20424-a.reston1.va.home.com>

> Please don't use __all__.  At the moment, __all__ is the only way
> to easily tell whether a particular module object really represents
> a package, and the only way to get the list of submodule names.
> 
> If __all__ is overloaded to also represent exportable symbols in
> modules, these two pieces of information will be impossible (or
> require much ugly hackery) to obtain.

Marc-Andre already explained that __all__ is not to be trusted.

If you want a reasonably good test for package-ness, use the presence
of __path__.

For a really good test, check whether __file__ ends in __init__.py[c].

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Thu Jan 11 16:14:00 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 11 Jan 2001 11:14:00 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
Message-ID: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us>

I've put a new version of the setup.py script at
     http://www.mems-exchange.org/software/files/python/setup.py

(I'm at work and can't remember the password to get into
www.amk.ca. :) )

This version improves the detection of Tcl/Tk, handles the
_curses_panel module, and doesn't do a chdir().  Same drill as before:
just grab the script, drop it in the root of your Python source tree
(2.0 or current CVS), run "./python setup.py build", and look at the
modules it compiles.  I can try it on Linux, so I'm most interested in
hearing reports for other Unix versions (*BSD, HP-UX, etc.)

--amk




From ping@lfw.org  Thu Jan 11 16:36:36 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 08:36:36 -0800 (PST)
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of Python)
Message-ID: <Pine.LNX.4.10.10101110803400.5846-100000@skuld.kingmanhall.org>

I'm pleased to announce a reasonable first pass at a documentation
utility for interactive use.  "pydoc" is usable in three ways:

1.  At the shell prompt, "pydoc <name>" displays documentation
    on <name>, very much like "man".

2.  At the shell prompt, "pydoc -k <keyword>" lists modules whose
    one-line descriptions mention the keyword, like "man -k".

3.  Within Python, "from pydoc import help" provides a "help"
    function to display documentation at the interpreter prompt.

All of them use sys.path in order to guarantee that the documentation
you see matches the modules you get.

To try "pydoc", download:

    http://www.lfw.org/python/pydoc.py
    http://www.lfw.org/python/htmldoc.py
    http://www.lfw.org/python/textdoc.py
    http://www.lfw.org/python/inspect.py

I would very much appreciate your feedback, especially from testing
on non-Unix platforms.  Thank you!

I've pasted some examples from my shell below (when you actually
run pydoc, the output is piped through "less", "more", or a pager
implemented in Python, depending on what is available).



-- ?!ng

"If I have seen farther than others, it is because I was standing on a
really big heap of midgets."
    -- K. Eric Drexler



skuld[1268]% pydoc -k mail
mailbox - Classes to handle Unix style, MMDF style, and MH style mailboxes.
mailcap - Mailcap file handling.  See RFC 1524.
mimify - Mimification and unmimification of mail messages.
test.test_mailbox - (no description)

skuld[1269]% pydoc -k text
textdoc - Generate text documentation from live Python objects.
collab - Routines for collaboration, especially group editing of text documents.
gettext - Internationalization and localization support.
test.test_gettext - (no description)
curses.textpad - Simple textbox editing widget with Emacs-like keybindings.
distutils.text_file - text_file
ScrolledText - (no description)

skuld[1270]% pydoc -k html
htmldoc - Generate HTML documentation from live Python objects.
htmlentitydefs - HTML character entity references.
htmllib - HTML 2.0 parser.

skuld[1271]% pydoc md5

Python Library Documentation: built-in module md5

NAME
    md5

FILE
    (built-in)

DESCRIPTION
    This module implements the interface to RSA's MD5 message digest
    algorithm (see also Internet RFC 1321). Its use is quite
    straightforward: use the new() to create an md5 object. You can now
    feed this object with arbitrary strings using the update() method, and
    at any point you can ask it for the digest (a strong kind of 128-bit
    checksum, a.k.a. ``fingerprint'') of the contatenation of the strings
    fed to it so far using the digest() method.
    
    Functions:
    
    new([arg]) -- return a new md5 object, initialized with arg if provided
    md5([arg]) -- DEPRECATED, same as new, but for compatibility
    
    Special Objects:
    
    MD5Type -- type object for md5 objects

FUNCTIONS
    md5(no arg info)
        new([arg]) -> md5 object
        
        Return a new md5 object. If arg is present, the method call update(arg)
        is made.
    
    new(no arg info)
        new([arg]) -> md5 object
        
        Return a new md5 object. If arg is present, the method call update(arg)
        is made.

skuld[1272]% pydoc types

Python Library Documentation: module types

NAME
    types

FILE
    /home/ping/sw/Python-1.5.2/Lib/types.py

DESCRIPTION
    # Define names for all type symbols known in the standard interpreter.
    # Types that are part of optional modules (e.g. array) are not listed.

skuld[1273]% pydoc abs

Python Library Documentation: built-in function abs

abs (no arg info)
    abs(number) -> number
    
    Return the absolute value of the argument.

skuld[1274]% pydoc repr             

Python Library Documentation: built-in function repr

repr (no arg info)
    repr(object) -> string
    
    Return the canonical string representation of the object.
    For most object types, eval(repr(object)) == object.


Python Library Documentation: module repr

NAME
    repr - # Redo the `...` (representation) but with limits on most sizes.

FILE
    /home/ping/sw/Python-1.5.2/Lib/repr.py

CLASSES
    Repr
    
    class Repr
        __init__(self)
        
        repr(self, x)
        
        repr1(self, x, level)
        
        repr_dictionary(self, x, level)
        
        repr_instance(self, x, level)
        
        repr_list(self, x, level)
        
        repr_long_int(self, x, level)
        
        repr_string(self, x, level)
        
        repr_tuple(self, x, level)

FUNCTIONS
    repr(no arg info)

skuld[1275]% pydoc re.MatchObject

Python Library Documentation: class MatchObject in re

class MatchObject
    __init__(self, re, string, pos, endpos, regs)
    
    end(self, g=0)
        Return the end of the substring matched by group g
    
    group(self, *groups)
        Return one or more groups of the match
    
    groupdict(self, default=None)
        Return a dictionary containing all named subgroups of the match
    
    groups(self, default=None)
        Return a tuple containing all subgroups of the match object
    
    span(self, g=0)
        Return (start, end) of the substring matched by group g
    
    start(self, g=0)
        Return the start of the substring matched by group g

skuld[1276]% pydoc xml    

Python Library Documentation: package xml

NAME
    xml - Core XML support for Python.

FILE
    /home/ping/dev/python/dist/src/Lib/xml/__init__.py

DESCRIPTION
    This package contains three sub-packages:
    
    dom -- The W3C Document Object Model.  This supports DOM Level 1 +
           Namespaces.
    
    parsers -- Python wrappers for XML parsers (currently only supports Expat).
    
    sax -- The Simple API for XML, developed by XML-Dev, led by David
           Megginson and ported to Python by Lars Marius Garshol.  This
           supports the SAX 2 API.

VERSION
    1.8

skuld[1277]% pydoc lovelyspam
no Python documentation found for lovelyspam

skuld[1278]% python
Python 1.5.2 (#1, Dec 12 2000, 02:25:44)  [GCC egcs-2.91.66 19990314/Linux (egcs- on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>>          
>>> from pydoc import help
>>> help(int)
Help on built-in function int:

int (no arg info)
    int(x) -> integer
    
    Convert a string or number to an integer, if possible.
    A floating point argument will be truncated towards zero.

>>> help("urlparse.urljoin")
Help on function urljoin in module urlparse:

urljoin(base, url, allow_fragments=1)
    # Join a base URL and a possibly relative URL to form an absolute
    # interpretation of the latter.
>>> import random
>>> help(random.generator)
Help on class generator in module random:

class generator(whrandom.whrandom)
    Random generator class.
    
    __init__(self, a=None)
        Constructor.  Seed from current time or hashable value.
    
    seed(self, a=None)
        Seed the generator from current time or hashable value.
>>> 




From moshez@zadka.site.co.il  Fri Jan 12 00:48:30 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 02:48:30 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A5C97F4.945D0C1@lemburg.com>
References: <3A5C97F4.945D0C1@lemburg.com>, <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
 <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <20010112004830.21E10A82F@darjeeling.zadka.site.co.il>

On Wed, 10 Jan 2001 18:12:20 +0100, "M.-A. Lemburg" <mal@lemburg.com> wrote:

> > So here's a plan: somebody please submit a patch that does only one
> > thing: from...import * looks for __all__ and if it exists, imports
> > exactly those names.  No changes to dir(), or anything.
> 
> +1 -- this won't be me though (at least not this week).

I'm working on it -- I'll have a patch ready as soon as my slow
modem will manage to finish the "cvs diff".  Guido, I'll
assign it to you, OK?

> Cool.  This could make Python instances usable as "modules"
> -- with full getattr() hook support !

My Patch already does that -- if the instance supports __all__

> For IMPORT_STAR I'd suggest first looking for __all__ and
> then reverting to __dict__.items() in case this fails. 

That's what my patch is doing.

> BTW, is __dict__ needed by the import mechanism or would
> the getattr/setattr slots suffice ? And if yes, must it
> be a real Python dictionary ?

My patch works with getattr (no setattr) as longs as there
is an __all__ attribute. 

-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From ping@lfw.org  Thu Jan 11 16:42:44 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 08:42:44 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101111558.KAA15447@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101110842110.5846-100000@skuld.kingmanhall.org>

On Thu, 11 Jan 2001, Guido van Rossum wrote:
> 
> Marc-Andre already explained that __all__ is not to be trusted.
> 
> If you want a reasonably good test for package-ness, use the presence
> of __path__.

Sorry, you're right.  I retract my comment about __all__.


-- ?!ng



From skip@mojam.com (Skip Montanaro)  Thu Jan 11 16:47:13 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 11 Jan 2001 10:47:13 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010111164831.X2467@xs4all.nl>
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>
 <3A5DD248.8EE0DF63@lemburg.com>
 <20010111164831.X2467@xs4all.nl>
Message-ID: <14941.58257.304339.437443@beluga.mojam.com>

    Thomas> __path__ = "neener, neener"

I believe correct English usage here is "neener, neener, neener", with a
little extra emphasis on the first syllable of the third "neener"...

does-that-help?-ly y'rs,

Skip


From MarkH@ActiveState.com  Fri Jan 12 16:55:29 2001
From: MarkH@ActiveState.com (Mark Hammond)
Date: Fri, 12 Jan 2001 08:55:29 -0800
Subject: [Python-Dev] RE: Baffled on Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEOGIHAA.tim.one@home.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPKEIHCOAA.MarkH@ActiveState.com>

> 4. The way mmapmodule.c is coded and built after Guido's change appears to
> me to be the same as how every other non-builtin module is coded and built
> on Windows.  For example, winsound.c, which uses DL_EXPORT(void)
> before its
> initwinsound and where that macro also expands to "void".  But importing
> winsound works fine.

winsound adds "/export:initwinsound" to the link line.  This is an
alternative to __declspec in the sources.

This all gets back to a discussion we had here nearly a year or so ago -
that "DL_EXPORT" isnt capturing our semantics, and that we should probably
create #defines that match the _intent_ of the definition, rather than the
implementation details - ie, replace DL_EXPORT with (say) PY_API_DECL and
PY_MODULEINIT_DECL or some such.

I'm happy to think about this and help implement it if the time is now
right...

> Any Windows geek got a clue?

Isn't that question a paradox? ;-)

Mark.



From skip@mojam.com (Skip Montanaro)  Thu Jan 11 17:11:23 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 11 Jan 2001 11:11:23 -0600 (CST)
Subject: [Python-Dev] dir()/__all__/etc
Message-ID: <14941.59707.632995.224116@beluga.mojam.com>

I know Guido has said he doesn't want to fiddle with dir(), but my sense of
things from the overall discussion of the __exports__ concept tells me that
when used interactively dir() often presents confusing output for new Python
users.

I twiddled CGIHTTPServer to have __all__ and added the following dir()
function to my PYTHONSTARTUP file:

def dir(o,showall=0):
    if not showall and hasattr(o, "__all__"):
        x = list(o.__all__)
        x.sort()
        return x
    from __builtin__ import dir as d
    return d(o)

Compare its output with and without showall set:

  >>> dir(CGIHTTPServer)
  ['CGIHTTPRequestHandler', 'test']
  >>> dir(CGIHTTPServer,1)
  ['BaseHTTPServer', 'CGIHTTPRequestHandler', 'SimpleHTTPServer', '__all__',
   '__builtins__', '__doc__', '__file__', '__name__', '__version__',
   'executable', 'nobody', 'nobody_uid', 'os', 'string', 'sys', 'test',
   'urllib']

I haven't demonstrated any great programming prowess with this little
function, but I rather suspect it may be beyond most brand new users.  If
Guido can't be convinced to allow dir() to change, how about adding a sample
PYTHONSTARTUP file to the distribution that contains little bits like this
and Ping's pydoc.help stuff (assuming it gets into the distro, which I hope
it does)?

Skip


From mal@lemburg.com  Thu Jan 11 17:25:20 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 11 Jan 2001 18:25:20 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> <3A5DD248.8EE0DF63@lemburg.com> <20010111164831.X2467@xs4all.nl>
Message-ID: <3A5DEC80.596F0818@lemburg.com>

Thomas Wouters wrote:
> 
> On Thu, Jan 11, 2001 at 04:33:28PM +0100, M.-A. Lemburg wrote:
> 
> > > Please don't use __all__.  At the moment, __all__ is the only way
> > > to easily tell whether a particular module object really represents
> > > a package, and the only way to get the list of submodule names.
> >
> > But __all__ has to be user-defined, so I don't buy that argument.
> > Note that the only true way to recognize a package is by looking
> > for an attribute "__path__" since Python adds this for packages
> > only.
> 
> Ehm.... What, exactly, prevents usercode from doing
> 
> __path__ = "neener, neener"
> 
> ? In other words, even *that* isn't a true way to recognize a package. You
> can see what isn't a package, but not what is.

Purists.... ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From moshez@zadka.site.co.il  Fri Jan 12 02:06:37 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 04:06:37 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <14941.49059.26189.733094@beluga.mojam.com>
References: <14941.49059.26189.733094@beluga.mojam.com>, <E14GgKS-0002AH-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010112020637.EF4D5A82F@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001 08:13:55 -0600 (CST), Skip Montanaro <skip@mojam.com> wrote:

> While it works, is it really kosher to test w's value after the DECREF?

Yes. It may not point to anything valid, but it won't be NULL.

> Just seems like an odd construct to me.  I'm used to seeing the test
> immediately after it's been set.

It was more convenient that way. And I'm pretty certain the _DECREF
macros do not change their arguments.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From moshez@zadka.site.co.il  Fri Jan 12 02:09:13 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 04:09:13 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>
Message-ID: <20010112020913.1FE70A82F@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001 07:14:17 -0800 (PST), Ka-Ping Yee <ping@lfw.org> wrote:
> On Wed, 10 Jan 2001, Guido van Rossum wrote:
> > Yes -- I came up with the same thought.
> > 
> > So here's a plan: somebody please submit a patch that does only one
> > thing: from...import * looks for __all__ and if it exists, imports
> > exactly those names.  No changes to dir(), or anything.
> 
> Please don't use __all__.  At the moment, __all__ is the only way
> to easily tell whether a particular module object really represents
> a package

Why not __init__? It has to be there, and is in no other module object.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From moshez@zadka.site.co.il  Fri Jan 12 02:23:16 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 04:23:16 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010112004830.21E10A82F@darjeeling.zadka.site.co.il>
References: <20010112004830.21E10A82F@darjeeling.zadka.site.co.il>, <3A5C97F4.945D0C1@lemburg.com>, <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
 <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <20010112022316.BE682A82D@darjeeling.zadka.site.co.il>

On Fri, 12 Jan 2001, Moshe Zadka <moshez@zadka.site.co.il> wrote:

> I'm working on it -- I'll have a patch ready as soon as my slow
> modem will manage to finish the "cvs diff".  Guido, I'll
> assign it to you, OK?

OK, it's 103200.
Unfortunately, I couldn't assign it to Guido, since I couldn't
upload it at all (yeah, still those lynx problems). This time
I managed to get one specific person to upload for me, but someone
else will have to assign to Guido.

-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From nas@arctrix.com  Thu Jan 11 11:42:51 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Thu, 11 Jan 2001 03:42:51 -0800
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Jan 11, 2001 at 11:14:00AM -0500
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us>
Message-ID: <20010111034251.A23512@glacier.fnational.com>

Here is what I get on my Debian Linux machine:

  _codecs.so        cPickle.so    imageop.so        pwd.so       termios.so
  _curses.so        cStringIO.so  linuxaudiodev.so  regex.so     time.so
  _curses_panel.so  cmath.so      math.so           resource.so  timing.so
  _locale.so        crypt.so      md5.so            rgbimg.so    ucnhash.so
  _socket.so        dbm.so        mmap.so           rotor.so     unicodedata.so
  _tkinter.so       errno.so      new.so            select.so    zlib.so
  array.so          fcntl.so      nis.so            sha.so
  audioop.so        fpectl.so     operator.so       signal.so
  binascii.so       gdbm.so       parser.so         strop.so
  bsddb.so          grp.so        pcre.so           syslog.so
  
I think that is every module which can be compiled on my machine.  Great work
Andrew (and the distutil developers).

  Neil


From nas@arctrix.com  Thu Jan 11 11:47:09 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Thu, 11 Jan 2001 03:47:09 -0800
Subject: [Python-Dev] dir()/__all__/etc
In-Reply-To: <14941.59707.632995.224116@beluga.mojam.com>; from skip@mojam.com on Thu, Jan 11, 2001 at 11:11:23AM -0600
References: <14941.59707.632995.224116@beluga.mojam.com>
Message-ID: <20010111034709.C23512@glacier.fnational.com>

I'm -1 on making dir() pay attention to __all__.  I'm +1 on
adding a help() function which pays attention to __all__ and
(optionally?) prints doc strings.

  Neil


From gstein@lyra.org  Thu Jan 11 19:38:50 2001
From: gstein@lyra.org (Greg Stein)
Date: Thu, 11 Jan 2001 11:38:50 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101111558.KAA15447@cj20424-a.reston1.va.home.com>; from guido@python.org on Thu, Jan 11, 2001 at 10:58:55AM -0500
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> <200101111558.KAA15447@cj20424-a.reston1.va.home.com>
Message-ID: <20010111113850.F4640@lyra.org>

On Thu, Jan 11, 2001 at 10:58:55AM -0500, Guido van Rossum wrote:
> > Please don't use __all__.  At the moment, __all__ is the only way
> > to easily tell whether a particular module object really represents
> > a package, and the only way to get the list of submodule names.
> > 
> > If __all__ is overloaded to also represent exportable symbols in
> > modules, these two pieces of information will be impossible (or
> > require much ugly hackery) to obtain.
> 
> Marc-Andre already explained that __all__ is not to be trusted.
> 
> If you want a reasonably good test for package-ness, use the presence
> of __path__.
> 
> For a really good test, check whether __file__ ends in __init__.py[c].

Even that isn't safe: if the module was pulled from an archive, __file__
might not get set.

Determining whether something is a package is highly dependent upon how it
was brought into the system. It is entirely possibly that you *can't* know
something represents a package.

You can get close by looking in sys.modules to look for modules "below" the
given module. But if none have been imported yet, then you're out of luck.
If you're using imputil, then you can look for __ispkg__ in the module.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From thomas@xs4all.net  Thu Jan 11 19:50:24 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 11 Jan 2001 20:50:24 +0100
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010112020913.1FE70A82F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Fri, Jan 12, 2001 at 04:09:13AM +0200
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> <20010112020913.1FE70A82F@darjeeling.zadka.site.co.il>
Message-ID: <20010111205024.Z2467@xs4all.nl>

On Fri, Jan 12, 2001 at 04:09:13AM +0200, Moshe Zadka wrote:

> Why not __init__? It has to be there, and is in no other module object.

Wrong association... __init__ would be a method that gets executed. (At
least that's what I'd expect :)

'sides,-everyone-was-in-agreement-on-__all__-ly y'rs,

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From MarkH@ActiveState.com  Thu Jan 11 20:25:30 2001
From: MarkH@ActiveState.com (Mark Hammond)
Date: Thu, 11 Jan 2001 12:25:30 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <20010112020637.EF4D5A82F@darjeeling.zadka.site.co.il>
Message-ID: <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>

> It was more convenient that way. And I'm pretty certain the _DECREF
> macros do not change their arguments.

Pretty certain???  That doesn't inspire confidence <wink>. How certain are
you that this will be true in the future?

I think it bad style indeed - for example, I could see benefit in having
DECREF (or _Py_Dealloc, called by decref) set the object to NULL in debug
builds.  What if that decision is taken in the future?

I thought rules were pretty clear with reference counting - dont assume
_anything_ about the object unless you hold a reference (or are damn sure
someone else does!)

Mark.



From thomas@xs4all.net  Thu Jan 11 21:41:57 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 11 Jan 2001 22:41:57 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>; from MarkH@ActiveState.com on Thu, Jan 11, 2001 at 12:25:30PM -0800
References: <20010112020637.EF4D5A82F@darjeeling.zadka.site.co.il> <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>
Message-ID: <20010111224157.A2467@xs4all.nl>

On Thu, Jan 11, 2001 at 12:25:30PM -0800, Mark Hammond wrote:

> I thought rules were pretty clear with reference counting - dont assume
> _anything_ about the object unless you hold a reference (or are damn sure
> someone else does!)

Moshe isn't breaking that rule. He isn't assuming anything about the object,
just about the value of the pointer to that object. I agree, though, that
it's bad practice to rely on it having the old value, after DECREFing it.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@python.org  Thu Jan 11 21:48:46 2001
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 16:48:46 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Thu, 11 Jan 2001 08:42:44 PST."
 <Pine.LNX.4.10.10101110842110.5846-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101110842110.5846-100000@skuld.kingmanhall.org>
Message-ID: <200101112148.QAA16227@cj20424-a.reston1.va.home.com>

> Sorry, you're right.  I retract my comment about __all__.

Can you explain *why* you wanted to test for package-ness?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Thu Jan 11 21:55:24 2001
From: guido@python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 16:55:24 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: Your message of "Thu, 11 Jan 2001 11:14:00 EST."
 <E14GkMS-0006DF-00@kronos.cnri.reston.va.us>
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us>
Message-ID: <200101112155.QAA16678@cj20424-a.reston1.va.home.com>

> I've put a new version of the setup.py script at
>      http://www.mems-exchange.org/software/files/python/setup.py
> 
> (I'm at work and can't remember the password to get into
> www.amk.ca. :) )
> 
> This version improves the detection of Tcl/Tk, handles the
> _curses_panel module, and doesn't do a chdir().  Same drill as before:
> just grab the script, drop it in the root of your Python source tree
> (2.0 or current CVS), run "./python setup.py build", and look at the
> modules it compiles.  I can try it on Linux, so I'm most interested in
> hearing reports for other Unix versions (*BSD, HP-UX, etc.)

Good work -- but I still can't run this inside a platform-specific
subdirectory.  Are you planning on supporting this?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@loewis.home.cs.tu-berlin.de  Thu Jan 11 21:20:45 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 11 Jan 2001 22:20:45 +0100
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of Python)
Message-ID: <200101112120.f0BLKjc01982@mira.informatik.hu-berlin.de>

> I would very much appreciate your feedback

At the first glance, it looks *very* promising. I really look forward
to see it in 2.1.

However, robustness probably needs to be improved:

>>> help()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: not enough arguments to help(); expected 1, got 0    

Wasn't there even a proposal that

>>> help

should do something meaningful (by implementing __repr__)?

>>> import string
>>> help(string)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "pydoc.py", line 183, in help
    pager('Help on %s:\n\n' % desc + textdoc.document(thing))
  File "./textdoc.py", line 171, in document
    if inspect.ismodule(object): results = document_module(object)
  File "./textdoc.py", line 87, in document_module
    if (inspect.getmodule(value) or object) is object:
  File "./inspect.py", line 190, in getmodule
    file = getsourcefile(object)
  File "./inspect.py", line 204, in getsourcefile
    filename = getfile(object)
  File "./inspect.py", line 172, in getfile
    raise TypeError, 'arg is a built-in class'
TypeError: arg is a built-in class

Also, the tools could use some command line options:

martin@mira:~/pydoc > ./pydoc.py --help
Traceback (most recent call last):
  File "./pydoc.py", line 190, in ?
    opts[args[i][1:]] = args[i+1]
IndexError: list index out of range

At a minimum, I propose -h, --help, -v, -V.

Regards,
Martin


From fdrake@acm.org  Thu Jan 11 22:11:24 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 11 Jan 2001 17:11:24 -0500 (EST)
Subject: [Python-Dev] [PEP 205] Weak References PEP updated, patch available!
Message-ID: <14942.12172.129547.770776@cj42289-a.reston1.va.home.com>

  I've updated the Weak References PEP a little:

http://python.sourceforge.net/peps/pep-0205.html

  A preliminary version of the implementation and documentation is
available as well:

http://sourceforge.net/patch/?func=detailpatch&patch_id=103203&group_id=5470

  Please send feedback on the PEP or implementation to me.
  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From akuchlin@mems-exchange.org  Thu Jan 11 22:26:33 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 11 Jan 2001 17:26:33 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: <200101112155.QAA16678@cj20424-a.reston1.va.home.com>; from guido@python.org on Thu, Jan 11, 2001 at 04:55:24PM -0500
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> <200101112155.QAA16678@cj20424-a.reston1.va.home.com>
Message-ID: <20010111172633.A26249@kronos.cnri.reston.va.us>

On Thu, Jan 11, 2001 at 04:55:24PM -0500, Guido van Rossum wrote:
>Good work -- but I still can't run this inside a platform-specific
>subdirectory.  Are you planning on supporting this?

I didn't really understand this when you pointed it out, but forgot to
ask for clarification.  What does your directory layout look like?

--amk



From ping@lfw.org  Thu Jan 11 22:26:53 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 14:26:53 -0800 (PST)
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <200101112120.f0BLKjc01982@mira.informatik.hu-berlin.de>
Message-ID: <Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>

On Thu, 11 Jan 2001, Martin v. Loewis wrote:
> 
> However, robustness probably needs to be improved:

Agreed.

> Wasn't there even a proposal that
> 
> >>> help
> 
> should do something meaningful (by implementing __repr__)?

There was.  I am planning to incorporate Paul Prescod's mechanism
for doing this; i just didn't have time to throw in that feature
yet, and wanted feedback on the man-like stuff first.

My next two targets are:
    1.  Generating text from the HTML documentation files
        using Paul Prescod's stuff in onlinehelp.py.

    2.  Running a background HTTP server that produces its
        pages using htmldoc.py.

Both are pieces we already have and only need to integrate; i just
wanted to get at least a working candidate done first.

Did using pydoc like "man" work okay for you?

> >>> import string
> >>> help(string)
> Traceback (most recent call last):
...
> TypeError: arg is a built-in class

Mine doesn't do this for me.  I think i may have left up an older version
of inspect.py by mistake.  Try downloading

    http://www.lfw.org/python/inspect.py

again -- apologies for the hassle.

> Also, the tools could use some command line options:
> 
> martin@mira:~/pydoc > ./pydoc.py --help
> Traceback (most recent call last):
>   File "./pydoc.py", line 190, in ?
>     opts[args[i][1:]] = args[i+1]
> IndexError: list index out of range
> 
> At a minimum, I propose -h, --help, -v, -V.

Okay.  There is usage help already; i just failed to make it sufficiently
robust about deciding when to show it.

    skuld[1010]% pydoc
    /home/ping/bin/pydoc <name> ...
        Show documentation on something.
        <name> may be the name of a Python function, module,
        package, or a dotted reference to a class or function
        within a module or module in a package.

    /home/ping/bin/pydoc -k <keyword>
        Search for a keyword in the short descriptions of modules.


-- ?!ng

"If I have seen farther than others, it is because I was standing on a
really big heap of midgets."
    -- K. Eric Drexler



From ping@lfw.org  Thu Jan 11 22:28:44 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 14:28:44 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101112148.QAA16227@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101111427060.5846-100000@skuld.kingmanhall.org>

On Thu, 11 Jan 2001, Guido van Rossum wrote:
> > Sorry, you're right.  I retract my comment about __all__.
> 
> Can you explain *why* you wanted to test for package-ness?

Auto-generating documentation.  pydoc.py currently tests for __path__,
and looks for the presence of __init__.py in a subdirectory to mean
that the subdirectory name is a package name.  Is it safe on all platforms
to just list all .py files in the subdirectory to get all submodules?


-- ?!ng

"If I have seen farther than others, it is because I was standing on a
really big heap of midgets."
    -- K. Eric Drexler



From tim.one@home.com  Thu Jan 11 23:17:06 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 11 Jan 2001 18:17:06 -0500
Subject: [Python-Dev] RE: Baffled on Windows
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPKEIHCOAA.MarkH@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEBJIIAA.tim.one@home.com>

[Mark Hammond]
> winsound adds "/export:initwinsound" to the link line.  This is an
> alternative to __declspec in the sources.

Yup/arghghghgh.  It's fixed now.  Thanks!

> This all gets back to a discussion we had here nearly a year
> or so ago -

Yup/arghghghgh.
.
> that "DL_EXPORT" isnt capturing our semantics, and that we should
> probably create #defines that match the _intent_ of the
> definition, rather than the implementation details - ie, replace
> DL_EXPORT with (say) PY_API_DECL and PY_MODULEINIT_DECL or some
> such.

Yup/noarghghghgh.

> I'm happy to think about this and help implement it if the time
> is now right...

Same here.  Now how can we tell whether the time is right?  I must say, it
hasn't gotten better by leaving it alone for a year.  I think we need a Unix
dweeb to play along, though -- if only to confirm that their compilers are
no help.

>> Any Windows geek got a clue?

> Isn't that question a paradox? ;-)

Well, nobody else will understand this, but *we* know that Windows geeks
need more clues than everyone else put together just to get the box booted
each day (or hour <0.9 wink>).



From michel@digicool.com  Fri Jan 12 01:15:52 2001
From: michel@digicool.com (Michel Pelletier)
Date: Thu, 11 Jan 2001 20:15:52 -0500
Subject: [Python-Dev] New Draft PEP: Python Interfaces
Message-ID: <web-555709@digicool.com>

Hello,

I have roughed out a draft PEP that proposes the extension
of Python to include an interface framework.  It is posted
online here:

http://www.zope.org/Members/michel/InterfacesPEP/PEP.txt

This is my first revision and stab at a PEP.  I'd like to
find out what you think about the PEP and maybe discuss it
some more offline on a different list.

Thanks!

-Michel


From martin@loewis.home.cs.tu-berlin.de  Fri Jan 12 01:15:25 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 12 Jan 2001 02:15:25 +0100
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>
 (message from Ka-Ping Yee on Thu, 11 Jan 2001 14:26:53 -0800 (PST))
References: <Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>
Message-ID: <200101120115.f0C1FPx03702@mira.informatik.hu-berlin.de>

> Did using pydoc like "man" work okay for you?

Yes, that is very impressive.

> Mine doesn't do this for me.  I think i may have left up an older version
> of inspect.py by mistake.  Try downloading
> 
>     http://www.lfw.org/python/inspect.py
> 
> again -- apologies for the hassle.

No need to apologize. It works fine now.

Thanks,
Martin


From moshez@zadka.site.co.il  Fri Jan 12 09:53:35 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 11:53:35 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>
References: <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>
Message-ID: <20010112095335.E8A15A82D@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001, "Mark Hammond" <MarkH@ActiveState.com> wrote:

> I think it bad style indeed - for example, I could see benefit in having
> DECREF (or _Py_Dealloc, called by decref) set the object to NULL in debug
> builds.  What if that decision is taken in the future?
> 
> I thought rules were pretty clear with reference counting - dont assume
> _anything_ about the object unless you hold a reference (or are damn sure
> someone else does!)

I'm not assuming anything about the object -- I'm assuming something
about the pointer. And macros should not change their arguments --
DECREF is basically a wrapper around _Py_Dealloc((PyObject *)(op)).

Just like

free(pointer);
if (pointer == NULL) 
	do_something();
is perfectly legal C.

-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From moshez@zadka.site.co.il  Fri Jan 12 09:57:32 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 11:57:32 +0200 (IST)
Subject: [Python-Dev] dir()/__all__/etc
In-Reply-To: <14941.59707.632995.224116@beluga.mojam.com>
References: <14941.59707.632995.224116@beluga.mojam.com>
Message-ID: <20010112095732.1F65BA82D@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001 11:11:23 -0600 (CST), Skip Montanaro <skip@mojam.com> wrote:
> 
> I know Guido has said he doesn't want to fiddle with dir(), but my sense of
> things from the overall discussion of the __exports__ concept tells me that
> when used interactively dir() often presents confusing output for new Python
> users.
> 
> I twiddled CGIHTTPServer to have __all__ and added the following dir()
> function to my PYTHONSTARTUP file:
> 
> def dir(o,showall=0):
>     if not showall and hasattr(o, "__all__"):
>         x = list(o.__all__)
>         x.sort()
>         return x
>     from __builtin__ import dir as d
>     return d(o)
> 
> Compare its output with and without showall set:
> 
>   >>> dir(CGIHTTPServer)
>   ['CGIHTTPRequestHandler', 'test']
>   >>> dir(CGIHTTPServer,1)
>   ['BaseHTTPServer', 'CGIHTTPRequestHandler', 'SimpleHTTPServer', '__all__',
>    '__builtins__', '__doc__', '__file__', '__name__', '__version__',
>    'executable', 'nobody', 'nobody_uid', 'os', 'string', 'sys', 'test',
>    'urllib']
> 
> I haven't demonstrated any great programming prowess with this little
> function, but I rather suspect it may be beyond most brand new users.  If
> Guido can't be convinced to allow dir() to change, how about adding a sample
> PYTHONSTARTUP file to the distribution that contains little bits like this
> and Ping's pydoc.help stuff (assuming it gets into the distro, which I hope
> it does)?

And, while we're at it, the following bit too can be in the PYTHONSTARTUP:

def display(x):
	import __builtin__
	__builtin__._ = None
	if type(x) == type(''):
		print `x`
	else:
		print x
	__built__._ = x

import sys
sys.displayhook = display

-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From tim.one@home.com  Fri Jan 12 02:33:59 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 11 Jan 2001 21:33:59 -0500
Subject: [Python-Dev] dir()/__all__/etc
In-Reply-To: <20010111034709.C23512@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEBPIIAA.tim.one@home.com>

[Neil Schemenauer]
> I'm -1 on making dir() pay attention to __all__.

Me too.  The original __exports__ idea was an ironclad guarantee about which
names were externally visible for *any* purpose.  Then it made sense to
restrict dir() accordingly.  But if __all__ is just "a hint" (to be ignored
or honored at whim, by whoever chooses), the introspective uses of dir()
must be served too.

> I'm +1 on adding a help() function which pays attention to
> __all__ and (optionally?) prints doc strings.

I can't be +1 on anything that vague -- although I'm +1 on each part of it
if done in exactly the way I envision <wink>.



From ping@lfw.org  Fri Jan 12 02:51:54 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 18:51:54 -0800 (PST)
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of
 Python)
In-Reply-To: <200101120115.f0C1FPx03702@mira.informatik.hu-berlin.de>
Message-ID: <Pine.LNX.4.10.10101111846240.5846-100000@skuld.kingmanhall.org>

On Fri, 12 Jan 2001, Martin v. Loewis wrote:
> > Did using pydoc like "man" work okay for you?
> 
> Yes, that is very impressive.

Good.  What platform did you try it on?

I have updated the scripts now to provide a very rudimentary HTTP server
feature:

    skuld[1316]% pydoc -p 8080
    starting server on port 8080

This starts a server on port 8080 that generates HTML documentation for
modules on the fly.  The root page (http://localhost:8080/) shows an
index of modules -- it badly needs some cleaning up, but at least it
provides access to all the documentation.

    http://www.lfw.org/python/pydoc.py
    http://www.lfw.org/python/htmldoc.py

Also, as you requested:

    skuld[1324]% pydoc -h
    /home/ping/bin/pydoc <name> ...
        Show documentation on something.
        <name> may be the name of a Python function, module,
        package, or a dotted reference to a class or function
        within a module or module in a package.

    /home/ping/bin/pydoc -k <keyword>
        Search for a keyword in the short descriptions of modules.

    /home/ping/bin/pydoc -p <port>
        Start an HTTP server on the given port on the local machine.


More to come.


-- ?!ng



From fdrake@acm.org  Fri Jan 12 03:02:00 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 11 Jan 2001 22:02:00 -0500 (EST)
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>
References: <200101112120.f0BLKjc01982@mira.informatik.hu-berlin.de>
 <Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>
Message-ID: <14942.29609.19618.534613@cj42289-a.reston1.va.home.com>

Ka-Ping Yee writes:
 > My next two targets are:
 >     1.  Generating text from the HTML documentation files
 >         using Paul Prescod's stuff in onlinehelp.py.

  You mean the ones I publish as the standard documentation?  Relying
on the structure of that HTML is pure folly!  I don't think I can make
any guaranttees that the HTML structures won't change as the
processing evolves.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From tim.one@home.com  Fri Jan 12 03:49:47 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 11 Jan 2001 22:49:47 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101111523.KAA14982@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOECEIIAA.tim.one@home.com>

[Guido]
> I don't want to call FLOCKFILE while holding the Python lock, as
> this means that *if* we're blocked in FLOCKFILE (e.g. we're reading
> from a pipe or socket), no other Python thread can run!

Ah, good point!  Doesn't appear an essential point, though:  the
HAVE_GETC_UNLOCKED code could still be fiddled easily enough to call
FLOCKFILE and FUNLOCKFILE exactly once per line, but with the first thread
release before the (dynamically only) FLOCKFILE and the last thread grab
after the (dynamically only) FUNLOCKFILE.  It's just a question of will, but
since that's lacking I'll drop it.

> ...
> I don't think that _filbuf can possibly wait for another thread to
> write data to the same stream object.

OK, I'll buy that.  Dropped too.

> ...
> OK.  It's unique to MS.  So close the bug report with a "won't fix"
> resolution.  There's no point in having bug reports remain open that
> we know we can't fix.

We don't really have a policy about that.  Perhaps you're articulating one
here, though!  I've always left bugs open if they're (a) bugs, and (b) open
<wink>.  For example, I left the Norton Blue-Screen crash bug open (although
I see now you eventually closed that).  Ditto the "Rare hangs in
w9xpopen.exe" bug (which is still open, but will never be fixed by *us*).
Just other examples of things we'll almost certainly never fix ourselves (we
have no handle on them, and all evidence says the OS is screwing up).

My view has been that if a user comes to the bug site, it's most helpful for
them if active (== "still happens") crashes and hangs appear among the open
problems.  Now that your view of it is clearer, I'll switch to yours.

too-easy<wink>-ly y'rs  - tim







From tim.one@home.com  Fri Jan 12 04:22:40 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 11 Jan 2001 23:22:40 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101111527.KAA15005@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIECGIIAA.tim.one@home.com>

[Guido]
> The locking prevents concurrent threads accessing the stream.
>
> But mixing reads and writes (without intervening fseek etc.) is
> illegal use of the stream, and the C standard allows them to be lax
> here, even if the program was single-threaded.
>
> In other words: the locking is so good that it serializes the
> sequence of reads and writes; but if the sequence of reads and
> writes is illegal, they don't guarantee anything.

We're never going to agree on this one, you know.

My definition of "bug" here has nothing to do with the std:  something's "a
bug" if it's not functioning as designed.  That's all.  So if the
implementers would say "oops!  that should not have happened!", then to me
it's "a bug".  It so happens I believe the MS implementers would consider
this to be a bug under that defn.  Multi-threaded libraries have to be
written to a much higher level than the C std guarantees (been there, done
that, and so have you), and this is specifically corruption in a crucial
area vulnerable to races.  They have a timing hole!  That's clear.  If the
MS implementers don't believe that's "a bug", then I'd say they're too
unprofessional to be allowed in the same country as a multithreaded library
<0.1 wink>.

Your definition of "bug" seems to be more "I don't want it in Python's open
bug list, so I'll do what Tim usually does and appeal to the std in a
transparent effort to convince someone that it's not really 'a bug' -- then
maybe I'll get it off of Python's bug list".

I'm sure you'll agree that's a fair summary of both sides <wink>.

it's-a-bug-and-it's-no-longer-on-python's-open-bug-list-ly y'rs
    - tim



From tim.one@home.com  Fri Jan 12 06:54:47 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 12 Jan 2001 01:54:47 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101111508.KAA14870@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECMIIAA.tim.one@home.com>

[Tim, on for_xreadlines vs readlines_sizehint, after disabling the
 default 1Mb buffer size in the latter]
> They're indistinguishable then on my box (on one run xreadlines
> is .1 seconds  (out of around 7.6 total) quicker, on another
> readlines_sizehint), *provided* that I specify the same buffer
> size (8192) that xreadlines uses internally.  However, if I even
> double that, readlines_sizehint is uniformly about 10% slower.  It's
> also a tiny bit slower if I cut the sizehint buffer size to 4096.

[Guido]
> 8192 happens to be the size of the stack-allocated buffer readlines()
> uses, and also the stdio BUFSIZ parameter, on many systems.  Look for
> SMALLCHUNK in fileobject.c.
>
> Would it make sense to tie the two constants together more to tune
> this optimally even when BUFSIZ is different?

Have to repeat what I first said:

> I'm afraid Mysteries will remain no matter how many
> person-decades we spend staring at this <0.5 wink> ...

I'm repeating that because BUFSIZ is 4096 on WinTel, but SMALLCHUNK (8192)
worked best for me.  Now we're in some complex balancing act among how often
the outer loop needs to refill the readlines_sizehint buffer;, how out of
whack the latter is with the platform stdio buffer; whether platform malloc
takes only twice as long to allocate space for 2*N strings as for N; and, if
the readlines buffer is too large, at exactly which point the known Win9x
eventually-quadratic-time behavior of PyList_Append starts to kick in.  I
can't out-think all that.  Indeed, I can't out-think any of it <frown>.

After staring at the code, I expect my "only a tiny bit slower" was an
illusion:  if 0 < sizehint <= SMALLCHUNK, sizehint appears to have no effect
on the operation on file_readline.

BTW, changing fileobject.c's SMALLCHUNK to a copy of BUFSIZ didn't make any
difference on Windows.



From moshez@zadka.site.co.il  Fri Jan 12 16:03:58 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 18:03:58 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules xreadlinesmodule.c,1.2,1.3
In-Reply-To: <E14GjqE-0003qi-00@usw-pr-cvs1.sourceforge.net>
References: <E14GjqE-0003qi-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010112160358.B0AC0A82D@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001, Thomas Wouters <twouters@users.sourceforge.net> wrote:

> Noone but me cares, but Guido said to go ahead and fix it if it bothered me.

I think you meant no one. Noone is an archaic spelling of noon.

quid-pro-quo-ly y'rs, Z.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From fredrik@effbot.org  Fri Jan 12 08:17:11 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Fri, 12 Jan 2001 09:17:11 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules xreadlinesmodule.c,1.2,1.3
References: <E14GjqE-0003qi-00@usw-pr-cvs1.sourceforge.net> <20010112160358.B0AC0A82D@darjeeling.zadka.site.co.il>
Message-ID: <012a01c07c70$11aac700$e46940d5@hagrid>

> > Noone but me cares, but Guido said to go ahead and fix it if it bothered me.
> 
> I think you meant no one. Noone is an archaic spelling of noon.

no, he meant me.  I care.

</F>



From martin@loewis.home.cs.tu-berlin.de  Fri Jan 12 08:09:00 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 12 Jan 2001 09:09:00 +0100
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of
 Python)
In-Reply-To: <Pine.LNX.4.10.10101111846240.5846-100000@skuld.kingmanhall.org>
 (message from Ka-Ping Yee on Thu, 11 Jan 2001 18:51:54 -0800 (PST))
References: <Pine.LNX.4.10.10101111846240.5846-100000@skuld.kingmanhall.org>
Message-ID: <200101120809.f0C890B00802@mira.informatik.hu-berlin.de>

> Good.  What platform did you try it on?

Linux, in a Konsole. I guess that is an environment you'd been using
as well :-)

Martin



From jack@oratrix.nl  Fri Jan 12 09:57:27 2001
From: jack@oratrix.nl (Jack Jansen)
Date: Fri, 12 Jan 2001 10:57:27 +0100
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of
 Python)
In-Reply-To: Message by Ka-Ping Yee <ping@lfw.org> ,
 Thu, 11 Jan 2001 08:36:36 -0800 (PST) , <Pine.LNX.4.10.10101110803400.5846-100000@skuld.kingmanhall.org>
Message-ID: <20010112095727.C56D13BD8B0@snelboot.oratrix.nl>

> I'm pleased to announce a reasonable first pass at a documentation
> utility for interactive use.  "pydoc" is usable in three ways:
[...]
> I would very much appreciate your feedback, especially from testing
> on non-Unix platforms.  Thank you!

Wow, I'm impressed!

To make it run on the mac I had to add tests for the existence of os.system 
only. (So all statements "if os.system(...) > 0:" got to be "if hasattr(os, 
"system") and os.system(...) > 0:").

There are however various other niceties that could be added to make it more 
useful, can this be put into the repository or something?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 




From gstein@lyra.org  Fri Jan 12 10:31:53 2001
From: gstein@lyra.org (Greg Stein)
Date: Fri, 12 Jan 2001 02:31:53 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <20010111224157.A2467@xs4all.nl>; from thomas@xs4all.net on Thu, Jan 11, 2001 at 10:41:57PM +0100
References: <20010112020637.EF4D5A82F@darjeeling.zadka.site.co.il> <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com> <20010111224157.A2467@xs4all.nl>
Message-ID: <20010112023153.Q4640@lyra.org>

On Thu, Jan 11, 2001 at 10:41:57PM +0100, Thomas Wouters wrote:
> On Thu, Jan 11, 2001 at 12:25:30PM -0800, Mark Hammond wrote:
> 
> > I thought rules were pretty clear with reference counting - dont assume
> > _anything_ about the object unless you hold a reference (or are damn sure
> > someone else does!)
> 
> Moshe isn't breaking that rule. He isn't assuming anything about the object,
> just about the value of the pointer to that object. I agree, though, that
> it's bad practice to rely on it having the old value, after DECREFing it.

Oh, that is just so much baloney.

If I said Py_DECREF(&ptr), *then* I'd be worried. But if I ever call
Py_DECREF(foo) and it modifies foo, then I'd be quite upset. "functions"
just aren't supposed to do that.

-g

-- 
Greg Stein, http://www.lyra.org/


From guido@python.org  Fri Jan 12 13:51:51 2001
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 08:51:51 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: Your message of "Thu, 11 Jan 2001 17:26:33 EST."
 <20010111172633.A26249@kronos.cnri.reston.va.us>
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> <200101112155.QAA16678@cj20424-a.reston1.va.home.com>
 <20010111172633.A26249@kronos.cnri.reston.va.us>
Message-ID: <200101121351.IAA19676@cj20424-a.reston1.va.home.com>

> >Good work -- but I still can't run this inside a platform-specific
> >subdirectory.  Are you planning on supporting this?
> 
> I didn't really understand this when you pointed it out, but forgot to
> ask for clarification.  What does your directory layout look like?

Ah.  It's very simple.  I create a directory "linux" as a subdirectory
of the Python source tree (i.e. at the same level as Lib, Objects,
etc.).  Then I chdir into that directory, and I say "../configure".
The configure script creates subdirectories to hold the object files
for me: Grammar, Parser, Objects, Python, Modules, and sticks
Makefiles in them.  The "srcdir" variable in the Makefiles is set to
"..".  Then I say "make" and it builds Python.  The source directories
are used but no files are created or modified there: all files are
created in the "linux" directory.  This lets me have several separate
configurations: the feature used to be intended for sharing a source
tree between multiple platforms, but now I use it to have threaded,
nonthreaded, debugging, and regular builds under a single source tree.

This also works where the build directory is completely outside the
source tree (some people apparently mount the source tree read-only).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Jan 12 13:54:12 2001
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 08:54:12 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Thu, 11 Jan 2001 14:28:44 PST."
 <Pine.LNX.4.10.10101111427060.5846-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101111427060.5846-100000@skuld.kingmanhall.org>
Message-ID: <200101121354.IAA19700@cj20424-a.reston1.va.home.com>

> > Can you explain *why* you wanted to test for package-ness?
> 
> Auto-generating documentation.  pydoc.py currently tests for __path__,
> and looks for the presence of __init__.py in a subdirectory to mean
> that the subdirectory name is a package name.  Is it safe on all platforms
> to just list all .py files in the subdirectory to get all submodules?

Yes, that should work.  Of course there could also be extension
modules or .pyc-only files there -- you could use imp..get_suffixes()
to find out all modules (even if that means you don't always have the
source code available).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Fri Jan 12 14:07:30 2001
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 09:07:30 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 11 Jan 2001 22:49:47 EST."
 <LNBBLJKPBEHFEDALKOLCOECEIIAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCOECEIIAA.tim.one@home.com>
Message-ID: <200101121407.JAA19781@cj20424-a.reston1.va.home.com>

> [Guido]
> > I don't want to call FLOCKFILE while holding the Python lock, as
> > this means that *if* we're blocked in FLOCKFILE (e.g. we're reading
> > from a pipe or socket), no other Python thread can run!

[Tim]
> Ah, good point!  Doesn't appear an essential point, though:  the
> HAVE_GETC_UNLOCKED code could still be fiddled easily enough to call
> FLOCKFILE and FUNLOCKFILE exactly once per line, but with the first thread
> release before the (dynamically only) FLOCKFILE and the last thread grab
> after the (dynamically only) FUNLOCKFILE.  It's just a question of will, but
> since that's lacking I'll drop it.

Yes, but if the line is very long, you'd have to use malloc() -- you
can't use _PyString_Resize() since that can access the thread state.
You're right that I don't want to do this.

> > OK.  It's unique to MS.  So close the bug report with a "won't fix"
> > resolution.  There's no point in having bug reports remain open that
> > we know we can't fix.
> 
> We don't really have a policy about that.  Perhaps you're articulating one
> here, though!  I've always left bugs open if they're (a) bugs, and (b) open
> <wink>.  For example, I left the Norton Blue-Screen crash bug open (although
> I see now you eventually closed that).  Ditto the "Rare hangs in
> w9xpopen.exe" bug (which is still open, but will never be fixed by *us*).
> Just other examples of things we'll almost certainly never fix ourselves (we
> have no handle on them, and all evidence says the OS is screwing up).

Yes, as I was thinking about this I realized that that was the policy
I wanted.  So, yes, the w9xpopen popen bug can be closed as WontFix too.

> My view has been that if a user comes to the bug site, it's most helpful for
> them if active (== "still happens") crashes and hangs appear among the open
> problems.  Now that your view of it is clearer, I'll switch to yours.

I find it more important that the bug list gives us developers an
overview of tasks to be tackled.  The problems that won't go away can
be listed in the Python 2.0 MoinMoin web!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Fri Jan 12 14:27:43 2001
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 09:27:43 -0500
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of Python)
In-Reply-To: Your message of "Fri, 12 Jan 2001 10:57:27 +0100."
 <20010112095727.C56D13BD8B0@snelboot.oratrix.nl>
References: <20010112095727.C56D13BD8B0@snelboot.oratrix.nl>
Message-ID: <200101121427.JAA20034@cj20424-a.reston1.va.home.com>

> There are however various other niceties that could be added to make it more 
> useful, can this be put into the repository or something?

Ping, do you think you could check this in into the nondist tree?
nondist/sandbox/help would seem a good name (next to Paul's
nondist/sandbox/doctools).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@mojam.com (Skip Montanaro)  Fri Jan 12 16:37:57 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 12 Jan 2001 10:37:57 -0600 (CST)
Subject: [Python-Dev] [Patch #103154] Cygwin Check Import Case Patch
In-Reply-To: <E14Gpl0-00016l-00@usw-sf-web3.sourceforge.net>
References: <E14Gpl0-00016l-00@usw-sf-web3.sourceforge.net>
Message-ID: <14943.13029.103771.261362@beluga.mojam.com>

    Guido> Summary: Cygwin Check Import Case Patch
    ...
    Guido> But I believe the solution is that the TERMIOS module should be
    Guido> renamed.

Isn't this a general problem?  As I recall, the convention when generating
Python modules from C header files is to simply convert the base name to
upper case and replace ".h" with ".py" (errno.h -> ERRNO.py).  From h2py.py:

    # Without filename arguments, acts as a filter.
    # If one or more filenames are given, output is written to corresponding
    # filenames in the local directory, translated to all uppercase, with
    # the extension replaced by ".py".

Perhaps the convention should be instead to append "d" or "data" to the base
name (errno.h -> errnodata.py).

Skip


From guido@python.org  Fri Jan 12 17:47:46 2001
From: guido@python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 12:47:46 -0500
Subject: [Python-Dev] [Patch #103154] Cygwin Check Import Case Patch
In-Reply-To: Your message of "Fri, 12 Jan 2001 10:37:57 CST."
 <14943.13029.103771.261362@beluga.mojam.com>
References: <E14Gpl0-00016l-00@usw-sf-web3.sourceforge.net>
 <14943.13029.103771.261362@beluga.mojam.com>
Message-ID: <200101121747.MAA27504@cj20424-a.reston1.va.home.com>

>     Guido> Summary: Cygwin Check Import Case Patch
>     ...
>     Guido> But I believe the solution is that the TERMIOS module should be
>     Guido> renamed.
> 
> Isn't this a general problem?  As I recall, the convention when generating
> Python modules from C header files is to simply convert the base name to
> upper case and replace ".h" with ".py" (errno.h -> ERRNO.py).  From h2py.py:
> 
>     # Without filename arguments, acts as a filter.
>     # If one or more filenames are given, output is written to corresponding
>     # filenames in the local directory, translated to all uppercase, with
>     # the extension replaced by ".py".
> 
> Perhaps the convention should be instead to append "d" or "data" to the base
> name (errno.h -> errnodata.py).

An even better solution is to get rid of those generated headers and
incorporate the desired symbols directly in the C extension modules.
That's happened for errno and socket, for example; maybe it's time to
do that for termios, too!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@digicool.com  Fri Jan 12 18:54:47 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Fri, 12 Jan 2001 13:54:47 -0500
Subject: [Python-Dev] Patch 103216 - dbmmodule Setup changes
Message-ID: <14943.21239.382891.661026@anthem.wooz.org>

I've just uploaded patch 103216 to the Python project at SF.  This
does a couple of things.  First, it auto-detects (in configure)
whether dbmmodule can be built, and if so whether the -lndbm library
needs to be specified.  Second, it moves the entry for dbmmodule to
Setup.conf, after the *shared* key so that it'll be built as a dynamic
library by default.

This should fix the problem where compiling in dbmmodule sets up a
dependency to libdb which later hoses pybsddb3.

I'd have just checked it in, but I'd like someone else to just proof
it first.  I've only tested this with the current CVS tree on a fairly
stock RH6.1.

BTW, I didn't include the changes to configure in the patch, because
it's large and made SF's patch manager cough.  Besides it can be
generated from configure.in and config.h.in which are included in the
patch.

Cheers,
-Barry



From martin@loewis.home.cs.tu-berlin.de  Fri Jan 12 22:19:57 2001
From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 12 Jan 2001 23:19:57 +0100
Subject: [Python-Dev] PEP 205 comments
Message-ID: <200101122219.f0CMJvp01376@mira.informatik.hu-berlin.de>

Before commenting on the patch itself, I'd like to comment on the
patch describing it.

I'm missing a discussion as to why weak references don't act as
proxies (or why they do now). A weak proxy would provide the same
attributes as the object which it encapsulates, so it could be used
transparently in place of the original object. I can think of a number
of reasons why it is not done this way (e.g. complete transparency is
impossible to achieve); now that a revision of the patch provides
proxies, the documentation should state which features are forwarded
to the proxy and which aren't (it lists the type() as a difference,
but I doubt that is the only difference - repr is also different).

Next, I wonder whether weakref.new is allowed to return an existing
weak reference to the same object. If that is not acceptable, I'd like
to know why - if it was acceptable, then weakref.new(instance)
(i.e. without callback) could return the same weak reference all the
time. A smart implementation might chose to put the weak reference
with no callback in the start of the list, so creation of additional
weak references to the same object would be inexpensive.

Likewise, I'd like to know the rationale for the clear method. Why is
it desirable to drop the object, yet keep the weak reference? Isn't it
easier for the application to either ignore clearing altogether, or
dropping the reference to the weak reference? So I'd propose to kill
the clear method.

Again on proxies, there is no discussion or documentation of the
ReferenceError. Why is it a RuntimeError? LookupError, ValueError, and
AttributeError seem to be just as fine or better.

On to the type type extensions: Should there be a type flag indicating
presence of tp_weaklistoffset? It appears that the type structure had
tp_xxx7 for a long time, so likely all in-use binary modules have
that field set to zero. Is that sufficient?

Thanks for reading all of this message,

Martin


From skip@mojam.com (Skip Montanaro)  Sat Jan 13 15:37:55 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sat, 13 Jan 2001 09:37:55 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib tempfile.py,1.23,1.24
In-Reply-To: <E14HGz6-0005Fh-00@usw-pr-cvs1.sourceforge.net>
References: <E14HGz6-0005Fh-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <14944.30291.658931.489979@beluga.mojam.com>

    Tim> On Linux, someone please run that standalone with more files and/or
    Tim> more threads; e.g.,

    Tim>     python lib/test/test_threadedtempfile.py -f 1000 -t 10

    Tim> to run with 10 threads each creating (and deleting) 1000 temp files.

After capitalizing "Lib", it worked fine for me:

    % ./python Lib/test/test_threadedtempfile.py -f 1000 -t 10
    Creating
    Starting
    Reaping
    Done: errors 0 ok 10000

Skip


From dkwolfe@pacbell.net  Sat Jan 13 18:48:21 2001
From: dkwolfe@pacbell.net (Dan Wolfe)
Date: Sat, 13 Jan 2001 10:48:21 -0800
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
Message-ID: <0G740027Q6Q1KL@mta6.snfc21.pbi.net>

Howdy Folks,

I need some help here. I'd like to see Python build out of the box with a 
./configure, make, make test, and make install on Darwin and Mac OS X.  
Having it build out of the box will make it easier to be incorporated 
into both Darwin and the base Mac OS X distribution - although not for 
the initial release of the latter but definitely doable for subsequent 
releases. In order to do this, I need to have it build cleanly on HFS and 
UFS filesystems.

Under HFS system, I've got a name conflict due to case insenstivity 
between the build target and the "Python" directory that forces me to 
build with a -with-suffix command on HFS and manually change the name 
after install - which is an automatic knockout factor when it comes to 
incorporating it in an automatic build system. Not to mention a problem 
with unix newbies trying to build from source...

Last night, I did some quick investigation to determine the best way to 
fix this problem as documented in PEP-42 in the build section and 
Sourceforge bug 122215 and determined that the easiest and least error 
prone way was to change the directory name Python to PyCore.

It's apparent from the comments that I'm missing something here as the 
reaction has been negative so far - to the point where Guido has rejected 
the patch. Can someone explain what I'd missing that's causing such 
strong feelings?

My second question is how do I resolve the name conflict in an approved 
way?  It's been suggested that a build directory be created (/src/build 
?) and that the target be place here. The problem that I had with this 
suggestion is that it would require an additional layer to execute the 
target and I wasn't sure what impact it whould have on running python 
from a new directory... which is the reason I took the more known path. 
:-)

Bottom line, come March 24th, Mac OS X 1.0 will be released and as of 
July 2001 all Macintoshes  will come with Mac OS X.  I'd like to see 
Python be easily built on "out of the box" these machines - rather come 
with a haphazardous list of instructions or commands as currently needed 
for 1.5.2 and 2.0 releases. And hopefully, at some point be incorporated 
into the base Mac OS X installation...

- Dan Wolfe


From esr@thyrsus.com  Sat Jan 13 20:23:50 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Sat, 13 Jan 2001 15:23:50 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
Message-ID: <20010113152350.A17338@thyrsus.com>

I have a new goodie for the 2.1 standard library, a module called
"simil" that supports computation of similarity indices between
strings such as one might use for recovery-matching of misspellings
against a dictionary.

The three methods supported are stemming, normalized Hamming
similarity, and (the star of the show) Ratcliff-Obershelp gestalt
subpattern matching.  The latter is spookily effective for detecting
not just substition typos but insertions and deletions.  The module is
a C extension (my first!) for speed and because the Ratcliff-Obershelp
implementation uses pointer arithmetic heavily.

It's documented, tested, and ready to go.  But having written it, I
now have a question: why is soundex marked obsolete?  Is there
something wrong with the algorithm or implementation?  If not, then
it would be natural for simil to absorb the existing soundex 
implementation as a fourth entry point.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Whether the authorities be invaders or merely local tyrants, the
effect of such [gun control] laws is to place the individual at the 
mercy of the state, unable to resist.
        -- Robert Anson Heinlein, 1949

-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Americans have the right and advantage of being armed - unlike the citizens
of other countries whose governments are afraid to trust the people with arms.
	-- James Madison, The Federalist Papers


From tim.one@home.com  Sat Jan 13 21:34:10 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 13 Jan 2001 16:34:10 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010113152350.A17338@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>

[Eric S. Raymond]
> I have a new goodie for the 2.1 standard library, a module called
> "simil" that supports computation of similarity indices between
> strings such as one might use for recovery-matching of misspellings
> against a dictionary.

My guess is that Guido won't accept it.

> The three methods supported are stemming, normalized Hamming
> similarity, and (the star of the show) Ratcliff-Obershelp gestalt
> subpattern matching.  The latter is spookily effective for detecting
> not just substition typos but insertions and deletions.  The module is
> a C extension (my first!) for speed and because the Ratcliff-Obershelp
> implementation uses pointer arithmetic heavily.

Never heard of R-O, so tracked down some C code via google.  It appears I
invented the same algorithm at Cray Research in the early 80's for a diff
generator, which later got reincarnated in my ndiff.py (in the
Tools/scripts/ directory).  ndiff generates "human-friendly" diffs between
text files, at both the "file is a sequence of lines" and "line is a
sequence of characters" levels.  I didn't have the hyperbolic marketing
genius to call it "gestalt subpattern matching", though <wink> -- I thought
of it as what Unix diff *would* do if it constrained itself to matching
*contiguous* subsequences, and under the theory people would find that more
natural because contiguity is something the human visual system naturally
latches on to.  ndiff can be spookily natural in practice too.

> It's documented, tested, and ready to go.  But having written it, I
> now have a question: why is soundex marked obsolete?  Is there
> something wrong with the algorithm or implementation?

What is the soundex algorithm?  Not joking.  Skip Montanaro and I were
unable to find the algorithm implemented by soundex.c anywhere in the
literature, and I never found *any* two definitions that were the same.
Even Knuth changed his description of Soundex between editions 2 and 3 of
volume 3.  Skip eventually merged my and Fred Drake's Python implementations
of Knuth Vol 3 Ed 3 Soundex (see the Vaults of Parnassus).

> If not, then it would be natural for simil to absorb the existing
> soundex implementation as a fourth entry point.

Well, soundex.c doesn't match any other Soundex on earth, so it's not worth
reproducing in new code.  Guido doesn't want to be in the middle of fighting
over ill-defined algorithms, so booted Soundex entirely.  Another candidate
for inclusion is the NYSIIS algorithm, which is probably in more "serious"
use than Soundex anyway.  Same thing with NYSIIS, though (i.e., what--
exactly --is "the NYSIIS algorithm"?), except that Knuth didn't do us the
favor of making up his own variation that will *become* "the std" via force
of reputation.  Sean True implemented *a* NYSIIS in Python (and again see
the Vaults for a link to that).

So that's why the module is unlikely to make it into the core:

+ There are any number of algorithms people may want to see (I don't know
what "normalized Hamming similarity" means, but if it's not the same as
Levenshtein edit distance then add the latter to the pot too).

+ Each algorithm on its own is likely controversial.

+ Computing string similarity is something few apps need anyway.

Lots of hassle + little demand == not a natural for the core.  ndiff is in
the core only because many people found the *app* useful; its
SequenceMatcher class isn't even advertised.

may-never-understand-how-bigints-got-into-python<wink>-ly
    y'rs  - tim



From fdrake@acm.org  Sat Jan 13 21:45:12 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Sat, 13 Jan 2001 16:45:12 -0500 (EST)
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>
References: <20010113152350.A17338@thyrsus.com>
 <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>
Message-ID: <14944.52328.558763.46161@cj42289-a.reston1.va.home.com>

Tim Peters writes:
 > + Computing string similarity is something few apps need anyway.

  And this is a biggie.

 > Lots of hassle + little demand == not a natural for the core.  ndiff is in

  But it *is* an excellent type of thing to have around -- Eric: just
post it on your Web site and register it with the Vaults.

 > the core only because many people found the *app* useful; its
 > SequenceMatcher class isn't even advertised.

  Did you ever write documentation for it?  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From nas@arctrix.com  Sat Jan 13 15:17:58 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Sat, 13 Jan 2001 07:17:58 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python sysmodule.c,2.82,2.83
In-Reply-To: <E14HYoJ-0002n3-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Sat, Jan 13, 2001 at 02:06:07PM -0800
References: <E14HYoJ-0002n3-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010113071758.C28643@glacier.fnational.com>

[Guido van Rossum on Demo/embed/loop]
> (Except it still leaks, but that's probably a separate issue.)

Could this be caused by modules adding things to their dict and
then forgetting to decref them?  I know I've been guilty of that.

  Neil


From esr@thyrsus.com  Sat Jan 13 22:15:28 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Sat, 13 Jan 2001 17:15:28 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>; from tim.one@home.com on Sat, Jan 13, 2001 at 04:34:10PM -0500
References: <20010113152350.A17338@thyrsus.com> <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>
Message-ID: <20010113171528.A17480@thyrsus.com>

OK, now I understand why soundex isn't in the core -- there's no canonical 
version.

Tim Peters <tim.one@home.com>:
> + There are any number of algorithms people may want to see (I don't know
> what "normalized Hamming similarity" means, but if it's not the same as
> Levenshtein edit distance then add the latter to the pot too).

Normalized Hamming similarity: it's an inversion of Hamming distance
-- number of pairwise matches in two strings of the same length,
divided by the common string length.  Gives a measure in [0.0, 1.0].

I've looked up "Levenshtein edit distance" and you're rigbt.  I'll add it
as a fourth entry point as soon as I can find C source to crib.  (Would
you happen to have a pointer?)

> + Each algorithm on its own is likely controversial.

Not these.  There *are* canonical versions of all these, and exact
equivalents are all heavily used in commercial OCR software.

> + Computing string similarity is something few apps need anyway.

Tim, this isn't true.  Any time you need to validate user input
against a controlled vocabulary and give feedback on probable right
choices, R/O similarity is *very* useful.  I've had it in my personal
toolkit for a decade and used it heavily for this -- you take your
unknown input, check it against a dictionary and kick "maybe you meant
foo?" to the user for every foo with an R/O similarity above 0.6 or so.

The effects look like black magic.  Users love it.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"I hold it, that a little rebellion, now and then, is a good thing, and as 
necessary in the political world as storms in the physical."
	-- Thomas Jefferson, Letter to James Madison, January 30, 1787


From guido@python.org  Sat Jan 13 22:25:12 2001
From: guido@python.org (Guido van Rossum)
Date: Sat, 13 Jan 2001 17:25:12 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python sysmodule.c,2.82,2.83
In-Reply-To: Your message of "Sat, 13 Jan 2001 07:17:58 PST."
 <20010113071758.C28643@glacier.fnational.com>
References: <E14HYoJ-0002n3-00@usw-pr-cvs1.sourceforge.net>
 <20010113071758.C28643@glacier.fnational.com>
Message-ID: <200101132225.RAA03197@cj20424-a.reston1.va.home.com>

> [Guido van Rossum on Demo/embed/loop]
> > (Except it still leaks, but that's probably a separate issue.)
> 
> Could this be caused by modules adding things to their dict and
> then forgetting to decref them?  I know I've been guilty of that.

Do you have a tool that detects leaks?  Barry has one: Insure++.  It's
expensive and we don't have a site license, so I'll ask Barry to
investigate this.

(Barry: go to Demo/embed and do "make looptest".  Then in another
shell window use "top" to watch the "loop" process grow slowly.  I'd
love to find out what's the problem here.  It's not dependent on what
you ask it to loop over; "./loop pass" also grows.  Of course it could
be one of the modules loaded during initialization...)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Sat Jan 13 22:33:34 2001
From: guido@python.org (Guido van Rossum)
Date: Sat, 13 Jan 2001 17:33:34 -0500
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
In-Reply-To: Your message of "Sat, 13 Jan 2001 10:48:21 PST."
 <0G740027Q6Q1KL@mta6.snfc21.pbi.net>
References: <0G740027Q6Q1KL@mta6.snfc21.pbi.net>
Message-ID: <200101132233.RAA03229@cj20424-a.reston1.va.home.com>

> Howdy Folks,
> 
> I need some help here. I'd like to see Python build out of the box with a 
> ./configure, make, make test, and make install on Darwin and Mac OS X.  
> Having it build out of the box will make it easier to be incorporated 
> into both Darwin and the base Mac OS X distribution - although not for 
> the initial release of the latter but definitely doable for subsequent 
> releases. In order to do this, I need to have it build cleanly on HFS and 
> UFS filesystems.
> 
> Under HFS system, I've got a name conflict due to case insenstivity 
> between the build target and the "Python" directory that forces me to 
> build with a -with-suffix command on HFS and manually change the name 
> after install - which is an automatic knockout factor when it comes to 
> incorporating it in an automatic build system. Not to mention a problem 
> with unix newbies trying to build from source...
> 
> Last night, I did some quick investigation to determine the best way to 
> fix this problem as documented in PEP-42 in the build section and 
> Sourceforge bug 122215 and determined that the easiest and least error 
> prone way was to change the directory name Python to PyCore.
> 
> It's apparent from the comments that I'm missing something here as the 
> reaction has been negative so far - to the point where Guido has rejected 
> the patch. Can someone explain what I'd missing that's causing such 
> strong feelings?

We use CVS to manage the sources.  CVS makes it it very hard to a
directory; it doesn't have a command for this, so you have to do the
move directly in the repository, which will then break checkouts for
everyone who has a work directory linked to the CVS repository.  Using
SourceForge makes it a bit harder still: we have to ask the SF
sysadmins to do the move for us.

And if we did the move, it would be much harder to reproduce old
versions of the source tree with a single CVS command.  A way around
that would be to do a copy instead of a move, but that would cause the
directory "PyCore" to pop up in all old versions, too.

I just don't want to go through this hassle in order to make building
easier for one relatively little-used platform.

> My second question is how do I resolve the name conflict in an approved 
> way?  It's been suggested that a build directory be created (/src/build 
> ?) and that the target be place here. The problem that I had with this 
> suggestion is that it would require an additional layer to execute the 
> target and I wasn't sure what impact it whould have on running python 
> from a new directory... which is the reason I took the more known path. 
> :-)

I don't understand what you are proposing here; I can't imagine that
an extra directory level could cause a slowdown.

A suggestion I would be open to: change the executable name during
build (currently a .exe suffix is added), but change it back (removing
the .exe suffix) during the install.  That should be a small change to
the Makefile.

> Bottom line, come March 24th, Mac OS X 1.0 will be released and as of 
> July 2001 all Macintoshes  will come with Mac OS X.  I'd like to see 
> Python be easily built on "out of the box" these machines - rather come 
> with a haphazardous list of instructions or commands as currently needed 
> for 1.5.2 and 2.0 releases. And hopefully, at some point be incorporated 
> into the base Mac OS X installation...

Just get Apple to include Python with their standard distribution and
nobody will *have* to build Python on Mac OSX. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Sat Jan 13 23:59:44 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 13 Jan 2001 18:59:44 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010113171528.A17480@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHHIIAA.tim.one@home.com>

[Eric]
> OK, now I understand why soundex isn't in the core -- there's no
> canonical version.

Actually, I think Knuth Vol 3 Ed 3 is canonical *now* -- nobody would dare
to oppose him <0.5 wink>.

> Normalized Hamming similarity: it's an inversion of Hamming distance
> -- number of pairwise matches in two strings of the same length,
> divided by the common string length.  Gives a measure in [0.0, 1.0].
>
> I've looked up "Levenshtein edit distance" and you're rigbt.  I'll add
> it as a fourth entry point as soon as I can find C source to crib.
> (Would you happen to have a pointer?)

If you throw almost everything out of Unix diff, that's what you'll be left
with.  Offhand I don't know of enencumbered, industrial-strength C source; a
problem is that writing a program to compute this is a std homework exercise
(it's a common first "dynamic programming" example), so you can find tons of
bad C source.

Caution:  many people want small variations of "edit distance", usually via
assigning different weights to insertions, replacements and deletions.  A
less common but still popular variant is to say that a transposition ("xy"
vs "yx") is less costly than a delete plus an insert.  Etc.  "edit distance"
is really a family of algorithms.

>> + Each algorithm on its own is likely controversial.

> Not these.  There *are* canonical versions of all these,

See the "edit distance" gloss above.

> and exact equivalents are all heavily used in commercial OCR
> software.

God forbid that core Python may lose the commercial OCR developer market
<wink>.  It's not accepted that for every field F, core Python needs to
supply the algorithms F uses heavily.  Heck, core Python doesn't even ship
with an FFT!  Doesn't bother the folks working in signal processing.

>> + Computing string similarity is something few apps need anyway.

> Tim, this isn't true.  Any time you need to validate user input
> against a controlled vocabulary and give feedback on probable right
> choices,

Which is something few apps need anyway -- in my experience, but more so in
my *primary* role here of trying to channel for you (& Guido) what Guido
will say.  It should be clear that I've got some familiarity with these
schemes, so it should also be clear that Guido is likely to ask me about
them whenever they pop up.  But Guido has hardly ever asked me about them
over the past decade, with the exception of the short-lived Soundex
brouhaha.  From that I guess hardly anyone ever asks *him* about them, and
that's how channeling works:  if this were an area where Guido felt core
Python needed beefier libraries, I'm pretty sure I would have heard about it
by now.

But now Guido can speak for himself.  There's no conceivable argument that
could change what I *predict* he'll say.

> R/O similarity is *very* useful.  I've had it in my personal
> toolkit for a decade and used it heavily for this -- you take your
> unknown input, check it against a dictionary and kick "maybe you meant
> foo?" to the user for every foo with an R/O similarity above 0.6 or so.
>
> The effects look like black magic.  Users love it.

I believe that.  And I'd guess we all have things in our personal toolkits
our users love.  That isn't enough to get into the core, as I expect Guido
will belabor on the next iteration of this <wink>.

doesn't-mean-the-code-isn't-mondo-cool-ly y'rs  - tim



From dkwolfe@pacbell.net  Sun Jan 14 00:19:56 2001
From: dkwolfe@pacbell.net (Dan Wolfe)
Date: Sat, 13 Jan 2001 16:19:56 -0800
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
Message-ID: <0G7400EZQM2TXD@mta5.snfc21.pbi.net>

>CVS makes it it very hard to a directory...
>which will then break checkouts for everyone...

with the potential to cause development code to be lost

>Using SourceForge...have to ask the SF sysadmins

I understand... we also use CVS and periodically (usually pre alpha) 
reorganize the source... going thru SF sysadmin makes it doublely hard... 
yuck!

However, since you have "released" tarball archives, it seems to me that 
the loss of the diffs and log notes is more troubling that the need to 
create an old version.... at least that's been my experience when 
building software. ;-)

>I just don't want to go through this hassle in order to make building
>easier for one relatively little-used platform.

humph. Ok, I'll accept that for now as we've only sold 100,000 Beta 
copies of Mac OS X... but if were not over 1 million users by this time 
next year... I'll eat my words. ;-)

>> It's been suggested that a build directory be created (/src/build ?) 
>> and that the target be place here. 

>I don't understand what you are proposing here; I can't imagine that
>an extra directory level could cause a slowdown.

moshez suggested this in his comment on the patch - moving the target to 
a seperate directory. I'm not sure of the implications of doing this 
however, and wondered if it might effect the running of the regression 
suite and the executable before it was installed.

>A suggestion I would be open to: change the executable name during
>build (currently a .exe suffix is added), but change it back (removing
>the .exe suffix) during the install.  That should be a small change to
>the Makefile.

You mean without using the -with-suffix command? That can probably be 
done... but based on my readings, I'd thought you reject it as not being 
"clean" and complicating the build process more than it should - not to 
mention renaming the executable behind the builder's back...  Lesser of 
two evils I guess - I'll investigate this however...

>> I'd like to see Python be easily built on "out of the box"...
>> [and] incorporated into the base Mac OS X installation...
>
>Just get Apple to include Python with their standard distribution and
>nobody will *have* to build Python on Mac OSX. :-)

Easier said that done as they already have the other P language 
installed. ;-) But then on the other hand, there are quite a few 
Pythonatic including me who use it in daily work at Apple. 

As I mentioned, the road to getting it in Mac OS X begins with getting it 
to build cleanly with the automated build system... so I've got to get 
this problem fixed before I start working on getting it in the build.

- Dan
  (yes, I work for Apple, but this is something that I'm doing on my own!)



From mwh21@cam.ac.uk  Sun Jan 14 00:41:35 2001
From: mwh21@cam.ac.uk (Michael Hudson)
Date: 14 Jan 2001 00:41:35 +0000
Subject: [Python-Dev] a readline replacement?
In-Reply-To: Michael Hudson's message of "17 Dec 2000 18:18:24 +0000"
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> <20001215235425.A29681@xs4all.nl> <m3hf42q5cf.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <m3snmn3qyo.fsf_-_@atrus.jesus.cam.ac.uk>

Michael Hudson <mwh21@cam.ac.uk> writes:

> It wouldn't be particularly hard to rewrite editline in Python (we
> have termios & the terminal handling functions in curses - and even
> ioctl if we get really keen).
> 
> I've been hacking on my own Python line reader on and off for a while;
> it's still pretty buggy, but if you're feeling brave you could look at:
> 
> http://www-jcsu.jesus.cam.ac.uk/~mwh21/hacks/pyrl-0.0.0.tar.gz

As I secretly planned <wink>, the embarrassment of having code that
full of holes publicly accessible spurred me to writing a much better
version, to be found at:

  http://www-jcsu.jesus.cam.ac.uk/~mwh21/hacks/pyrl-0.2.0.tar.gz

(or, now rsync works there again, in the equivalent place on the
starship...).

If you unpack it and execute

$ python python_reader.py

you should get something that closely mimics the current interpreter
top level.  It supports a wide range of cursor motion commands,
built-in support for multiple line input and history (including
incremental search).  It doesn't do completion, basically because I
haven't got round to it yet, and it will get into severe trouble if
you enter an input that is taller than your terminal (I think this
should be surmountable, but I haven't gotten round to this either).
Another thing that I haven't gotten round to yet is documentation.
After I've tackled these points I'll probably stick it up on
parnassus.

I've been using it as my standard python shell for a week or so, and
quite like it, though the lack of completion is a drag.

It is probably staggeringly unportable, so I'd appreciate finding out
how it breaks on systems other that Linux with terminals other than
xterms...

Have the changes to enable use of editline been checked in yet?  I
worry that the licensing situation around the readline module is grey
at best...

Cheers,
M.

-- 
  That's why the smartest companies use Common Lisp, but lie about it
  so all their competitors think Lisp is slow and C++ is fast.  (This
  rumor has, however, gotten a little out of hand. :)
                                        -- Erik Naggum, comp.lang.lisp



From esr@thyrsus.com  Sun Jan 14 00:58:08 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Sat, 13 Jan 2001 19:58:08 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHHIIAA.tim.one@home.com>; from tim.one@home.com on Sat, Jan 13, 2001 at 06:59:44PM -0500
References: <20010113171528.A17480@thyrsus.com> <LNBBLJKPBEHFEDALKOLCIEHHIIAA.tim.one@home.com>
Message-ID: <20010113195808.B17712@thyrsus.com>

Tim Peters <tim.one@home.com>:
> If you throw almost everything out of Unix diff, that's what you'll be left
> with.  Offhand I don't know of enencumbered, industrial-strength C source; a
> problem is that writing a program to compute this is a std homework exercise
> (it's a common first "dynamic programming" example), so you can find tons of
> bad C source.

I found some formal descriptions of the algorithm and some unencumbered 
Oberon source.  I'm coding up C now.  It's not complicated if you're willing 
to hold the cost matrix in memory, which is reasonable for a string comparator
in a way it wouldn't be for a file diff.
 
> Caution:  many people want small variations of "edit distance", usually via
> assigning different weights to insertions, replacements and deletions.  A
> less common but still popular variant is to say that a transposition ("xy"
> vs "yx") is less costly than a delete plus an insert.  Etc.  "edit distance"
> is really a family of algorithms.

Which about collapse into one if your function has three weight
arguments for insert/replace/delete weights, as mine does.  It don't
get more general than that -- I can see that by looking at the formal
description.  

OK, so I'll give you that I don't weight transpositions separately,
but neither does any other variant I found on the web nor the formal
descriptions.  A fourth optional weight agument someday, maybe :-).

> God forbid that core Python may lose the commercial OCR developer market
> <wink>.  It's not accepted that for every field F, core Python needs to
> supply the algorithms F uses heavily.

That's not my point -- I don't see OCR as a big Python market either.
My point in observing that OCR uses Ratcliff/Obershelp heavily was
simplty to show that it's a well-established algorithm, not
`controversial'.

>                      Heck, core Python doesn't even ship
> with an FFT!  Doesn't bother the folks working in signal processing.

It probably won't surprise you that I considered writing an FFT extension
module at one point :-).  

> > Tim, this isn't true.  Any time you need to validate user input
> > against a controlled vocabulary and give feedback on probable right
> > choices,
> 
> Which is something few apps need anyway

I fundamentally disagree.  Few application designers *know* they need
it, but user interfaces would get a hell of a lot better if the
technique were more commonly applied -- and that's why I want it in
the Python library, so doing the right thing in Python will be a
minimum-effort proposition.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

What if you were an idiot, and what if you were a member of Congress?
But I repeat myself.
        -- Mark Twain


From tim.one@home.com  Sun Jan 14 03:17:34 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 13 Jan 2001 22:17:34 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <14944.52328.558763.46161@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHNIIAA.tim.one@home.com>

[Fred]
>   Did you ever write documentation for it?  ;-)

A lot more than you did <wink>.

just-show-me-"write-docs"-in-my-job-description-ly y'rs  - tim



From tim.one@home.com  Sun Jan 14 04:39:59 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 13 Jan 2001 23:39:59 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010113195808.B17712@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHPIIAA.tim.one@home.com>

[Eric, on "edit distance"]
> I found some formal descriptions of the algorithm and some
> unencumbered Oberon source.  I'm coding up C now.  It's not
> complicated if you're willing to hold the cost matrix in memory,
> which is reasonable for a string comparator in a way it wouldn't
> be for a file diff.

All agreed, and it should be a straightforward task then.  I'm assuming it
will work with Unicode strings too <wink>.

[on differing weights]
> Which about collapse into one if your function has three weight
> arguments for insert/replace/delete weights, as mine does.  It don't
> get more general than that -- I can see that by looking at the formal
> description.
>
> OK, so I'll give you that I don't weight transpositions separately,
> but neither does any other variant I found on the web nor the formal
> descriptions.  A fourth optional weight agument someday, maybe :-).
> ...
> and that's why I want it in the Python library, so doing the right
> thing in Python will be a minimum-effort proposition.

Guido will depart from you at a different point.  I depart here:  it's not
"the right thing".  It's a bunch of hacks that appeal not because they solve
a problem, but because they're cute algorithms that are pretty easy to
implement and kinda solve part of a problem.   "The right thing"-- which you
can buy --at least involves capturing a large base of knowledge about
phonetics and spelling.  In high school, one of my buddies was Dan
Pryzbylski.  If anyone who knew him (other than me <wink>) were to type his
name into the class reunion guy's web page, they'd probably spell it the way
they remember him pronouncing it:  sha-bill-skey (and that's how he
pronounced "Dan" <wink>).  If that hit on the text string "Pryzbylski",
*then* it would be "the right thing" in a way that makes sense to real
people, not just to implementers.

Working six years in commercial speech recog really hammered that home to
me:  95% solutions are on the margin of unsellable, because an error one try
in 20 is intolerable for real people.  Developers writing for developers get
"whoa! cool!" where my sisters walk away going "what good is that?".  Edit
distance doesn't get within screaming range of 95% in real life.

Even for most developers, it would be better to package up the single best
approach you've got (f(list, word) -> list of possible matches sorted in
confidence order), instead of a module with 6 (or so) functions they don't
understand and a pile of equally mysterious knobs.  Then it may actually get
used!  Developers of the breed who would actually take the time to
understand what you've done are, I suggest, similar to us:  they'd skim the
docs, ignore the code, and write their own variations.  Or, IOW:

> so doing the right thing in Python will be a minimum-effort
> proposition.

Make someone think first, and 95% of developers will just skip over it too.

BTW, the theoretical literature ignored transposition at first, because it
didn't fit well in the machinery.  IIRC, I first read about it in an issue
of SP&E (Software Practice & Experience), where the authors were forced into
it because the "traditional" edit sequence measure sucked in their practice.
They were much happier after taking transposition into account.  The
theoreticians have more than caught up since, and research is still active;
e.g., 1997's

    PATTERN RECOGNITION OF STRINGS WITH SUBSTITUTIONS, INSERTIONS,
    DELETIONS AND GENERALIZED TRANSPOSITIONS
    B. J. Oommen and R. K. S. Loke
    http://www.scs.carleton.ca/~oommen/papers/GnTrnsJ2.PDF

is a good read.  As they say there,

    If one views the elements of the confusion matrices as
    probabilities, this [treating each character independent
    of all others, as "edit distance" does] is equivalent to
    assuming that the transformation probabilities at each
    position in the string are statistically independent and
    possess first-order Markovian characteristics. This model
    is usually assumed for simplicity rather it [sic] having
    any statistical significance.

IOW, because it's easy to analyze, not because it solves a real problem --
and they're complaining about an earlier generalization of edit distance
that makes the weights depend on the individual symbols involved as well as
on the edit/delete/insert distinction (another variation trying to make this
approach genuinely useful in real life).  The Oommen-Loke algorithm appears
much more realistic, taking into account the observed probabilities of
mistyping specific letter pairs (although it still ignores phonetics), and
they report accuracies approaching 98% in correctly identifying mangled
words.

98% (more than twice as good as 95% -- the error rate is actually more
useful to think about, 2% vs 5%) is truly useful for non-geek end users, and
the state of the art here is far beyond what's easy to find and dead easy to
implement.

> ...
> It probably won't surprise you that I considered writing an FFT
> extension module at one point :-).

Nope!  More power to you, Eric.  At least FFTs *are* state of the art,
although *coding* them optimally is likely beyond human ability on modern
machines:

    http://www.fftw.org/

(short course:  they've generally got the fastest FFTs available, and their
code is generated by program, systematically *trying* every trick in the
book, timing it on a given box, and synthesizing a complete strategy out of
the quickest pieces).

sooner-or-later-the-only-code-real-people-will-use-won't-be-written-
    by-people-at-all-ly y'rs  - tim



From tim.one@home.com  Sun Jan 14 05:38:52 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 14 Jan 2001 00:38:52 -0500
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
In-Reply-To: <0G7400EZQM2TXD@mta5.snfc21.pbi.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEIAIIAA.tim.one@home.com>

[Dan Wolfe]
> ...
> As I mentioned, the road to getting it in Mac OS X begins with
> getting it to build cleanly with the automated build system... so
> I've got to get  this problem fixed before I start working on
> getting it in the build.
>
> - Dan
>   (yes, I work for Apple, but this is something that I'm doing
>    on my own!)

Hang in there, Dan!  I did the first Python port to the KSR-1 on my own time
too, despite working for the visionless bastards at the time.  The rest is
history:  the glory, the fame, the riches, the groupies, the adulation of my
peers.  We won't mention the financial scandal and subsequent bankruptcy
lest it discourage you for no good reason <wink>.

BTW, "do the simplest thing that can possibly work"!  It's OK if it's a
little ugly.  Better that than force hundreds of Python-builders to get
divorced from a decade-old directory naming scheme.



From esr@thyrsus.com  Sun Jan 14 07:08:57 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 02:08:57 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEHPIIAA.tim.one@home.com>; from tim.one@home.com on Sat, Jan 13, 2001 at 11:39:59PM -0500
References: <20010113195808.B17712@thyrsus.com> <LNBBLJKPBEHFEDALKOLCEEHPIIAA.tim.one@home.com>
Message-ID: <20010114020857.E19782@thyrsus.com>

Tim Peters <tim.one@home.com>:
> All agreed, and it should be a straightforward task then.  I'm assuming it
> will work with Unicode strings too <wink>.

Thought about that.  Want to get it working for 8 bits first.
 
> Guido will depart from you at a different point.  I depart here:  it's not
> "the right thing".  It's a bunch of hacks that appeal not because they solve
> a problem, but because they're cute algorithms that are pretty easy to
> implement and kinda solve part of a problem.

Again, my experience says differently.  I have actually *used*
Ratcliff-Obershelp to implement Do What I Mean (actually, Tell Me What
I Mean) -- and had it work very well for non-geek users.  That's why I
want other Python programmers to have easy access to the capability.

> Working six years in commercial speech recog really hammered that home to
> me:  95% solutions are on the margin of unsellable, because an error one try
> in 20 is intolerable for real people.  Developers writing for developers get
> "whoa! cool!" where my sisters walk away going "what good is that?".  Edit
> distance doesn't get within screaming range of 95% in real life.

I suspect your speech recognition experience has given you an
unhelpful bias.  For English, what you say is certainly true -- but
that's a gross worst-case application of R/O and Levenshtein that I'm
not interested in pursuing.  Nor do I expect Python hackers to use
my module for that.

Where techniques like Ratcliff-Obershelp really shine (and what I
expect the module to be used for) is with controlled vocabularies such
as command interfaces.  These tend to have better orthogonality than
NL, so antinoise filtering by R/O or Levenshtein distance (a kindred
technique I somehow didn't learn until today -- there are
disadvantages to being an autodidact) can really go to town on them.

(Actually, my gut after thinking about both algorithms hard is that
R/O is still a better technique than Levenshtein for the kind of
application I have in mind.  But I also suspect the difference is
marginal.)

(Other good uses for algorithms in this class include cladistics and
genomic analysis.)

> Even for most developers, it would be better to package up the single best
> approach you've got (f(list, word) -> list of possible matches sorted in
> confidence order), instead of a module with 6 (or so) functions they don't
> understand and a pile of equally mysterious knobs.

That's why good documentation, with motivating usage hints, is important.
I write good documentation, Tim.

>     PATTERN RECOGNITION OF STRINGS WITH SUBSTITUTIONS, INSERTIONS,
>     DELETIONS AND GENERALIZED TRANSPOSITIONS
>     B. J. Oommen and R. K. S. Loke
>     http://www.scs.carleton.ca/~oommen/papers/GnTrnsJ2.PDF

Thanks for the pointer; I've downloaded it and will read it.  If the 
description of Ooomen's algorithm is good enough, I'll implement it and
add it to the module.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Power concedes nothing without a demand. It never did, and it never will.
Find out just what people will submit to, and you have found out the exact
amount of injustice and wrong which will be imposed upon them; and these will
continue until they are resisted with either words or blows, or with both.
The limits of tyrants are prescribed by the endurance of those whom they
oppress.
	-- Frederick Douglass, August 4, 1857


From dkwolfe@pacbell.net  Sun Jan 14 07:48:51 2001
From: dkwolfe@pacbell.net (Dan Wolfe)
Date: Sat, 13 Jan 2001 23:48:51 -0800
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEIAIIAA.tim.one@home.com>
Message-ID: <0G75009ZD6UYYE@mta5.snfc21.pbi.net>

--Apple-Mail-1687604877-3
content-transfer-encoding: 7bit
content-type: text/plain;
	format=flowed;
	charset=us-ascii

On Saturday, January 13, 2001, at 09:38 PM, Tim Peters wrote:

> [Dan Wolfe]
>> ...
>> As I mentioned, the road to getting it in Mac OS X begins with
>> getting it to build cleanly with the automated build system... so
>> I've got to get  this problem fixed before I start working on
>> getting it in the build.
>>
>> - Dan
>> (yes, I work for Apple, but this is something that I'm doing
>> on my own!)
>
> Hang in there, Dan!  I did the first Python port to the KSR-1 on my own 
> time
> too, despite working for the visionless bastards at the time.

Well, I won't go that far..... some of them are quite visionaries (I 
can't stop drooling over a Ti portable....).

> The rest is
> history:  the glory, the fame, the riches, the groupies, the adulation 
> of my
> peers.  We won't mention the financial scandal and subsequent bankruptcy
> lest it discourage you for no good reason <wink>.

You left out the part where they turn ya into a timbot... <wink><wink>

> BTW, "do the simplest thing that can possibly work"!  It's OK if it's a
> little ugly.  Better that than force hundreds of Python-builders to get
> divorced from a decade-old directory naming scheme.

Well the mv Python to PyCore was the simplest... but obviously the most 
painful.... The longer ugly fix is working but it's such a hack that I'd 
rather not show it off...I need to fix it so that it allow nice things 
such allowing the -with-suffix to be used...and then testing all the 
edge cases such as clobber, etc so that I don't break anything. :-)

appreciating-your-note-after-attempting-to-understand-makefiles-on-Saturday-night'
ly yours,

- Dan










--Apple-Mail-1687604877-3
content-transfer-encoding: quoted-printable
content-type: text/enriched;
	charset=us-ascii

On Saturday, January 13, 2001, at 09:38 PM, Tim Peters wrote:


<excerpt>[Dan Wolfe]

<excerpt>...

As I mentioned, the road to getting it in Mac OS X begins with

getting it to build cleanly with the automated build system... so

I've got to get  this problem fixed before I start working on

getting it in the build.


- Dan

(yes, I work for Apple, but this is something that I'm doing

on my own!)

</excerpt>

Hang in there, Dan!  I did the first Python port to the KSR-1 on my
own time

too, despite working for the visionless bastards at the time. =20

</excerpt>

Well, I won't go that far..... some of them are quite visionaries (I
can't stop drooling over a Ti portable....).


<excerpt>The rest is

history:  the glory, the fame, the riches, the groupies, the adulation
of my

peers.  We won't mention the financial scandal and subsequent
bankruptcy

lest it discourage you for no good reason <<wink>.

</excerpt>

You left out the part where they turn ya into a timbot...
<<wink><<wink>

<color><param>0000,0000,DEB7</param>

</color><excerpt>BTW, "do the simplest thing that can possibly work"!=20
It's OK if it's a

little ugly.  Better that than force hundreds of Python-builders to get

divorced from a decade-old directory naming scheme.

</excerpt>

Well the mv Python to PyCore was the simplest... but obviously the
most painful.... The longer ugly fix is working but it's such a hack
that I'd rather not show it off...I need to fix it so that it allow
nice things such allowing the -with-suffix to be used...and then
testing all the edge cases such as clobber, etc so that I don't break
anything. :-)


=
appreciating-your-note-after-attempting-to-understand-makefiles-on-Saturda=
y-night'ly
yours,


- Dan











--Apple-Mail-1687604877-3--


From tim.one@home.com  Sun Jan 14 10:45:53 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 14 Jan 2001 05:45:53 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010114020857.E19782@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEIGIIAA.tim.one@home.com>

[Tim]
>> ...It's a bunch of hacks that appeal not because they solve
>> a problem, but because they're cute algorithms that are pretty
>> easy to implement and kinda solve part of a problem.

[Eric]
> Again, my experience says differently.  I have actually *used*
> Ratcliff-Obershelp to implement Do What I Mean (actually, Tell Me What
> I Mean) -- and had it work very well for non-geek users.  That's why I
> want other Python programmers to have easy access to the capability.
> ...
> Where techniques like Ratcliff-Obershelp really shine (and what I
> expect the module to be used for) is with controlled vocabularies
> such as command interfaces.

Yet the narrower the domain, the less call for a library with multiple
approaches.  If R-O really shone for you, why bother with anything else?
Seriously.  You haven't used some (most?) of these.  The core isn't a place
for research modules either (note that I have no objection whatsoever to
writing any module you like -- the only question here is what belongs in the
core, and any algorithm *nobody* here has experience with in your target
domain is plainly a poor *core* candidate for that reason alone -- we have
to maintain, justify and explain it for years to come).

> I suspect your speech recognition experience has given you an
> unhelpful bias.

Try to think of it as a helpfully different perspective <0.5 wink>.  It's in
favor of measuring error rate by controlled experiments, skeptical of
intuition, and dismissive of anecdotal evidence.  I may well agree you don't
need all that heavy machinery if I had a clear definition of what problem it
is you're trying to solve (I've learned it's not the kinds of problems *I*
had in mind when I first read your description!).

BTW, telephone speech recog requires controlled vocabularies because phone
acoustics are too poor for the customary close-talking microphone approaches
to work well enough.  A std technique there is to build a "confusability
matrix" of the words *in* the vocabulary, to spot trouble before it happens:
if two words are acoustically confusable, it flags them and bounces that
info back to the vocabulary designer.  A similar approach should work well
in your domain:  if you get to define the cmd interface, run all the words
in it pairwise through your similarity measure of choice, and dream up new
words whenever a pair is "too close".  That all but ensures that even a
naive similarity algorithm will perform well (in telephone speech recog, the
unconstrained error rate is up to 70% on cell phones; by constraining the
vocabulary with the aid of confusability measures, we cut that to under 1%).

> ...
> (Actually, my gut after thinking about both algorithms hard is that
> R/O is still a better technique than Levenshtein for the kind of
> application I have in mind.  But I also suspect the difference is
> marginal.)

So drop Levenshtein -- go with your best shot.  Do note that they both
(usually) consider a single transposition to be as much a mutation as two
replacements (or an insert plus a delete -- "pure" Levenshtein treats those
the same).

What happens when the user doesn't enter an exact match?  Does the kind of
app you have in mind then just present them with a list of choices?  If
that's all (as opposed to, e.g., substituting its best guess for what the
user actually typed and proceeding as if the user had given that from the
start), then the evidence from studies says users are almost as pleased when
the correct choice appears somewhere in the first three choices as when it
appears as *the* top choice.  A well-designed vocabulary can almost
guarantee that happy result (note that most of the current research is aimed
at the much harder job of getting the intended word into the #1 slot on the
choice list).

> (Other good uses for algorithms in this class include cladistics and
> genomic analysis.)

I believe you'll find current work in those fields has moved far beyond
these simplest algorithms too, although they remain inspirational (for
example, see
"Protein Sequence Alignment and Database Scanning" at

    http://barton.ebi.ac.uk/papers/rev93_1/rev93_1.html

Much as in typing, some mutations are more likely than others for *physical*
reasons, so treating all pairs of symbols in the alphabet alike is too gross
a simplification.).

>> Even for most developers, it would be better to package up the
>> single best approach you've got (f(list, word) -> list of possible
>> matches sorted in confidence order), instead of a module with 6
>> (or so) functions they don't understand and a pile of equally
>> mysterious knobs.

> That's why good documentation, with motivating usage hints, is
> important.  I write good documentation, Tim.

You're not going to find offense here even if you look for it, Eric <wink>:
while only a small percentage of developers don't read docs at all, everyone
else spaces out at least in linear proportion to the length of the docs.
Most people will be looking for "a solution", not for "a toolkit".  If the
docs read like a toolkit, it doesn't matter how good they are, the bulk of
the people you're trying to reach will pass on it.  If you really want this
to be *used*, supply one class that does *all* the work, including making
the expert-level choices of which algorithm is used under the covers and how
it's tuned.  That's good advice.

I still expect Guido won't want it in the core before wide use is a
demonstrated fact, though (and no, that's not a chicken-vs-egg thing:  "wide
use" for a thing outside the core is narrower than "wide use" for a thing in
the core).  An exception would likely get made if he tried it and liked it a
lot.  But to get it under his radar, it's again much easier if the usage
docs are no longer than a couple paragraphs.

I'll attach a tiny program that uses ndiff's SequenceMatcher to guess which
of the 147 std 2.0 top-level library modules a user may be thinking of (and
best I can tell, these are the same results case-folding R/O would yield):

Module name? random
Hmm.  My best guesses are random, whrandom, anydbm
(BTW, the first choice was an exact match)
Module name? disect
Hmm.  My best guesses are bisect, dis, UserDict
Module name? password
Hmm.  My best guesses are keyword, getpass, asyncore
Module name? chitchat
Hmm.  My best guesses are whichdb, stat, asynchat
Module name? xml
Hmm.  My best guesses are xmllib, mhlib, xdrlib

[So far so good]

Module name? http
Hmm.  My best guesses are httplib, tty, stat

[I was thinking of httplib, but note that it missed
 SimpleHTTPServer:  a name that long just isn't going to score
 high when the input is that short]

Module name? dictionary
Hmm.  My best guesses are Bastion, ConfigParser, tabnanny

[darn, I *think* I was thinking of UserDict there]

Module name? uuencode
Hmm.  My best guesses are code, codeop, codecs

[Missed uu]

Module name? parse
Hmm.  My best guesses are tzparse, urlparse, pre
Module name? browser
Hmm.  My best guesses are webbrowser, robotparser, user
Module name? brower
Hmm.  My best guesses are webbrowser, repr, reconvert
Module name? Thread
Hmm.  My best guesses are threading, whrandom, sched
Module name? pickle
Hmm.  My best guesses are pickle, profile, tempfile
(BTW, the first choice was an exact match)
Module name? shelf
Hmm.  My best guesses are shelve, shlex, sched
Module name? katmandu
Hmm.  My best guesses are commands, random, anydbm

[I really was thinking of "commands"!]

Module name? temporary
Hmm.  My best guesses are tzparse, tempfile, fpformat

So it gets what I was thinking of into the top 3 very often, and despite
some wildly poor guesses at the correct spelling -- you'd *almost* think it
was doing a keyword search, except the *unintended* choices on the list are
so often insane <wink>.

Something like that may be a nice addition to Paul/Ping's help facility
someday too.

Hard question:  is that "good enough" for what you want?  Checking against
147 things took no perceptible time, because SequenceMatcher is already
optimized for "compare one thing against N", doing preprocessing work on the
"one thing" that greatly speeds the N similarity computations (I suspect
you're not -- yet).  It's been tuned and tested in practice for years; it
works for any sequence type with hashable elements (so Unicode strings are
already covered); it works for long sequences too.  And if R-O is the best
trick we've got, I believe it already does it.  Do we need more?  Of course
*I'm* not convinced we even need *it* in the core, but packaging a
match-1-against-N class is just a few minutes' editing of what follows.

something-to-play-with-anyway-ly y'rs  - tim


NDIFFPATH = "/Python20/Tools/Scripts"
LIBPATH = "/Python20/Lib"

import sys, os

sys.path.append(NDIFFPATH)
from ndiff import SequenceMatcher

modules = {}  # map lowercase module stem to module name
for f in os.listdir(LIBPATH):
    if f.endswith(".py"):
        f = f[:-3]
        modules[f.lower()] = f

def match(fname, numchoices=3):
    lower = fname.lower()
    s = SequenceMatcher()
    s.set_seq2(lower)
    scores = []
    for lowermod, mod in modules.items():
        s.set_seq1(lowermod)
        scores.append((s.ratio(), mod))
    scores.sort()
    scores.reverse()
    return modules.has_key(lower), [x[1] for x in scores[:numchoices]]

while 1:
    name = raw_input("Module name? ")
    is_exact, choices = match(name)
    print "Hmm.  My best guesses are", ", ".join(choices)
    if is_exact:
        print "(BTW, the first choice was an exact match)"



From esr@thyrsus.com  Sun Jan 14 12:15:33 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 07:15:33 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEIGIIAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 14, 2001 at 05:45:53AM -0500
References: <20010114020857.E19782@thyrsus.com> <LNBBLJKPBEHFEDALKOLCEEIGIIAA.tim.one@home.com>
Message-ID: <20010114071533.A5812@thyrsus.com>

Tim Peters <tim.one@home.com>:
> Yet the narrower the domain, the less call for a library with multiple
> approaches.  If R-O really shone for you, why bother with anything else?

Well, I was bothering with Levenshtein because *you* suggested it. :-)

I put in Hamming similarity and stemming because they're O(n) where
R/O is quadratic, and both widely used in situations where a fast sloppy
job is preferable to a good but slow one.  My documentation page is explicit
about the tradeoff.

> Seriously.  You haven't used some (most?) of these. 

I've used stemming and R-O.  Haven't used Hamming or Levenshtein.

>                                   The core isn't a place
> for research modules either (note that I have no objection whatsoever to
> writing any module you like -- the only question here is what belongs in the
> core, and any algorithm *nobody* here has experience with in your target
> domain is plainly a poor *core* candidate for that reason alone -- we have
> to maintain, justify and explain it for years to come).

Fair point.  I read it, in this context, as good advice to drop the Hamming 
entry point and forget about the Levenshtein implementation -- stick to what
I've used and know is useful as opposed to what I think might be useful.

>                                                I may well agree you don't
> need all that heavy machinery if I had a clear definition of what problem it
> is you're trying to solve (I've learned it's not the kinds of problems *I*
> had in mind when I first read your description!).

I think you have it by now, judging by the following...

> What happens when the user doesn't enter an exact match?  Does the kind of
> app you have in mind then just present them with a list of choices? 

Yes.  I've used this technique a lot.  It gives users not just guidance 
but warm fuzzy feelings -- they react as though there's a friendly 
homunculus inside the software looking out for them.  Actually, in my
experience, the less techie they are the more they like this.

> If that's all (as opposed to, e.g., substituting its best guess for what the
> user actually typed and proceeding as if the user had given that from the
> start), then the evidence from studies says users are almost as pleased when
> the correct choice appears somewhere in the first three choices as when it
> appears as *the* top choice.

Interesting.  That does fit what I've seen.

>                    A well-designed vocabulary can almost
> guarantee that happy result (note that most of the current research is aimed
> at the much harder job of getting the intended word into the #1 slot on the
> choice list).

Yes.  One of my other tricks is to design command vocabularies so the
first three characters close to unique.  This means R/O will almost
always nail the right thing.

> Much as in typing, some mutations are more likely than others for *physical*
> reasons, so treating all pairs of symbols in the alphabet alike is too gross
> a simplification.).

Indeed.  Couple weeks ago I was a speaker at a conference called "After the
Genome 6" at which one of the most interesting papers was given by a lady
mathematician who designs algorithms for DNA sequence matching.  She made
exactly this point.

> > That's why good documentation, with motivating usage hints, is
> > important.  I write good documentation, Tim.
> 
> You're not going to find offense here even if you look for it, Eric <wink>:

No worries, I wasn't looking. :-)

> Most people will be looking for "a solution", not for "a toolkit".  If the
> docs read like a toolkit, it doesn't matter how good they are, the bulk of
> the people you're trying to reach will pass on it.  If you really want this
> to be *used*, supply one class that does *all* the work, including making
> the expert-level choices of which algorithm is used under the covers and how
> it's tuned.  That's good advice.

I don't think that's possible in this case -- the proper domains for
stemming and R-O are too different.  But maybe this is another nudge to drop
the Hamming code.

>       But to get it under his radar, it's again much easier if the usage
> docs are no longer than a couple paragraphs.

How's this?

\section{\module{simil} -- 
         String similarily metrics}

\declaremodule{standard}{simil}
\moduleauthor{Eric S. Raymond}{esr@thyrsus.com}
\modulesynopsis{String similarity metrics.}

\sectionauthor{Eric S. Raymond}

The \module{simil} module provides similarity functions for
approximate word or string matching.  One important application is for
checking input words against a dictionary to match possible
misspellings with the right terms in a controlled vocabulary.

The entry points provide different tradeoffs ranging from crude and
fast (stemming) to effective but slow (Ratcliff-Obershelp gestalt
subpattern matching).  The latter is one of the standard techniques
used in commercial OCR software.

The \module{simil} module defines the following functions:

\begin{funcdesc}{stem}{}
Returns the length of the longest common prefix of two strings divided
by the length of the longer.  Similarity scores range from 0.0 (no
common prefix) to 1.0 (identity).  Running time is linear in string
length.
\end{funcdesc}

\begin{funcdesc}{hamming}{}
Computes a normalized Hamming similarity between two strings of equal
length -- the number of pairwise matches in the strings, divided by
their common length.  It returns None if the strings are of unequal
length.  Similarity scores range from 0.0 (no positions equal) to 1.0
(identity).  Running time is linear in string length.
\end{funcdesc}

\begin{funcdesc}{ratcliff}{}
Returns a Ratcliff/Obershelp gestalt similarity score based on
co-occurrence of subpatterns.  Similarity scores range from 0.0 (no
common subpatterns) to 1.0 (identity).  Running time is best-case
linear, worst-case quadratic in string length.
\end{funcdesc}

> Module name? http
> Hmm.  My best guesses are httplib, tty, stat
> 
> [I was thinking of httplib, but note that it missed
>  SimpleHTTPServer:  a name that long just isn't going to score
>  high when the input is that short]

>>> simil.ratcliff("http", "httplib")
0.72727274894714355
>>> simil.ratcliff("http", "tty")
0.57142859697341919
>>> simil.ratcliff("http", "stat")
0.5
>>> simil.ratcliff("http", "simplehttpserver")
0.40000000596046448

So with the 0.6 threshold I normally use R-O does better at eliminating
the false matches but doesn't catch SimpleHTTPServer (case is, I'm
sure you'll agree, an irrelevant detail here).
 
> Module name? dictionary
> Hmm.  My best guesses are Bastion, ConfigParser, tabnanny
> 
> [darn, I *think* I was thinking of UserDict there]

>>> simil.ratcliff("dictionary", "bastion")
0.47058823704719543
>>> simil.ratcliff("dictionary", "configparser")
0.45454546809196472
>>> simil.ratcliff("dictionary", "tabnanny")
0.4444444477558136
>>> simil.ratcliff("dictionary", "userdict")
0.4444444477558136

R-O would have booted all of these.  Hiighest score to configparser.
Interesting -- I'm beginning to think R-O overweights lots of small
subpattern matches relative to a few big ones, something I didn't notice
before because the statistics of my vocabularies masked it.

> Module name? uuencode
> Hmm.  My best guesses are code, codeop, codecs

>>> simil.ratcliff("uuencode", "code")
0.66666668653488159
>>> simil.ratcliff("uuencode", "codeops")
0.53333336114883423
>>> simil.ratcliff("uuencode", "codecs")
0.57142859697341919
>>> simil.ratcliff("uuencode", "uu")
0.40000000596046448

R-O would pick "code" and boot the rest.

> [Missed uu]
> 
> Module name? parse
> Hmm.  My best guesses are tzparse, urlparse, pre

>>> simil.ratcliff("parse", "tzparse")
0.83333331346511841
>>> simil.ratcliff("parse", "urlparse")
0.76923078298568726
>>> simil.ratcliff("parse", "pre")
0.75

Same result.

> Module name? browser
> Hmm.  My best guesses are webbrowser, robotparser, user

>>> simil.ratcliff("browser", "webbrowser")
0.82352942228317261
>>> simil.ratcliff("browser", "robotparser")
0.55555558204650879
>>> simil.ratcliff("browser", "user")
0.54545456171035767

Big win for R-O.  Picks the right one, boots the wrong two.

> Module name? brower
> Hmm.  My best guesses are webbrowser, repr, reconvert

>>> simil.ratcliff("brower", "webbrowser")
0.75
>>> simil.ratcliff("brower", "repr")
0.60000002384185791
>>> simil.ratcliff("brower", "reconvert")
0.53333336114883423

Small win for R/O -- boots reconvert, and repr squeaks in under the wire.

> Module name? Thread
> Hmm.  My best guesses are threading, whrandom, sched

>>> simil.ratcliff("thread", "threading")
0.80000001192092896
>>> simil.ratcliff("thread", "whrandom")
0.57142859697341919
>>> simil.ratcliff("thread", "sched")
0.54545456171035767

Big win for R-O.

> Module name? pickle
> Hmm.  My best guesses are pickle, profile, tempfile

>>> simil.ratcliff("pickle", "pickle")
1.0
>>> simil.ratcliff("pickle", "profile")
0.61538463830947876
>>> simil.ratcliff("pickle", "tempfile")
0.57142859697341919

R-O wins again.

> (BTW, the first choice was an exact match)
> Module name? shelf
> Hmm.  My best guesses are shelve, shlex, sched

>>> simil.ratcliff("shelf", "shelve")
0.72727274894714355
>>> simil.ratcliff("shelf", "shlex")
0.60000002384185791
>>> simil.ratcliff("shelf", "sched")
0.60000002384185791

Interesting.  Shelve scoores highest, both the others squeak in.

> Module name? katmandu
> Hmm.  My best guesses are commands, random, anydbm
>
> [I really was thinking of "commands"!]

>>> simil.ratcliff("commands", "commands")
1.0
>>> simil.ratcliff("commands", "random")
0.4285714328289032
>>> simil.ratcliff("commands", "anydbm")
0.4285714328289032

R-O wins big.
 
> Module name? temporary
> Hmm.  My best guesses are tzparse, tempfile, fpformat

>>> simil.ratcliff("temporary", "tzparse")
0.5
>>> simil.ratcliff("temporary", "tempfile")
0.47058823704719543
>>> simil.ratcliff("temporary", "fpformat")
0.47058823704719543

R-O boots all of these.  

> Hard question:  is that "good enough" for what you want?

Um...notice that R-O filtering, even though it seems to be
underweighting large matches, did a rather better job on your examples!
With an 0.66 threshold it would have done *much* better.

I think you've just made an argument for replacing your SequenceMatcher
with simil.ratcliff.  Mine's even documented. :-).
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Militias, when properly formed, are in fact the people themselves and
include all men capable of bearing arms. [...] To preserve liberty it is
essential that the whole body of the people always possess arms and be
taught alike, especially when young, how to use them.
        -- Senator Richard Henry Lee, 1788, on "militia" in the 2nd Amendment


From ping@lfw.org  Sun Jan 14 12:38:42 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 04:38:42 -0800 (PST)
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
Message-ID: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>

Sorry i'm being forgetful -- could someone please refresh my memory:

Was there a good reason for allowing both lowercase and capital 'r'
as a prefix for raw-strings?  I assume that the availability of both
r'' and R'' is what led to having both u'' and U''.  Is there any
good reason for that either?

This just seems to lead to ambiguity and unneeded complexity:
more cases in tokenize.py, more cases in tokenize.c, more work
for IDLE, more annoying when searching for u' in your editor.
(I was about to fix the lack of u'' support in tokenize.py and
that made me think about this.)

What happened to TOOWTDI?

Would you believe we now have 36 different ways of starting a string:

    '      "      '''    """
    r'     r"     r'''   r"""
    u'     u"     u'''   u"""
    ur'    ur"    ur'''  ur"""
    R'     R"     R'''   R"""
    U'     U"     U'''   U"""
    uR'    uR"    uR'''  uR"""
    Ur'    Ur"    Ur'''  Ur"""
    UR'    UR"    UR'''  UR"""

Would it be outrageous to suggest deprecating the last five rows?


-- ?!ng

[1] We started with 4.  Perl has (by my count) 381 ways of starting
    a string literal, so we're halfway there, logarithmically speaking.
    Perl has 757 if you count the fancier operators qx, qw, s, and tr.



From mal@lemburg.com  Sun Jan 14 13:33:29 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 14 Jan 2001 14:33:29 +0100
Subject: [Python-Dev] Why is soundex marked obsolete?
References: <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>
Message-ID: <3A61AAA9.F6F1EA9F@lemburg.com>

[Lots of talk about interesting algorithms for "human" pattern matching]

I just want to add my 2 cents to the discussion:

* Eric's package seems very useful for pattern matching, but that
  is a very specific domain -- not main stream

* I would opt to create a neat distutils style package for it
  for people to install at their own liking (I would certainly
  like it :)

* If wrapped up as a separate package, I'd suggest to add all
  known algorithms to the package and also make it Unicode
  aware. There are similar package for e.g. RNGs on Parnassus.

BTW, are there less English centric "sounds alike" matchers
around ? The NIST soundex algorithm as published on the internet:

    http://physics.nist.gov/cuu/Reference/soundex.html

works fine for English texts, but other languages of course
have different letter coding requirements (or even different
alphabets).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Sun Jan 14 13:53:03 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sun, 14 Jan 2001 14:53:03 +0100
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
References: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>
Message-ID: <3A61AF3F.EE6DAB88@lemburg.com>

Ka-Ping Yee wrote:
> 
> Sorry i'm being forgetful -- could someone please refresh my memory:
> 
> Was there a good reason for allowing both lowercase and capital 'r'
> as a prefix for raw-strings?  I assume that the availability of both
> r'' and R'' is what led to having both u'' and U''. 

Right.

> Is there any
> good reason for that either?

No idea... I have never used anything other than the lowercase
versions.
 
> This just seems to lead to ambiguity and unneeded complexity:
> more cases in tokenize.py, more cases in tokenize.c, more work
> for IDLE, more annoying when searching for u' in your editor.
> (I was about to fix the lack of u'' support in tokenize.py and
> that made me think about this.)
> 
> What happened to TOOWTDI?
> 
> Would you believe we now have 36 different ways of starting a string:
> 
>     '      "      '''    """
>     r'     r"     r'''   r"""
>     u'     u"     u'''   u"""
>     ur'    ur"    ur'''  ur"""
>     R'     R"     R'''   R"""
>     U'     U"     U'''   U"""
>     uR'    uR"    uR'''  uR"""
>     Ur'    Ur"    Ur'''  Ur"""
>     UR'    UR"    UR'''  UR"""
>
> Would it be outrageous to suggest deprecating the last five rows?

No. + 1 on the idea.
 
> -- ?!ng
> 
> [1] We started with 4.  Perl has (by my count) 381 ways of starting
>     a string literal, so we're halfway there, logarithmically speaking.
>     Perl has 757 if you count the fancier operators qx, qw, s, and tr.
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From thomas@xs4all.net  Sun Jan 14 14:24:08 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sun, 14 Jan 2001 15:24:08 +0100
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>; from ping@lfw.org on Sun, Jan 14, 2001 at 04:38:42AM -0800
References: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>
Message-ID: <20010114152408.G1005@xs4all.nl>

On Sun, Jan 14, 2001 at 04:38:42AM -0800, Ka-Ping Yee wrote:

> [1] We started with 4.  Perl has (by my count) 381 ways of starting
>     a string literal, so we're halfway there, logarithmically speaking.
>     Perl has 757 if you count the fancier operators qx, qw, s, and tr.

Don't forget 'qr//', which is quite like a raw string, except that Perl uses
it to 'precompile' regular expressions as a side effect. 

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@python.org  Sun Jan 14 17:08:28 2001
From: guido@python.org (Guido van Rossum)
Date: Sun, 14 Jan 2001 12:08:28 -0500
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: Your message of "Sun, 14 Jan 2001 14:53:03 +0100."
 <3A61AF3F.EE6DAB88@lemburg.com>
References: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>
 <3A61AF3F.EE6DAB88@lemburg.com>
Message-ID: <200101141708.MAA11161@cj20424-a.reston1.va.home.com>

> Ka-Ping Yee wrote:
> > 
> > Sorry i'm being forgetful -- could someone please refresh my memory:
> > 
> > Was there a good reason for allowing both lowercase and capital 'r'
> > as a prefix for raw-strings?  I assume that the availability of both
> > r'' and R'' is what led to having both u'' and U''. 
> 
> Right.
> 
> > Is there any
> > good reason for that either?
> 
> No idea... I have never used anything other than the lowercase
> versions.

It comes from the numeric literals.  C allows 0x0 and 0X0, and 0L as
well as 0l.  So does Python (and also 0j == 0J).

> > This just seems to lead to ambiguity and unneeded complexity:
> > more cases in tokenize.py, more cases in tokenize.c, more work
> > for IDLE, more annoying when searching for u' in your editor.
> > (I was about to fix the lack of u'' support in tokenize.py and
> > that made me think about this.)
> > 
> > What happened to TOOWTDI?
> > 
> > Would you believe we now have 36 different ways of starting a string:
> > 
> >     '      "      '''    """
> >     r'     r"     r'''   r"""
> >     u'     u"     u'''   u"""
> >     ur'    ur"    ur'''  ur"""
> >     R'     R"     R'''   R"""
> >     U'     U"     U'''   U"""
> >     uR'    uR"    uR'''  uR"""
> >     Ur'    Ur"    Ur'''  Ur"""
> >     UR'    UR"    UR'''  UR"""
> >
> > Would it be outrageous to suggest deprecating the last five rows?
> 
> No. + 1 on the idea.

Why bother?  All that does is outdate a bunch of documentation.  I
don't see the extra effort in various parsers as a big deal.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@effbot.org  Sun Jan 14 17:53:32 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Sun, 14 Jan 2001 18:53:32 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
Message-ID: <010f01c07e52$e9801fc0$e46940d5@hagrid>

The name database portions of SF task 17335 ("add
compressed unicode database") were postponed to
2.1.

My current patch replaces the ~450k large ucnhash
module with a new ~160k large module.  (See earlier
posts for more info on how the new database works).

Should I check it in?

</F>



From skip@mojam.com (Skip Montanaro)  Sun Jan 14 17:51:52 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sun, 14 Jan 2001 11:51:52 -0600 (CST)
Subject: [Python-Dev] pydoc - put it in the core
Message-ID: <14945.59192.400783.403810@beluga.mojam.com>

Ping's pydoc is awesome!  Move it out of the sandbox and put it in the
standard distribution.

Biggest hook for me:

   1. execute "pydoc -p 3200"
   2. visit "http://localhost:3200/"
   3. knock yourself out

Skip


From martin@mira.cs.tu-berlin.de  Sun Jan 14 17:57:57 2001
From: martin@mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 14 Jan 2001 18:57:57 +0100
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
Message-ID: <200101141757.f0EHvvt01407@mira.informatik.hu-berlin.de>

> > Would it be outrageous to suggest deprecating the last five rows?
> Why bother?  All that does is outdate a bunch of documentation.

He suggested to deprecate it, not to remove it. By the time it is
removed, the documentation still mentioning it should be outdated for
other reasons (e.g. the string module might have disappeared).

In general, the rationale for deprecating things would be that the
simplification will make everybody's life easier in the long run. In
the case of a small change (such as this one), that advantage would be
small. OTOH, the hassle for users that rely on the then-removed
feature will be also small; I see it as quite unlikely that anybody
uses that feature actively (although I do think that people use 0X10
and 100L; the latter is common since 100l is oft confused with 1001).

Regards,
Martin


From tim.one@home.com  Sun Jan 14 19:00:21 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 14 Jan 2001 14:00:21 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010114071533.A5812@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEJAIIAA.tim.one@home.com>

Very quick (swamped):

> I think you've just made an argument for replacing your
> SequenceMatcher with simil.ratcliff.

Actually, I'm certain they're the same algorithm now, except the C is
showing through in ratcliff to the floating-point eye <wink>.  For
demonstration, I *always* printed the top three scorers (that's logic in the
little driver I posted, not in SequenceMatcher), without any notion of
cutoff (ndiff does use a cutoff).  Add this line before the return (in the
posted driver) to see the actual scores:

    print scores[:numchoices]

For example:

Module name? browser
[(0.82352941176470584, 'webbrowser'),
 (0.55555555555555558, 'robotparser'),
 (0.54545454545454541, 'user')]
Hmm.  My best guesses are webbrowser, robotparser, user
Module name?

On this example you reported:

>>> simil.ratcliff("browser", "webbrowser")
0.82352942228317261
>>> simil.ratcliff("browser", "robotparser")
0.55555558204650879
>>> simil.ratcliff("browser", "user")
0.54545456171035767

which strongly suggests you're using C floats instead of Python floats to
compute the final score.  I didn't try every example in your email, but it's
the same story on the three I did try (scores identical modulo
simil.ratcliff dropping about 30 of the low-order result bits -- which is
about the difference between a C double and a C float on most boxes).

> Mine's even documented. :-).

Which I appreciate!  I dreamt up the SequenceMatcher algorithm going on 20
years ago for a friendly diff generator, and never even considered using it
for other purposes.  But then I may have mentioned that these other purposes
never come up in my apps <wink>.

or-at-least-they-haven't-in-contexts-where-r/o-would-have-been-
    strong-enough-ly y'rs  - tim



From bckfnn@worldonline.dk  Sun Jan 14 19:00:33 2001
From: bckfnn@worldonline.dk (Finn Bock)
Date: Sun, 14 Jan 2001 19:00:33 GMT
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <010f01c07e52$e9801fc0$e46940d5@hagrid>
References: <010f01c07e52$e9801fc0$e46940d5@hagrid>
Message-ID: <3a61f12a.36601630@smtp.worldonline.dk>

On Sun, 14 Jan 2001 18:53:32 +0100, you wrote:

>The name database portions of SF task 17335 ("add
>compressed unicode database") were postponed to
>2.1.
>
>My current patch replaces the ~450k large ucnhash
>module with a new ~160k large module.  (See earlier
>posts for more info on how the new database works).

Do you have a link or an approx date of this earlier posts? I must have
missed it. The patch on sourceforge seems a bit empty:

https://sourceforge.net/patch/index.php?func=detailpatch&patch_id=100899&group_id=5470

As a result I invented my own compression format for the ucnhash for
jython. I managed to achive ~100k but that probably have different
performance properties.

regards,
finn


From esr@thyrsus.com  Sun Jan 14 19:09:01 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 14:09:01 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEJAIIAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 14, 2001 at 02:00:21PM -0500
References: <20010114071533.A5812@thyrsus.com> <LNBBLJKPBEHFEDALKOLCCEJAIIAA.tim.one@home.com>
Message-ID: <20010114140901.A6431@thyrsus.com>

Tim Peters <tim.one@home.com>:
> > I think you've just made an argument for replacing your
> > SequenceMatcher with simil.ratcliff.
> 
> Actually, I'm certain they're the same algorithm now, except the C is
> showing through in ratcliff to the floating-point eye <wink>.

Take a look:

/*****************************************************************************
 *
 * Ratcliff-Obershelp common-subpattern similarity.
 *
 * This code first appeared in a letter to the editor in Doctor
 * Dobbs's Journal, 11/1988.  The original article on the algorithm,
 * "Pattern Matching by Gestalt" by John Ratcliff, had appeared in the
 * July 1988 issue (#181) but the algorithm was presented in assembly.
 * The main drawback of the Ratcliff-Obershelp algorithm is the cost
 * of the pairwise comparisons.  It is significantly more expensive
 * than stemming, Hamming distance, soundex, and the like.
 *
 * Running time quadratic in the data size, memory usage constant.
 *
 *****************************************************************************/

static int RatcliffObershelp(char *st1, char *end1, char *st2, char *end2)
{
    register char *a1, *a2;
    char *b1, *b2; 
    char *s1 = st1, *s2 = st2;	/* initializations are just to pacify GCC */
    short max, i;

    if (end1 <= st1 || end2 <= st2)
	return(0);
    if (end1 == st1 + 1 && end2 == st2 + 1)
	return(0);
		
    max = 0;
    b1 = end1; b2 = end2;
	
    for (a1 = st1; a1 < b1; a1++)
    {
	for (a2 = st2; a2 < b2; a2++)
	{
	    if (*a1 == *a2)
	    {
		/* determine length of common substring */
		for (i = 1; a1[i] && (a1[i] == a2[i]); i++) 
		    continue;
		if (i > max)
		{
		    max = i; s1 = a1; s2 = a2;
		    b1 = end1 - max; b2 = end2 - max;
		}
	    }
	}
    }
    if (!max)
	return(0);
    max += RatcliffObershelp(s1 + max, end1, s2 + max, end2);	/* rhs */
    max += RatcliffObershelp(st1, s1, st2, s2);			/* lhs */
    return max;
}

static float ratcliff(char *s1, char *s2)
/* compute Ratcliff-Obershelp similarity of two strings */
{
    short l1, l2;

    l1 = strlen(s1);
    l2 = strlen(s2);
	
    /* exact match end-case */
    if (l1 == 1 && l2 == 1 && *s1 == *s2)
	return(1.0);
			
    return 2.0 * RatcliffObershelp(s1, s1 + l1, s2, s2 + l2) / (l1 + l2);
}

static PyObject *
simil_ratcliff(PyObject *self, PyObject *args)
{
    char *str1, *str2;
    
    if(!PyArg_ParseTuple(args, "ss:ratcliff", &str1, &str2))
        return NULL;

    return Py_BuildValue("f", ratcliff(str1, str2));
}
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Taking my gun away because I might shoot someone is like cutting my tongue
out because I might yell `Fire!' in a crowded theater."
        -- Peter Venetoklis


From fredrik@effbot.org  Sun Jan 14 19:31:06 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Sun, 14 Jan 2001 20:31:06 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <010f01c07e52$e9801fc0$e46940d5@hagrid> <3a61f12a.36601630@smtp.worldonline.dk>
Message-ID: <040e01c07e60$8c74d100$e46940d5@hagrid>

finn wrote:
> As a result I invented my own compression format for the ucnhash for
> jython. I managed to achive ~100k but that probably have different
> performance properties.

here's the description:

---

From: "Fredrik Lundh" <effbot@telia.com>
Date: Sun, 16 Jul 2000 20:40:46 +0200

/.../

    The unicodenames database consists of two parts: a name
    database which maps character codes to names, and a code
    database, mapping names to codes.

* The Name Database (getname)

    First, the 10538 text strings are split into 42193 words,
    and combined into a 4949-word lexicon (a 29k array).

    Each word is given a unique index number (common words get
    lower numbers), and there's a "lexicon offset" table mapping
    from numbers to words (10k).

    To get back to the original text strings, I use a "phrase
    book".  For each original string, the phrase book stores a a
    list of word numbers.  Numbers 0-127 are stored in one byte,
    higher numbers (less common words) use two bytes.  At this
    time, about 65% of the words can be represented by a single
    byte.  The result is a 56k array.

    The final data structure is an offset table, which maps code
    points to phrase book offsets.  Instead of using one big
    table, I split each code point into a "page number" and a
    "line number" on that page.

      offset = line[ (page[code>>SHIFT]<<SHIFT) + (code&MASK) ]

    Since the unicode space is sparsely populated, it's possible
    to split the code so that lots of pages gets no contents.  I
    use a brute force search to find the optimal SHIFT value.

    In the current database, the page table has 1024 entries
    (SHIFT is 6), and there are 199 unique pages in the line
    table.  The total size of the offset table is 26k.

* The code database (getcode)

    For the code table, I use a straight-forward hash table to store
    name to code mappings.  It's basically the same implementation
    as in Python's dictionary type, but a different hash algorithm.
    The table lookup loop simply uses the name database to check
    for hits.

    In the current database, the hash table is 32k.

/.../

</F>



From tim.one@home.com  Sun Jan 14 19:46:44 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 14 Jan 2001 14:46:44 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <3A61AAA9.F6F1EA9F@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com>

[M.-A. Lemburg]
> BTW, are there less English centric "sounds alike" matchers
> around ?

Yes, but if anything there are far too many of them:  like Soundex, they're
just heuristics, and *everybody* who cares adds their own unique twists,
while proper studies are almost non-existent.  Few variants appear to be in
use much beyond their inventor's friends; one notable exception in the
Jewish community is the Daitch-Mokotoff variation, originally tailored to
their unique needs but later generalized; a brief description here:

    http://www.avotaynu.com/soundex.html

The similarly involved NYSIIS algorithm (New York State Identification
Intelligence System -- look for NYSIIS on Parnassus) was the winner from a
field of about two dozen competing algorithms, after measuring their
effectiveness on assorted databases maintained by the state of New York.
Since New York has a large immigrant population, NYSIIS isn't as
Anglocentric as Soundex either.

But state-of-the-art has given up on purely computational algorithms for
these purposes:  proper names are simply too much a mess.  For example, if I
search for "Richard", it *ought* to match on "Dick"; if my Arab buddy
searches on "Mohammed", it *ought* to match on "Mhd"; "the rules" people
actually use just aren't reducible to pure computation -- it takes a large
knowledge base to capture what people "just know".  You may enjoy visiting
this commercial site (AFAIK, nobody is giving away state-of-the-art for
free):

    http://www.las-inc.com/

> ...
>     http://physics.nist.gov/cuu/Reference/soundex.html
>
> works fine for English texts,

If that were true, the English-speaking researchers would have declared
victory 120 years ago <wink>.  But English pronunciation is *notoriously*
difficult to predict from spelling, partly because English is the Perl of
human languages.

or-maybe-the-borg-assuming-there's-a-difference<wink>-ly y'rs  - tim



From esr@thyrsus.com  Sun Jan 14 20:17:53 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 15:17:53 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 14, 2001 at 02:46:44PM -0500
References: <3A61AAA9.F6F1EA9F@lemburg.com> <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com>
Message-ID: <20010114151753.A6671@thyrsus.com>

Tim Peters <tim.one@home.com>:
> If that were true, the English-speaking researchers would have declared
> victory 120 years ago <wink>.  But English pronunciation is *notoriously*
> difficult to predict from spelling, partly because English is the Perl of
> human languages.

Actually, according to the Oxford Encyclopedia of Linguistics, this is
an urban myth.  The orthography of English is, in fact, quite
consistent; it looks much more wacked out than it is because the
maddening irregularities are concentrated in the 400 most commonly
used words.

The situation is much like that with French verb forms -- most French
verbs have a very regular inflection pattern, but the twenty or so
exceptions are the most commonly used ones.  In fact it's a general
rule in language evolution that irregularities are preserved in common
forms and not rare ones -- in the rare ones they get forgotten.

American personal names are are problem precisely because they sometimes
do *not* have English orthography.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

  "...quemadmodum gladius neminem occidit, occidentis telum est."
[...a sword never kills anybody; it's a tool in the killer's hand.]
        -- (Lucius Annaeus) Seneca "the Younger" (ca. 4 BC-65 AD),


From tim.one@home.com  Sun Jan 14 20:31:06 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 14 Jan 2001 15:31:06 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010114140901.A6431@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEJCIIAA.tim.one@home.com>

[Tim]
> Actually, I'm certain they're the same algorithm now, except the C is
> showing through in ratcliff to the floating-point eye <wink>.

[Eric]
> Take a look:

Yup, same thing, except:

> static float ratcliff(char *s1, char *s2)

accounts for the numeric differences (change "float"->"double" and they'd be
the same; Python has to convert it to a double anyway, lacking any internal
support for C's floats; and the C code is *computing* in double regardless,
cutting it back to a float upon return just because of the "float" decl).

The code in SequenceMatcher doesn't *look* anything like it, though, due to
years of dreaming up faster ways to do this (in its original role as a diff
generator, it routinely had to deal with sequences containing 10s of
thousands of elements, and code very much like the code you posted was just
too slow for that).

One simple trick that can enormously speed the worst cases:  the "find the
longest match starting here" innermost loop is guarded by

> 	    if (*a1 == *a2)

However, it can't possibly find a *bigger* max unless it's also the case
that

    a1[max) == a2[max)

That's usually false in real life, so by adding that test to the guard you
usually get to skip the innermost loop entirely.  Probably more important in
a diff-generator role, though.

SequenceMatcher's prime trick is to preprocess one of the strings, in linear
time building up a hash table mapping each character in the string to a list
of the indices at which it appears.  Then the second-innermost loop is saved
from needing to do any search:  when we get to, e.g., 'x' in the other
string, the precomputed hash table tells us directly where to find all the
x's in the original string.  And in the match-1-against-N case, this hash
table can be computed once & reused N times.  That's a monster win.

However, I never had the patience to code that in C, so I never *did* that
before I reimplemented my stuff in Python.  Now the Python ndiff runs
circles around the old Pascal and C versions.  I'm sure that has nothing to
do with machines having gotten 100x faster in the meantime <wink>>

for-short-1-against-1-matches-yours-will-certainly-be-quicker-ly
    y'rs  - tim



From guido@python.org  Sun Jan 14 20:55:21 2001
From: guido@python.org (Guido van Rossum)
Date: Sun, 14 Jan 2001 15:55:21 -0500
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: Your message of "Sun, 14 Jan 2001 11:51:52 CST."
 <14945.59192.400783.403810@beluga.mojam.com>
References: <14945.59192.400783.403810@beluga.mojam.com>
Message-ID: <200101142055.PAA13041@cj20424-a.reston1.va.home.com>

> Ping's pydoc is awesome!  Move it out of the sandbox and put it in the
> standard distribution.
> 
> Biggest hook for me:
> 
>    1. execute "pydoc -p 3200"
>    2. visit "http://localhost:3200/"
>    3. knock yourself out

Yes, wow!

Now, if we could somehow get this to show both the docs that Fred
maintains and the stuff that Ping extracts from the source code, that
would be even better!  (I think that Ping's stuff should also run on
the python.org site, by the way.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@thyrsus.com  Sun Jan 14 20:59:28 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 15:59:28 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEJCIIAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 14, 2001 at 03:31:06PM -0500
References: <20010114140901.A6431@thyrsus.com> <LNBBLJKPBEHFEDALKOLCIEJCIIAA.tim.one@home.com>
Message-ID: <20010114155928.A6793@thyrsus.com>

Tim Peters <tim.one@home.com>:
> [Tim]
> > Actually, I'm certain they're the same algorithm now, except the C is
> > showing through in ratcliff to the floating-point eye <wink>.
> 
> [Eric]
> > Take a look:
> 
> Yup, same thing, except:
> 
> > static float ratcliff(char *s1, char *s2)
> 
> accounts for the numeric differences (change "float"->"double" and they'd be
> the same; Python has to convert it to a double anyway, lacking any internal
> support for C's floats; and the C code is *computing* in double regardless,
> cutting it back to a float upon return just because of the "float" decl).

OK, so the right answer is to make your version visible and documented
in the library.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

No one is bound to obey an unconstitutional law and no courts are bound
to enforce it.  
	-- 16 Am. Jur. Sec. 177 late 2d, Sec 256


From tim.one@home.com  Sun Jan 14 21:01:19 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 14 Jan 2001 16:01:19 -0500
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEJDIIAA.tim.one@home.com>

[?!ng]
> [1] We started with 4.

Na, *we* started with two, just ' and ".  And at the time, I thought that
was arguably one too many already <wink>.  Allowing the modifiers to be
case-insensitive seems to me much more Pythonic than the original sin of
making ' and " mean the same thing.  OTOH, if only " had been allowed at the
start, we'd probably spell raw strings with ' today, and that doesn't really
scream that they're so very different from " strings.

leaving-this-one-be-ly y'rs  - tim



From barry@digicool.com  Sun Jan 14 21:02:07 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Sun, 14 Jan 2001 16:02:07 -0500
Subject: [Python-Dev] pydoc - put it in the core
References: <14945.59192.400783.403810@beluga.mojam.com>
Message-ID: <14946.5071.92879.789400@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@mojam.com> writes:

    SM> Ping's pydoc is awesome!  Move it out of the sandbox and put
    SM> it in the standard distribution.

    SM> Biggest hook for me:

    |    1. execute "pydoc -p 3200"
    |    2. visit "http://localhost:3200/"
    |    3. knock yourself out

Whoa.  Awesome.



From ping@lfw.org  Sun Jan 14 21:01:45 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 13:01:45 -0800 (PST)
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: <200101141708.MAA11161@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101141235520.5846-100000@skuld.kingmanhall.org>

On Sun, 14 Jan 2001, Guido van Rossum wrote:
> 
> It comes from the numeric literals.  C allows 0x0 and 0X0, and 0L as
> well as 0l.  So does Python (and also 0j == 0J).

I just did a little test.  Neither Python, Perl, nor Tcl support
"\X66", only "\x66".  Perl doesn't support 0X1234, only 0x1234.
Tcl's "expr" routine does support 0X1234.  Javascript supports
0X1234, but not "\X66".  I'd bet that no one really relies on or
expects the uppercase forms except L.


-- ?!ng



From ping@lfw.org  Sun Jan 14 21:14:34 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 13:14:34 -0800 (PST)
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of
 Python)
In-Reply-To: <14942.29609.19618.534613@cj42289-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101141309320.5846-100000@skuld.kingmanhall.org>

On Thu, 11 Jan 2001, Fred L. Drake, Jr. wrote:
> Ka-Ping Yee writes:
>  > My next two targets are:
>  >     1.  Generating text from the HTML documentation files
>  >         using Paul Prescod's stuff in onlinehelp.py.
> 
> You mean the ones I publish as the standard documentation?  Relying
> on the structure of that HTML is pure folly!

Paul's onlinehelp.py is using the HTMLParser and AbstractFormatter
to turn HTML into text.  It also contains paths to specific files,
e.g. help('assert') looks for "ref/assert.html".  Are you okay with
this technique?  Have you tried onlinehelp.py?  I was planning to
do the same to provide help on the language in pydoc.


-- ?!ng



From skip@mojam.com (Skip Montanaro)  Sun Jan 14 21:26:48 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sun, 14 Jan 2001 15:26:48 -0600 (CST)
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: <200101142055.PAA13041@cj20424-a.reston1.va.home.com>
References: <14945.59192.400783.403810@beluga.mojam.com>
 <200101142055.PAA13041@cj20424-a.reston1.va.home.com>
Message-ID: <14946.6552.542015.620760@beluga.mojam.com>

    Guido> Now, if we could somehow get this to show both the docs that Fred
    Guido> maintains and the stuff that Ping extracts from the source code,
    Guido> that would be even better!

I had exactly the same thought.  I suspect that if the install target were
modified to install the html-ized sections of the lib reference manual pydoc
could grovel around in sys and find the root of the library reference manual
pretty easily.  If not, it could simply redirect to the relevant section of
http://www.python.org/doc/current/lib/.

Skip



From tim.one@home.com  Sun Jan 14 21:45:48 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 14 Jan 2001 16:45:48 -0500
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: <Pine.LNX.4.10.10101141235520.5846-100000@skuld.kingmanhall.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEJGIIAA.tim.one@home.com>

[?!ng]
> ...
> I'd bet that no one really relies on or expects the uppercase
> forms except L.

And 0X.  I don't think it's in the std library, but I've certainly seen
Python code do stuff like

    magic = 0XFEEDFACE

Plus it's always good for a language to be able parse the stuff it prints,
and "0X..." is generated by Python's %#X format code.

Don't believe I've ever seen the "u" or "r" string modifiers in uppercase,
though, but really don't see the harm in allowing that.



From ping@lfw.org  Sun Jan 14 21:50:43 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 13:50:43 -0800 (PST)
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: <14946.5071.92879.789400@anthem.wooz.org>
Message-ID: <Pine.LNX.4.10.10101141349040.5846-100000@skuld.kingmanhall.org>

On Sun, 14 Jan 2001, Barry A. Warsaw wrote:
> Whoa.  Awesome.

Thanks!

Two things added recently: constants (any numbers, lists, tuples,
strings, or types) in modules are shown; and packages are listed
in the index as they should be.


-- ?!ng



From bckfnn@worldonline.dk  Sun Jan 14 22:20:51 2001
From: bckfnn@worldonline.dk (Finn Bock)
Date: Sun, 14 Jan 2001 22:20:51 GMT
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <040e01c07e60$8c74d100$e46940d5@hagrid>
References: <010f01c07e52$e9801fc0$e46940d5@hagrid> <3a61f12a.36601630@smtp.worldonline.dk> <040e01c07e60$8c74d100$e46940d5@hagrid>
Message-ID: <3a622615.50148579@smtp.worldonline.dk>

[/F]

>here's the description:

Thanks.

>From: "Fredrik Lundh" <effbot@telia.com>
>Date: Sun, 16 Jul 2000 20:40:46 +0200
>
>/.../
>
>    The unicodenames database consists of two parts: a name
>    database which maps character codes to names, and a code
>    database, mapping names to codes.
>
>* The Name Database (getname)
>
>    First, the 10538 text strings are split into 42193 words,
>    and combined into a 4949-word lexicon (a 29k array).

I only added a word to the lexicon if it was used more than once and if
the length was larger then the lexicon index. I ended up with 1385
entries in the lexicon. (a 7k array)

>    Each word is given a unique index number (common words get
>    lower numbers), and there's a "lexicon offset" table mapping
>    from numbers to words (10k).

My lexicon offset table is 3k and I also use 4k on a perfect hash of the
words.

>    To get back to the original text strings, I use a "phrase
>    book".  For each original string, the phrase book stores a a
>    list of word numbers.  Numbers 0-127 are stored in one byte,
>    higher numbers (less common words) use two bytes.  At this
>    time, about 65% of the words can be represented by a single
>    byte.  The result is a 56k array.

Because not all words are looked up in the lexicon, I used the values
0-38 for the letters and number, 39-250 are used for one byte lexicon
index, and 251-255 are combined with following byte to form a two byte.
This also result in a 57k array

So far it is only minor variations.

>    The final data structure is an offset table, which maps code
>    points to phrase book offsets.  Instead of using one big
>    table, I split each code point into a "page number" and a
>    "line number" on that page.
>
>      offset = line[ (page[code>>SHIFT]<<SHIFT) + (code&MASK) ]
>
>    Since the unicode space is sparsely populated, it's possible
>    to split the code so that lots of pages gets no contents.  I
>    use a brute force search to find the optimal SHIFT value.
>
>    In the current database, the page table has 1024 entries
>    (SHIFT is 6), and there are 199 unique pages in the line
>    table.  The total size of the offset table is 26k.
>
>* The code database (getcode)
>
>    For the code table, I use a straight-forward hash table to store
>    name to code mappings.  It's basically the same implementation
>    as in Python's dictionary type, but a different hash algorithm.
>    The table lookup loop simply uses the name database to check
>    for hits.
>
>    In the current database, the hash table is 32k.

I chose to split a unicode name into words even when looking up a
unicode name. Each word is hashed to a lexicon index and a "phrase book
string" is created. The sorted phrase book is then search with a binary
search among 858 entries that can be address directly followed by a
sequential search among 12 entries. The phrase book search index is 8k
and a table that maps phrase book indexes to codepoints is another 20k.

The searching I do makes jython slower then the direct calculation you
do. I'll take another look at this after jython 2.0 to see if I can
improve performance with your page/line number scheme and a total
hashing of all the unicode names.

regards,
finn


From ping@lfw.org  Sun Jan 14 22:44:47 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 14:44:47 -0800 (PST)
Subject: [Python-Dev] SourceForge and long patches
Message-ID: <Pine.LNX.4.10.10101141443200.5846-100000@skuld.kingmanhall.org>

Okay, this is getting really annoying.  SourceForge won't accept
any patches > 16k.  Why not?  Is there a way around this?

    SourceForge: Exiting with Error

    ERROR

    Patch Uploaded ERROR - Submission failed PQsendQuery() -- query is too long. Maximum length is 16382 

I'm trying to submit the update to tokenize.py, but it's too long
because i've changed test/output/test_tokenize and that's a big file.


-- ?!ng



From guido@python.org  Sun Jan 14 22:58:03 2001
From: guido@python.org (Guido van Rossum)
Date: Sun, 14 Jan 2001 17:58:03 -0500
Subject: [Python-Dev] SourceForge and long patches
In-Reply-To: Your message of "Sun, 14 Jan 2001 14:44:47 PST."
 <Pine.LNX.4.10.10101141443200.5846-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101141443200.5846-100000@skuld.kingmanhall.org>
Message-ID: <200101142258.RAA13606@cj20424-a.reston1.va.home.com>

> Okay, this is getting really annoying.  SourceForge won't accept
> any patches > 16k.  Why not?  Is there a way around this?

I have no idea why; can only assume it's a limitation in the database
package they use.

The standard workaround is to upload a URL pointing to the patch. :-(

>     SourceForge: Exiting with Error
> 
>     ERROR
> 
>     Patch Uploaded ERROR - Submission failed PQsendQuery() -- query is too long. Maximum length is 16382 

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Sun Jan 14 23:35:51 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 00:35:51 +0100
Subject: [Python-Dev] Where's Greg Ward ?
Message-ID: <3A6237D7.673BBB30@lemburg.com>

He seems to be offline and the people on the distutils list have some
patches and other things which would be nice to have in distutils 
for 2.1.

I suppose we could simply check in the patches, but we still want
to get his OK on things before applying patches to the distutils
tree.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tim.one@home.com  Sun Jan 14 23:57:45 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 14 Jan 2001 18:57:45 -0500
Subject: [Python-Dev] Where's Greg Ward ?
In-Reply-To: <3A6237D7.673BBB30@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com>

[MAL]
> He seems to be offline and the people on the distutils list have
> some patches and other things which would be nice to have in
> distutils for 2.1.

Greg's somewhere near the end of the process of moving from Virginia to
Canada; I expect he'll become visible again Real Soon.

> I suppose we could simply check in the patches, but we still want
> to get his OK on things before applying patches to the distutils
> tree.

The distutils SIG could elect a Shadow Dictator in his place; if everyone
agrees to vote for Andrew, you save the effort of counting votes <wink>.



From tismer@tismer.com  Mon Jan 15 01:35:57 2001
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 15 Jan 2001 02:35:57 +0100
Subject: [Python-Dev] Minor Bug-fix release for Stackless Python 2.0
Message-ID: <3A6253FD.E9B30462@tismer.com>

Wolfgang Lipp reported that Microthreads were executing
sequentially with SLP 2.0 .

The bug fix is available on the website.
Please use this new version, or microthreads will not
give you much fun.

http://www.stackless.com/spc20-win32.exe
http://www.stackless.com/spc-src-010115.zip

enjoy - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From tommy@ilm.com  Mon Jan 15 02:18:20 2001
From: tommy@ilm.com (Captain Senorita)
Date: Sun, 14 Jan 2001 18:18:20 -0800 (PST)
Subject: [Python-Dev] chomp()?
In-Reply-To: <14923.31238.65155.496546@buffalo.fnal.gov>
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
 <14923.31238.65155.496546@buffalo.fnal.gov>
Message-ID: <14946.23981.694472.406438@mace.lucasdigital.com>

Charles G Waldman writes:
| 
|              P=NP (Python is not Perl)

Is it too late to suggest this for the SPAM9 t-shirt? :)


From guido@python.org  Mon Jan 15 02:24:36 2001
From: guido@python.org (Guido van Rossum)
Date: Sun, 14 Jan 2001 21:24:36 -0500
Subject: [Python-Dev] chomp()?
In-Reply-To: Your message of "Sun, 14 Jan 2001 18:18:20 PST."
 <14946.23981.694472.406438@mace.lucasdigital.com>
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> <14923.31238.65155.496546@buffalo.fnal.gov>
 <14946.23981.694472.406438@mace.lucasdigital.com>
Message-ID: <200101150224.VAA15254@cj20424-a.reston1.va.home.com>

> Charles G Waldman writes:
> | 
> |              P=NP (Python is not Perl)
> 
> Is it too late to suggest this for the SPAM9 t-shirt? :)

By just about a day -- I haven't seen the new design yet, but Just &
Eric were supposed to design it today and hand in the final proofs
tomorrow.  I believe the slogan will be "it fits your brain" (or "it
fits my brain").

But if you print a bunch of P=NP shirts, I'm sure you can sell them
with a profit, both in Long Beach and in San Diego (at the O'Reilly
Open Source conference)...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Mon Jan 15 06:35:05 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 15 Jan 2001 01:35:05 -0500
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
In-Reply-To: <20010110101545.A21305@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEKGIIAA.tim.one@home.com>

[Timmy]
> At this point I'm +0.5 on the idea of fileobject.c using
> ms_getline_hack whenever HAVE_GETC_UNLOCKED isn't available.

[NeilS, from Wednesday]
> Compare ms_getline_hack to what Perl does in order speed up IO.

Believe me, I have <wink>.

> I think its worth maintaining that piece of relatively portable
> code given the benefit.  If the code has to be maintained then it
> might was well be used.  If we find a platform the breaks we can
> always disable it before the final release.

Given that hearty encouragement, and the utterly non-scary results so far, I
just checked in a new scheme:

On a platform with getc_unlocked():
    By default, use getc_unlocked().
    If you want to use fgets() instead, #define USE_FGETS_IN_GETLINE.
        [so motivated people can use fgets() instead if it's faster
         on their platform]
On a platform without getc_unlocked():
    By default, use fgets().
    If you don't want to use fgets(), #define DONT_USE_FGETS_IN_GETLINE.
        [so if we stumble into a platform it fails on between
         releases, the user will have an easy time turning it off
         themself]



From gstein@lyra.org  Mon Jan 15 07:18:20 2001
From: gstein@lyra.org (Greg Stein)
Date: Sun, 14 Jan 2001 23:18:20 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib httplib.py,1.26,1.27
In-Reply-To: <E14HTxn-0003nR-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Sat, Jan 13, 2001 at 08:55:35AM -0800
References: <E14HTxn-0003nR-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010114231820.C6081@lyra.org>

On Sat, Jan 13, 2001 at 08:55:35AM -0800, Guido van Rossum wrote:
> Update of /cvsroot/python/python/dist/src/Lib
> In directory usw-pr-cvs1:/tmp/cvs-serv14586
> 
> Modified Files:
> 	httplib.py 
> Log Message:
> SF Patch #103225 by Ping: httplib: smallest Python patch ever
>...

Not so small:

>...
> *** 333,337 ****
>               i = host.find(':')
>               if i >= 0:
> !                 port = int(host[i+1:])
>                   host = host[:i]
>               else:
> --- 333,340 ----
>               i = host.find(':')
>               if i >= 0:
> !                 try:
> !                     port = int(host[i+1:])
> !                 except ValueError, msg:
> !                     raise socket.error, str(msg)
>                   host = host[:i]
>               else:


Did you intend to commit this?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From moshez@zadka.site.co.il  Mon Jan 15 15:53:58 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Mon, 15 Jan 2001 17:53:58 +0200 (IST)
Subject: [Python-Dev] chomp()?
In-Reply-To: <200101150224.VAA15254@cj20424-a.reston1.va.home.com>
References: <200101150224.VAA15254@cj20424-a.reston1.va.home.com>, <200012281504.KAA25892@cj20424-a.reston1.va.home.com> <14923.31238.65155.496546@buffalo.fnal.gov>
 <14946.23981.694472.406438@mace.lucasdigital.com>
Message-ID: <20010115155358.86E5AA828@darjeeling.zadka.site.co.il>

On Sun, 14 Jan 2001 21:24:36 -0500, Guido van Rossum <guido@python.org> wrote:

> But if you print a bunch of P=NP shirts, I'm sure you can sell them
> with a profit, both in Long Beach and in San Diego (at the O'Reilly
> Open Source conference)...

And the Libre Software Meeting (http://lsm.abul.org), which has a Python
subtopic too.
(Since it's in France, no one is calling it "free", so it's probable you
can sell those T-shirts there...)
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From mal@lemburg.com  Mon Jan 15 09:44:14 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 10:44:14 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <010f01c07e52$e9801fc0$e46940d5@hagrid>
Message-ID: <3A62C66E.2BB69E61@lemburg.com>

Fredrik Lundh wrote:
> 
> The name database portions of SF task 17335 ("add
> compressed unicode database") were postponed to
> 2.1.
> 
> My current patch replaces the ~450k large ucnhash
> module with a new ~160k large module.  (See earlier
> posts for more info on how the new database works).
> 
> Should I check it in?

Since the Unicode character names are probably
not used for performance sensitive tasks, I suggest to
checkin the smallest version possible.

If it is too much work to get Finn's version recoded in C
(presuming it's written in Java), then I'd suggest checking
in your version until someone comes up with a yet smaller
edition.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Mon Jan 15 09:48:49 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 10:48:49 +0100
Subject: [Python-Dev] pydoc - put it in the core
References: <14945.59192.400783.403810@beluga.mojam.com>
 <200101142055.PAA13041@cj20424-a.reston1.va.home.com> <14946.6552.542015.620760@beluga.mojam.com>
Message-ID: <3A62C781.22240D3C@lemburg.com>

Skip Montanaro wrote:
> 
>     Guido> Now, if we could somehow get this to show both the docs that Fred
>     Guido> maintains and the stuff that Ping extracts from the source code,
>     Guido> that would be even better!
> 
> I had exactly the same thought.  I suspect that if the install target were
> modified to install the html-ized sections of the lib reference manual pydoc
> could grovel around in sys and find the root of the library reference manual
> pretty easily.  If not, it could simply redirect to the relevant section of
> http://www.python.org/doc/current/lib/.

Since Fred remarked that the URLs for the different docs are
not fixed, how about adding a __onlinedocs__ attribute to the
standard Python modules providing the correct URL ?

Or, alternatively, pass the module's name through some Google
like "I feel lucky" documentation search engine...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Mon Jan 15 09:51:40 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 10:51:40 +0100
Subject: [Python-Dev] Where's Greg Ward ?
References: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com>
Message-ID: <3A62C82C.EA25AAF5@lemburg.com>

[CCed to distutils, since it matters there]
Tim Peters wrote:
> 
> [MAL]
> > He seems to be offline and the people on the distutils list have
> > some patches and other things which would be nice to have in
> > distutils for 2.1.
> 
> Greg's somewhere near the end of the process of moving from Virginia to
> Canada; I expect he'll become visible again Real Soon.

Great :)
 
> > I suppose we could simply check in the patches, but we still want
> > to get his OK on things before applying patches to the distutils
> > tree.
> 
> The distutils SIG could elect a Shadow Dictator in his place; if everyone
> agrees to vote for Andrew, you save the effort of counting votes <wink>.

Ok, let's agree to vote for Andrew :)

Andrew, is that OK with you ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tim.one@home.com  Mon Jan 15 10:52:09 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 15 Jan 2001 05:52:09 -0500
Subject: [Python-Dev] RE: xreadline speed vs readlines_sizehint
In-Reply-To: <3A5D602D.9DC991CB@per.dem.csiro.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCMELAIIAA.tim.one@home.com>

[Mark Favas]
> ...
> The lines range in length from 96 to 747 characters, with
> 11% @ 233, 17% @ 252 and 52% @ 254 characters, so #1 [a vendor
> who actually optimized fgets()] looks promising - most lines are
> long enough to trigger a realloc.

Plus as soon as you spill over the stack buffer, I make you pay for filling
1024 new bytes with newlines before the next fgets() call, and almost all of
those are irrelevant to you.  It doesn't degrade gracefully.  Alas, I tried
several "adaptive" schemes (adjusting how much of the initial segment of a
larger stack buffer they would use, based on the actual line lengths seen in
the past), but the costs always exceeded the savings on my box.

> Cranking up INITBUFSIZE in ms_getline_hack to 260 from 200
> improves thing again, by another 25%:
> total 131426612 chars and 514216 lines
> count_chars_lines     5.081  5.066
> readlines_sizehint    3.743  3.717
> using_fileinput      11.113 11.100
> while_readline        6.100  6.083
> for_xreadlines        3.027  3.033

Well, I couldn't let you forego *all* of 25%.  The current fileobject.c has
a stack buffer of 300 bytes, but only uses 100 of them on the first gets()
call.  On a very quiet machine, that saved 3-4% of the runtime on *my* test
case, whose line lengths are typical of the text files I crunch over, so I'm
happy for me.  If 100 bytes aren't enough, it must call fgets() again, but
just appends the next call into the full 300-byte buffer.  So it saves the
realloc for lines under 300 chars.

> Apart from the name <grin>, I like ms_getline_hack...

Ya, it's now the non-pejorative getline_via_fgets().  I hate that I became a
grown-up <0.9 wink>.

time-to-pick-wings-off-of-flies-ly y'rs  - tim



From ping@lfw.org  Mon Jan 15 11:11:16 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 03:11:16 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 httplib.py,1.26,1.27
In-Reply-To: <20010114231820.C6081@lyra.org>
Message-ID: <Pine.LNX.4.10.10101150310100.5846-100000@skuld.kingmanhall.org>

On Sun, 14 Jan 2001, Greg Stein wrote:
> Not so small:
> 
> >...
> > *** 333,337 ****
> >               i = host.find(':')
> >               if i >= 0:
> > !                 port = int(host[i+1:])
> >                   host = host[:i]
> >               else:
> > --- 333,340 ----
> >               i = host.find(':')
> >               if i >= 0:
> > !                 try:
> > !                     port = int(host[i+1:])
> > !                 except ValueError, msg:
> > !                     raise socket.error, str(msg)
> >                   host = host[:i]
> >               else:

The above changes were not part of the patch i submitted;
the patch i submitted was exactly a one-character change.
Guido has already edited the file, so there's no need to
commit anything further here.



-- ?!ng



From mal@lemburg.com  Mon Jan 15 11:56:37 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 12:56:37 +0100
Subject: [Python-Dev] Why is soundex marked obsolete?
References: <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com>
Message-ID: <3A62E575.9A584108@lemburg.com>

Tim Peters wrote:
> 
> [M.-A. Lemburg]
> > BTW, are there less English centric "sounds alike" matchers
> > around ?
> 
> Yes, but if anything there are far too many of them:  like Soundex, they're
> just heuristics, and *everybody* who cares adds their own unique twists,
> while proper studies are almost non-existent.  Few variants appear to be in
> use much beyond their inventor's friends; one notable exception in the
> Jewish community is the Daitch-Mokotoff variation, originally tailored to
> their unique needs but later generalized; a brief description here:
> 
>     http://www.avotaynu.com/soundex.html
> 
> The similarly involved NYSIIS algorithm (New York State Identification
> Intelligence System -- look for NYSIIS on Parnassus) was the winner from a
> field of about two dozen competing algorithms, after measuring their
> effectiveness on assorted databases maintained by the state of New York.
> Since New York has a large immigrant population, NYSIIS isn't as
> Anglocentric as Soundex either.

Thanks for the pointer. I'll add that module to my lib :)

       http://metagram.webreply.com/downloads/nysiis.py

Perhaps Eric ought to add this one to his package as well  ?!
BTW, where can I find your package on the web, Eric ? I'd like
to give it a ride under German language conditions ;)
 
> But state-of-the-art has given up on purely computational algorithms for
> these purposes:  proper names are simply too much a mess.  For example, if I
> search for "Richard", it *ought* to match on "Dick"; if my Arab buddy
> searches on "Mohammed", it *ought* to match on "Mhd"; "the rules" people
> actually use just aren't reducible to pure computation -- it takes a large
> knowledge base to capture what people "just know".  You may enjoy visiting
> this commercial site (AFAIK, nobody is giving away state-of-the-art for
> free):
> 
>     http://www.las-inc.com/

Sad -- "patent pending" algorithms don't help anyone on this
planet :(
 
> > ...
> >     http://physics.nist.gov/cuu/Reference/soundex.html
> >
> > works fine for English texts,
> 
> If that were true, the English-speaking researchers would have declared
> victory 120 years ago <wink>.  But English pronunciation is *notoriously*
> difficult to predict from spelling, partly because English is the Perl of
> human languages.

Then Dutch must be the Python of human languages... ;)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From moshez@zadka.site.co.il  Mon Jan 15 20:13:18 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Mon, 15 Jan 2001 22:13:18 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib tabnanny.py,1.10,1.11 telnetlib.py,1.8,1.9 tempfile.py,1.26,1.27 threading.py,1.10,1.11 toaiff.py,1.8,1.9 tokenize.py,1.15,1.16 traceback.py,1.18,1.19 tty.py,1.2,1.3 tzparse.py,1.8,1.9
In-Reply-To: <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
References: <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>

On Sun, 14 Jan 2001 19:26:38 -0800, Tim Peters <tim_one@users.sourceforge.net> wrote:
> Modified Files:
> 	tabnanny.py 
> Log Message:
> Whitespace normalization.

hmmmmmm.......
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From mal@lemburg.com  Mon Jan 15 12:10:30 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 13:10:30 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 tabnanny.py,1.10,1.11 telnetlib.py,1.8,1.9 tempfile.py,1.26,1.27
 threading.py,1.10,1.11 toaiff.py,1.8,1.9 tokenize.py,1.15,1.16
 traceback.py,1.18,1.19 tty.py,1.2,1.3 tzparse.py,1.8,1.9
References: <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
Message-ID: <3A62E8B6.3DFC1FA2@lemburg.com>

Moshe Zadka wrote:
> 
> On Sun, 14 Jan 2001 19:26:38 -0800, Tim Peters <tim_one@users.sourceforge.net> wrote:
> > Modified Files:
> >       tabnanny.py
> > Log Message:
> > Whitespace normalization.
> 
> hmmmmmm.......

Perhaps you ought to make this a CRON job ?!

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From moshez@zadka.site.co.il  Mon Jan 15 20:24:48 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Mon, 15 Jan 2001 22:24:48 +0200 (IST)
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <3A62E8B6.3DFC1FA2@lemburg.com>
References: <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
Message-ID: <20010115202448.38F60A828@darjeeling.zadka.site.co.il>

I'm sorry! I meant to reply to tim alone, and ended up spamming python-dev!
Of course, the real culprit is the person who fixed up the reply-to in
the checkin messages to point to python-dev. Why was it done, and
isn't there a better way? This makes it painful to personally comment
on people's checkin messages. I suggest instead to add a mail-followup-to
header

(Didn't anyone read "Reply-To Munging Considered Harmful"?)
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From esr@thyrsus.com  Mon Jan 15 12:23:25 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 07:23:25 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <3A62E575.9A584108@lemburg.com>; from mal@lemburg.com on Mon, Jan 15, 2001 at 12:56:37PM +0100
References: <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com> <3A62E575.9A584108@lemburg.com>
Message-ID: <20010115072325.A10377@thyrsus.com>

M.-A. Lemburg <mal@lemburg.com>:
> Perhaps Eric ought to add this one to his package as well  ?!

Actually, at this point, my plan is to give Tim a decent interval to
refactor ndiff so his SequenceMatcher class is exposed and documented --
otherwise *I'll* go in and do it (har! waving a bloody knife!).

His turns out to be the same as the Ratcliff-Obershelp technique I was
using, except Tim had his bullshit threshold set too low (:-)) and let
through matches I wouldn't have.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The only purpose for which power can be rightfully exercised over any
member of a civilized community, against his will, is to prevent harm
to others. His own good, either physical or moral, is not a sufficient
warrant
	-- John Stuart Mill, "On Liberty", 1859


From mal@lemburg.com  Mon Jan 15 12:26:59 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 13:26:59 +0100
Subject: [Python-Dev] Re: Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il>
Message-ID: <3A62EC93.9AA60ABA@lemburg.com>

Moshe Zadka wrote:
> 
> I'm sorry! I meant to reply to tim alone, and ended up spamming python-dev!
> Of course, the real culprit is the person who fixed up the reply-to in
> the checkin messages to point to python-dev. Why was it done, and
> isn't there a better way? This makes it painful to personally comment
> on people's checkin messages. I suggest instead to add a mail-followup-to
> header
> 
> (Didn't anyone read "Reply-To Munging Considered Harmful"?)

Naa, noone needs to be shot in the foot ;)

In fact I like it, that replies go to python-dev ... after all,
that's where these things should be discussed.

BTW, in case you misunderstood my reply: it would indeed make
sense to automate these kinds of check (tabnanny et al).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From moshez@zadka.site.co.il  Mon Jan 15 20:42:15 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Mon, 15 Jan 2001 22:42:15 +0200 (IST)
Subject: [Python-Dev] Re: Someone should be shot
In-Reply-To: <3A62EC93.9AA60ABA@lemburg.com>
References: <3A62EC93.9AA60ABA@lemburg.com>, <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il>
Message-ID: <20010115204215.84F0CA828@darjeeling.zadka.site.co.il>

On Mon, 15 Jan 2001 13:26:59 +0100, "M.-A. Lemburg" <mal@lemburg.com> wrote:
 
> In fact I like it, that replies go to python-dev ... after all,
> that's where these things should be discussed.

Well, that's the mailing list where things should be discussed.
But when I press the "Reply" button (as opposed to "Reply to List" button)
I expect my e-mail to go to the person originating the e-mail. 
Reply-To: means "I'd like to get replies to some other address".
What if, say, a checkin message relates to some private topic
I'd discussed with someone: I'd like to reply to him personally.

I agree that responses to Python-Checkins should be handled on Python-Dev:
that's what the mail-followup-to header is for.

> BTW, in case you misunderstood my reply: it would indeed make
> sense to automate these kinds of check (tabnanny et al).

Oh, ok. The "cron" part threw me off (why cron?) 
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From barry@digicool.com  Mon Jan 15 13:15:28 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 08:15:28 -0500
Subject: [Python-Dev] Where's Greg Ward ?
References: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com>
 <3A62C82C.EA25AAF5@lemburg.com>
Message-ID: <14946.63472.282750.828218@anthem.wooz.org>

>>>>> "M" == M  <mal@lemburg.com> writes:

    >>  The distutils SIG could elect a Shadow Dictator in his place;
    >> if everyone agrees to vote for Andrew, you save the effort of
    >> counting votes <wink>.

    M> Ok, let's agree to vote for Andrew :)

    M> Andrew, is that OK with you ?

He's got my vote.  I've been experiencing some weird problems with the
distutils installation of pybsddb3 out of the current Python cvs
tree.  It'd be nice if the outstanding distutils patches are
integrated before I dive in.  I don't see anything relevant in patches
or bugs, but I don't know if there are other repositories of distutils
fixes (like the archives?).

-Barry



From barry@digicool.com  Mon Jan 15 13:27:02 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 08:27:02 -0500
Subject: [Python-Dev] Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>
 <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
 <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
 <20010115202448.38F60A828@darjeeling.zadka.site.co.il>
Message-ID: <14946.64166.348139.425223@anthem.wooz.org>

>>>>> "MZ" == Moshe Zadka <moshez@zadka.site.co.il> writes:

    MZ> I'm sorry! I meant to reply to tim alone, and ended up
    MZ> spamming python-dev!  Of course, the real culprit is the
    MZ> person who fixed up the reply-to in the checkin messages to
    MZ> point to python-dev. Why was it done, and isn't there a better
    MZ> way? This makes it painful to personally comment on people's
    MZ> checkin messages. I suggest instead to add a mail-followup-to
    MZ> header

    MZ> (Didn't anyone read "Reply-To Munging Considered Harmful"?)

Or how about

    http://www.metasystema.org/essays/reply-to-useful.mhtml

for a dissenting view.  Of course Mail-Followup-To is completely
non-standard, but even if it were, having the mailing list munge it in
isn't recommended:

    http://cr.yp.to/proto/replyto.html

Bottom line (IMHO), this is just something about email that is and
will forever remain broken.  Given that, it was voted a long while
back to make Reply-To for checkins point to python-dev so until
there's a hue and cry to change it back, I'll leave it as is.  And
yeah, it bites me sometimes too!

-Barry



From tony@lsl.co.uk  Mon Jan 15 14:18:36 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Mon, 15 Jan 2001 14:18:36 -0000
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <Pine.LNX.4.10.10101110803400.5846-100000@skuld.kingmanhall.org>
Message-ID: <002801c07efe$0c728a80$f05aa8c0@lslp7o.int.lsl.co.uk>

<fx: jumps up and down in glee>

Neat stuff. Ka-Ping Yee strikes again. And it works with Python 1.5.2.

<fx: more of the same>

Running on NT (4.00.1381) in an "MS-DOS" window, using Python 1.5.2
installed in the effbot manner, it works, with the slight strangeness
that if I do:

	python pydoc.py <name>

I get the documentation for <name> OK, but it is preceded with a line
claiming that:

	The system cannot find the path specified.

I don't have the time to pursue this at the moment - it's possibly an
artefact of our system?

(one minor "prettiness" hack - those of us who have been tainted by
Emacs Lisp programming tend to start module documentation off with a
line of the form:

	<name>.py -- information about the module

which, when pydoc'ed, results in a NAME line which starts with <name>
twice...
Of course, if I'm the only person doing this, I'll just have to, well,
stop...)

A request - a "-f" switch to allow the user to specify a particular
Python file (i.e., something not on the PYTHONPATH).

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"How fleeting are all human passions compared with the massive
continuity of ducks." - Dorothy L. Sayers, "Gaudy Night"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)



From jack@oratrix.nl  Mon Jan 15 14:32:02 2001
From: jack@oratrix.nl (Jack Jansen)
Date: Mon, 15 Jan 2001 15:32:02 +0100
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
In-Reply-To: Message by Guido van Rossum <guido@python.org> ,
 Sat, 13 Jan 2001 17:33:34 -0500 , <200101132233.RAA03229@cj20424-a.reston1.va.home.com>
Message-ID: <20010115143203.A44B63C2031@snelboot.oratrix.nl>

Also note that the problem only occurs when trying to build a unix-Python 
out-of-the-box on MacOSX. If you're building a Carbon Python from the 
MacPython sources (something very few people can do right now:-) the 
executable isn't called "python". And when a real MacOSX-Python will be done 
it'll have all the nifty packaging stuff that will also make sure that there's 
nothing called "python" in the toplevel folder.

And the two workarounds (1-Use a UFS filesystem, 2-Put a ".exe" extension in 
the Makefile) work fine for the mean time.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 




From guido@python.org  Mon Jan 15 14:33:23 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 09:33:23 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib httplib.py,1.26,1.27
In-Reply-To: Your message of "Sun, 14 Jan 2001 23:18:20 PST."
 <20010114231820.C6081@lyra.org>
References: <E14HTxn-0003nR-00@usw-pr-cvs1.sourceforge.net>
 <20010114231820.C6081@lyra.org>
Message-ID: <200101151433.JAA17944@cj20424-a.reston1.va.home.com>

> >...
> > *** 333,337 ****
> >               i = host.find(':')
> >               if i >= 0:
> > !                 port = int(host[i+1:])
> >                   host = host[:i]
> >               else:
> > --- 333,340 ----
> >               i = host.find(':')
> >               if i >= 0:
> > !                 try:
> > !                     port = int(host[i+1:])
> > !                 except ValueError, msg:
> > !                     raise socket.error, str(msg)
> >                   host = host[:i]
> >               else:
> 
> Did you intend to commit this?

Oops.  That was a patch submitted a while ago that I applied as an
experiment but then decided I didn't like (argument: why bother).
I've reverted it.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Jan 15 14:40:30 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 09:40:30 -0500
Subject: [Python-Dev] Someone should be shot
In-Reply-To: Your message of "Mon, 15 Jan 2001 22:24:48 +0200."
 <20010115202448.38F60A828@darjeeling.zadka.site.co.il>
References: <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
 <20010115202448.38F60A828@darjeeling.zadka.site.co.il>
Message-ID: <200101151440.JAA18045@cj20424-a.reston1.va.home.com>

> I'm sorry! I meant to reply to tim alone, and ended up spamming python-dev!
> Of course, the real culprit is the person who fixed up the reply-to in
> the checkin messages to point to python-dev. Why was it done, and
> isn't there a better way? This makes it painful to personally comment
> on people's checkin messages. I suggest instead to add a mail-followup-to
> header
> 
> (Didn't anyone read "Reply-To Munging Considered Harmful"?)

I agree with you, but Barry (who set this up) seems to believe that
there's a good reason to do it this way.  Barry, do you still feel
that way?  The auto-reply-all has probably tripped me up more than
anyone.  Anyone else have a strong reason why this should be set?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From moshez@zadka.site.co.il  Mon Jan 15 23:03:25 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Tue, 16 Jan 2001 01:03:25 +0200 (IST)
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <14946.64166.348139.425223@anthem.wooz.org>
References: <14946.64166.348139.425223@anthem.wooz.org>, <3A62E8B6.3DFC1FA2@lemburg.com>
 <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
 <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
 <20010115202448.38F60A828@darjeeling.zadka.site.co.il>
Message-ID: <20010115230325.1C7F5A828@darjeeling.zadka.site.co.il>

On Mon, 15 Jan 2001 08:27:02 -0500, barry@digicool.com (Barry A. Warsaw) wrote:
> 
> Or how about
> 
>     http://www.metasystema.org/essays/reply-to-useful.mhtml

     If your mailer doesn't have this option, you should request it from
     its development team. Any mailer, whose development team refuses
     this simple request due to some ideological position, cannot be
     said to be reasonable.

As some people here know, I'm my mailer's "development team". I refuse to add
it due to an ideological position. Anyone who knows me know I'm quite 
unreasonable. Hmmm....I'm not making much headway, am I ;-)

> for a dissenting view.  Of course Mail-Followup-To is completely
> non-standard, but even if it were, having the mailing list munge it in
> isn't recommended:
> 
>     http://cr.yp.to/proto/replyto.html

This has no relevance to the current case, since python-checkin 
messages are machine-generated -- so this is closer to doing this in
the script generating the checkin message, and only differes in 
implementation.

> Bottom line (IMHO), this is just something about email that is and
> will forever remain broken.  Given that, it was voted a long while
> back to make Reply-To for checkins point to python-dev so until
> there's a hue and cry to change it back, I'll leave it as is.  And
> yeah, it bites me sometimes too!

I won't continue this thread, but remember that my vote is "no".
I simply shudder at the thought that I might send someone e-mail
with something like "nice bugfix. Didn't know you were back from
the sex-change operation", and it would be broadcast out to all
Python-Dev *and* the archives, for posterity.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From thomas@xs4all.net  Mon Jan 15 15:31:22 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 15 Jan 2001 16:31:22 +0100
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <14946.64166.348139.425223@anthem.wooz.org>; from barry@digicool.com on Mon, Jan 15, 2001 at 08:27:02AM -0500
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <14946.64166.348139.425223@anthem.wooz.org>
Message-ID: <20010115163122.I1005@xs4all.nl>

On Mon, Jan 15, 2001 at 08:27:02AM -0500, Barry A. Warsaw wrote:

> Bottom line (IMHO), this is just something about email that is and
> will forever remain broken.  Given that, it was voted a long while
> back to make Reply-To for checkins point to python-dev so until
> there's a hue and cry to change it back, I'll leave it as is.  And
> yeah, it bites me sometimes too!

I've said this before, on the Mailman-devel list, but I'll repeat it here
for the record (in case this issue ever comes up for vote again :)

The main bite (for me) is that to reply to a person in private, you have to
cut&paste the 'From' header from the original mail, and edit your new mail's
headers, in order to reply to a specific person. My mailer is mature enough
to have a 'reply', 'reply-group' and 'reply-list' keybinding, so the
'Reply-To' only interferes. There probably is a
'reply-to-from-ignoring-replyto' keybinding in there, too, somewhere, or it
could be added, but remembering to type that different key is almost as much
trouble as typing the email address by hand ;P

So, my vote, like Moshe's, is just back from a sex change, and reads 'no'.

Recount-recount-ly y'rs,
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@python.org  Mon Jan 15 15:38:01 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 10:38:01 -0500
Subject: [Python-Dev] Someone should be shot
In-Reply-To: Your message of "Mon, 15 Jan 2001 08:27:02 EST."
 <14946.64166.348139.425223@anthem.wooz.org>
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il>
 <14946.64166.348139.425223@anthem.wooz.org>
Message-ID: <200101151538.KAA21937@cj20424-a.reston1.va.home.com>

> Bottom line (IMHO), this is just something about email that is and
> will forever remain broken.  Given that, it was voted a long while
> back to make Reply-To for checkins point to python-dev so until
> there's a hue and cry to change it back, I'll leave it as is.  And
> yeah, it bites me sometimes too!

It sounds like a hue and cry to change it to me!  It looks like it's
time for a BDFL Pronouncement.  I pronounce:

Given that:

- we all know how to mail to python-dev;

- replying to the sender is by far the most common kind of reply;

- the mistake of replying to the sender when a reply-all was intended
  does much less potential harm than the mistake of replying to all
  when reply-to-sender was intended,

the reply-to header shall be removed.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Mon Jan 15 16:57:19 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 15 Jan 2001 11:57:19 -0500
Subject: [Python-Dev] Where's Greg Ward ?
In-Reply-To: <14946.63472.282750.828218@anthem.wooz.org>; from barry@digicool.com on Mon, Jan 15, 2001 at 08:15:28AM -0500
References: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com> <3A62C82C.EA25AAF5@lemburg.com> <14946.63472.282750.828218@anthem.wooz.org>
Message-ID: <20010115115719.B919@kronos.cnri.reston.va.us>

On Mon, Jan 15, 2001 at 08:15:28AM -0500, Barry A. Warsaw wrote:
>tree.  It'd be nice if the outstanding distutils patches are
>integrated before I dive in.  I don't see anything relevant in patches
>or bugs, but I don't know if there are other repositories of distutils
>fixes (like the archives?).

There are a few patches buried in the back archives, but I don't know
of any outstanding bugfixes, so please report whatever problem you're
seeing.

Oh, and Barry, did the issue holding up your patch for adding shar
support (#102313) ever get resolved?

--amk


From guido@python.org  Mon Jan 15 16:02:39 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 11:02:39 -0500
Subject: [Python-Dev] TELL64
In-Reply-To: Your message of "Mon, 08 Jan 2001 18:20:56 PST."
 <20010108182056.C4640@lyra.org>
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net>
 <20010108182056.C4640@lyra.org>
Message-ID: <200101151602.LAA22272@cj20424-a.reston1.va.home.com>

Greg Stein noticed me checking in *yet* another system that needs
the fallback TELL64() definition in fileobjects.c, and wrote:

> All of those #ifdefs could be tossed and it would be more robust (long term)
> if an autoconf macro were used to specify when TELL64 should be defined.
> 
> [ I've looked thru fileobject.c and am a bit confused: the conditions for
>   defining TELL64 do not match the conditions for *using* it. that would
>   seem to imply a semantic error somewhere and/or a potential gotcha when
>   they get skewed (like I assume what happened to FreeBSD). simplifying with
>   an autoconf macro may help to rationalize it. ]

I have a better idea.  Since "lseek((fd),0,SEEK_CUR)" seems to be the
universal fallback, why not just define TELL64 to be that if it's not
previously defined (currently only MS_WIN64 has a different
definition)?  It isn't always *used* (the conditions under which
_portable_fseek() uses it are quite complex), but *when* it is used,
this seems to be the most common definition...

Patch:

*** fileobject.c	2001/01/15 10:36:56	2.106
--- fileobject.c	2001/01/15 16:02:06
***************
*** 58,66 ****
  /* define the appropriate 64-bit capable tell() function */
  #if defined(MS_WIN64)
  #define TELL64 _telli64
! #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(__FreeBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
! /* NOTE: this is only used on older
!    NetBSD prior to f*o() funcions */
  #define TELL64(fd) lseek((fd),0,SEEK_CUR)
  #endif
  
--- 58,65 ----
  /* define the appropriate 64-bit capable tell() function */
  #if defined(MS_WIN64)
  #define TELL64 _telli64
! #else
! /* Fallback for older systems that don't have the f*o() funcions */
  #define TELL64(fd) lseek((fd),0,SEEK_CUR)
  #endif


I'll check this in after 24 hours unless a better idea comes up.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Mon Jan 15 16:17:07 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 11:17:07 -0500
Subject: [Python-Dev] PEP 205 comments
In-Reply-To: Your message of "Fri, 12 Jan 2001 23:19:57 +0100."
 <200101122219.f0CMJvp01376@mira.informatik.hu-berlin.de>
References: <200101122219.f0CMJvp01376@mira.informatik.hu-berlin.de>
Message-ID: <200101151617.LAA22359@cj20424-a.reston1.va.home.com>

I'll leave most of this to Fred, but I'll reply to two items (Fred can
add these replies to the PEP):

> Again on proxies, there is no discussion or documentation of the
> ReferenceError. Why is it a RuntimeError? LookupError, ValueError, and
> AttributeError seem to be just as fine or better.

RuntimeError was my suggestion.  The error doesn't really qualify as a
LookupError in my view (there's no key that could be valid or invalid)
and ValueError seems too general (that's typically used for
out-of-range arguments and unparseable strings and the like).  Do you
have a reason why RuntimeError is inappropriate?

> On to the type type extensions: Should there be a type flag indicating
> presence of tp_weaklistoffset? It appears that the type structure had
> tp_xxx7 for a long time, so likely all in-use binary modules have
> that field set to zero. Is that sufficient?

Yes, that should be sufficient.  (I'm also going to clain tp_xxx7 for
the rich comparison function slot, but either patch can be modified to
use tp_xxx8 instead.)  Maybe it's time to add a bunch of new spares?

> Thanks for reading all of this message,

You're welcome.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@digicool.com  Mon Jan 15 16:39:03 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 11:39:03 -0500
Subject: [Python-Dev] Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>
 <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
 <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
 <20010115202448.38F60A828@darjeeling.zadka.site.co.il>
 <14946.64166.348139.425223@anthem.wooz.org>
 <200101151538.KAA21937@cj20424-a.reston1.va.home.com>
Message-ID: <14947.10151.575008.869188@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    GvR> the reply-to header shall be removed.

I'm more than happy to do this (I remember adding the reply-to munging
reluctantly).  Understand one thing: anybody who naively replies to
the whole list will send those replies to python-checkins, not
python-dev.

Still want it?

-Barry



From barry@digicool.com  Mon Jan 15 16:46:28 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 11:46:28 -0500
Subject: [Python-Dev] Where's Greg Ward ?
References: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com>
 <3A62C82C.EA25AAF5@lemburg.com>
 <14946.63472.282750.828218@anthem.wooz.org>
 <20010115115719.B919@kronos.cnri.reston.va.us>
Message-ID: <14947.10596.733726.995351@anthem.wooz.org>

>>>>> "AK" == Andrew Kuchling <akuchlin@mems-exchange.org> writes:

    AK> There are a few patches buried in the back archives, but I
    AK> don't know of any outstanding bugfixes, so please report
    AK> whatever problem you're seeing.

Okay, will do.

    AK> Oh, and Barry, did the issue holding up your patch for adding
    AK> shar support (#102313) ever get resolved?

No, but I'll try to take another poke at it.

-Barry



From moshez@zadka.site.co.il  Tue Jan 16 01:07:48 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Tue, 16 Jan 2001 03:07:48 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Demo/metaclasses Meta.py,1.3,1.4
In-Reply-To: <E14ICtM-00083b-00@usw-pr-cvs1.sourceforge.net>
References: <E14ICtM-00083b-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010116010748.41869A828@darjeeling.zadka.site.co.il>

On Mon, 15 Jan 2001, Guido van Rossum <gvanrossum@users.sourceforge.net> wrote:

> Modified Files:
> 	Meta.py 
> Log Message:
> Geoffrey Gerrietts discovered that a KeyError was caught that probably
> should have been a NameError.  I'm checking in a change that catches
> both, just to be sure -- I can't be bothered trying to understand this
> code any more. :-)
...
> !             except (KeyError, AttributeError):

Ummmm....can you be bothered to make sure you really meant AttributeError
when you said NameError? <wink>
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From guido@python.org  Mon Jan 15 17:06:07 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 12:06:07 -0500
Subject: [Python-Dev] Someone should be shot
In-Reply-To: Your message of "Mon, 15 Jan 2001 11:39:03 EST."
 <14947.10151.575008.869188@anthem.wooz.org>
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <14946.64166.348139.425223@anthem.wooz.org> <200101151538.KAA21937@cj20424-a.reston1.va.home.com>
 <14947.10151.575008.869188@anthem.wooz.org>
Message-ID: <200101151706.MAA22884@cj20424-a.reston1.va.home.com>

> I'm more than happy to do this (I remember adding the reply-to munging
> reluctantly).  Understand one thing: anybody who naively replies to
> the whole list will send those replies to python-checkins, not
> python-dev.
> 
> Still want it?

Yes.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@digicool.com  Mon Jan 15 17:11:29 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 12:11:29 -0500
Subject: [Python-Dev] Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>
 <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
 <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
 <20010115202448.38F60A828@darjeeling.zadka.site.co.il>
 <14946.64166.348139.425223@anthem.wooz.org>
 <200101151538.KAA21937@cj20424-a.reston1.va.home.com>
 <14947.10151.575008.869188@anthem.wooz.org>
 <200101151706.MAA22884@cj20424-a.reston1.va.home.com>
Message-ID: <14947.12097.613433.580928@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

    >> I'm more than happy to do this (I remember adding the reply-to
    >> munging reluctantly).  Understand one thing: anybody who
    >> naively replies to the whole list will send those replies to
    >> python-checkins, not python-dev.  Still want it?

    GvR> Yes.

Done.



From thomas@xs4all.net  Mon Jan 15 17:34:37 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 15 Jan 2001 18:34:37 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib ftplib.py,1.47,1.48
In-Reply-To: <E14ICYu-000781-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Mon, Jan 15, 2001 at 08:32:52AM -0800
References: <E14ICYu-000781-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010115183437.J1005@xs4all.nl>

On Mon, Jan 15, 2001 at 08:32:52AM -0800, Guido van Rossum wrote:

> This is slightly controversial, but after reading the argumentation in
> the bug tracker for and against, I believe this is the right solution.

It's really only slightly controversional. 'mfisk' convinced me too, and I
used to use ftp to a server behind a firewall :-)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mal@lemburg.com  Mon Jan 15 18:21:54 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 19:21:54 +0100
Subject: [Python-Dev] Re: Someone should be shot
References: <3A62EC93.9AA60ABA@lemburg.com>, <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <20010115204215.84F0CA828@darjeeling.zadka.site.co.il>
Message-ID: <3A633FC2.11F90E94@lemburg.com>

Moshe Zadka wrote:
> 
> On Mon, 15 Jan 2001 13:26:59 +0100, "M.-A. Lemburg" <mal@lemburg.com> wrote:
> 
> > In fact I like it, that replies go to python-dev ... after all,
> > that's where these things should be discussed.
> 
> Well, that's the mailing list where things should be discussed.
> But when I press the "Reply" button (as opposed to "Reply to List" button)
> I expect my e-mail to go to the person originating the e-mail.
> Reply-To: means "I'd like to get replies to some other address".
> What if, say, a checkin message relates to some private topic
> I'd discussed with someone: I'd like to reply to him personally.
> 
> I agree that responses to Python-Checkins should be handled on Python-Dev:
> that's what the mail-followup-to header is for.

Ah, ok. I thought you pressed Reply-All and then wondered why
your message got copied to python-dev...
 
> > BTW, in case you misunderstood my reply: it would indeed make
> > sense to automate these kinds of check (tabnanny et al).
> 
> Oh, ok. The "cron" part threw me off (why cron?)

CRON is what's used on Unix to implement jobs which run
on a regular basis... perhaps we just need to seup the
CRON job in timbot though ;)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@python.org  Mon Jan 15 18:35:54 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 13:35:54 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Demo/metaclasses Meta.py,1.3,1.4
In-Reply-To: Your message of "Tue, 16 Jan 2001 03:07:48 +0200."
 <20010116010748.41869A828@darjeeling.zadka.site.co.il>
References: <E14ICtM-00083b-00@usw-pr-cvs1.sourceforge.net>
 <20010116010748.41869A828@darjeeling.zadka.site.co.il>
Message-ID: <200101151835.NAA26712@cj20424-a.reston1.va.home.com>

> > Modified Files:
> > 	Meta.py 
> > Log Message:
> > Geoffrey Gerrietts discovered that a KeyError was caught that probably
> > should have been a NameError.  I'm checking in a change that catches
> > both, just to be sure -- I can't be bothered trying to understand this
> > code any more. :-)
> ...
> > !             except (KeyError, AttributeError):
> 
> Ummmm....can you be bothered to make sure you really meant AttributeError
> when you said NameError? <wink>

The code is correct.  Ignore the comment. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)


From nas@arctrix.com  Mon Jan 15 11:55:51 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Mon, 15 Jan 2001 03:55:51 -0800
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <14947.10151.575008.869188@anthem.wooz.org>; from barry@digicool.com on Mon, Jan 15, 2001 at 11:39:03AM -0500
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <14946.64166.348139.425223@anthem.wooz.org> <200101151538.KAA21937@cj20424-a.reston1.va.home.com> <14947.10151.575008.869188@anthem.wooz.org>
Message-ID: <20010115035550.B4336@glacier.fnational.com>

[Barry on removing the reply-to header on python-checkins messages]
> I'm more than happy to do this (I remember adding the reply-to munging
> reluctantly).  Understand one thing: anybody who naively replies to
> the whole list will send those replies to python-checkins, not
> python-dev.

Could you make the script generate mail-followup-to instead of
reply-to?  I know its not a standard header but some MUA
understand it and it is exactly what is needed to solve this
problem.  I think promoting it is a good thing.

  Neil


From thomas@xs4all.net  Mon Jan 15 18:59:12 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 15 Jan 2001 19:59:12 +0100
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <20010115035550.B4336@glacier.fnational.com>; from nas@arctrix.com on Mon, Jan 15, 2001 at 03:55:51AM -0800
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <14946.64166.348139.425223@anthem.wooz.org> <200101151538.KAA21937@cj20424-a.reston1.va.home.com> <14947.10151.575008.869188@anthem.wooz.org> <20010115035550.B4336@glacier.fnational.com>
Message-ID: <20010115195912.K1005@xs4all.nl>

On Mon, Jan 15, 2001 at 03:55:51AM -0800, Neil Schemenauer wrote:
> [Barry on removing the reply-to header on python-checkins messages]
> > I'm more than happy to do this (I remember adding the reply-to munging
> > reluctantly).  Understand one thing: anybody who naively replies to
> > the whole list will send those replies to python-checkins, not
> > python-dev.

> Could you make the script generate mail-followup-to instead of
> reply-to?  I know its not a standard header but some MUA
> understand it and it is exactly what is needed to solve this
> problem.  I think promoting it is a good thing.

The script just calls '/bin/mail'. The Reply-To munging is done by Mailman,
which is slightly more than 'a script'. syncmail could do it, but that would
mean using sendmail instead of mail, and writing all headers itself.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@python.org  Mon Jan 15 19:17:27 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 14:17:27 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: Your message of "Fri, 05 Jan 2001 14:14:49 EST."
 <14934.7465.360749.199433@localhost.localdomain>
References: <14934.7465.360749.199433@localhost.localdomain>
Message-ID: <200101151917.OAA29687@cj20424-a.reston1.va.home.com>

There doesn't seem to be a lot of enthousiasm for a Unittest
bakeoff...  Certainly I don't think I'll get to this myself before the
conference.

How about the following though: talking of low-hanging fruit, Tim's
doctest module is an excellent thing even if it isn't a unit testing
framework!  (I found this out when I played with it -- it's real easy
to get used to...)

Would anyone object against Tim checking this in?  Since it isn't a
contender in the unit test bake-off, it shouldn't affect the outcome
there at all.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@digicool.com  Mon Jan 15 19:40:03 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 14:40:03 -0500
Subject: [Python-Dev] Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>
 <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
 <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
 <20010115202448.38F60A828@darjeeling.zadka.site.co.il>
 <14946.64166.348139.425223@anthem.wooz.org>
 <200101151538.KAA21937@cj20424-a.reston1.va.home.com>
 <14947.10151.575008.869188@anthem.wooz.org>
 <20010115035550.B4336@glacier.fnational.com>
 <20010115195912.K1005@xs4all.nl>
Message-ID: <14947.21011.310090.686632@anthem.wooz.org>

>>>>> "TW" == Thomas Wouters <thomas@xs4all.net> writes:

    >> Could you make the script generate mail-followup-to instead of
    >> reply-to?  I know its not a standard header but some MUA
    >> understand it and it is exactly what is needed to solve this
    >> problem.  I think promoting it is a good thing.

    TW> The script just calls '/bin/mail'. The Reply-To munging is
    TW> done by Mailman, which is slightly more than 'a
    TW> script'. syncmail could do it, but that would mean using
    TW> sendmail instead of mail, and writing all headers itself.

I'm sure Fred or I would be happy to review such a patch to syncmail
<wink>.

-Barry


From jeremy@alum.mit.edu  Mon Jan 15 19:31:44 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 15 Jan 2001 14:31:44 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <200101151917.OAA29687@cj20424-a.reston1.va.home.com>
References: <14934.7465.360749.199433@localhost.localdomain>
 <200101151917.OAA29687@cj20424-a.reston1.va.home.com>
Message-ID: <14947.20512.140859.119597@localhost.localdomain>

>>>>> "GvR" == Guido van Rossum <guido@python.org> writes:

  GvR> There doesn't seem to be a lot of enthousiasm for a Unittest
  GvR> bakeoff...  Certainly I don't think I'll get to this myself
  GvR> before the conference.

Let's have all the interested parties vote now, then.  It would
certainly be helpful to have the new unittest module in the alpha
release of 2.1.  I'd like to write some new tests and I'd rather use
the new stuff than the old stuff.

Jeremy


From tim.one@home.com  Mon Jan 15 20:01:52 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 15 Jan 2001 15:01:52 -0500
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <14947.10151.575008.869188@anthem.wooz.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEMOIIAA.tim.one@home.com>

[Barry]
> ...
> Understand one thing: anybody who naively replies to the whole
> list will send those replies to python-checkins, not python-dev.

IIRC, that's why the redirect to python-dev was added to begin with:  of
course people will reply to python-checkins, and then the next guy x-posts
to python-dev too, and the next three in turn variously remove one or the
other groups, or keep both or add c.l.py too.  In the end, no single archive
contains a coherent record on its own, and the random mix of "[Python-Dev]"
and "[Python-checkins]" Subject tags even make it impossible to sort by
(true) subject easily in your own mail client.

> Still want it?

Don't care <wink -- but simple tech approaches to human carelessness (the
true problem here!) don't work no matter which way you flip the switch>.



From tim.one@home.com  Mon Jan 15 20:08:15 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 15 Jan 2001 15:08:15 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib tabnanny.py,1.10,1.11 telnetlib.py,1.8,1.9 tempfile.py,1.26,1.27 threading.py,1.10,1.11 toaiff.py,1.8,1.9 tokenize.py,1.15,1.16 traceback.py,1.18,1.19 tty.py,1.2,1.3 tzparse.py,1.8,1.9
In-Reply-To: <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEMOIIAA.tim.one@home.com>

[<tim_one@users.sourceforge.net>]
> Modified Files:
> 	tabnanny.py
> Log Message:
> Whitespace normalization.

[Moshe]
> hmmmmmm.......

LOL!  I was hoping nobody would notice that <0.7 wink>.  The appalling truth
is that late in tabnanny's development I deliberately indented a large block
of code by one column, and actually thought it was a good idea at the time.
I'm as delighted to see that finally fixed as I am emabarrassed by the
necessity.

although-perhaps-more-appalled-that-was-there-was-followup-
    debate-about-followups-containing-more-msgs-than-there-
    were-characters-in-moshe's-followup-ly y'rs  - tim



From ping@lfw.org  Mon Jan 15 20:10:10 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 12:10:10 -0800 (PST)
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and
 outside of Python)
In-Reply-To: <002801c07efe$0c728a80$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <Pine.LNX.4.10.10101151155270.5846-100000@skuld.kingmanhall.org>

On Mon, 15 Jan 2001, Tony J Ibbs (Tibs) wrote:
> I get the documentation for <name> OK, but it is preceded with a line
> claiming that:
> 
> 	The system cannot find the path specified.

Thanks for the NT testing.  That's funny -- i put in a special case
for Windows to avoid messages like the above a couple of days ago.
How recently did you download pydoc.py?  Does your copy contain:

    if hasattr(sys, 'winver'):
        return lambda text: tempfilepager(text, 'more')

?

> 	<name>.py -- information about the module
> 
> which, when pydoc'ed, results in a NAME line which starts with <name>
> twice...
> Of course, if I'm the only person doing this, I'll just have to, well,
> stop...)

I think i'm going to ask you to stop, unless Guido prefers
otherwise.  Guido, do you have a style pronouncement for module
docstrings?

> A request - a "-f" switch to allow the user to specify a particular
> Python file (i.e., something not on the PYTHONPATH).

Yes, it's on my to-do list.

So you can see what i'm up to, here's my current to-do list:

    make boldness optional (only if using more/less?  only Unix?)
    document a .py file given on the command line
  + webserver in background
    help should have a repr
    write a better htmlrepr (\n should look special, max length limit, etc.)
    generate docs from lib HTML
    generate HTML index from precis and __path__ and package contents list
    have help(...) produce a directory of available things to ask for help on
    curses.wrapper is broken: both function and package
    respect package __all__
    coherent answer to .py vs .pyc: do we show .pyc?
    fix getcomments() bug: last two lines stuck together
  + grey out shadowed modules/packages
    refactor .py/.pyc/.module.so/.module.so.1 listers in htmldoc, textdoc
    skip __main__ module
  + index built-in modules too
    Windows and Mac testing
    default to HTTP mode on GUI platforms?  (win, mac)

The ones marked with + i consider done.  Feel free to comment on
or suggest priorities for the others; in particular, what do you
think of the last one?  The idea is that double-clicking on
pydoc.py in Windows or MacOS could launch the server and then open
the localhost URL using webbrowser.py to display the documentation
index.  Should it do this by default?


-- ?!ng



From guido@python.org  Mon Jan 15 20:41:25 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 15:41:25 -0500
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and outside of Python)
In-Reply-To: Your message of "Mon, 15 Jan 2001 12:10:10 PST."
 <Pine.LNX.4.10.10101151155270.5846-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101151155270.5846-100000@skuld.kingmanhall.org>
Message-ID: <200101152041.PAA32298@cj20424-a.reston1.va.home.com>

> > 	<name>.py -- information about the module
> > 
> > which, when pydoc'ed, results in a NAME line which starts with <name>
> > twice...
> > Of course, if I'm the only person doing this, I'll just have to, well,
> > stop...)
> 
> I think i'm going to ask you to stop, unless Guido prefers
> otherwise.  Guido, do you have a style pronouncement for module
> docstrings?

I'm with Ping.  None of the examples in the style guide start the
docstring with the function name.  Almost none of the standard library
modules start their module docstring with the module name (codecs is
an exception, but I didn't write it :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From bckfnn@worldonline.dk  Mon Jan 15 20:45:02 2001
From: bckfnn@worldonline.dk (Finn Bock)
Date: Mon, 15 Jan 2001 20:45:02 GMT
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <3A62C66E.2BB69E61@lemburg.com>
References: <010f01c07e52$e9801fc0$e46940d5@hagrid> <3A62C66E.2BB69E61@lemburg.com>
Message-ID: <3a636122.45847835@smtp.worldonline.dk>

[Fredrik Lundh]

> The name database portions of SF task 17335 ("add
> compressed unicode database") were postponed to
> 2.1.
> 
> My current patch replaces the ~450k large ucnhash
> module with a new ~160k large module.  (See earlier
> posts for more info on how the new database works).
> 
> Should I check it in?

[M.-A. Lemburg]

>Since the Unicode character names are probably
>not used for performance sensitive tasks, I suggest to
>checkin the smallest version possible.
>
>If it is too much work to get Finn's version recoded in C
>(presuming it's written in Java), then I'd suggest checking
>in your version until someone comes up with a yet smaller
>edition.

FWIW, I agree the that 160k module should be used. Please, nobody should
use the jython compression as an argument to delay any improvements in
CPython. 

I certainly didn't post because I wanted to complicate your processes. I
just wanted to show off <wink>.

regards,
finn


From fredrik@effbot.org  Mon Jan 15 20:58:11 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Mon, 15 Jan 2001 21:58:11 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <010f01c07e52$e9801fc0$e46940d5@hagrid> <3A62C66E.2BB69E61@lemburg.com> <3a636122.45847835@smtp.worldonline.dk>
Message-ID: <001f01c07f35$e2c09500$e46940d5@hagrid>

mal, finn:
> >If it is too much work to get Finn's version recoded in C
> >(presuming it's written in Java), then I'd suggest checking
> >in your version until someone comes up with a yet smaller
> >edition.
> 
> FWIW, I agree the that 160k module should be used. Please, nobody should
> use the jython compression as an argument to delay any improvements in
> CPython. 

okay, unless someone throws in a -1 vote, I'll check
this in tomorrow.

Cheers /F



From tim.one@home.com  Mon Jan 15 20:57:26 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 15 Jan 2001 15:57:26 -0500
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <010f01c07e52$e9801fc0$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCGENEIIAA.tim.one@home.com>

[Fredrik Lundh]
> The name database portions of SF task 17335 ("add
> compressed unicode database") were postponed to
> 2.1.
>
> My current patch replaces the ~450k large ucnhash
> module with a new ~160k large module.  (See earlier
> posts for more info on how the new database works).
>
> Should I check it in?

Absolutely!  But not like as for 2.0:  check it in *now*, so we have a few
days to deal with surprises before the alpha release.  With 300K sitting on
the table waiting to be taken, it's not worth delaying one hour to worry
about 60K additional that may or may not be achievable later.



From ping@lfw.org  Mon Jan 15 21:02:38 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 13:02:38 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Demo/metaclasses
 Meta.py,1.3,1.4
In-Reply-To: <20010116010748.41869A828@darjeeling.zadka.site.co.il>
Message-ID: <Pine.LNX.4.10.10101151302070.5846-100000@skuld.kingmanhall.org>

On Tue, 16 Jan 2001, Moshe Zadka wrote:
> Ummmm....can you be bothered to make sure you really meant AttributeError
> when you said NameError? <wink>

Nice bugfix.  Didn't know you were back from the sex-change operation.


-- ?!ng



From tim.one@home.com  Mon Jan 15 21:15:54 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 15 Jan 2001 16:15:54 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <200101151917.OAA29687@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMENFIIAA.tim.one@home.com>

[Guido]
> There doesn't seem to be a lot of enthousiasm for a Unittest
> bakeoff...

I'm enthusiastic, but ...

> Certainly I don't think I'll get to this myself before the
> conference.

Ditto.  Takes time that's not there.

> ...
> Would anyone object against Tim checking [doctest] in?

You suggested that before, and so it was already on my 2.1a1 todo list.
Hoped to get to it over the weekend but didn't.  Hope to get to it today,
but won't <wink - I hope>.  On the chance that I do, anyone inclined to
object should do so before the sun sets in Reston.

or-if-it-never-sets-the-world-ends-anyway-ly y'rs  - tim



From akuchlin@mems-exchange.org  Mon Jan 15 21:26:19 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 15 Jan 2001 16:26:19 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <14947.20512.140859.119597@localhost.localdomain>; from jeremy@alum.mit.edu on Mon, Jan 15, 2001 at 02:31:44PM -0500
References: <14934.7465.360749.199433@localhost.localdomain> <200101151917.OAA29687@cj20424-a.reston1.va.home.com> <14947.20512.140859.119597@localhost.localdomain>
Message-ID: <20010115162619.A19484@kronos.cnri.reston.va.us>

On Mon, Jan 15, 2001 at 02:31:44PM -0500, Jeremy Hylton wrote:
>Let's have all the interested parties vote now, then.  It would
>certainly be helpful to have the new unittest module in the alpha
>release of 2.1.  I'd like to write some new tests and I'd rather use
>the new stuff than the old stuff.

Huh?  If no one has tried the different modules, what's the point of
having a vote?  (Given that doctest is going to be added, though, it 
should be checked in ASAP.)

--amk


From trentm@ActiveState.com  Mon Jan 15 22:10:26 2001
From: trentm@ActiveState.com (Trent Mick)
Date: Mon, 15 Jan 2001 14:10:26 -0800
Subject: [Python-Dev] TELL64
In-Reply-To: <200101151602.LAA22272@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 15, 2001 at 11:02:39AM -0500
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com>
Message-ID: <20010115141026.I29870@ActiveState.com>

On Mon, Jan 15, 2001 at 11:02:39AM -0500, Guido van Rossum wrote:
> Greg Stein noticed me checking in *yet* another system that needs
> the fallback TELL64() definition in fileobjects.c, and wrote:
> 
> > All of those #ifdefs could be tossed and it would be more robust (long term)
> > if an autoconf macro were used to specify when TELL64 should be defined.
> > 
> > [ I've looked thru fileobject.c and am a bit confused: the conditions for
> >   defining TELL64 do not match the conditions for *using* it. that would
> >   seem to imply a semantic error somewhere and/or a potential gotcha when
> >   they get skewed (like I assume what happened to FreeBSD). simplifying with
> >   an autoconf macro may help to rationalize it. ]

The problem is that these systems lie when they "say" (according to Python's
configure tests for HAVE_LARGEFILE_SUPPORT) that they have largefile support.
This seems to have happened for a particular release of BSD (which has since
been fixed). I think that the Right(tm) (meaning the cleanest solution where
the tests and definitions in the code actually represent the truth) answer is
a proper configure test (sort of as Greg suggests). I don't really feel
comfortable writing that patch (because (1) lack of time and (2) inability to
test, I don't have any access to any of these BSD machines).

[Guido]
> 
> I have a better idea.  Since "lseek((fd),0,SEEK_CUR)" seems to be the
> universal fallback, why not just define TELL64 to be that if it's not
> previously defined (currently only MS_WIN64 has a different
> definition)?  It isn't always *used* (the conditions under which
> _portable_fseek() uses it are quite complex), but *when* it is used,
> this seems to be the most common definition...

While I agree that it is annoying that the build breaks for these platforms I
think that it is appropriate that the build breaks. Having to put these:
    #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(__FreeBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
definitions here gives a nice list of those platforms that *do* lie. I would
prefer that to having an "#else" block that just captures all other cases,
but that is just my opinion.

Options (in order of preference):

(1) Update the configure test for HAVE_LARGEFILE_SUPPORT such that the proper
    versions of these OSes do *not* #define it.
(2) Guido's suggestion.
(2) Keep extending the "#elif" list.

 ^---- using (2) twice was intentional


Trent

> 
> *** fileobject.c	2001/01/15 10:36:56	2.106
> --- fileobject.c	2001/01/15 16:02:06
> ***************
> *** 58,66 ****
>   /* define the appropriate 64-bit capable tell() function */
>   #if defined(MS_WIN64)
>   #define TELL64 _telli64
> ! #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(__FreeBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
> ! /* NOTE: this is only used on older
> !    NetBSD prior to f*o() funcions */
>   #define TELL64(fd) lseek((fd),0,SEEK_CUR)
>   #endif
>   
> --- 58,65 ----
>   /* define the appropriate 64-bit capable tell() function */
>   #if defined(MS_WIN64)
>   #define TELL64 _telli64
> ! #else
> ! /* Fallback for older systems that don't have the f*o() funcions */
>   #define TELL64(fd) lseek((fd),0,SEEK_CUR)
>   #endif
> 
> 
> I'll check this in after 24 hours unless a better idea comes up.
> 

Better idea but no patch. :(

Trent


-- 
Trent Mick
TrentM@ActiveState.com


From skip@mojam.com (Skip Montanaro)  Mon Jan 15 22:10:36 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 15 Jan 2001 16:10:36 -0600 (CST)
Subject: [Python-Dev] should we start instrumenting modules with __all__?
Message-ID: <14947.30044.934204.951564@beluga.mojam.com>

I see the from-import-* patch for __all__ has been checked in.  Should we
make an effort to add __all__ to at least some modules before 2.1a1?

Skip


From akuchlin@mems-exchange.org  Mon Jan 15 22:13:03 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 15 Jan 2001 17:13:03 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: <200101121351.IAA19676@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Jan 12, 2001 at 08:51:51AM -0500
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> <200101112155.QAA16678@cj20424-a.reston1.va.home.com> <20010111172633.A26249@kronos.cnri.reston.va.us> <200101121351.IAA19676@cj20424-a.reston1.va.home.com>
Message-ID: <20010115171303.A23626@kronos.cnri.reston.va.us>

On Fri, Jan 12, 2001 at 08:51:51AM -0500, Guido van Rossum wrote:
>Ah.  It's very simple.  I create a directory "linux" as a subdirectory
>of the Python source tree (i.e. at the same level as Lib, Objects,
>etc.).  Then I chdir into that directory, and I say "../configure".
>The configure script creates subdirectories to hold the object files ...
>Then I say "make" and it builds Python.  

This doesn't work at all for me in my copy of the CVS tree.  Are there
other steps or requirements to make this work.  (Transcript available
upon request, but I suspect I'm missing something simple.)

--amk



From tim.one@home.com  Mon Jan 15 22:32:51 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 15 Jan 2001 17:32:51 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <20010115162619.A19484@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCKENMIIAA.tim.one@home.com>

[Jeremy]
> Let's have all the interested parties vote now, then.  It would
> certainly be helpful to have the new unittest module in the alpha
> release of 2.1.  I'd like to write some new tests and I'd rather use
> the new stuff than the old stuff.

[Andrew]
> Huh?  If no one has tried the different modules, what's the point of
> having a vote?

Presumably so that *something* gets into 2.1a1.  At least you, Jeremy and
Fredrik have tried them, and if that's all there can't be a tie <wink>.  I
would agree this is not an ideal decision procedure.

the-question-is-whether-it's-better-than-paralysis-ly y'rs  - tim



From ping@lfw.org  Mon Jan 15 22:35:47 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 14:35:47 -0800 (PST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
Message-ID: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>

I don't know whether this is going to be obvious or controversial,
but here goes.  Most of the time we're used to seeing a newline as
'\n', not as '\012', and newlines are typed in as '\n'.

A newcomer to Python is likely to do

    >>> 'hello\n'
    'hello\012'

and ask "what's \012?" -- whereupon one has to explain that it's an
octal escape, that 012 in octal equals 10, and that chr(10) is
newline, which is the same as '\n'.  You're bound to run into this,
and you'll see \012 a lot, because \n is such a common character.
Aside from being slightly more frightening, '\012' also takes up
twice as many characters as necessary.

So... i'm submitting a patch that causes the three most common
special whitespace characters, '\n', '\r', and '\t', to appear in
their natural form rather than as octal escapes when strings are
printed and repr()ed.

Mm?


-- ?!ng



From esr@thyrsus.com  Mon Jan 15 23:15:50 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 18:15:50 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>; from ping@lfw.org on Mon, Jan 15, 2001 at 02:35:47PM -0800
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
Message-ID: <20010115181550.A11566@thyrsus.com>

Ka-Ping Yee <ping@lfw.org>:
> I don't know whether this is going to be obvious or controversial,
> but here goes.  Most of the time we're used to seeing a newline as
> '\n', not as '\012', and newlines are typed in as '\n'.
> 
> A newcomer to Python is likely to do
> 
>     >>> 'hello\n'
>     'hello\012'
> 
> and ask "what's \012?" -- whereupon one has to explain that it's an
> octal escape, that 012 in octal equals 10, and that chr(10) is
> newline, which is the same as '\n'.  You're bound to run into this,
> and you'll see \012 a lot, because \n is such a common character.
> Aside from being slightly more frightening, '\012' also takes up
> twice as many characters as necessary.
> 
> So... i'm submitting a patch that causes the three most common
> special whitespace characters, '\n', '\r', and '\t', to appear in
> their natural form rather than as octal escapes when strings are
> printed and repr()ed.

Works for me.  I'd add \v, \b and \a to cover the whole ANSI C 
standard escape set (hmmm...am I missing any?)
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Live free or die; death is not the worst of evils.
	-- General George Stark.


From thomas@xs4all.net  Mon Jan 15 23:49:30 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 00:49:30 +0100
Subject: [Python-Dev] time functions
Message-ID: <20010116004930.L1005@xs4all.nl>

Maybe this is a dead and buried subject, but I'm going to try anyway, since
everyone's been in such a wonderful 'lets fix ugly but harmless nits' mood
lately :)

Why do we need the following atrocity <wink>:

  timestr = time.strftime("<format>", time.localtime(time.time()))

To do the simple task of 'date +<format>' ?  I never really understood why
there isn't a way to get a timetuple directly from C, rather than converting
a float that we got from C a bytecode before, even though the higher level
almost always deals with timetuples. How about making the float-to-tuple
functions (time.localtime, time.gmtime) accept 0 arguments as well, and
defaulting to time.time() in that case ? Even better, how about doing the
same for the other functions, too ? (where it makes sense, of course :) 

Actually, I'll split it up in three proposals:

- Making the time in time.strftime default to 'now', so that the above
  becomes the ever so slightly confusing:

  timestr = time.strftime("<format>")
  (confusing because it looks a bit like a regexp constructor...)

- Making the time in time.asctime and time.ctime optional, defaulting to
  'now', so you can just call 'time.ctime()' without having to pass
  time.time() (which are about half the calls in my own code :)

- Making the time in time.localtime and time.gmtime default to 'now'.

I'm 0/+1/+1 myself :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Mon Jan 15 23:55:36 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 00:55:36 +0100
Subject: [Python-Dev] TELL64
In-Reply-To: <20010115141026.I29870@ActiveState.com>; from trentm@ActiveState.com on Mon, Jan 15, 2001 at 02:10:26PM -0800
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com>
Message-ID: <20010116005536.M1005@xs4all.nl>

On Mon, Jan 15, 2001 at 02:10:26PM -0800, Trent Mick wrote:

> > > [ I've looked thru fileobject.c and am a bit confused: the conditions
> > >   for defining TELL64 do not match the conditions for *using* it. that
> > >   would seem to imply a semantic error somewhere and/or a potential
> > >   gotcha when they get skewed (like I assume what happened to
> > >   FreeBSD). simplifying with an autoconf macro may help to rationalize
> > >   it. ]

> The problem is that these systems lie when they "say" (according to
> Python's configure tests for HAVE_LARGEFILE_SUPPORT) that they have
> largefile support. This seems to have happened for a particular release of
> BSD (which has since been fixed). I think that the Right(tm) (meaning the
> cleanest solution where the tests and definitions in the code actually
> represent the truth) answer is a proper configure test (sort of as Greg
> suggests). I don't really feel comfortable writing that patch (because (1)
> lack of time and (2) inability to test, I don't have any access to any of
> these BSD machines).

There is no (longer any) 'single BSD release', so I doubt it has 'since been
fixed' :) We should consider the different BSD derived OSes as separate, if
slightly related, systems (much like SunOS <-> BSD.) The problem in the BSDI
case is really simple: the autoconf test doesn't test whether the fs really
supports large files, but rather whether the system has an off_t type that
is 64 bits. BSDI has that type, but does not actually use it in any of the
seek/tell functions. This has not been 'fixed' as far as I know, precisely
because it isn't 'broken' :)

I tried to fix the test, but I have been completely unable to find a proper
test. There doesn't seem to be a 'standard' one, and I wasn't able to figure
out what, say, 'zsh' uses -- black autoconf magic, for sure.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From trentm@ActiveState.com  Tue Jan 16 00:24:54 2001
From: trentm@ActiveState.com (Trent Mick)
Date: Mon, 15 Jan 2001 16:24:54 -0800
Subject: [Python-Dev] TELL64
In-Reply-To: <20010116005536.M1005@xs4all.nl>; from thomas@xs4all.net on Tue, Jan 16, 2001 at 12:55:36AM +0100
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl>
Message-ID: <20010115162454.D3864@ActiveState.com>

On Tue, Jan 16, 2001 at 12:55:36AM +0100, Thomas Wouters wrote:
> On Mon, Jan 15, 2001 at 02:10:26PM -0800, Trent Mick wrote:
> 
> > The problem is that these systems lie when they "say" (according to
> > Python's configure tests for HAVE_LARGEFILE_SUPPORT) that they have
> > largefile support. This seems to have happened for a particular release of
> > BSD (which has since been fixed). I think that the Right(tm) (meaning the
> > cleanest solution where the tests and definitions in the code actually
> > represent the truth) answer is a proper configure test (sort of as Greg
> > suggests). I don't really feel comfortable writing that patch (because (1)
> > lack of time and (2) inability to test, I don't have any access to any of
> > these BSD machines).
> 
> There is no (longer any) 'single BSD release', so I doubt it has 'since been
> fixed' :) 

Okay sure (showing my ignorance). My only understanding was that this
"lying" was the case for some unspecified BSDs a while ago but that the
latest releases of any of them *did* have largefile support.

> 
> I tried to fix the test, but I have been completely unable to find a proper
> test. There doesn't seem to be a 'standard' one, and I wasn't able to figure
> out what, say, 'zsh' uses -- black autoconf magic, for sure.

Hmmm... if one code encode whether or not a 64-bit fseek could be
implemented (either using fseek, fseek0, fseek64, _fseek, fsetpos/fgetpos,
etc.) in a short C program then that would be the test (or at least most of
the test, might have to see if ftell could be implemented as well). Or are
there other requirements?


Trent

-- 
Trent Mick
TrentM@ActiveState.com


From esr@thyrsus.com  Tue Jan 16 01:26:14 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 20:26:14 -0500
Subject: [Python-Dev] time functions
In-Reply-To: <20010116004930.L1005@xs4all.nl>; from thomas@xs4all.net on Tue, Jan 16, 2001 at 12:49:30AM +0100
References: <20010116004930.L1005@xs4all.nl>
Message-ID: <20010115202614.A11732@thyrsus.com>

Thomas Wouters <thomas@xs4all.net>:
> Actually, I'll split it up in three proposals:
> 
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)
> 
> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)
> 
> - Making the time in time.localtime and time.gmtime default to 'now'.
> 
> I'm 0/+1/+1 myself :)

Likewise.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Never trust a man who praises compassion while pointing a gun at you.


From barry@digicool.com  Tue Jan 16 02:14:33 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 21:14:33 -0500
Subject: [Python-Dev] time functions
References: <20010116004930.L1005@xs4all.nl>
Message-ID: <14947.44681.254332.976234@anthem.wooz.org>

>>>>> "TW" == Thomas Wouters <thomas@xs4all.net> writes:

    TW> I'm 0/+1/+1 myself :)

Maybe I'm an inch on the +0/+1/+1 side. :)


From jeremy@alum.mit.edu  Tue Jan 16 00:11:59 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 15 Jan 2001 19:11:59 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <20010115162619.A19484@kronos.cnri.reston.va.us>
References: <14934.7465.360749.199433@localhost.localdomain>
 <200101151917.OAA29687@cj20424-a.reston1.va.home.com>
 <14947.20512.140859.119597@localhost.localdomain>
 <20010115162619.A19484@kronos.cnri.reston.va.us>
Message-ID: <14947.37327.395622.66435@localhost.localdomain>

>>>>> "AMK" == Andrew Kuchling <akuchlin@mems-exchange.org> writes:

  AMK> On Mon, Jan 15, 2001 at 02:31:44PM -0500, Jeremy Hylton wrote:
  >> Let's have all the interested parties vote now, then.  It would
  >> certainly be helpful to have the new unittest module in the alpha
  >> release of 2.1.  I'd like to write some new tests and I'd rather
  >> use the new stuff than the old stuff.

  AMK> Huh?  If no one has tried the different modules, what's the
  AMK> point of having a vote?  (Given that doctest is going to be
  AMK> added, though, it should be checked in ASAP.)

Guido is the only person that said he hadn't tried anything.  If
others have given it a whirl, they ought to chime in now.  If very few
people have given them a try, we should decide whether we wait for
them or proceed without them.  We can't wait indefinitely.  I'm not
sure when we need to decide.

Jeremy


From nas@arctrix.com  Mon Jan 15 19:40:55 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Mon, 15 Jan 2001 11:40:55 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python sysmodule.c,2.82,2.83
In-Reply-To: <200101132225.RAA03197@cj20424-a.reston1.va.home.com>; from guido@python.org on Sat, Jan 13, 2001 at 05:25:12PM -0500
References: <E14HYoJ-0002n3-00@usw-pr-cvs1.sourceforge.net> <20010113071758.C28643@glacier.fnational.com> <200101132225.RAA03197@cj20424-a.reston1.va.home.com>
Message-ID: <20010115114055.A5879@glacier.fnational.com>

On Sat, Jan 13, 2001 at 05:25:12PM -0500, Guido van Rossum wrote:
> Do you have a tool that detects leaks?

debauch is showing promise athough it is still pretty rough
around the edges.  memprof is another option.  It looks like
init_exceptions may be leaking memory.  Some debauch output:

 1      Leaked Memory 0x0849cf98, size 44 (from 0x0) AllocTime: 79269 FreeTime: 43436      
        return stack:
                ???:?? (0x40016005) 
                classobject.c:84 (0x805c16d) <PyClass_New+631>
                exceptions.c:337 (0x8088594) <make_Exception+250>
                exceptions.c:1061 (0x80898dc) <init_exceptions+232>
                pythonrun.c:151 (0x8053581) <Py_Initialize+573>
                loop.c:23 (0x8053305) <main+101>

I haven't figured out if this is a real leak yet.

  Neil


From michel@digicool.com  Tue Jan 16 06:33:00 2001
From: michel@digicool.com (Michel Pelletier)
Date: Mon, 15 Jan 2001 22:33:00 -0800 (PST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <14947.37327.395622.66435@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.10101152216200.2373-100000@localhost.localdomain>

On Mon, 15 Jan 2001, Jeremy Hylton wrote:

> >>>>> "AMK" == Andrew Kuchling <akuchlin@mems-exchange.org> writes:
> 
>   AMK> On Mon, Jan 15, 2001 at 02:31:44PM -0500, Jeremy Hylton wrote:
>   >> Let's have all the interested parties vote now, then.  It would
>   >> certainly be helpful to have the new unittest module in the alpha
>   >> release of 2.1.  I'd like to write some new tests and I'd rather
>   >> use the new stuff than the old stuff.
> 
>   AMK> Huh?  If no one has tried the different modules, what's the
>   AMK> point of having a vote?  (Given that doctest is going to be
>   AMK> added, though, it should be checked in ASAP.)
> 
> Guido is the only person that said he hadn't tried anything.  If
> others have given it a whirl, they ought to chime in now.  

I have used pyunit to create a simple set of tests.  It seemed to do the
job well and it was very easy. I'd never done it before and the docs were
fat and A+.

I can only give a one-sided opinion.  I know of AMK's work but I have not
used it, are there others?

-Michel



From akuchlin@mems-exchange.org  Tue Jan 16 03:03:31 2001
From: akuchlin@mems-exchange.org (A.M. Kuchling)
Date: Mon, 15 Jan 2001 22:03:31 -0500
Subject: [Python-Dev] Detecting install time
Message-ID: <200101160303.WAA11632@207-172-111-91.s91.tnt1.ann.va.dialup.rcn.com>

For PEP 229, the setup.py script needs to figure out if it's running
from the build directory, because then distutils.sysconfig needs to
look at different config files; ./Modules/Makefile instead of
/usr/lib/python2.0/config/Makefile, and so forth.  Is there a
simple/clean way to do this?

--amk





From guido@python.org  Tue Jan 16 03:21:43 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:21:43 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: Your message of "Mon, 15 Jan 2001 17:13:03 EST."
 <20010115171303.A23626@kronos.cnri.reston.va.us>
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> <200101112155.QAA16678@cj20424-a.reston1.va.home.com> <20010111172633.A26249@kronos.cnri.reston.va.us> <200101121351.IAA19676@cj20424-a.reston1.va.home.com>
 <20010115171303.A23626@kronos.cnri.reston.va.us>
Message-ID: <200101160321.WAA00648@cj20424-a.reston1.va.home.com>

> On Fri, Jan 12, 2001 at 08:51:51AM -0500, Guido van Rossum wrote:
> >Ah.  It's very simple.  I create a directory "linux" as a subdirectory
> >of the Python source tree (i.e. at the same level as Lib, Objects,
> >etc.).  Then I chdir into that directory, and I say "../configure".
> >The configure script creates subdirectories to hold the object files ...
> >Then I say "make" and it builds Python.  
> 
> This doesn't work at all for me in my copy of the CVS tree.  Are there
> other steps or requirements to make this work.  (Transcript available
> upon request, but I suspect I'm missing something simple.)

You can't start doing this in a tree where you have already built
Python using the default way -- you have to use a pristine tree.  The
reason is the funny way Make's VPATH feature works, it sees the .o
files in the source directory and then thinks it doesn't have to creat
the .o file in the build directory.  I think a "make clobber" at the
top level would probably eradicate everything that confuses Make.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jan 16 03:24:04 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:24:04 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Mon, 15 Jan 2001 14:35:47 PST."
 <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
Message-ID: <200101160324.WAA00677@cj20424-a.reston1.va.home.com>

> I don't know whether this is going to be obvious or controversial,
> but here goes.  Most of the time we're used to seeing a newline as
> '\n', not as '\012', and newlines are typed in as '\n'.
> 
> A newcomer to Python is likely to do
> 
>     >>> 'hello\n'
>     'hello\012'
> 
> and ask "what's \012?" -- whereupon one has to explain that it's an
> octal escape, that 012 in octal equals 10, and that chr(10) is
> newline, which is the same as '\n'.  You're bound to run into this,
> and you'll see \012 a lot, because \n is such a common character.
> Aside from being slightly more frightening, '\012' also takes up
> twice as many characters as necessary.
> 
> So... i'm submitting a patch that causes the three most common
> special whitespace characters, '\n', '\r', and '\t', to appear in
> their natural form rather than as octal escapes when strings are
> printed and repr()ed.

+1 on the idea; no time to study the patch tonight.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Jan 16 03:28:38 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:28:38 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Mon, 15 Jan 2001 18:15:50 EST."
 <20010115181550.A11566@thyrsus.com>
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
 <20010115181550.A11566@thyrsus.com>
Message-ID: <200101160328.WAA00723@cj20424-a.reston1.va.home.com>

> > So... i'm submitting a patch that causes the three most common
> > special whitespace characters, '\n', '\r', and '\t', to appear in
> > their natural form rather than as octal escapes when strings are
> > printed and repr()ed.
> 
> Works for me.  I'd add \v, \b and \a to cover the whole ANSI C 
> standard escape set (hmmm...am I missing any?)

You missed \f [*].  Unclear to me whether it's a good idea to add the
lesser-known ones; they are just as likely binary gobbledegook rather
than what their escapes stand for.

[*] http://www.python.org/doc/current/ref/strings.html

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Jan 16 03:31:19 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:31:19 -0500
Subject: [Python-Dev] time functions
In-Reply-To: Your message of "Tue, 16 Jan 2001 00:49:30 +0100."
 <20010116004930.L1005@xs4all.nl>
References: <20010116004930.L1005@xs4all.nl>
Message-ID: <200101160331.WAA00780@cj20424-a.reston1.va.home.com>

> Maybe this is a dead and buried subject, but I'm going to try anyway, since
> everyone's been in such a wonderful 'lets fix ugly but harmless nits' mood
> lately :)
> 
> Why do we need the following atrocity <wink>:
> 
>   timestr = time.strftime("<format>", time.localtime(time.time()))
> 
> To do the simple task of 'date +<format>' ?  I never really understood why
> there isn't a way to get a timetuple directly from C, rather than converting
> a float that we got from C a bytecode before, even though the higher level
> almost always deals with timetuples. How about making the float-to-tuple
> functions (time.localtime, time.gmtime) accept 0 arguments as well, and
> defaulting to time.time() in that case ? Even better, how about doing the
> same for the other functions, too ? (where it makes sense, of course :) 
> 
> Actually, I'll split it up in three proposals:
> 
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)

I don't see the confusion.

> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)
> 
> - Making the time in time.localtime and time.gmtime default to 'now'.
> 
> I'm 0/+1/+1 myself :)

Yes, I've wondered this myself too.  I guess the current API is based
too much on the C API...

+1/+1/+1.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Jan 16 03:47:32 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:47:32 -0500
Subject: [Python-Dev] Detecting install time
In-Reply-To: Your message of "Mon, 15 Jan 2001 22:03:31 EST."
 <200101160303.WAA11632@207-172-111-91.s91.tnt1.ann.va.dialup.rcn.com>
References: <200101160303.WAA11632@207-172-111-91.s91.tnt1.ann.va.dialup.rcn.com>
Message-ID: <200101160347.WAA01132@cj20424-a.reston1.va.home.com>

> For PEP 229, the setup.py script needs to figure out if it's running
> from the build directory, because then distutils.sysconfig needs to
> look at different config files; ./Modules/Makefile instead of
> /usr/lib/python2.0/config/Makefile, and so forth.  Is there a
> simple/clean way to do this?

You could check for the presence of config.status -- that file is not
installed.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Tue Jan 16 03:53:16 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 15 Jan 2001 22:53:16 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com>

[?!ng]
> So... i'm submitting a patch that causes the three most common
> special whitespace characters, '\n', '\r', and '\t', to appear in
> their natural form rather than as octal escapes when strings are
> printed and repr()ed.

-1 on doing that when they're printed (although I probably misunderstand
what you mean there).

+1 for changing repr() as suggested.

-0 on generalizing to \a \b \f \v too (I've never used one of those in a
string literal in my life, so would be more baffled by seeing one come back
than I would the octal equivalent).

I would also be +1 on using hex escapes instead of octal (I grew up on 36-
and 60-bit machines, but that was the last time octal looked *natural*!).
Octal and hex escapes both consume 4 characters, so I can't imagine what
octal has going for it in the 21st century <wink>.

377-is-an-irritating-way-to-spell-ff-ly y'rs  - tim


PS:  Note that C doesn't define what numerical values \a etc have, just
that:

    Each of these escape sequences shall produce a unique
    implementation-defined value which can be stored in a single
    char object. The external representations in a text file need
    not be identical to the internal representations, and are
    outside the scope of this International Standard.

The current method does have the advantage of extreme clarity.



From guido@python.org  Tue Jan 16 04:08:46 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 23:08:46 -0500
Subject: [Python-Dev] TELL64
In-Reply-To: Your message of "Mon, 15 Jan 2001 16:24:54 PST."
 <20010115162454.D3864@ActiveState.com>
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl>
 <20010115162454.D3864@ActiveState.com>
Message-ID: <200101160408.XAA01368@cj20424-a.reston1.va.home.com>

Looking at the code (in _portable_fseek()) that uses TELL64, I don't
understand why it can't use fgetpos().  That code is used only when
fpos_t -- the type used by fgetpos() and fsetpos() -- is 64-bit.

Trent, you wrote that code.  Why wouldn't this work just as well?

(your code):
			if ((pos = TELL64(fileno(fp))) == -1L)
				return -1;
(my suggestion):
			if (fgetpos(fp, &pos) != 0)
				return -1;

It can't be because fgetpos() doesn't exist or is otherwise unusable,
because the SEEK_CUR case uses it.

We also know that offset is 8-bit capable (the #if around the
declaration of _portable_fseek() ensures that).

I would even go as far as to collapse the entire switch as follows:

	fpos_t pos;
	switch (whence) {
	case SEEK_END:
		/* do a "no-op" seek first to sync the buffering so that
		   the low-level tell() can be used correctly */
		if (fseek(fp, 0, SEEK_END) != 0)
			return -1;
		/* fall through */
	case SEEK_CUR:
		if (fgetpos(fp, &pos) != 0)
			return -1;
		offset += pos;
		break;
	/* case SEEK_SET: break; */
	}
	return fsetpos(fp, &offset);

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Jan 16 04:13:40 2001
From: guido@python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 23:13:40 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Mon, 15 Jan 2001 22:53:16 EST."
 <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com>
Message-ID: <200101160413.XAA01404@cj20424-a.reston1.va.home.com>

> [?!ng]
> > So... i'm submitting a patch that causes the three most common
> > special whitespace characters, '\n', '\r', and '\t', to appear in
> > their natural form rather than as octal escapes when strings are
> > printed and repr()ed.
> 
> -1 on doing that when they're printed (although I probably misunderstand
> what you mean there).

Ping was using imprecise language here -- he meant repr() and "printed
at the command line prompt."

> +1 for changing repr() as suggested.
> 
> -0 on generalizing to \a \b \f \v too (I've never used one of those in a
> string literal in my life, so would be more baffled by seeing one come back
> than I would the octal equivalent).
> 
> I would also be +1 on using hex escapes instead of octal (I grew up on 36-
> and 60-bit machines, but that was the last time octal looked *natural*!).

Me too.  One summer vacation while in college I had nothing better to
do than decode the Pascal runtime system for the University's CDC-6600
from an octal dump into assembly.  Learned lots!

> Octal and hex escapes both consume 4 characters, so I can't imagine what
> octal has going for it in the 21st century <wink>.

Originally, using \x for these was impractical (at least) because of
the stupid gobble-up-everything-that-looks-like-a-hex-digit semantics
of the \x escape.  Now we've fixed this, I agree.

> 377-is-an-irritating-way-to-spell-ff-ly y'rs  - tim
> 
> 
> PS:  Note that C doesn't define what numerical values \a etc have, just
> that:
> 
>     Each of these escape sequences shall produce a unique
>     implementation-defined value which can be stored in a single
>     char object. The external representations in a text file need
>     not be identical to the internal representations, and are
>     outside the scope of this International Standard.
> 
> The current method does have the advantage of extreme clarity.

Python doesn't support non-ASCII machines, like the C standard
(pretends to).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@thyrsus.com  Tue Jan 16 04:26:13 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 23:26:13 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <200101160328.WAA00723@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 15, 2001 at 10:28:38PM -0500
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <20010115181550.A11566@thyrsus.com> <200101160328.WAA00723@cj20424-a.reston1.va.home.com>
Message-ID: <20010115232613.B12166@thyrsus.com>

Guido van Rossum <guido@python.org>:
> > > So... i'm submitting a patch that causes the three most common
> > > special whitespace characters, '\n', '\r', and '\t', to appear in
> > > their natural form rather than as octal escapes when strings are
> > > printed and repr()ed.
> > 
> > Works for me.  I'd add \v, \b and \a to cover the whole ANSI C 
> > standard escape set (hmmm...am I missing any?)
> 
> You missed \f [*].  Unclear to me whether it's a good idea to add the
> lesser-known ones; they are just as likely binary gobbledegook rather
> than what their escapes stand for.
> 
> [*] http://www.python.org/doc/current/ref/strings.html

Truth is, Guido, I'm kind of iffy about whether there'd be a gain in
clarity myself.  But I find I'm rather attached to the idea of
maintaining strictest possible symmetry between what Python handles on
input and what it emits on output.

So unless we think adding \f, \v, \b, and \a to the special set would
actually produce a *loss* of clarity relative to octal gibberish (!),
I say do 'em all.  Aesthetically, that feels to me like the right thing, 
and the *Pythonic* thing, to do here.

Have I erred in my intuition, O BDFL?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

A man who has nothing which he is willing to fight for, nothing 
which he cares about more than he does about his personal safety, 
is a miserable creature who has no chance of being free, unless made 
and kept so by the exertions of better men than himself. 
	-- John Stuart Mill, writing on the U.S. Civil War in 1862


From nas@arctrix.com  Mon Jan 15 21:45:28 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Mon, 15 Jan 2001 13:45:28 -0800
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <20010115232613.B12166@thyrsus.com>; from esr@thyrsus.com on Mon, Jan 15, 2001 at 11:26:13PM -0500
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <20010115181550.A11566@thyrsus.com> <200101160328.WAA00723@cj20424-a.reston1.va.home.com> <20010115232613.B12166@thyrsus.com>
Message-ID: <20010115134528.B6193@glacier.fnational.com>

On Mon, Jan 15, 2001 at 11:26:13PM -0500, Eric S. Raymond wrote:
> [...] I find I'm rather attached to the idea of maintaining
> strictest possible symmetry between what Python handles on
> input and what it emits on output.
> 
> So unless we think adding \f, \v, \b, and \a to the special set would
> actually produce a *loss* of clarity relative to octal gibberish (!),
> I say do 'em all.

Symmetry is good but I bet most people who would see \f, \v, \b,
\a wouldn't have entered those characters using escapes.  Most
likely those character's would have been read from a binary file.

That said, I don't really mind either way.

  Neil


From tim.one@home.com  Tue Jan 16 04:43:06 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 15 Jan 2001 23:43:06 -0500
Subject: [Python-Dev] Whitesapce normalization
Message-ID: <LNBBLJKPBEHFEDALKOLCCEOOIIAA.tim.one@home.com>

You may have noticed that I checked in changes to most of the modules in the
top level of Lib yesterday (Sunday).  This is part of a Crusade that was
supposed to happen before 2.0a1, but got dropped on the floor then due to
misunderstandings:  make the Python code we distribute adhere to Guido's
style guide (4-space indents, no hard tabs), + clean up minor whitespace
nits (no stray blank lines at the ends of files, no trailing whitespace on
lines, last line of the file should end with a newline).

It would be nice if people cleaned up their code this way too; I'm not going
to go thru the entire distribution doing this.  So, if you give a rip, pick
a directory or some modules you're fond of, and clean 'em up.

The program Tools/scripts/reindent.py does all of the above for you, so it's
not hard.  But it takes some care in two areas, which is why I did the top
level of Lib one file at a time by hand, and studied diffs by eyeball before
checking in any changes:

+ It's unlikely but possible that some program file *depends* on trailing
whitespace.  That plain sucks (it's *going* to break sooner or later), but
reindent.py can't help you there.

+ While reindent should never otherwise damage program logic, very strange
commenting or docstring styles may get mangled by it, making code and/or
docs hard to read.  reindent works very hard to do a good job on that, and
indeed I found no need to make manual changes to anything it did in the top
level of Lib.  But check anyway.  Especially some of the very oldest modules
are littered with ugly stuff like

    #

all over the place, from back when nobody had an editor smart enough to skip
over preceding blank lines when suggesting indentation for the current line.
Then again, maybe we should just drop the Irix5 directory <wink>.

voice-in-the-wilderness-ly y'rs  - tim



From esr@thyrsus.com  Tue Jan 16 04:43:24 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 23:43:24 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 15, 2001 at 10:53:16PM -0500
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com>
Message-ID: <20010115234324.C12166@thyrsus.com>

Tim Peters <tim.one@home.com>:
> I would also be +1 on using hex escapes instead of octal (I grew up on 36-
> and 60-bit machines, but that was the last time octal looked *natural*!).
> Octal and hex escapes both consume 4 characters, so I can't imagine what
> octal has going for it in the 21st century <wink>.

Tim, on the level of aesthetic preference I'm totally with you.  I've always
found octal really ugly myself.  Hex fits my brain better; somehow I find it
easier to visualize the bit patterns from.

Sadly, there are so many other related ways in which Python
intelligently follows C/Unix conventions that I think changing to a default
of hex escapes rather than octal would violate the Rule of Least
Surprise.

One of the things I like about Python is precisely its conservatism in
areas like string escapes, that Guido refrained from inventing new OS
APIs or new conventions for things like string escapes in places where
Unix and C did them in a well-established and reasonable way.  He didn't
make the mistake, all too typical in academic languages, of confusing
novelty with value...

This conservatism is valuable because it frees the C-experienced
programmer's mind from having to think about where the language is
trivially different, so he can concentrate on where it's importantly
different.  It's worth maintaining.

On the other hand, the change would mesh well with the Unicode support.
Hmm.  Tough call.  I could go either way, I guess.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The politician attempts to remedy the evil by increasing the very thing
that caused the evil in the first place: legal plunder.
	-- Frederick Bastiat


From tim.one@home.com  Tue Jan 16 05:07:16 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 16 Jan 2001 00:07:16 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <20010115234324.C12166@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPAIIAA.tim.one@home.com>

[Eric]
> Tim, on the level of aesthetic preference I'm totally with you.
> I've always found octal really ugly myself.  Hex fits my brain
> better;  somehow I find it easier to visualize the bit patterns from.
>
> Sadly, there are so many other related ways in which Python
> intelligently follows C/Unix conventions that I think changing to
> a default of hex escapes rather than octal would violate the Rule
> of Least Surprise.
>
> ... [and skipping nice stuff I *do* agree with <wink>] ...

The saving grace here is that repr() is a form of ASCII dump.  C has nothing
to say about that, while last time I used Unix it was real easy to get dumps
in hex (and indeed that's what everyone I knew routinely did).  I expect
that od retains both its name and its octal defaults on most systems simply
due to inertia.  An octal dump would be infinitely surprising on Windows
(I'm not sure I can even get one without writing it myself).

Do people actually use octal dumps on Unices anymore?  I'd be surprised, if
they're running on power-of-2 boxes.  Defaults aren't conventions when
*everyone* overrides them, they're just old and in the way.

takes-one-to-know-one<wink>-ly y'rs  - tim



From ping@lfw.org  Tue Jan 16 05:27:33 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 21:27:33 -0800 (PST)
Subject: [Python-Dev] time functions
In-Reply-To: <20010116004930.L1005@xs4all.nl>
Message-ID: <Pine.LNX.4.10.10101152126120.5846-100000@skuld.kingmanhall.org>

On Tue, 16 Jan 2001, Thomas Wouters wrote:
> Actually, I'll split it up in three proposals:
> 
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)
> 
> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)
> 
> - Making the time in time.localtime and time.gmtime default to 'now'.

I like all of these suggestions.  Go for it!


-- ?!ng



From esr@thyrsus.com  Tue Jan 16 05:31:14 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 16 Jan 2001 00:31:14 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEPAIIAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 16, 2001 at 12:07:16AM -0500
References: <20010115234324.C12166@thyrsus.com> <LNBBLJKPBEHFEDALKOLCCEPAIIAA.tim.one@home.com>
Message-ID: <20010116003114.A12365@thyrsus.com>

Tim Peters <tim.one@home.com>:
> Do people actually use octal dumps on Unices anymore? 

Well, we do when we momentarily forget to give od(1) the -x escape :-)

This so annoyed me that back around 1983 I wrote my own hex dumper
specifically to emulate the 16-hex-bytes-with-midpage-gutter-and-ASCII-
over-on-the-right-side format that CP/M used and DOS inherited.  It's
still available at <http://www.tuxedo.org/~esr/hex/>.

Do you know the history on this?  C speaks octal because a bunch of 
mode fields in the PDP-11 instruction word were three bits wide.
Time was it was actually useful to have the output from (say)
core files chunk that way. But I haven't seen an octal code dump 
in over a decade, probably pushing fifteen years now.  
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

In the absence of any evidence tending to show that possession 
or use of a 'shotgun having a barrel of less than eighteen inches 
in length' at this time has some reasonable relationship to the 
preservation or efficiency of a well regulated militia, we cannot 
say that the Second Amendment guarantees the right to keep and bear 
such an instrument. [...] The Militia comprised all males 
physically capable of acting in concert for the common defense.  
        -- Majority Supreme Court opinion in "U.S. vs. Miller" (1939)


From ping@lfw.org  Tue Jan 16 05:33:42 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 21:33:42 -0800 (PST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <200101160413.XAA01404@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101152130090.5846-100000@skuld.kingmanhall.org>

On Mon, 15 Jan 2001, Guido van Rossum wrote:
> > > special whitespace characters, '\n', '\r', and '\t', to appear in
> > > their natural form rather than as octal escapes when strings are
> > > printed and repr()ed.
> > 
> > -1 on doing that when they're printed (although I probably misunderstand
> > what you mean there).
> 
> Ping was using imprecise language here -- he meant repr() and "printed
> at the command line prompt."

Yes, i referred to "when strings are printed and repr()ed" as two cases
because both string_print() and string_repr() have to be changed.

(Side question: when are *_print() and *_repr() ever different, and why?)

> Originally, using \x for these was impractical (at least) because of
> the stupid gobble-up-everything-that-looks-like-a-hex-digit semantics
> of the \x escape.  Now we've fixed this, I agree.

Oh, now i understand.  Good point.  I'll update the patch to do hex.

0xdeadbeef-ly yours,


-- ?!ng



From fredrik@effbot.org  Tue Jan 16 07:11:38 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Tue, 16 Jan 2001 08:11:38 +0100
Subject: [Python-Dev] time functions
References: <20010116004930.L1005@xs4all.nl>
Message-ID: <00b201c07f8b$93996820$e46940d5@hagrid>

thomas wrote:
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)

where "now" is local time, I assume?

since you're assuming a time zone, you could make it accept
an integer as well...

> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)

same here.

</F>



From thomas@xs4all.net  Tue Jan 16 07:18:38 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 08:18:38 +0100
Subject: [Python-Dev] time functions
In-Reply-To: <00b201c07f8b$93996820$e46940d5@hagrid>; from fredrik@effbot.org on Tue, Jan 16, 2001 at 08:11:38AM +0100
References: <20010116004930.L1005@xs4all.nl> <00b201c07f8b$93996820$e46940d5@hagrid>
Message-ID: <20010116081838.N1005@xs4all.nl>

On Tue, Jan 16, 2001 at 08:11:38AM +0100, Fredrik Lundh wrote:
> thomas wrote:
> > - Making the time in time.strftime default to 'now', so that the above
> >   becomes the ever so slightly confusing:
> > 
> >   timestr = time.strftime("<format>")
> >   (confusing because it looks a bit like a regexp constructor...)

> where "now" is local time, I assume?

Yes. See the patch I'll upload later today (meetings first, grrr)

> since you're assuming a time zone, you could make it accept
> an integer as well...

Could, yes... I'll include it in the 2nd revision of the patch, it can be
rejected (or accepted) separately.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Tue Jan 16 08:22:11 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 09:22:11 +0100
Subject: [Python-Dev] time functions
In-Reply-To: <20010116081838.N1005@xs4all.nl>; from thomas@xs4all.net on Tue, Jan 16, 2001 at 08:18:38AM +0100
References: <20010116004930.L1005@xs4all.nl> <00b201c07f8b$93996820$e46940d5@hagrid> <20010116081838.N1005@xs4all.nl>
Message-ID: <20010116092211.O1005@xs4all.nl>

On Tue, Jan 16, 2001 at 08:18:38AM +0100, Thomas Wouters wrote:
> On Tue, Jan 16, 2001 at 08:11:38AM +0100, Fredrik Lundh wrote:

> > >   timestr = time.strftime("<format>")

> > since you're assuming a time zone, you could make it accept
> > an integer as well...

> Could, yes... 

Actually, on second thought, lets not, not just yet anyway. Doing that for
all functions in the time module would continue to pollute the already toxic
waters of a C API translated into Python :P Who knows what 'ctime' stands
for, anyway ? And 'asctime' ? How can we expect Python programmers who think
'C' is a high note or average grade, to understand how the time module is
supposed to be used ? :)

We now have:
time() -- return current time in seconds since the Epoch as a float
gmtime() -- convert seconds since Epoch to UTC tuple
localtime() -- convert seconds since Epoch to local time tuple
asctime() -- convert time tuple to string
ctime() -- convert time in seconds to string
mktime() -- convert local time tuple to seconds since Epoch
strftime() -- convert time tuple to string according to format specification

where asctime and ctime are basically wrappers around strftime, and would do
the exact same thing if they both accepted tuples and floats. 

I think we should have something like:
time() -- current time in float
timetuple() -- current (local) time in timetuple
tuple2time(tuple) -- tuple -> float
time2tuple(float, tz=local) -- float -> tuple using timezone tz
stringtime(time=now, format="ctimeformat") -- convert time value to string

Those are just working names, to make the point, I don't have time to think
up better ones :) I'm not sure if the timezone support in the above list is
extensive enough, mostly because I hardly use timezones myself. Also,
tuple2time() could be merged with time(), and likewise for time2tuple() and
timetuple(). I think keeping strftime() and maybe ctime() for ease-of-use is
a good idea, but the rest could eventually be deprecated.

Off-to-important-meetings-*cough*-ly y'rs
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From fredrik@effbot.org  Tue Jan 16 08:30:28 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Tue, 16 Jan 2001 09:30:28 +0100
Subject: [Python-Dev] unit testing bake-off
References: <LNBBLJKPBEHFEDALKOLCKENMIIAA.tim.one@home.com>
Message-ID: <01ba01c07f96$967b7870$e46940d5@hagrid>

Tim Peters wrote:
> At least you, Jeremy and Fredrik have tried them, and
> if that's all there can't be a tie <wink>.

let me guess:

    Jeremy: PyUnit
    Andrew: unittest
    Fredrik: unittest

(I find pyunit a bit unpythonic, and both overengineered
and underengineered at the same time...  hard to explain,
but I strongly prefer unittest)

> I would agree this is not an ideal decision procedure.

well, any decision procedure that comes up with what I
want just has to be ideal ;-)

</F>



From andy@reportlab.com  Tue Jan 16 09:20:45 2001
From: andy@reportlab.com (Andy Robinson)
Date: Tue, 16 Jan 2001 09:20:45 -0000
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <20010115204701.11972EA6B@mail.python.org>
Message-ID: <PGECLPOBGNBNKHNAGIJHAEELCGAA.andy@reportlab.com>

> Subject: Re: [Python-Dev] unit testing bake-off
> From: Guido van Rossum <guido@python.org>
> Date: Mon, 15 Jan 2001 14:17:27 -0500
> 
> There doesn't seem to be a lot of enthousiasm for a Unittest
> bakeoff...  Certainly I don't think I'll get to this myself before the
> conference.
> 
> How about the following though: talking of low-hanging fruit, Tim's
> doctest module is an excellent thing even if it isn't a unit testing
> framework!  (I found this out when I played with it -- it's real easy
> to get used to...)
> 
> Would anyone object against Tim checking this in?  Since it isn't a
> contender in the unit test bake-off, it shouldn't affect the outcome
> there at all.
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)

I think it should definitely go in.  Ditto with whatever testing
framework and documentation tools (pydoc etc.) shortly emerge
as "best of breed".  I spend my time on corporate consulting
projects, and saying things like "Python has standard tools for
unit testing and documentation" is even better than saying 
"We have standard tools for unit testing and documentation".

BTW, ReportLab has recently adopted PyUnit's unittest.py
It feels a bit Java-like to me - a few more lines of code
than needed - but it certainly works.   One key feature is
aggregating test suites; a big app we installed on a
customer site can run the test suite for itself, the ReportLab
library (whose test suite we are just getting to work on)
and four or five dependent utilities; another is that
people have heard of JUnit.

Just my 2p worth,
Andy Robinson



From tony@lsl.co.uk  Tue Jan 16 09:47:01 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 16 Jan 2001 09:47:01 -0000
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and
 outside of Python)
In-Reply-To: <200101152041.PAA32298@cj20424-a.reston1.va.home.com>
Message-ID: <003901c07fa1$46e10c70$f05aa8c0@lslp7o.int.lsl.co.uk>

In the context of my starting doc strings in an Emacs Lisp manner,
Ka-Ping Yee said:
> I think i'm going to ask you to stop, unless Guido prefers
> otherwise.  Guido, do you have a style pronouncement for module
> docstrings?

and since Guido replied
> I'm with Ping.  None of the examples in the style guide start the
> docstring with the function name.  Almost none of the standard library
> modules start their module docstring with the module name (codecs is
> an exception, but I didn't write it :-).

I shall indeed stop (of course, my habit started before we HAD
documentation tools, and if we're going to browse things with pydoc, et
al, then there's no need for it. To be honest, it's the answer I
expected.

Oh dear, another item for my TO DO list (i.e., remove the offending
nits). Still, if it's only me it's hardly high impact!

Tibs
--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Which is safer, driving or cycling?
Cycling - it's harder to kill people with a bike...
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)



From tony@lsl.co.uk  Tue Jan 16 10:13:31 2001
From: tony@lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 16 Jan 2001 10:13:31 -0000
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and
 outside of Python)
In-Reply-To: <Pine.LNX.4.10.10101151155270.5846-100000@skuld.kingmanhall.org>
Message-ID: <003a01c07fa4$fa0883c0$f05aa8c0@lslp7o.int.lsl.co.uk>

I mentioned a "spurious"
>	The system cannot find the path specified.

on NT, and Ka-Ping Yee said:
> Thanks for the NT testing.  That's funny -- i put in a special case
> for Windows to avoid messages like the above a couple of days ago.
> How recently did you download pydoc.py?  Does your copy contain:
>
>     if hasattr(sys, 'winver'):
>         return lambda text: tempfilepager(text, 'more')

Hmm. I downloaded it when I read the email message announcing it, which
was yesterday some time. But it doesn't look like the lines you mention
are there - I'll try re-downloading...

...I've redownloaded the files from http://www.lfw.org/python/pydoc.py,
etc., and done a grep for hasattr within them. There's no check such as
the one you mention, so I guess it's "download impedance".

> So you can see what i'm up to, here's my current to-do list:
>
>     make boldness optional (only if using more/less?  only Unix?)

probably sensible. By the way, I don't get boldness on the NT box - any
chance (he says, not intending to help *at all* in doing it!) of it
happening there as well? (or would that depend on what curses support is
built into the Python?)

>     document a .py file given on the command line

also allow for a directory module (i.e., something with __init__.py in
it) given on the command line?

>     write a better htmlrepr (\n should look special, max
>     length limit, etc.)

yes, but these things can always get better - the fact it's working
allows for improoooovement down the line.

>     generate HTML index from precis and __path__ and package

a neat idea - definitely Good Stuff!

>     contents list

well, I always do these, so I'm for this one as well

>     have help(...) produce a directory of available things to
>     ask for help on

bouncy fun!

>     Windows and Mac testing

I'm running Windows 98 with Python 1.5.2 at home, and will willingly try
it out on that (after all, it's not a very big download) - although it
might sometimes take a day or two to get round to it (for instance, I
haven't yet done so!). But I suspect I shan't be a very demanding
user...

>     default to HTTP mode on GUI platforms?  (win, mac)
>
> The ones marked with + i consider done.  Feel free to comment on
> or suggest priorities for the others; in particular, what do you
> think of the last one?  The idea is that double-clicking on
> pydoc.py in Windows or MacOS could launch the server and then open
> the localhost URL using webbrowser.py to display the documentation
> index.  Should it do this by default?

I'll leave that to better designers than myself (although if one is to
*have* a double click action, that seems sensible to me).

(looks up webbrowser.py - ah, a 2.0 module). Personally, I'd also like
to have the option of having a "mini-browser" supported directly,
perhaps in Tkinter, so I don't need to start up a whole web browser. But
again I may be odd in that wish (I can't remember what IDLE does).

Oh - that also means "integrate into IDLE" presumably goes on at least a
WishList as well...

Other ideas:
* command line switch to *output* HTML to a file (i.e., documentation
generation) (presumably something like "-o <name>.html", where the
"html" indicates the output format - an alternative being "txt"
* if I ever finish the docutils effort (I should be getting back to it
soon) then use that to format the texts (this would mean I need not
worry about the "frontend" to docutils too much, since pydoc is already
doing so much). Or maybe the docutils tool should be importing pydoc...

Tibs (must do some (paid) work now!)

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"Bounce with the bunny. Strut with the duck.
 Spin with the chickens now - CLUCK CLUCK CLUCK!"
BARNYARD DANCE! by Sandra Boynton
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)




From mal@lemburg.com  Tue Jan 16 10:18:44 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 16 Jan 2001 11:18:44 +0100
Subject: [Python-Dev] time functions
References: <20010116004930.L1005@xs4all.nl>
Message-ID: <3A642004.F6197E86@lemburg.com>

Thomas Wouters wrote:
> 
> Maybe this is a dead and buried subject, but I'm going to try anyway, since
> everyone's been in such a wonderful 'lets fix ugly but harmless nits' mood
> lately :)
> 
> Why do we need the following atrocity <wink>:
> 
>   timestr = time.strftime("<format>", time.localtime(time.time()))
> 
> To do the simple task of 'date +<format>' ?  I never really understood why
> there isn't a way to get a timetuple directly from C, rather than converting
> a float that we got from C a bytecode before, even though the higher level
> almost always deals with timetuples. How about making the float-to-tuple
> functions (time.localtime, time.gmtime) accept 0 arguments as well, and
> defaulting to time.time() in that case ? Even better, how about doing the
> same for the other functions, too ? (where it makes sense, of course :)
> 
> Actually, I'll split it up in three proposals:
> 
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)
> 
> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)
> 
> - Making the time in time.localtime and time.gmtime default to 'now'.
> 
> I'm 0/+1/+1 myself :)

+1 all the way -- though these days I tend not to use the
time module anymore. mxDateTime already does everything I want
and there date/time values are objects rather than Python integers
or tuples... ok, I'm just showing opff a little :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Tue Jan 16 10:32:21 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 16 Jan 2001 11:32:21 +0100
Subject: [Python-Dev] Strings: '\012' -> '\n'
References: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com> <200101160413.XAA01404@cj20424-a.reston1.va.home.com>
Message-ID: <3A642335.82358B02@lemburg.com>

Minor nit about this idea: it makes decoding repr() style
strings harder for external tools and it could cause breakage
(e.g. if "\n" is usedby the encoding for some other purpose).

BTW, since there are a gazillion ways to encode strings into
7-bit ASCII, why not use the new codec design to add additional
output schemes for 8-bit strings ?!

Strings have an .encode() method as well...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From ping@lfw.org  Tue Jan 16 10:37:42 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Tue, 16 Jan 2001 02:37:42 -0800 (PST)
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <003a01c07fa4$fa0883c0$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <Pine.LNX.4.10.10101160236330.5846-100000@skuld.kingmanhall.org>

Before somebody decides to shoot us for spamming both lists,
i'm taking this thread off of python-dev and solely to doc-sig.
Please continue further discussion there...


-- ?!ng



From ping@lfw.org  Tue Jan 16 10:47:02 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Tue, 16 Jan 2001 02:47:02 -0800 (PST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101152130090.5846-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101160240520.5846-100000@skuld.kingmanhall.org>

On Mon, 15 Jan 2001, Ka-Ping Yee wrote:
> On Mon, 15 Jan 2001, Guido van Rossum wrote:
> > Originally, using \x for these was impractical (at least) because of
> > the stupid gobble-up-everything-that-looks-like-a-hex-digit semantics
> > of the \x escape.  Now we've fixed this, I agree.
> 
> Oh, now i understand.  Good point.  I'll update the patch to do hex.

I assume you would like Unicode strings to do the same (\n, \t, \r,
and \xff rather than \377).

Guido, do you have a Pronouncement on \v, \f, \b, \a?

By the way, why do Unicode escapes appear in capitals?

    >>> u'\uface'
    u'\uFACE'

(If someone tells me that there happens to be a picture of a face at
that code point, i'll laugh.  Is there a cow at \uBEEF?)

Does anyone care that \x will be followed by lowercase and \u by uppercase?

I noticed that the tutorial claims Unicode strings can be str()-ified
and will encode themselves using UTF-8 as default.  But this doesn't
actually work for me:

    >>> us = u'\uface'
    >>> us
    u'\uFACE'
    >>> str(us)
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    UnicodeError: ASCII encoding error: ordinal not in range(128)
    >>> us.encode()
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    UnicodeError: ASCII encoding error: ordinal not in range(128)
    >>> us.encode('UTF-8')
    '\xef\xab\x8e'

Assuming i have understood this correctly, i have submitted a patch
to correct tut.tex.


-- ?!ng




From bckfnn@worldonline.dk  Tue Jan 16 10:52:10 2001
From: bckfnn@worldonline.dk (Finn Bock)
Date: Tue, 16 Jan 2001 10:52:10 GMT
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
Message-ID: <3a642768.6426631@smtp.worldonline.dk>

[Ping]

>I don't know whether this is going to be obvious or controversial,
>but here goes.  Most of the time we're used to seeing a newline as
>'\n', not as '\012', and newlines are typed in as '\n'.
>
>A newcomer to Python is likely to do
>
>    >>> 'hello\n'
>    'hello\012'
>
>and ask "what's \012?" -- whereupon one has to explain that it's an
>octal escape, that 012 in octal equals 10, and that chr(10) is
>newline, which is the same as '\n'.  You're bound to run into this,
>and you'll see \012 a lot, because \n is such a common character.
>Aside from being slightly more frightening, '\012' also takes up
>twice as many characters as necessary.
>
>So... i'm submitting a patch that causes the three most common
>special whitespace characters, '\n', '\r', and '\t', to appear in
>their natural form rather than as octal escapes when strings are
>printed and repr()ed.

I like it, because it removes yet another difference between Python and
Jython. Jython happens to handle these chars specially: \n, \t, \b, \f
and \r.

regards,
finn


From esr@thyrsus.com  Tue Jan 16 10:53:00 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 16 Jan 2001 05:53:00 -0500
Subject: [Python-Dev] time functions
In-Reply-To: <3A642004.F6197E86@lemburg.com>; from mal@lemburg.com on Tue, Jan 16, 2001 at 11:18:44AM +0100
References: <20010116004930.L1005@xs4all.nl> <3A642004.F6197E86@lemburg.com>
Message-ID: <20010116055300.C12847@thyrsus.com>

M.-A. Lemburg <mal@lemburg.com>:
> +1 all the way -- though these days I tend not to use the
> time module anymore. mxDateTime already does everything I want
> and there date/time values are objects rather than Python integers
> or tuples... ok, I'm just showing opff a little :)

mxDateTime is on my short list of "why isn't this in the Python library
already?"  Has it ever been discussed?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

You need only reflect that one of the best ways to get yourself 
a reputation as a dangerous citizen these days is to go about 
repeating the very phrases which our founding fathers used in the 
great struggle for independence.
	-- Attributed to Charles Austin Beard (1874-1948)


From mal@lemburg.com  Tue Jan 16 11:18:24 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 16 Jan 2001 12:18:24 +0100
Subject: [Python-Dev] time functions
References: <20010116004930.L1005@xs4all.nl> <3A642004.F6197E86@lemburg.com> <20010116055300.C12847@thyrsus.com>
Message-ID: <3A642E00.BD330647@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal@lemburg.com>:
> > +1 all the way -- though these days I tend not to use the
> > time module anymore. mxDateTime already does everything I want
> > and there date/time values are objects rather than Python integers
> > or tuples... ok, I'm just showing opff a little :)
> 
> mxDateTime is on my short list of "why isn't this in the Python library
> already?"  Has it ever been discussed?

Yes. I'd rather keep it separate from the standard dist for
various reasons. One of these reasons is that I will be moving
the mx tools into a new packaging scheme built on distutils --
installing it should then boil down to a simple RPM install
or maybe a "python setup.py install" thanks to distutils. The
package will then become a subpackage of the mx package.

BTW, I see distutils as strong argument for *not* including
more exotic packages in Python's stdlib. If this catches on,
I expect that together with the Vaults we are not far away
from having our own CPAN style archive of add-on packages.
I also expect the commercial vendors like ActiveState et al.
to take care of wrapping SUMO distributions of Python and
the existing add-ons.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From esr@thyrsus.com  Tue Jan 16 11:20:18 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 16 Jan 2001 06:20:18 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <3a642768.6426631@smtp.worldonline.dk>; from bckfnn@worldonline.dk on Tue, Jan 16, 2001 at 10:52:10AM +0000
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <3a642768.6426631@smtp.worldonline.dk>
Message-ID: <20010116062018.A12935@thyrsus.com>

Finn Bock <bckfnn@worldonline.dk>:
> I like it, because it removes yet another difference between Python and
> Jython. Jython happens to handle these chars specially: \n, \t, \b, \f
> and \r.

This is an argument for adding \b and \f to the special set in
CPython.  If the BDFL looks benignly on adding \v and \a, those
should go into Jython's special set too.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Sometimes it is said that man cannot be trusted with the government
of himself.  Can he, then, be trusted with the government of others?
	-- Thomas Jefferson, in his 1801 inaugural address


From fredrik@pythonware.com  Tue Jan 16 11:37:10 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Tue, 16 Jan 2001 12:37:10 +0100
Subject: [Python-Dev] Strings: '\012' -> '\n'
References: <Pine.LNX.4.10.10101160240520.5846-100000@skuld.kingmanhall.org>
Message-ID: <03eb01c07fb0$aaaa19e0$0900a8c0@SPIFF>

ping wrote:
> By the way, why do Unicode escapes appear in capitals?
> 
>     >>> u'\uface'
>     u'\uFACE'
> 
> (If someone tells me that there happens to be a picture of a face at
> that code point, i'll laugh.  Is there a cow at \uBEEF?)

iirc, 0xFACE and 0xBEEF are part of the CJK and
Hangul spaces.  not sure 0xFACE is assigned, but
0xBEEF glyph looks like a ribcage with four legs...

you'll find faces at 0x263A etc.

</F>



From skip@mojam.com (Skip Montanaro)  Tue Jan 16 13:09:51 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 16 Jan 2001 07:09:51 -0600 (CST)
Subject: [Python-Dev] bummer - regsub/regex no longer in module index
Message-ID: <14948.18463.971334.401426@beluga.mojam.com>

I am now getting deprecation warnings about regsub so I decided to start
replacing it with more zeal than I had previously.  First thing I wanted to
replace were some regsub.split calls.  I went to the module index to look up
the description but regsub was nowhere to be found.  (I know, I know.  I can
use pydoc.)

Still... how about continuing to include deprecated modules in the library
reference manual but in a separate Deprecated Modules section and annotate
them as such in the module index?

Skip


From guido@python.org  Tue Jan 16 13:44:01 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 08:44:01 -0500
Subject: [Python-Dev] time functions
In-Reply-To: Your message of "Tue, 16 Jan 2001 08:11:38 +0100."
 <00b201c07f8b$93996820$e46940d5@hagrid>
References: <20010116004930.L1005@xs4all.nl>
 <00b201c07f8b$93996820$e46940d5@hagrid>
Message-ID: <200101161344.IAA04513@cj20424-a.reston1.va.home.com>

> thomas wrote:
> > - Making the time in time.strftime default to 'now', so that the above
> >   becomes the ever so slightly confusing:
> > 
> >   timestr = time.strftime("<format>")
> >   (confusing because it looks a bit like a regexp constructor...)
> 
> where "now" is local time, I assume?
> 
> since you're assuming a time zone, you could make it accept
> an integer as well...

What would the integer mean?

> > - Making the time in time.asctime and time.ctime optional, defaulting to
> >   'now', so you can just call 'time.ctime()' without having to pass
> >   time.time() (which are about half the calls in my own code :)
> 
> same here.

Same what here?  "now" == local time, sure.  But accept an integer?
It already accepts an integer!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Jan 16 13:55:01 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 08:55:01 -0500
Subject: [Python-Dev] time functions
In-Reply-To: Your message of "Tue, 16 Jan 2001 09:22:11 +0100."
 <20010116092211.O1005@xs4all.nl>
References: <20010116004930.L1005@xs4all.nl> <00b201c07f8b$93996820$e46940d5@hagrid> <20010116081838.N1005@xs4all.nl>
 <20010116092211.O1005@xs4all.nl>
Message-ID: <200101161355.IAA04802@cj20424-a.reston1.va.home.com>

Let's not redesign the time module API too much.  I'm all for adding
the default argument values that Thomas proposes.  Then, instead of
changing the API, we should look into a higher-level Python module.
That's how those things typically go.

Digital Creations has its own time extension type somewhere in Zope, a
bit similar to mxDateTime.  I looked into making this a standard
Python extension but quickly gave up.  The problems with these things
seems to be that it's hard to come up with a design that makes
everyone happy: some people want small objects (because they have a
lot of them around, e.g. a timestamp on almost every other object);
others want timezone support; yet others want microsecond resolution;
leap-second support; pre-Christian era support; support for
nonstandard calendars; interval arithmetic; support for dates without
times or times without dates...

Python could use a better time type, but we'll have to look into which
requirements make sense for a generalized type, and which don't.  I
fear that a committee could easily pee away years designing an
interface to satisfy absolutely every wish.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Jan 16 14:02:29 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 09:02:29 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Mon, 15 Jan 2001 21:33:42 PST."
 <Pine.LNX.4.10.10101152130090.5846-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101152130090.5846-100000@skuld.kingmanhall.org>
Message-ID: <200101161402.JAA05045@cj20424-a.reston1.va.home.com>

> Yes, i referred to "when strings are printed and repr()ed" as two cases
> because both string_print() and string_repr() have to be changed.
> 
> (Side question: when are *_print() and *_repr() ever different, and why?)

You mean the tp_print and tp_str function slots in type objects,
right?  tp_print *should* always render exactly the same as tp_str.
tp_print is used by the print statement, not by value display at the
interactive prompt.

tp_print and tp_str have differed historically for 3rd party extension
types by accident.

So, string_print most definitely should *not* be changed -- only
string_repr!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Jan 16 14:06:23 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 09:06:23 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Tue, 16 Jan 2001 02:47:02 PST."
 <Pine.LNX.4.10.10101160240520.5846-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101160240520.5846-100000@skuld.kingmanhall.org>
Message-ID: <200101161406.JAA05153@cj20424-a.reston1.va.home.com>

> I assume you would like Unicode strings to do the same (\n, \t, \r,
> and \xff rather than \377).

Yeah.

> Guido, do you have a Pronouncement on \v, \f, \b, \a?

Practicality beats purity: these will remain octal.

> By the way, why do Unicode escapes appear in capitals?
> 
>     >>> u'\uface'
>     u'\uFACE'

Could it be just that that's what Unicode folks are expecting?

> (If someone tells me that there happens to be a picture of a face at
> that code point, i'll laugh.  Is there a cow at \uBEEF?)

I'm laughing even though I don't see pictures. :-)

> Does anyone care that \x will be followed by lowercase and \u by uppercase?

It's mildly weird, and I think hex escapes in lowercase are more
Pythonic than in upper case.

> I noticed that the tutorial claims Unicode strings can be str()-ified
> and will encode themselves using UTF-8 as default.  But this doesn't
> actually work for me:
> 
>     >>> us = u'\uface'
>     >>> us
>     u'\uFACE'
>     >>> str(us)
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     UnicodeError: ASCII encoding error: ordinal not in range(128)
>     >>> us.encode()
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     UnicodeError: ASCII encoding error: ordinal not in range(128)
>     >>> us.encode('UTF-8')
>     '\xef\xab\x8e'
> 
> Assuming i have understood this correctly, i have submitted a patch
> to correct tut.tex.

Yeah, I guess that part of the tutorial was written before we changed
our minds about this. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Tue Jan 16 14:09:56 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 09:09:56 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Tue, 16 Jan 2001 11:32:21 +0100."
 <3A642335.82358B02@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com> <200101160413.XAA01404@cj20424-a.reston1.va.home.com>
 <3A642335.82358B02@lemburg.com>
Message-ID: <200101161409.JAA05268@cj20424-a.reston1.va.home.com>

> Minor nit about this idea: it makes decoding repr() style
> strings harder for external tools and it could cause breakage
> (e.g. if "\n" is usedby the encoding for some other purpose).

Such a tool would be broken.  If it accepts string literals it should
accept all forms of escapes.

> BTW, since there are a gazillion ways to encode strings into
> 7-bit ASCII, why not use the new codec design to add additional
> output schemes for 8-bit strings ?!
> 
> Strings have an .encode() method as well...

Good idea!  This could also be used to "hexify" a string, for which
currently one of the quickest ways is still the hack

    "%02x"*len(s) % tuple(s)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@python.org  Tue Jan 16 14:11:53 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 09:11:53 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Tue, 16 Jan 2001 06:20:18 EST."
 <20010116062018.A12935@thyrsus.com>
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <3a642768.6426631@smtp.worldonline.dk>
 <20010116062018.A12935@thyrsus.com>
Message-ID: <200101161411.JAA05336@cj20424-a.reston1.va.home.com>

> Finn Bock <bckfnn@worldonline.dk>:
> > I like it, because it removes yet another difference between Python and
> > Jython. Jython happens to handle these chars specially: \n, \t, \b, \f
> > and \r.

[ESR]
> This is an argument for adding \b and \f to the special set in
> CPython.  If the BDFL looks benignly on adding \v and \a, those
> should go into Jython's special set too.

No, I think Jython should remove \b and \f.  Or the language standard
could allow implementations some freedom here (as long as the output
is a string literal).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fdrake@acm.org  Tue Jan 16 15:06:34 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Tue, 16 Jan 2001 10:06:34 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKENMIIAA.tim.one@home.com>
References: <20010115162619.A19484@kronos.cnri.reston.va.us>
 <LNBBLJKPBEHFEDALKOLCKENMIIAA.tim.one@home.com>
Message-ID: <14948.25466.698063.240902@cj42289-a.reston1.va.home.com>

Tim Peters writes:
 > Presumably so that *something* gets into 2.1a1.  At least you, Jeremy and
 > Fredrik have tried them, and if that's all there can't be a tie <wink>.  I
 > would agree this is not an ideal decision procedure.

  I've been using PyUNIT some, but haven't tried the Quixote unittest
module, which tells me I can't make a particularly informed
recommendation (vote, whatever).


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From thomas@xs4all.net  Tue Jan 16 15:23:52 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 16:23:52 +0100
Subject: [Python-Dev] time functions
In-Reply-To: <200101161355.IAA04802@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Jan 16, 2001 at 08:55:01AM -0500
References: <20010116004930.L1005@xs4all.nl> <00b201c07f8b$93996820$e46940d5@hagrid> <20010116081838.N1005@xs4all.nl> <20010116092211.O1005@xs4all.nl> <200101161355.IAA04802@cj20424-a.reston1.va.home.com>
Message-ID: <20010116162350.A21010@xs4all.nl>

On Tue, Jan 16, 2001 at 08:55:01AM -0500, Guido van Rossum wrote:

> Let's not redesign the time module API too much.

[snip]

Agreed.

> I fear that a committee could easily pee away years designing an
> interface to satisfy absolutely every wish.

A committee is a life form with six or more legs and no brain.
    Lazarus Long in "Time Enough For Love", by R. A. Heinlein.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From skip@mojam.com (Skip Montanaro)  Tue Jan 16 17:23:56 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 16 Jan 2001 11:23:56 -0600 (CST)
Subject: [Python-Dev] Re: [Patches] [Patch #102891] Alternative readline module
In-Reply-To: <m366jf4esw.fsf@atrus.jesus.cam.ac.uk>
References: <E14IXZj-0007Cc-00@usw-sf-web1.sourceforge.net>
 <m366jf4esw.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <14948.33708.332464.107009@beluga.mojam.com>

    Michael> ... (or I'll just call it pyttyinput)

Which, like "Guido", when properly pronounced should leave your monitor
slightly moist... ;-)

Skip



From thomas@xs4all.net  Tue Jan 16 17:36:03 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 18:36:03 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102891] Alternative readline module
In-Reply-To: <14948.33708.332464.107009@beluga.mojam.com>; from skip@mojam.com on Tue, Jan 16, 2001 at 11:23:56AM -0600
References: <E14IXZj-0007Cc-00@usw-sf-web1.sourceforge.net> <m366jf4esw.fsf@atrus.jesus.cam.ac.uk> <14948.33708.332464.107009@beluga.mojam.com>
Message-ID: <20010116183603.B2776@xs4all.nl>

On Tue, Jan 16, 2001 at 11:23:56AM -0600, Skip Montanaro wrote:

> Which, like "Guido", when properly pronounced should leave your monitor
> slightly moist... ;-)

Nono, 'Guido' should be pronounced using a hard, back-of-your-throat 'G',
more like a growl than a hiss. The less moisture the better :)

You-were-thinking-of-Centraal-Wiskunde-Instituut-(cwi.nl)-ly y'rs,

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From trentm@ActiveState.com  Tue Jan 16 18:36:29 2001
From: trentm@ActiveState.com (Trent Mick)
Date: Tue, 16 Jan 2001 10:36:29 -0800
Subject: [Python-Dev] TELL64
In-Reply-To: <200101160408.XAA01368@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 15, 2001 at 11:08:46PM -0500
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl> <20010115162454.D3864@ActiveState.com> <200101160408.XAA01368@cj20424-a.reston1.va.home.com>
Message-ID: <20010116103626.D30209@ActiveState.com>

On Mon, Jan 15, 2001 at 11:08:46PM -0500, Guido van Rossum wrote:
> 
> Trent, you wrote that code.  Why wouldn't this work just as well?
> 
> (your code):
> 			if ((pos = TELL64(fileno(fp))) == -1L)
> 				return -1;
> (my suggestion):
> 			if (fgetpos(fp, &pos) != 0)
> 				return -1;

I agree, that looks to me like it would. I guess I just missed that when I
wrote it.

> 
> I would even go as far as to collapse the entire switch as follows:
> 
> 	fpos_t pos;
> 	switch (whence) {
> 	case SEEK_END:
> 		/* do a "no-op" seek first to sync the buffering so that
> 		   the low-level tell() can be used correctly */
> 		if (fseek(fp, 0, SEEK_END) != 0)
> 			return -1;
> 		/* fall through */
> 	case SEEK_CUR:
> 		if (fgetpos(fp, &pos) != 0)
> 			return -1;
> 		offset += pos;
> 		break;
> 	/* case SEEK_SET: break; */
> 	}
> 	return fsetpos(fp, &offset);

Sure. Just get rid of the """do a "no-op" seek...""" comment because it is no
longer applicable. I am not setup to test this on Win64 right and I don't
suppose there are a lot of you out there with your own Win64 setups. I will
be able to test this before the scheduled 2.1 beta (late Feb), though.

Trent


-- 
Trent Mick
TrentM@ActiveState.com


From trentm@ActiveState.com  Tue Jan 16 19:34:17 2001
From: trentm@ActiveState.com (Trent Mick)
Date: Tue, 16 Jan 2001 11:34:17 -0800
Subject: [Python-Dev] TELL64
In-Reply-To: <20010116103626.D30209@ActiveState.com>; from trentm@ActiveState.com on Tue, Jan 16, 2001 at 10:36:29AM -0800
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl> <20010115162454.D3864@ActiveState.com> <200101160408.XAA01368@cj20424-a.reston1.va.home.com> <20010116103626.D30209@ActiveState.com>
Message-ID: <20010116113417.I30209@ActiveState.com>

On Tue, Jan 16, 2001 at 10:36:29AM -0800, Trent Mick wrote:
> Sure. Just get rid of the """do a "no-op" seek...""" comment because it is no
> longer applicable. I am not setup to test this on Win64 right and I don't

s/right/right now/


Trent

-- 
Trent Mick
TrentM@ActiveState.com


From cgw@fnal.gov  Tue Jan 16 20:19:09 2001
From: cgw@fnal.gov (Charles G Waldman)
Date: Tue, 16 Jan 2001 14:19:09 -0600 (CST)
Subject: [Python-Dev] Re: [Patch #103248] Fix a memory leak in _sre.c
Message-ID: <14948.44221.876681.838046@buffalo.fnal.gov>

Frederik - I noticed that you chose to check in a slightly different
patch than the one I submitted.

I wonder why you chose to do this?  In particular at line 1238 I had:

    if (PyErr_Occurred()) {
        Py_DECREF(self);
        return NULL;
    }

and you changed this to 

    if (PyErr_Occurred()) {
        PyObject_DEL(self);
        return NULL;
    }

Can you explain why you made this (seemingly arbitrary) change? 

I think that since "self" was created via:

 self = PyObject_NEW_VAR(PatternObject, &Pattern_Type, n);

which calls PyObjectINIT, which in turn calls _Py_NewReference, which
increments _Py_RefTotal, it is incorrect to simply do a PyObject_DEL
to de-allocate it -- won't this screw up the value of _Py_RefTotal?

Admittedly this is a minor nit and only matters if Py_TRACE_REFS is
defined - I just wanted to check to make sure my understanding of
reference counting w.r.t. memory allocation and deallocation is
correct - if the above is in error, I'd apprecate any corrections...



From guido@python.org  Tue Jan 16 20:53:41 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 15:53:41 -0500
Subject: [Python-Dev] TELL64
In-Reply-To: Your message of "Tue, 16 Jan 2001 10:36:29 PST."
 <20010116103626.D30209@ActiveState.com>
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl> <20010115162454.D3864@ActiveState.com> <200101160408.XAA01368@cj20424-a.reston1.va.home.com>
 <20010116103626.D30209@ActiveState.com>
Message-ID: <200101162053.PAA13099@cj20424-a.reston1.va.home.com>

> I agree, that looks to me like it would. I guess I just missed that when I
> wrote it.

Excellent!  I've checked this in now -- we'll hear if it breaks
anywhere soon enough.

>I am not setup to test this on Win64 right [now] and I don't
> suppose there are a lot of you out there with your own Win64 setups.

What happened to ActiveState's Itanium boxes?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Tue Jan 16 21:53:22 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Tue, 16 Jan 2001 16:53:22 -0500
Subject: [Python-Dev] Re: Detecting install time
In-Reply-To: <200101160347.WAA01132@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 15, 2001 at 10:47:32PM -0500
References: <200101160303.WAA11632@207-172-111-91.s91.tnt1.ann.va.dialup.rcn.com> <200101160347.WAA01132@cj20424-a.reston1.va.home.com>
Message-ID: <20010116165322.B29674@kronos.cnri.reston.va.us>

[CC'ing to the distutils-sig]

On Mon, Jan 15, 2001 at 10:47:32PM -0500, Guido van Rossum wrote:
>> For PEP 229, the setup.py script needs to figure out if it's running
>> from the build directory, because then distutils.sysconfig needs to
>
>You could check for the presence of config.status -- that file is not
>installed.

This isn't a check suitable for inclusion in distutils.sysconfig,
though, because it's so liable to being fooled (consider a
Distutils-packaged module that comes with a configure script to build
some library).  Right now I'm using a hacked version of sysconfig with several patches like this:

@@ -120,12 +121,16 @@
 def get_config_h_filename():
     """Return full pathname of installed config.h file."""
     inc_dir = get_python_inc(plat_specific=1)
+    # XXX
+    if 1: inc_dir = '.'
     return os.path.join(inc_dir, "config.h")
 
One hackish approach would be to add a assume_build_directories() to
distutils.sysconfig, a little back door to be used by the setup.py
script that comes with Python, so the above would become 'if
build_time_flag: ...'.  Anyone have a cleaner idea?

--amk



From akuchlin@mems-exchange.org  Wed Jan 17 01:46:47 2001
From: akuchlin@mems-exchange.org (A.M. Kuchling)
Date: Tue, 16 Jan 2001 20:46:47 -0500
Subject: [Python-Dev] PEP 229 issues
Message-ID: <200101170146.UAA00542@207-172-112-159.s159.tnt4.ann.va.dialup.rcn.com>

I'm in a quandry about the patch implementing PEP 229.  The patch is
quite close to being ready, with only a few minor issues remaining,
but to fix those issues, I need to make some changes to the Distutils,
such as the sysconfig modification I recently suggested. 

Problem: I believe the patch *must* go in at the alpha stage, because
there are bound to be lots of platform-specific problems that will
show up; it should not be added in the beta stage, because it'll need
time to get tested and debugged, and I wouldn't be surprised if it has
to be reverted later because of some insurmountable problem.

Problem: Greg Ward, the Distutils maintainer, is away at the moment.
I can check in changes to the Distutils without his say-so, but when
Greg gets back he might shriek in horror and rip all of the changes
out again.  (Or he's stuck with maintaining them until 2.2.)

Problem: 2.1alpha1 is due on Friday.

So, what to do?  If I know there's going to be an alpha2, that's
probably fine; Greg should have resurfaced by then, and the patch can
go in for alpha2.  

Or, I can check in the changes before Friday, and if they're
unacceptable, they can be fixed for alpha2/beta1, or simply backed
out.  

Or, I can leave Distutils alone and make setup.py a tissue of hacks
and workarounds.  For example, it might insert new versions of various
functions into the distutils.sysconf module.  Icky and fragile, but
cleaning it up for beta1 would then be a priority.

Suggestions?  Pronouncements?

--amk


From guido@python.org  Wed Jan 17 01:39:35 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 20:39:35 -0500
Subject: [Python-Dev] PEP 229 issues
In-Reply-To: Your message of "Tue, 16 Jan 2001 20:46:47 EST."
 <200101170146.UAA00542@207-172-112-159.s159.tnt4.ann.va.dialup.rcn.com>
References: <200101170146.UAA00542@207-172-112-159.s159.tnt4.ann.va.dialup.rcn.com>
Message-ID: <200101170139.UAA17954@cj20424-a.reston1.va.home.com>

I expect that there will be an alpha2, but I still recommend that you
check in *something* that works for alpha1, to get maximal testing
coverage.  Alpha1 may slip a day or so (Jeremy and I are both late
with our big patches, respectively nested scopes and rich comparisons,
that we really want to have in alpha1).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Wed Jan 17 02:04:53 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 16 Jan 2001 21:04:53 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <200101161409.JAA05268@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIECBIJAA.tim.one@home.com>

[Guido]
> Good idea [using string.encode()]!  This could also be used to
> "hexify" a string, for which currently one of the quickest ways
> is still the hack
>
>     "%02x"*len(s) % tuple(s)

Note that as of 2.0, a far quicker way is to use binascii.b2a_hex(), or its
absurdist (read "Barry" <wink>) synonym binascii.hexlify().

I'm wary of using string.encode() for this, because one normally hexlifies
binary data (e.g., like sha checksums), and 4 days of 7 we're more than not
in favor of moving away from strings to carry binary data.

Of course we can change our minds about this across releases, and have
even-numbered releases deprecate the function forms while odd-numbered ones
abjure methods.  Works for me <wink>.



From nas@arctrix.com  Tue Jan 16 21:08:23 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Tue, 16 Jan 2001 13:08:23 -0800
Subject: [Python-Dev] [droux@tuks.co.za: Our application doesn't work with Debian packaged Python]
Message-ID: <20010116130823.C9640@glacier.fnational.com>

This message was on the debian-python list.  Does anyone know why
the patch is needed?

  Neil

----- Forwarded message from Danie Roux <droux@tuks.co.za> -----

Date: Tue, 16 Jan 2001 11:44:48 +0200
From: Danie Roux <droux@tuks.co.za>
Subject: Our application doesn't work with Debian packaged Python
To: Debian Python <debian-python@lists.debian.org>

Good they all,

Our program is an archiver for gnome that uses gnome-python with one
widget written in C.

I converted our program to autoconf and automake so anyone can (and please
do!) compile it and see what I mean.

Everything compiles fine. But when it runs it just throws a weird
exception.

The funny thing is, if I alien RedHat 6.2's python package, and install
that, it works! I need to change nothing else. Only the python package.

I then went and look at the source rpm. They have this patch in there:

--- Python-1.5.2/Python/importdl.c.global	Sat Jul 17 16:52:26 1999
+++ Python-1.5.2/Python/importdl.c	Sat Jul 17 16:53:19 1999
@@ -441,13 +441,13 @@
 #ifdef RTLD_NOW
 		/* RTLD_NOW: resolve externals now
 		   (i.e. core dump now if some are missing) */
-		void *handle = dlopen(pathname, RTLD_NOW);
+		void *handle = dlopen(pathname, RTLD_NOW | RTLD_GLOBAL);
 #else
 		void *handle;
 		if (Py_VerboseFlag)
 			printf("dlopen(\"%s\", %d);\n", pathname,
-			       RTLD_LAZY);
-		handle = dlopen(pathname, RTLD_LAZY);
+			       RTLD_LAZY | RTLD_GLOBAL);
+		handle = dlopen(pathname, RTLD_LAZY | RTLD_GLOBAL);
 #endif /* RTLD_NOW */
 		if (handle == NULL) {
 			PyErr_SetString(PyExc_ImportError, dlerror());

Sure enough this fixes my problem. The thing is that this means our
program only works on Redhat (and who ever patched python 1.5.2 with this).

So what can I do now? How can I get this patch into debian-python? How can
I change my program to not need the patch?

btw the program is garchiver, it will be hosted at sourceforge as soon as
they get back to me, in the mean time I will mail anyone a copy of the
sources.

-- 
Danie Roux *shuffle* Adore Unix


-- 
To UNSUBSCRIBE, email to debian-python-request@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org


----- End forwarded message -----


From guido@python.org  Wed Jan 17 04:16:48 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 23:16:48 -0500
Subject: [Python-Dev] [droux@tuks.co.za: Our application doesn't work with Debian packaged Python]
In-Reply-To: Your message of "Tue, 16 Jan 2001 13:08:23 PST."
 <20010116130823.C9640@glacier.fnational.com>
References: <20010116130823.C9640@glacier.fnational.com>
Message-ID: <200101170416.XAA20515@cj20424-a.reston1.va.home.com>

> This message was on the debian-python list.  Does anyone know why
> the patch is needed?

> -		handle = dlopen(pathname, RTLD_LAZY);

> +		handle = dlopen(pathname, RTLD_LAZY | RTLD_GLOBAL);

This comes back every once in a while.  It means that they have an
module whose shared library implementation exports symbols that are
needed by another shared library (probably another module).

IMO this approach is evil, because RTLD_GLOBAL means that *all*
external symbols defined by any module are exported to all other
shared libraries, and this will cause conflicts if the same symbol is
exported by two different modules -- which can happen quite easily.
(I don't know what happens on conflicts -- maybe you get an error,
maybe it links to the wrong symbol.)

The proper solution would be to put the needed entry points beside the
init<module> entry point in a separate shared library.  But that's
often not how quick-and-dirty extension modules are designed...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@python.org  Wed Jan 17 04:22:54 2001
From: guido@python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 23:22:54 -0500
Subject: [Python-Dev] Rich Comparisons technical prerelease
Message-ID: <200101170422.XAA20626@cj20424-a.reston1.va.home.com>

I've got a working version of the rich comparisons ready for preview.

The patch is here:

  http://www.python.org/~guido/richdiff.txt

It's also referenced at sourceforge:

  http://sourceforge.net/patch/?func=detailpatch&patch_id=103283&group_id=5470

Here's a summary:

- The comparison operators support "rich comparison overloading" (PEP
  207).  C extension types can provide a rich comparison function in
  the new tp_richcompare slot in the type object.  The cmp() function
  and the C function PyObject_Compare() first try the new rich
  comparison operators before trying the old 3-way comparison.  There
  is also a new C API PyObject_RichCompare() (which also falls back on
  the old 3-way comparison, but does not constrain the outcome of the
  rich comparison to a Boolean result).

  The rich comparison function takes two objects (at least one of
  which is guaranteed to have the type that provided the function) and
  an integer indicating the opcode, which can be Py_LT, Py_LE, Py_EQ,
  Py_NE, Py_GT, Py_GE (for <, <=, ==, !=, >, >=), and returns a Python
  object, which may be NotImplemented (in which case the tp_compare
  slot function is used as a fallback, if defined).

  Classes can overload individual comparison operators by defining one
  or more of the methods__lt__, __le__, __eq__, __ne__, __gt__,
  __ge__.  There are no explicit "reversed argument" versions of
  these; instead, __lt__ and __gt__ are each other's reverse, likewise
  for__le__ and __ge__; __eq__ and __ne__ are their own reverse
  (similar at the C level).  No other implications are made; in
  particular, Python does not assume that == is the inverse of !=, or
  that < is the inverse of >=.  This makes it possible to define types
  with partial orderings.

  Classes or types that want to implement (in)equality tests but not
  the ordering operators (i.e. unordered types) should implement ==
  and !=, and raise an error for the ordering operators.

  It is possible to define types whose comparison results are not
  Boolean; e.g. a matrix type might want to return a matrix of bits
  for A < B, giving elementwise comparisons.  Such types should ensure
  that any interpretation of their value in a Boolean context raises
  an exception, e.g. by defining __nonzero__ (or the tp_nonzero slot
  at the C level) to always raise an exception.

  XXX TO DO for this feature:

  - the test "test_compare" fails, because of the changed semantics
    for complex number comparisons (1j<2j raises an error now)
  - tuple, dict should implement EQ/NE so containers containing
    complex numbers can be compared for equality (list is already
    done) -- or complex numbers should be reverted to old behavior
  - list.sort() shoud use rich comparison
  - check for memory leaks
  - int, long, float contain new-style-cmp functions that aren't used
    to their full potential any more (the new-style-cmp functions
    introduced by Neil's coercion work are gone again)
  - decide on unresolved issues from PEP 207
  - documentation
  - more testing
  - compare performance to 2.0 (microbench?)

Please give this a good spin -- I'm hoping to check this in and
make it part of the alpha 1 release Friday...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@digicool.com  Wed Jan 17 04:50:25 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 16 Jan 2001 23:50:25 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
References: <200101161409.JAA05268@cj20424-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCIECBIJAA.tim.one@home.com>
Message-ID: <14949.9361.591610.684695@anthem.wooz.org>

>>>>> "TP" == Tim Peters <tim.one@home.com> writes:

    TP> Note that as of 2.0, a far quicker way is to use
    TP> binascii.b2a_hex(), or its absurdist (read "Barry" <wink>)
    TP> synonym binascii.hexlify().

Thanks for the compliment Tim, but I can't take credit for that name.
If it was me I'd have called it wudduptify() (and its inverse,
notmuchlify()).  I stole the name from Emacs's hexlify-buffer function
which kind of does the same thing.

would-converting-to-octal-digits-be-called-octopuslify-ly y'rs,
-Barry


From fredrik@effbot.org  Wed Jan 17 08:12:32 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Wed, 17 Jan 2001 09:12:32 +0100
Subject: [Python-Dev] Re: [Patch #103248] Fix a memory leak in _sre.c
References: <14948.44221.876681.838046@buffalo.fnal.gov>
Message-ID: <00fe01c0805d$432d4cd0$e46940d5@hagrid>

Charles G Waldman wrote:
> Can you explain why you made this (seemingly arbitrary) change? 
> 
> I think that since "self" was created via:
> 
>  self = PyObject_NEW_VAR(PatternObject, &Pattern_Type, n);
> 
> which calls PyObjectINIT, which in turn calls _Py_NewReference, which
> increments _Py_RefTotal, it is incorrect to simply do a PyObject_DEL
> to de-allocate it -- won't this screw up the value of _Py_RefTotal?

and what do you think will happen if you call the destructor before
you've initialized all pointer fields in the object?

(according to the docs, the NEW/New functions return uninitialized
memory.  in this case, we're bailing out before the object has been
fully initialized.  pattern_dealloc definitely isn't prepared to deal with
random pointer values...)

> Admittedly this is a minor nit and only matters if Py_TRACE_REFS is
> defined - I just wanted to check to make sure my understanding of
> reference counting w.r.t. memory allocation and deallocation is
> correct - if the above is in error, I'd apprecate any corrections...

same here.  I don't doubt it's working as you say it does, but I find it
strange that you shouldn't be able to DEL an object you just created
with NEW...  maybe DEL should be fixed?

Cheers /F



From thomas@xs4all.net  Wed Jan 17 09:48:12 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 17 Jan 2001 10:48:12 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules Setup.config.in,1.7,1.8 Setup.dist,1.7,1.8
In-Reply-To: <E14Inu5-00047g-00@usw-pr-cvs1.sourceforge.net>; from esr@users.sourceforge.net on Wed, Jan 17, 2001 at 12:25:13AM -0800
References: <E14Inu5-00047g-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010117104812.F2776@xs4all.nl>

On Wed, Jan 17, 2001 at 12:25:13AM -0800, Eric S. Raymond wrote:

> + # ndbm(3) may require -lndbm or similar
> + @USE_NDBM_MODULE@ndbm ndbmmodule.c @HAVE_LIBNDBM@

This is an interesting module... It's not in the Modules/ directory :-) Did
you mean 'dbmmodule.c' with a different library argument ? 

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From skip@mojam.com (Skip Montanaro)  Wed Jan 17 15:17:39 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 17 Jan 2001 09:17:39 -0600 (CST)
Subject: [Python-Dev] Rich comparison confusion
Message-ID: <14949.46995.259157.871323@beluga.mojam.com>

I'm a bit confused about Guido's rich comparison stuff.  In the description
he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.
>From a boolean standpoint this just can't be so.  Guido mentions partial
orderings, but I'm still confused.  Consider this example: Objects of type A
implement rich comparisons.  Objects of type B don't.  If my code looks like

    a = A()
    b = B()
    ...
    if b < a:
        ...

My interpretation of the rich comparison stuff is that either

    1. Since b doesn't implement rich comparisons, the interpreter falls
       back to old fashioned comparisons which may or may not allow the
       comparison of B objects and A objects.

    or

    2. The sense of the inequality is switched (a > b) and the rich
       comparison code in A's implementation is called.

That's my reading of it.  It has to be wrong.  The inverse comparison should
be a >= b, not a > b, but the described pairing of comparison functions
would imply otherwise.

I'm sure I'm missing something obvious or revealing some fundamental failure
of my grade school education.  Please explain...

Skip



From akuchlin@mems-exchange.org  Wed Jan 17 15:42:13 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 17 Jan 2001 10:42:13 -0500
Subject: [Python-Dev] PEP 229 issues
In-Reply-To: <200101170139.UAA17954@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Jan 16, 2001 at 08:39:35PM -0500
References: <200101170146.UAA00542@207-172-112-159.s159.tnt4.ann.va.dialup.rcn.com> <200101170139.UAA17954@cj20424-a.reston1.va.home.com>
Message-ID: <20010117104213.B490@kronos.cnri.reston.va.us>

On Tue, Jan 16, 2001 at 08:39:35PM -0500, Guido van Rossum wrote:
>I expect that there will be an alpha2, but I still recommend that you
>check in *something* that works for alpha1, to get maximal testing
>coverage.  Alpha1 may slip a day or so (Jeremy and I are both late
>with our big patches, respectively nested scopes and rich comparisons,
>that we really want to have in alpha1).

OK; thanks for the pronouncement!

I've checked in all the smaller changes that shouldn't break anything.
All that's left now is to actually enable the new feature, which
requires the nasty changes:

     * In the top-level Makefile.in, the "sharedmods" target simply
       runs "./python setup.py build", and "sharedinstall" runs
       "./python setup.py install".  The "clobber" target also deletes
       the build/ subdirectory where Distutils puts its output.

     * Rip stuff out of the Setup files.  Modules/Setup.config.in only
       contains entries for the gc and thread modules; the readline,
       curses, and db modules are removed because it's now setup.py's
       job to handle them.
 
     * Modules/Setup.dist now contains entries for only 3 modules --
       _sre, posix, and strop.

Guido and Jeremy are rushing to finish their patches in time for the
alpha release, though Guido seems to be checking in the rich
comparison stuff now.  I don't want to impede them by making them stop
to debug build problems, so I can either wait until they've landed
their changes (at which point there's nothing major left, I think), or
they can simply not do a 'cvs update' after the serious changes go in.
Thoughts?

--amk


From barry@digicool.com  Wed Jan 17 15:54:06 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Wed, 17 Jan 2001 10:54:06 -0500
Subject: [Python-Dev] Breakage in latest CVS
Message-ID: <14949.49182.636526.292265@anthem.wooz.org>

Looks like the latest CVS (updated just minutes ago) is broken.  I'm
trying to fix some of these complaints, but thought I'd at least
report what I've found...

-Barry

...
gcc -g -O2 -Wall -Wstrict-prototypes -fPIC -I./../Include -I.. -DHAVE_CONFIG_H   -c floatobject.c -o floatobject.o
floatobject.c:675: warning: excess elements in struct initializer after `float_as_number'
floatobject.c:700: `Py_TPFLAGS_NEWSTYLENUMBER' undeclared here (not in a function)
floatobject.c:700: initializer element for `PyFloat_Type.tp_flags' is not constant
...
intobject.c:800: warning: excess elements in struct initializer after `int_as_number'
intobject.c:825: `Py_TPFLAGS_NEWSTYLENUMBER' undeclared here (not in a function)
intobject.c:825: initializer element for `PyInt_Type.tp_flags' is not constant
make[1]: *** [intobject.o] Error 1
...
gcc -g -O2 -Wall -Wstrict-prototypes -fPIC -I./../Include -I.. -DHAVE_CONFIG_H   -c longobject.c -o longobject.o
longobject.c:1865: warning: excess elements in struct initializer after `long_as_number'
longobject.c:1890: `Py_TPFLAGS_NEWSTYLENUMBER' undeclared here (not in a function)
longobject.c:1890: initializer element for `PyLong_Type.tp_flags' is not constant
make[1]: *** [longobject.o] Error 1


From guido@python.org  Wed Jan 17 16:09:27 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jan 2001 11:09:27 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Your message of "Wed, 17 Jan 2001 09:17:39 CST."
 <14949.46995.259157.871323@beluga.mojam.com>
References: <14949.46995.259157.871323@beluga.mojam.com>
Message-ID: <200101171609.LAA04102@cj20424-a.reston1.va.home.com>

> I'm a bit confused about Guido's rich comparison stuff.  In the description
> he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.

Yes.  By this I mean that A<B and B>A are interchangeable, ditto for
A<=B and B>=A.  Also A==B interchanges for B==A, and A!=B for B!=A.

> From a boolean standpoint this just can't be so.  Guido mentions partial
> orderings, but I'm still confused.  Consider this example: Objects of type A
> implement rich comparisons.  Objects of type B don't.  If my code looks like
> 
>     a = A()
>     b = B()
>     ...
>     if b < a:
>         ...
> 
> My interpretation of the rich comparison stuff is that either
> 
>     1. Since b doesn't implement rich comparisons, the interpreter falls
>        back to old fashioned comparisons which may or may not allow the
>        comparison of B objects and A objects.
> 
>     or
> 
>     2. The sense of the inequality is switched (a > b) and the rich
>        comparison code in A's implementation is called.

It's case 2.

> That's my reading of it.  It has to be wrong.  The inverse comparison should
> be a >= b, not a > b, but the described pairing of comparison functions
> would imply otherwise.

We're trying very hard *not* to make any connections between a<b and
a>=b.  You've learned in grade school that these are each other's
Boolean inverse (a<b is true iff a>=b is false).  However, for partial
orderings this may not be true: for unordered a and b, none of a<b,
a<=b, a>b, a>=b, a==b may be true.

On the other hand, even for partially ordered types, a<b and b>a
(note: swapped arguments *and* swapped sense of comparison) always
give the same outcome!

> I'm sure I'm missing something obvious or revealing some fundamental failure
> of my grade school education.  Please explain...

I think what threw you off was the ambiguity of "inverse".  This means
Boolean negation.  I'm not relying on Boolean negation here -- I'm
relying on the more fundamental property that a<b and b>a have the
same outcome.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mwh21@cam.ac.uk  Wed Jan 17 16:13:32 2001
From: mwh21@cam.ac.uk (Michael Hudson)
Date: 17 Jan 2001 16:13:32 +0000
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Skip Montanaro's message of "Wed, 17 Jan 2001 09:17:39 -0600 (CST)"
References: <14949.46995.259157.871323@beluga.mojam.com>
Message-ID: <m3hf2y2m37.fsf@atrus.jesus.cam.ac.uk>

Skip Montanaro <skip@mojam.com> writes:

> I'm a bit confused about Guido's rich comparison stuff.  In the description
> he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.
> >From a boolean standpoint this just can't be so.  Guido mentions partial
> orderings, but I'm still confused.  Consider this example: Objects of type A
> implement rich comparisons.  Objects of type B don't.  If my code looks like
> 
>     a = A()
>     b = B()
>     ...
>     if b < a:
>         ...
> 
> My interpretation of the rich comparison stuff is that either
> 
>     1. Since b doesn't implement rich comparisons, the interpreter falls
>        back to old fashioned comparisons which may or may not allow the
>        comparison of B objects and A objects.
> 
>     or
> 
>     2. The sense of the inequality is switched (a > b) and the rich
>        comparison code in A's implementation is called.
> 
> That's my reading of it.  It has to be wrong.  The inverse comparison should
> be a >= b, not a > b, but the described pairing of comparison functions
> would imply otherwise.
> 
> I'm sure I'm missing something obvious or revealing some fundamental failure
> of my grade school education.  Please explain...

For a total order:

a < b if and only if b > a.
This is what the rich comparison code does.

a < b if and only if a >= b. 
This is that the rich comparison code doesn't do.

Does this make sense?

Cheers,
M.

-- 
  Presumably pronging in the wrong place zogs it.
                                        -- Aldabra Stoddart, ucam.chat



From moshez@zadka.site.co.il  Thu Jan 18 00:08:06 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Thu, 18 Jan 2001 02:08:06 +0200 (IST)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <14949.46995.259157.871323@beluga.mojam.com>
References: <14949.46995.259157.871323@beluga.mojam.com>
Message-ID: <20010118000806.D1C04A828@darjeeling.zadka.site.co.il>

On Wed, 17 Jan 2001 09:17:39 -0600 (CST), Skip Montanaro <skip@mojam.com> wrote:

> I'm a bit confused about Guido's rich comparison stuff.  In the description
> he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.

I think that you're confused between two meanings of inverses.

You think:
op is an inverse of op' if for every a,b  (a op b) = not (a op' b)

Guido meant (and I hope, implemented):
op is an inverse of op' if for every a,b  (a op b) =  (b op' a)

And a<b iff b>a 
a<=b iff b>=a

Sounds sane.

Unless I'm the one confused....
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!


From fredrik@effbot.org  Wed Jan 17 16:47:29 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Wed, 17 Jan 2001 17:47:29 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <LNBBLJKPBEHFEDALKOLCGENEIIAA.tim.one@home.com>
Message-ID: <012901c080a5$306023a0$e46940d5@hagrid>

tim wrote:
> > Should I check it in?
> 
> Absolutely!  But not like as for 2.0:  check it in *now*, so we have a few
> days to deal with surprises before the alpha release.

as it turned out, the source I had didn't build, and the table-
building python script generated something that wasn't quite
compatible with the C code.  bit rot.

I've almost sorted it all out.  will check it in later tonight (local
time).

</F>



From tim.one@home.com  Wed Jan 17 18:27:11 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 17 Jan 2001 13:27:11 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Tools/idle CallTipWindow.py,1.2,1.3 CallTips.py,1.7,1.8 ClassBrowser.py,1.11,1.12 Debugger.py,1.14,1.15 Delegator.py,1.2,1.3 FileList.py,1.7,1.8 FormatParagraph.py,1.8,1.9 IdleConf.py,1.5,1.6 IdleHistory.py,1.3,1
In-Reply-To: <200101171358.IAA27661@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEBIJAA.tim.one@home.com>

[an anonymous developer panics, after Tim "reindent"s the IDLE dir]

> Oh no!
>
> I have a whole slew of changes to IDLE sitting in my work directory.
> If I do an update half of these will turn into merge conflicts. :-(
>
> Don't worry, I'll get over it.

I imagine this will pop up from time to time until everything is normalized.
If it's about to burn you, run reindent.py on the affected directory
*before* you update ("python redindent.py -v .").  That will make all the
same changes to your local versions as were checked in, modulo the rare
hand-edit (of which there were none in the IDLE directory).



From akuchlin@mems-exchange.org  Wed Jan 17 19:04:04 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 17 Jan 2001 14:04:04 -0500
Subject: [Python-Dev] PEP 229 checked in
Message-ID: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us>

I've checked in the last bit of the PEP 229 changes.  Be sure to
rename your Modules/Setup file (or do a 'make distclean' before
rebuilding.  Squeal if you run into trouble, or file bugs on SF.

--am"Aieee!"k


From jeremy@alum.mit.edu  Wed Jan 17 19:12:47 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Wed, 17 Jan 2001 14:12:47 -0500 (EST)
Subject: [Python-Dev] unexpected consequence of function attributes
Message-ID: <14949.61103.258714.325465@localhost.localdomain>

I have found one place in the library that depended on 
hasattr(func, '__dict__') to return false -- dis.dis.  You might want
to check and see if there is anything other code that doesn't expect
function's to have extra attributes.  I expect that only introspective
code would be affected.

Jeremy


From barry@wooz.org  Wed Jan 17 19:46:36 2001
From: barry@wooz.org (Barry A. Warsaw)
Date: Wed, 17 Jan 2001 14:46:36 -0500
Subject: [Python-Dev] Re: unexpected consequence of function attributes
References: <14949.61103.258714.325465@localhost.localdomain>
Message-ID: <14949.63132.583025.303677@anthem.wooz.org>

>>>>> "JH" == Jeremy Hylton <jeremy@alum.mit.edu> writes:

    JH> I have found one place in the library that depended on
    JH> hasattr(func, '__dict__') to return false -- dis.dis.  You
    JH> might want to check and see if there is anything other code
    JH> that doesn't expect function's to have extra attributes.  I
    JH> expect that only introspective code would be affected.

I guess we need a test_dis.py in the regression test suite, eh? :)

Here's an extremely quick and dirty fix to dis.py.
-Barry

-------------------- snip snip --------------------
Index: dis.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/dis.py,v
retrieving revision 1.28
diff -u -r1.28 dis.py
--- dis.py	2001/01/14 23:36:05	1.28
+++ dis.py	2001/01/17 19:45:40
@@ -15,6 +15,10 @@
         return
     if type(x) is types.InstanceType:
         x = x.__class__
+    if hasattr(x, 'func_code'):
+        x = x.func_code
+    if hasattr(x, 'im_func'):
+        x = x.im_func
     if hasattr(x, '__dict__'):
         items = x.__dict__.items()
         items.sort()
@@ -28,17 +32,12 @@
                 except TypeError, msg:
                     print "Sorry:", msg
                 print
+    elif hasattr(x, 'co_code'):
+        disassemble(x)
     else:
-        if hasattr(x, 'im_func'):
-            x = x.im_func
-        if hasattr(x, 'func_code'):
-            x = x.func_code
-        if hasattr(x, 'co_code'):
-            disassemble(x)
-        else:
-            raise TypeError, \
-                  "don't know how to disassemble %s objects" % \
-                  type(x).__name__
+        raise TypeError, \
+              "don't know how to disassemble %s objects" % \
+              type(x).__name__
 
 def distb(tb=None):
     """Disassemble a traceback (default: last traceback)."""


From barry@wooz.org  Wed Jan 17 19:49:51 2001
From: barry@wooz.org (Barry A. Warsaw)
Date: Wed, 17 Jan 2001 14:49:51 -0500
Subject: [Python-Dev] Re: unexpected consequence of function attributes
References: <14949.61103.258714.325465@localhost.localdomain>
Message-ID: <14949.63327.22745.359978@anthem.wooz.org>

>>>>> "JH" == Jeremy Hylton <jeremy@alum.mit.edu> writes:

    JH> I have found one place in the library that depended on
    JH> hasattr(func, '__dict__') to return false -- dis.dis.  You
    JH> might want to check and see if there is anything other code
    JH> that doesn't expect function's to have extra attributes.  I
    JH> expect that only introspective code would be affected.

Patch #103303

http://sourceforge.net/patch/?func=detailpatch&patch_id=103303&group_id=5470


From tim.one@home.com  Wed Jan 17 20:51:57 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 17 Jan 2001 15:51:57 -0500
Subject: [Python-Dev] Windows Python totally hosed
Message-ID: <LNBBLJKPBEHFEDALKOLCEEEGIJAA.tim.one@home.com>

Failures range from

test test_winsound skipped --  Module use of python20.dll
    conflicts with this version of Python.

to

test test_tokenize crashed -- exceptions.AttributeError: 're' module
    has no attribute 'compile'

I suspect the latter is really a disguised version of

C:\Code\python\dist\src\PCbuild>python
Python 2.1a1 (#8, Jan 17 2001, 13:15:23) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import re
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "c:\code\python\dist\src\lib\re.py", line 28, in ?
    from sre import *
  File "c:\code\python\dist\src\lib\sre.py", line 17, in ?
    import sre_compile
  File "c:\code\python\dist\src\lib\sre_compile.py", line 11, in ?
    import _sre
ImportError: Module use of python20.dll conflicts with this version of
Python.
>>>

Suspect all of this has to do with patchlevel.h changing.  I'll try to dope
it out, but if anyone knows the cure off the top of their head, don't be
shy!



From akuchlin@mems-exchange.org  Wed Jan 17 21:00:56 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 17 Jan 2001 16:00:56 -0500
Subject: [Python-Dev] Re: 'Setup' buglet
In-Reply-To: <200101171928.OAA21460@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 17, 2001 at 02:28:36PM -0500
References: <200101171928.OAA21460@cj20424-a.reston1.va.home.com>
Message-ID: <20010117160056.A20603@kronos.cnri.reston.va.us>

[Taking this bug public]

On Wed, Jan 17, 2001 at 02:28:36PM -0500, Guido van Rossum wrote:
>One problem seems to be that the creation
>of the (minimal) Modules/Setup file doesn't seem to be doing the right
>thing.  When I delete Modules/Setup, the next "make" doesn't create
>it; it used to be copied from Setup.dist if it doesn't exist.

This seems to have been removed from Modules/Makefile.pre.in in
revision 1.69 by Fred; instead the configure script now copies
Setup.dist to Setup, so you have to rerun configure in order to create
Modules/Setup after deleting it.  

--amk


From mal@lemburg.com  Wed Jan 17 21:04:29 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jan 2001 22:04:29 +0100
Subject: [Python-Dev] Usage of "assert" in regression tests
Message-ID: <3A6608DD.E12A2422@lemburg.com>

I've just checked in a patch which removes all uses of the
assert statement in the regression tests. This makes the
tests compatible with the -O mode of Python and also allows
centralizing error reporting (many tests already provide their
own little test function for this purpose).

I urge you to only check in tests which use the new API
verify() to verify a certain condition. The API is defined
in the regression tools module test_support.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fredrik@effbot.org  Wed Jan 17 21:21:56 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Wed, 17 Jan 2001 22:21:56 +0100
Subject: [Python-Dev] Windows Python totally hosed
References: <LNBBLJKPBEHFEDALKOLCEEEGIJAA.tim.one@home.com>
Message-ID: <028801c080cb$86658350$e46940d5@hagrid>

tim wrote:
> Suspect all of this has to do with patchlevel.h changing.  I'll try to dope
> it out, but if anyone knows the cure off the top of their head, don't be
> shy!

text.replace("python20", "python21") for all files in
the PCBuild directory, plus PC/config.h

</F>



From tim.one@home.com  Wed Jan 17 21:42:13 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 17 Jan 2001 16:42:13 -0500
Subject: [Python-Dev] Windows Python totally hosed
In-Reply-To: <028801c080cb$86658350$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEEJIJAA.tim.one@home.com>

[/F]
> text.replace("python20", "python21") for all files in
> the PCBuild directory, plus PC/config.h

Brrrr.  It strikes me as insane to have the core Python files in an MS
project file *named* after the release number (python20.dsp).  So I'm going
to change that to core.dsp so that at least that much never needs to be
changed again.

gratefully y'rs  - tim



From fredrik@effbot.org  Wed Jan 17 21:47:28 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Wed, 17 Jan 2001 22:47:28 +0100
Subject: [Python-Dev] Usage of "assert" in regression tests
References: <3A6608DD.E12A2422@lemburg.com>
Message-ID: <02b401c080cf$1a3a5530$e46940d5@hagrid>

mal wrote:
> I urge you to only check in tests which use the new API
> verify() to verify a certain condition. The API is defined
> in the regression tools module test_support.

did you run the test yourself after applying that patch?

(a patch to the patch is on the way in.  please check
that the test suite still runs on non-Windows boxes...)

</F>



From gstein@lyra.org  Wed Jan 17 21:45:44 2001
From: gstein@lyra.org (Greg Stein)
Date: Wed, 17 Jan 2001 13:45:44 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.106,2.107
In-Reply-To: <E14J06i-0003ty-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Wed, Jan 17, 2001 at 01:27:04PM -0800
References: <E14J06i-0003ty-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010117134544.H7731@lyra.org>

On Wed, Jan 17, 2001 at 01:27:04PM -0800, Guido van Rossum wrote:
> Update of /cvsroot/python/python/dist/src/Objects
> In directory usw-pr-cvs1:/tmp/cvs-serv14991
> 
> Modified Files:
> 	object.c 
> Log Message:
> Deal properly (?) with comparing recursive datastructures.
>...
> - Change the in-progress code to use static variables instead of
>   globals (both the nesting level and the key for the thread dict were
>   globals but have no reason to be globals; the key can even be a
>   function-static variable in get_inprogress_dict()).

The "compare_nesting" variable is a bit troublesome long-term -- it will
cause threading issues in a free-threaded implementation. The solution is to
put the value into the thread-state.

[ not sure if it matters right now, but just bringing it up ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From fdrake@acm.org  Wed Jan 17 21:55:02 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 17 Jan 2001 16:55:02 -0500 (EST)
Subject: [Python-Dev] [PEP 205] weak references patch
Message-ID: <14950.5302.356566.778486@cj42289-a.reston1.va.home.com>

  I've updated the patch that implements PEP 205:

http://sourceforge.net/patch/?func=detailpatch&patch_id=103203&group_id=5470

  The actual patch is too big for SF:

http://starship.python.net/crew/fdrake/patches/weakref.patch-5

  One thing about this is that it changes some of the low-level object
creation macros, so you'll need to do a "make clean" before "make"
when testing it.
  Have fun!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From mal@lemburg.com  Wed Jan 17 22:16:29 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jan 2001 23:16:29 +0100
Subject: [Python-Dev] Usage of "assert" in regression tests
References: <3A6608DD.E12A2422@lemburg.com> <02b401c080cf$1a3a5530$e46940d5@hagrid>
Message-ID: <3A6619BD.2AC8F6D3@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:
> > I urge you to only check in tests which use the new API
> > verify() to verify a certain condition. The API is defined
> > in the regression tools module test_support.
> 
> did you run the test yourself after applying that patch?

Yes, but as I wrote in the SF patch message: I can only
test it on Linux and there not all tests are run due
to missing extensions. The alpha testing will hopefully catch all
possible bugs this patch introduced.
 
> (a patch to the patch is on the way in.  please check
> that the test suite still runs on non-Windows boxes...)

I'll have to leave that to the Windows wizards, sorry.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From thomas@xs4all.net  Wed Jan 17 22:49:25 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 17 Jan 2001 23:49:25 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Wed, Jan 17, 2001 at 02:04:04PM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us>
Message-ID: <20010117234925.A17392@xs4all.nl>

On Wed, Jan 17, 2001 at 02:04:04PM -0500, Andrew Kuchling wrote:
> I've checked in the last bit of the PEP 229 changes.  Be sure to
> rename your Modules/Setup file (or do a 'make distclean' before
> rebuilding.

make distclean doesn't remove Modules/Setup anymore :) Also, I couldn't get
it to work with an old tree, even after several make distclean/reconfigures.
I got tired looking for it, so I just grabbed a new tree.

> Squeal if you run into trouble, or file bugs on SF.

I have a couple of questions: what to do when setup.py doesn't work ? Is
there a way to make it bypass a module ? What about specifying include dirs
manually, for some modules (for instance, when you have readline source in a
separate directory, and want to link it statically.)

Here are are some specific squeals. See at the bottom for the most important
one :)

On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
setup.py. Also, SSL support for the socket module was not enabled, though
OpenSSL is installed, in the default path.

On Debian GNU/Linux' 'woody', the 'testing' (soon 'stable') branch, I can't
compile dbmmodule:

building 'dbm' extension
gcc -g -O2 -Wall -Wstrict-prototypes -fPIC -fpic -I. -I/home/thomas/python/python/dist/src/./Include -IInclude/ -c /home/thomas/python/python/dist/src/Modules/dbmmodule.c -o build/temp.linux-i686-2.1/dbmmodule.o
/home/thomas/python/python/dist/src/Modules/dbmmodule.c:24: #error "No ndbm.h available!"
error: command 'gcc' failed with exit status 1
make: *** [sharedmods] Error 1

(ndbm.h does exist, as /usr/include/db1/ndbm.h. There is also
/usr/include/gdbm-ndbm.h, but I'm not sure if that's the same.)

Nor can I build the _tkinter module there:

building '_tkinter' extension
gcc -g -O2 -Wall -Wstrict-prototypes -fPIC -fpic -DWITH_APPINIT=1 -I/usr/X11R6/include -I. -I/home/thomas/python/python/dist/src/./Include -IInclude/ -c /home/thomas/python/python/dist/src/Modules/_tkinter.c -o build/temp.linux-i686-2.1/_tkinter.o
/home/thomas/python/python/dist/src/Modules/_tkinter.c:44: tcl.h: No such file or directory
In file included from /home/thomas/python/python/dist/src/Modules/_tkinter.c:45:/usr/include/tk.h:66: tcl.h: No such file or directory
error: command 'gcc' failed with exit status 1
make: *** [sharedmods] Error 1

The Tcl/Tk header files are stored in /usr/include/tcl<ver>/ on Debian,
which I personally like a lot, though it's probably a bitch to autodetect.
(I tried, using autoconf ;-P)

On Debian GNU/Linux 'sid', the current unstable branch, I can't compile
Python at all, now:

c++  -Xlinker -export-dynamic python.o \
          ../libpython2.1.a   -lpthread -ldl  -lutil -lm  -o python
../libpython2.1.a(posixmodule.o): In function `posix_tmpnam':
/home/thomas/python/python-write/dist/src/Modules/./posixmodule.c:4115: the use of `tmpnam_r' is dangerous, better use `mkstemp'
../libpython2.1.a(posixmodule.o): In function `posix_tempnam':
/home/thomas/python/python-write/dist/src/Modules/./posixmodule.c:4071: the use of `tempnam' is dangerous, better use `mkstemp'
mv python ../python
make[1]: Leaving directory `/home/thomas/python/python-write/dist/src/Modules'
./python ./setup.py build
running build
running build_ext
Traceback (most recent call last):
  File "./setup.py", line 460, in ?
    main()
  File "./setup.py", line 455, in main
    ext_modules=[Extension('struct', ['structmodule.c'])]
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/core.py", line 138, in setup
    dist.run_commands()
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/dist.py", line 871, in run_commands
    self.run_command(cmd)
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/dist.py", line 891, in run_command
    cmd_obj.run()
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/command/build.py", line 106, in run
    self.run_command(cmd_name)
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/cmd.py", line 328, in run_command
    self.distribution.run_command(command)
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/dist.py", line 891, in run_command
    cmd_obj.run()
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/command/build_ext.py", line 202, in run
    customize_compiler(self.compiler)
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/sysconfig.py", line 121, in customize_compiler
    (cc, opt, ccshared, ldshared, so_ext) = \
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/sysconfig.py", line 389, in get_config_vars
    func()
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/sysconfig.py", line 302, in _init_posix
    raise DistutilsPlatformError, my_msg
distutils.errors.DistutilsPlatformError: invalid Python installation: unable to open /usr/lib/python2.1/config/Makefile (No such file or directory)
make: *** [sharedmods] Error 1

For the record, I don't have a /usr/lib/python2.1 directory on the other
machines either.

I haven't been able to test FreeBSD yet, will get to that later tonight.

And most importantly(!), on all these machines, 'make test' stops
functioning. In fact, after setup.py started building, you can't run 'make'
without 'make clean' anymore. You get a lot of undefined-symbol warnings
(see below.) If you run 'make clean;make test' it also doesn't work, because
the build directory is not in the Python library path, and regrtest.py
requires (at least) the time module.

c++  -Xlinker -export-dynamic python.o \
          ../libpython2.1.a   -lpthread -ldl  -lutil -lm  -o python 
../libpython2.1.a(posixmodule.o): In function `posix_tmpnam':
/home/thomas/python/python/dist/src/Modules/./posixmodule.c:4115: the use of `tmpnam_r' is dangerous, better use `mkstemp'
../libpython2.1.a(posixmodule.o): In function `posix_tempnam':
/home/thomas/python/python/dist/src/Modules/./posixmodule.c:4071: the use of `tempnam' is dangerous, better use `mkstemp'
../libpython2.1.a(myreadline.o): In function `my_fgets':
/home/thomas/python/python/dist/src/Parser/myreadline.c:41: undefined reference to `PyOS_InterruptOccurred'
/home/thomas/python/python/dist/src/Parser/myreadline.c:35: undefined reference to `PyOS_InterruptOccurred'
../libpython2.1.a(errors.o): In function `PyErr_SetFromErrnoWithFilename':
/home/thomas/python/python/dist/src/Python/errors.c:260: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(pythonrun.o): In function `Py_Finalize':
/home/thomas/python/python/dist/src/Python/pythonrun.c:193: undefined reference to `PyOS_FiniInterrupts'
../libpython2.1.a(pythonrun.o): In function `initsigs':
/home/thomas/python/python/dist/src/Python/pythonrun.c:1161: undefined reference to `PyOS_InitInterrupts'
../libpython2.1.a(traceback.o): In function `tb_printinternal':
/home/thomas/python/python/dist/src/Python/traceback.c:213: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(fileobject.o): In function `get_line':
/home/thomas/python/python/dist/src/Objects/fileobject.c:883: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(longobject.o): In function `long_format':
/home/thomas/python/python/dist/src/Objects/longobject.c:644: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(longobject.o): In function `x_divrem':
/home/thomas/python/python/dist/src/Objects/longobject.c:855: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(longobject.o): In function `long_mul':
/home/thomas/python/python/dist/src/Objects/longobject.c:1193: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(object.o):/home/thomas/python/python/dist/src/Objects/object.c:174: more undefined references to `PyErr_CheckSignals' follow
../libpython2.1.a(posixmodule.o): In function `posix_fork':
/home/thomas/python/python/dist/src/Modules/./posixmodule.c:1666: undefined reference to `PyOS_AfterFork'
../libpython2.1.a(posixmodule.o): In function `posix_forkpty':
/home/thomas/python/python/dist/src/Modules/./posixmodule.c:1733: undefined reference to `PyOS_AfterFork'
collect2: ld returned 1 exit status
make[1]: *** [link] Error 1
make[1]: Leaving directory `/home/thomas/python/python/dist/src/Modules'
make: *** [python] Error 2

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mal@lemburg.com  Wed Jan 17 22:56:58 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jan 2001 23:56:58 +0100
Subject: [Python-Dev] Standard install locations for Python ?
Message-ID: <3A66233A.A6AE07BD@lemburg.com>

I'm currently busy building new version of my mx packages. While
trying to convert all of them to distutils I found that there
seems to be no standard for installing documentation or other
data files of Python extensions. I also noted, that for Windows
the standard extension installation defaults to \Python instead
of some \Python\Site-Packages. So the general question is:

Where should Python extensions install themselves and their docs ?

(On Linux the typical place for docs is /usr/doc/packages,
for Python code it is /usr/local/lib/pythonX.X/site-packages,
BTW)

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From esr@thyrsus.com  Wed Jan 17 23:04:09 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 17 Jan 2001 18:04:09 -0500
Subject: [Python-Dev] Rich Comparisons technical prerelease
In-Reply-To: <200101170422.XAA20626@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Jan 16, 2001 at 11:22:54PM -0500
References: <200101170422.XAA20626@cj20424-a.reston1.va.home.com>
Message-ID: <20010117180409.A17897@thyrsus.com>

Guido van Rossum <guido@python.org>:
>   This makes it possible to define types with partial orderings.

Guido's time machine is working again, and seems now to have been
augmented by telepathy.  I was just thinking about bugging him about
this...

I will definitely check this out with my set() class -- it was waiting on
rich comparisons so I could do partial-orderings properly.  If it works,
we'll have set algebra for the standard library.  Coolness.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Under democracy one party always devotes its chief energies
to trying to prove that the other party is unfit to rule--and
both commonly succeed, and are right... The United States
has never developed an aristocracy really disinterested or an
intelligentsia really intelligent. Its history is simply a record
of vacillations between two gangs of frauds. 
	--- H. L. Mencken


From akuchlin@mems-exchange.org  Wed Jan 17 23:09:47 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Wed, 17 Jan 2001 18:09:47 -0500
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010117234925.A17392@xs4all.nl>; from thomas@xs4all.net on Wed, Jan 17, 2001 at 11:49:25PM +0100
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl>
Message-ID: <20010117180947.E9384@kronos.cnri.reston.va.us>

On Wed, Jan 17, 2001 at 11:49:25PM +0100, Thomas Wouters wrote:
>I have a couple of questions: what to do when setup.py doesn't work ? Is
>there a way to make it bypass a module ? What about specifying include dirs

There's a 'disabled_module_list' global in the code, but no way to set
it from the command-line yet, since I couldn't figure out how to do
that in time.

>On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
>setup.py. Also, SSL support for the socket module was not enabled, though
>OpenSSL is installed, in the default path.

Can you take a look at the detection code in setup.py and see what's
going wrong.  I believe it should be found if OpenSSL is in
/usr/local/, but /usr/contrib isn't checked currently.

>The Tcl/Tk header files are stored in /usr/include/tcl<ver>/ on Debian,
>which I personally like a lot, though it's probably a bitch to autodetect.
>(I tried, using autoconf ;-P)

There's code to handle Debian, though I have no way of testing it, and
it worked on Neil's Debian box for some reason.  Search for
debian_tcl_include in setup.py, and see if you can fix it.

>distutils.errors.DistutilsPlatformError: invalid Python installation: unable to open /usr/lib/python2.1/config/Makefile (No such file or directory)

Are you sure setup.py is up to date; do a 'cvs update setup.py' to check.  
You might get a "setup.py is in the way; remove it' message if you 
downloaded the first setup.py script manually.

>without 'make clean' anymore. You get a lot of undefined-symbol warnings
>(see below.) If you run 'make clean;make test' it also doesn't work, because
>the build directory is not in the Python library path, and regrtest.py
>requires (at least) the time module.

Again, be sure the tree is up to date; I think this stems from
attempting to compile the signal module as shared, which doesn't work.
I know that "make test" doesn't work, but am not sure how to fix it
yet.

--amk


From tim.one@home.com  Wed Jan 17 23:42:24 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 17 Jan 2001 18:42:24 -0500
Subject: [Python-Dev] Windows Python totally rad
Message-ID: <LNBBLJKPBEHFEDALKOLCIEFCIJAA.tim.one@home.com>

Windows Python runs normally again, modulo four test failures I figure are
due to the "get rid of assert" patch.

Note that the python20 DevStudio subproject is gone.  It's been replaced by
a new subproject named pythoncore.



From thomas@xs4all.net  Wed Jan 17 23:44:00 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 18 Jan 2001 00:44:00 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010117234925.A17392@xs4all.nl>; from thomas@xs4all.net on Wed, Jan 17, 2001 at 11:49:25PM +0100
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl>
Message-ID: <20010118004400.B17392@xs4all.nl>

On Wed, Jan 17, 2001 at 11:49:25PM +0100, Thomas Wouters wrote:

I got around to testing on FreeBSD now, and it actually went pretty smooth!
However, some small points:

> On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
> setup.py. Also, SSL support for the socket module was not enabled, though
> OpenSSL is installed, in the default path.

Curiously enough, FreeBSD, with OpenSSL installed in /usr/include/openssl,
*did* get the socketmodule compiled with SSL support, but without the
necessary -I directive, so the compile failed. 

> And most importantly(!), on all these machines, 'make test' stops
> functioning. In fact, after setup.py started building, you can't run 'make'
> without 'make clean' anymore. You get a lot of undefined-symbol warnings

Strangely enough, this problem does not exist on FreeBSD. I can run 'make'
or 'make test' after 'make' just fine. 'make test' still doesn't work
because of the incorrect library path, but it doesn't barf like the other
systems (BSDI and Debian Linux)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From esr@thyrsus.com  Thu Jan 18 00:32:53 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 17 Jan 2001 19:32:53 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <20010118000806.D1C04A828@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Thu, Jan 18, 2001 at 02:08:06AM +0200
References: <14949.46995.259157.871323@beluga.mojam.com> <20010118000806.D1C04A828@darjeeling.zadka.site.co.il>
Message-ID: <20010117193253.A18565@thyrsus.com>

Moshe Zadka <moshez@zadka.site.co.il>:
> I think that you're confused between two meanings of inverses.
> 
> You think:
> op is an inverse of op' if for every a,b  (a op b) = not (a op' b)
> 
> Guido meant (and I hope, implemented):
> op is an inverse of op' if for every a,b  (a op b) =  (b op' a)

I thought the same.

<pedantic role="defrocked mathematician">

if (a op1 b) <=> (b op2 a), op2 is properly described as the "reflection"
of op1, and vice-versa.

</pedantic>
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Sometimes the law defends plunder and participates in it. Sometimes
the law places the whole apparatus of judges, police, prisons and
gendarmes at the service of the plunderers, and treats the victim --
when he defends himself -- as a criminal.
	-- Frederic Bastiat, "The Law"


From greg@cosc.canterbury.ac.nz  Thu Jan 18 00:22:11 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Jan 2001 13:22:11 +1300 (NZDT)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <m3hf2y2m37.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <200101180022.NAA00898@s454.cosc.canterbury.ac.nz>

Michael Hudson <mwh21@cam.ac.uk>:

> a < b if and only if b > a.
> This is what the rich comparison code does.

Someone is bound to come up with a use for comparison
operator overloading in which this isn't true, just
to be difficult!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@python.org  Thu Jan 18 03:40:31 2001
From: guido@python.org (Guido van Rossum)
Date: Wed, 17 Jan 2001 22:40:31 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.106,2.107
In-Reply-To: Your message of "Wed, 17 Jan 2001 13:45:44 PST."
 <20010117134544.H7731@lyra.org>
References: <E14J06i-0003ty-00@usw-pr-cvs1.sourceforge.net>
 <20010117134544.H7731@lyra.org>
Message-ID: <200101180340.WAA00655@cj20424-a.reston1.va.home.com>

> > - Change the in-progress code to use static variables instead of
> >   globals (both the nesting level and the key for the thread dict were
> >   globals but have no reason to be globals; the key can even be a
> >   function-static variable in get_inprogress_dict()).
> 
> The "compare_nesting" variable is a bit troublesome long-term -- it will
> cause threading issues in a free-threaded implementation. The solution is to
> put the value into the thread-state.
> 
> [ not sure if it matters right now, but just bringing it up ]

Good point -- especially since the in-progress-dict is already part of
the thread state.  Jeremy explained to me that the compare_nesting
variable is mostly an optimization (avoiding the work with the
in-progress-dict when we don't know for sure that it's worth it) but
yes, mixing nesting levels (even if the dicts are separate) could
cause coupling or interference between threads...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@mojam.com (Skip Montanaro)  Thu Jan 18 04:20:30 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 17 Jan 2001 22:20:30 -0600 (CST)
Subject: [Python-Dev] urllib.urlencode & repeated values
Message-ID: <14950.28430.572215.10643@beluga.mojam.com>

I'm pretty sure this has come up before, but urllib.urlencode doesn't handle
repeated parameters properly.  If I call

    urllib.urlencode({"performers": ("U2","Lawrence Martin")})

instead of getting

    performers=U2&performers=Lawrence+Martin

I get a quoted stringified tuple:

    performers=%28%27U2%27%2c+%27Lawrence+Martin%27%29

Obviously, fixing this will change the function's current semantics, but I
think it's worth treating lists and tuples (actually, any sequence) as
repeated values.  If the existing semantics are deemed valuable enough, a
third default parameter could be added to switch on the new behavior when
desired.

If others agree I'd be happy to whip up a patch.  I think it's a bug.

Skip


From jeremy@alum.mit.edu  Thu Jan 18 02:58:19 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Wed, 17 Jan 2001 21:58:19 -0500 (EST)
Subject: [Python-Dev] bug in grammar
Message-ID: <14950.23499.275398.963621@localhost.localdomain>

As part of the implementation of PEP 227 (and in an attempt to reach
some low-hanging fruit Guido mentioned on the types-sig long ago), I
have been working on a compiler pass that generates a module-level
symbol table.  I recently discovered a bug in the handling of list
comprehensions that was giving me headaches.

I realize now that the problem is with the current grammar and/or
compiler.  Here's a simple demonstration; try it in your friendly
python 2.0 interpreter.

>>> [i for i in range(10)] = (1, 2, 3)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: unpack list of wrong size

The generated bytecode is:

          0 SET_LINENO               0

          3 SET_LINENO               1
          6 LOAD_CONST               0 (1)
          9 LOAD_CONST               1 (2)
         12 LOAD_CONST               2 (3)
         15 BUILD_TUPLE              3
         18 UNPACK_SEQUENCE          1
         21 STORE_NAME               0 (i)
         24 LOAD_CONST               3 (None)
         27 RETURN_VALUE        

I assume this isn't intended :-).  The compiler is ignoring everything
after the initial atom in the list comprehension.  It's basically
compiling the code as if it were:

[i] = (1, 2, 3)

I'm not sure how to try and fix this.  Should the grammar allow one to
construct the example statement above?  If not, I'm not sure how to
fix the grammar.  If not, I suppose the compiler should detect that
the list comp is misplaced.  This seems fairly messy, since there are
about 10 nodes between the expr_stmt and the list_for.

Or is this a cool way to use list comprehensions to generate
ValueErrors?

Jeremy


From akuchlin@mems-exchange.org  Thu Jan 18 05:19:31 2001
From: akuchlin@mems-exchange.org (A.M. Kuchling)
Date: Thu, 18 Jan 2001 00:19:31 -0500
Subject: [Python-Dev] Embedded language discussion
Message-ID: <200101180519.AAA00612@207-172-111-227.s227.tnt1.ann.va.dialup.rcn.com>

http://www.kuro5hin.org/?op=displaystory;sid=2001/1/16/11334/2280

The poster is on a project that's trying to use Python, but they're
encountering unspecified problems (perhaps because of the global
interpreter lock).

--amk


From mal@lemburg.com  Thu Jan 18 09:32:54 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jan 2001 10:32:54 +0100
Subject: [Python-Dev] Windows Python totally rad
References: <LNBBLJKPBEHFEDALKOLCIEFCIJAA.tim.one@home.com>
Message-ID: <3A66B846.3D24B959@lemburg.com>

Tim Peters wrote:
> 
> Windows Python runs normally again, modulo four test failures I figure are
> due to the "get rid of assert" patch.

Could you tell me which these are ? The tests tested all passed
just fine, so I guess these must be Windows-related problems.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fredrik@effbot.org  Thu Jan 18 06:48:41 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Thu, 18 Jan 2001 07:48:41 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <LNBBLJKPBEHFEDALKOLCGENEIIAA.tim.one@home.com> <012901c080a5$306023a0$e46940d5@hagrid>
Message-ID: <008701c0811a$b3371c00$e46940d5@hagrid>

I wrote:
> I've almost sorted it all out.  will check it in later tonight (local
> time).

python build problems and real life got in the way.

will 2.1a1 be released according to plan?  will there
be a 2.1a2 release?  maybe I should postpone this?

</F>



From esr@thyrsus.com  Thu Jan 18 07:23:21 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Thu, 18 Jan 2001 02:23:21 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
Message-ID: <20010118022321.A9021@thyrsus.com>

So I'm writing a module to that needs to generate unique cookies.  The
module will run inside one of two environments: (1) a trivial test wrapper,
not threaded, and (2) a lomg-running multithreaded server.

Because Python garbage-collects, hash() of a just-created object isn't
good enough.  Because we may be threading, millisecond time isn't
good enough.  Because we may *not* be threading, thread ID isn't good
either.  

On the other hand, I'm on Linux getting millisecond time resolution.
And it's not hard to notice that an object hash is a memory address.

So, how about `time.time()` + hex(hash([]))?

It looks to me like this will remain unique forever, because another thread
would have to create an object at the same memory address during the same
millisecond to collide.

Furthermore, it looks to me like this hack might be portable to any OS
with a clock tick shorter than its timeslice.

Comments?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Good intentions will always be pleaded for every assumption of
authority. It is hardly too strong to say that the Constitution was
made to guard the people against the dangers of good intentions. There
are men in all ages who mean to govern well, but they mean to
govern. They promise to be good masters, but they mean to be masters.
	-- Daniel Webster


From ping@lfw.org  Thu Jan 18 09:29:13 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Thu, 18 Jan 2001 01:29:13 -0800 (PST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <200101161402.JAA05045@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org>

On Tue, 16 Jan 2001, Guido van Rossum wrote:
> You mean the tp_print and tp_str function slots in type objects,
> right?  tp_print *should* always render exactly the same as tp_str.
> tp_print is used by the print statement, not by value display at the
> interactive prompt.

Uh, i hate to disagree with you about your own interpreter, but:

    com_expr_stmt in Python/compile.c
        inserts a PRINT_EXPR opcode if c_interactive is true;
    eval_code2 in Python/ceval.c
        handles PRINT_EXPR by calling displayhook;
    sys_displayhook in Python/sysmodule.c
        prints the object by calling PyFile_WriteObject on sys.stdout;
    PyFile_WriteObject in Objects/fileobject.c
        calls PyObject_Print if the file is really a PyFileObject;
    PyObject_Print in Objects/object.c
        calls op->ob_type->tp_print if it's not NULL.

The print statement produces a PRINT_ITEM opcode, which invokes
PyFile_WriteObject with a Py_PRINT_RAW flag.  That Py_PRINT_RAW
flag is propagated down to PyObject_Print and into string_print,
where it causes the string to fwrite itself directly without quoting.

> So, string_print most definitely should *not* be changed -- only
> string_repr!

I had to change them both before i actually saw the change in the
interactive interpreter.  Actually, your statement above (that the
two should always render the same) seems to imply that if i change
one, i must also change the other.


-- ?!ng



From sjoerd@oratrix.nl  Thu Jan 18 10:11:09 2001
From: sjoerd@oratrix.nl (Sjoerd Mullender)
Date: Thu, 18 Jan 2001 11:11:09 +0100
Subject: [Python-Dev] distutils in Python 2.1 not ready for prime time
Message-ID: <20010118101110.6D29C31E1B8@bireme.oratrix.nl>

I just updated my copy of python with the current CVS version and I am
not happy.

The current version uses distutils for configuring and compiling most
modules that are written in C.  That is a nice idea in theory, but in
practice it's not ready for prime time yet.  The major advantage of
using a Setup file is that you can add your own -I and -L compiler
flags on a module-by-module basis.  I *need* those flags since not all
libraries and include files are in standard places (e.g. I need
-I/usr/local/include and -L/usr/local/lib for some modules which my
compiler doesn't provide by itself).  There seems to be no way to tell
distutils to supply those flags.  The documentation (only on the web
site, also not great, but I assume more documentation (at least an
up-to-date README) will be provided in the final release) says that
that has not yet been implemented.

-- Sjoerd Mullender <sjoerd.mullender@oratrix.com>


From ping@lfw.org  Thu Jan 18 10:14:19 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Thu, 18 Jan 2001 02:14:19 -0800 (PST)
Subject: [Python-Dev] Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: <3A66BCCC.14997FE3@lemburg.com>
Message-ID: <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org>

I hope you don't mind that i'm taking this over to python-dev,
because it led me to discover a more general issue (see below).

For the others on python-dev, here's the background: MAL was
about to check in the unistr() function, described as follows:

> This patch adds a utility function unistr() which works just like
> the standard builtin str()  -- only that the return value will
> always be a Unicode object.
> 
> The patch also adds a new object level C API PyObject_Unicode()
> which complements PyObject_Str().

I responded:
> Why are unistr() and unicode() two separate functions?
> 
> str() performs one task: convert to string.  It can convert anything,
> including strings or Unicode strings, numbers, instances, etc.
> 
> The other type-named functions e.g. int(), long(), float(), list(),
> tuple() are similar in intent.
> 
> Why have unicode() just for converting strings to Unicode strings,
> and unistr() for converting everything else to a Unicode string?
> What does unistr(x) do differently from unicode(x) if x is a string?

MAL responded:
> unistr() is meant to complement str() very closely. unicode()
> works as constructor for Unicode objects which can also take
> care of decoding encoded data. str() and unistr() don't provide
> this capability but instead always assume the default encoding.
> 
> There's also a subtle difference in that str() and unistr() 
> try the tp_str slot which unicode() doesn't. unicode()
> supports any character buffer which str() and unistr() don't.

Okay, given this explanation, i still feel fairly confident
that unicode() should subsume unistr().  Many of the other
type-named functions try various slots:

    int() looks for __int__
    float() looks for __float__
    long() looks for __long__
    str() looks for __str__

In testing this i also discovered the following:

    >>> class Foo:
    ...     def __int__(self):
    ...         return 3
    ... 
    >>> f = Foo()
    >>> int(f)
    3
    >>> long(f) 
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    AttributeError: Foo instance has no attribute '__long__'
    >>> float(f)
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    AttributeError: Foo instance has no attribute '__float__'

This is kind of surprising.  How about:

    int() looks for __int__
    float() looks for __float__, then tries __int__
    long() looks for __long__, then tries __int__
    str() looks for __str__
    unicode() looks for __unicode__, then tries __str__

The extra parameter to unicode() is very similar to the extra
parameter to int(), so i think there is a natural parallel here.

Hmm... what about the other types?

Wow!!  __complex__ can produce a segfault!

    >>> complex
    <built-in function complex>
    >>> class Foo:
    ...   def __complex__(self): return 3
    ... 
    >>> Foo()
    <__main__.Foo instance at 0x81e8684>
    >>> f = _
    >>> complex(f)
    Segmentation fault (core dumped)

This happens because builtin_complex first retrieves and saves
the PyNumberMethods of the argument (in this case, from the
instance), then tries to call __complex__ (in this case, returning 3),
and THEN coerces the result using nbr->nb_float if the result is
not complex!  (This calls the instance's nb_float method on the
integer object 3!!)

I think __complex__ should probably look for __complex__, then
__float__, then __int__.

One could argue for __list__, __tuple__, or __dict__, but that
seems much weaker; the Pythonic way has always been to implement
__getitem__ instead.  There is no built-in dict(); if it existed
i suppose it would do the opposite of x.items(); again a weak
argument, though i might have found such a function useful once
or twice.

And that about covers the built-in types for data.


-- ?!ng



From ping@lfw.org  Thu Jan 18 10:16:42 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Thu, 18 Jan 2001 02:16:42 -0800 (PST)
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org>

On Thu, 18 Jan 2001, Ka-Ping Yee wrote:
>     str() looks for __str__

Oops.  I forgot that

      str() looks for __str__, then tries __repr__

So, presumably,

      unicode() should look for __unicode__, then __str__, then __repr__


-- ?!ng



From mal@lemburg.com  Thu Jan 18 10:51:46 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jan 2001 11:51:46 +0100
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. unistr()
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org>
Message-ID: <3A66CAC2.74FC894@lemburg.com>

Ka-Ping Yee wrote:
> 
> On Thu, 18 Jan 2001, Ka-Ping Yee wrote:
> >     str() looks for __str__
> 
> Oops.  I forgot that
> 
>       str() looks for __str__, then tries __repr__
> 
> So, presumably,
> 
>       unicode() should look for __unicode__, then __str__, then __repr__

Not quite... str() does this:

1. strings are passed back as-is
2. the type slot tp_str is tried
3. the method __str__ is tried
4. Unicode returns are converted to strings
5. anything other than a string return value is rejected

unistr() does the same, but makes sure that the return
value is an Unicode object.

unicode() does the following:

1. for instances, __str__ is called
2. Unicode objects are returned as-is
3. string objects or character buffers are used as basis for decoding
4. decoding is applied to the character buffer and the results
   are returned

I think we should perhaps merge the two approaches into one
which then applies all of the above in unicode() (and then
forget about unistr()). This might lose hide some type errors,
but since all other generic constructors behave more or less
in the same way, I think unicode() should too.

Thoughts ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From martin@mira.cs.tu-berlin.de  Thu Jan 18 10:48:30 2001
From: martin@mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 11:48:30 +0100
Subject: [Python-Dev] Having extensions builtin
Message-ID: <200101181048.f0IAmU210251@mira.informatik.hu-berlin.de>

With the new distutils configuration scheme, it appears to be
difficult to build modules in a non-shared way. Building modules
non-shared is desirable when freezing is attempted, and also to reduce
the startup time and memory consumption.

It is still possible to add modules to Setup or Setup.local, so that
they will be build into the interpreter. However, setup.py will still
build them in a shared way afterwards. I propose that setup.py builds
only those modules that are not builtin.

Regards,
Martin



From martin@mira.cs.tu-berlin.de  Thu Jan 18 12:20:06 2001
From: martin@mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 13:20:06 +0100
Subject: [Python-Dev] Standard install locations for Python ?
Message-ID: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de>

> Where should Python extensions install themselves and their docs?

I feel that extensions should not need to care. For extensions,
distutils will pick a location, and the system administrator
configuration the package can chose a different location.

Unfortunately, distutils does not support the installation of
documentation, which I think it should.

Now switching sides, as an administrator, I'd wish distutils to follow
the system conventions by default. 

That means on Linux, documentation should go into the system's <doc>
directory, which is /usr/share/doc according to latest
standards. Distributions vary, so distutils should find out - e.g. by
querying the location from rpm. In addition, when building RPMs,
distutils should declare these files as %doc in the spec file, so RPM
will install it following the system conventions.

On Windows, the convention apparently is to put the documentation
"nearby" the software, so it should probably go into Doc or a
subdirectory thereof.

On Unix, there appears to be no standard location, unless the
documentation consists of man pages or perhaps info files. So
<prefix>/share/doc is probably a place as good as any other.

Regards,
Martin


From martin@mira.cs.tu-berlin.de  Thu Jan 18 10:39:30 2001
From: martin@mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 11:39:30 +0100
Subject: [Python-Dev] SSL detection problem
Message-ID: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de>

The distutils-based configuration fails to build on my system (SuSE
7.0) with the error

/usr/src/python/Modules/socketmodule.c:159: rsa.h: Datei oder Verzeichnis nicht
gefunden
/usr/src/python/Modules/socketmodule.c:160: crypto.h: Datei oder Verzeichnis nicht gefunden
/usr/src/python/Modules/socketmodule.c:161: x509.h: Datei oder Verzeichnis nicht gefunden
/usr/src/python/Modules/socketmodule.c:162: pem.h: Datei oder Verzeichnis nicht
gefunden
/usr/src/python/Modules/socketmodule.c:163: ssl.h: Datei oder Verzeichnis nicht
gefunden                                                                       

The problem is that these header files are in /usr/include/openssl,
which is not in the standard include search path.

So the obvious request is: could this be fixed? I guess when setup.py
finds the openssl library, it should also try to find ssl.h, in some
obvious locations.

The not-so-obvious question: How can one work-around such a problem
with the new setup scheme? In the old scheme, I could have chosen to
either provide the right -I option in Modules/Setup, to disable SSL
support, or to disable the _socket module altogether. How can I
achieve either configuration with the new scheme?

Regards,
Martin

P.S. As a quick hack, I added a custom include_dirs parameter to the
SSL extension.


From martin@mira.cs.tu-berlin.de  Thu Jan 18 12:39:54 2001
From: martin@mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 13:39:54 +0100
Subject: [Python-Dev] bug in grammar
Message-ID: <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de>

> Should the grammar allow one to construct the example statement
> above?

It should not. Please note that the grammar allows a number of other
things, e.g.

  a+b = c

(pass this to parser.suite to see details)

> If not, I'm not sure how to fix the grammar.

The central problem is that it allows testlist on the LHS of an
augassign or '=', whereas the languages only allows a small subset in
that position. It is not possible to restrict the grammar in itself,
as that will necessarily produce a conflict - you only know that the
'+' was incorrect when you see the '='.

> I suppose the compiler should detect that the list comp is misplaced

I think there should be a well-formedness pass in-between. I.e. after
the AST has been build, a single pass should descend through the tree,
looking for an expr_statement with more than a single testlist. Once
it finds one, it should confirm that this really is a well-formed
lvalue (in C speak). In this case, the test should be that each term
is a an atom without factors.

If the parser itself performs such checks, the compiler could be
simplified in many places, I guess.

Regards,
Martin


From thomas@xs4all.net  Thu Jan 18 09:53:14 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 18 Jan 2001 10:53:14 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010117180947.E9384@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Wed, Jan 17, 2001 at 06:09:47PM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010117180947.E9384@kronos.cnri.reston.va.us>
Message-ID: <20010118105314.D17392@xs4all.nl>

On Wed, Jan 17, 2001 at 06:09:47PM -0500, Andrew Kuchling wrote:

> >On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
> >setup.py. Also, SSL support for the socket module was not enabled, though
> >OpenSSL is installed, in the default path.
> 
> Can you take a look at the detection code in setup.py and see what's
> going wrong.  I believe it should be found if OpenSSL is in
> /usr/local/, but /usr/contrib isn't checked currently.

Well, OpenSSL rests in the default location, which is
/usr/local/ssl/include/openssl. Haven't the time to look into it right now,
sorry.

> >The Tcl/Tk header files are stored in /usr/include/tcl<ver>/ on Debian,
> >which I personally like a lot, though it's probably a bitch to autodetect.
> >(I tried, using autoconf ;-P)

> There's code to handle Debian, though I have no way of testing it, and
> it worked on Neil's Debian box for some reason.  Search for
> debian_tcl_include in setup.py, and see if you can fix it.

Ah, yes. The problem in my case is that the *library* files are just in
/usr/lib, but the include files are not. I re-indented the code to pull the
debian-specific code out of the 'if prefix + os.sep + 'lib' not in
lib_dirs' block, and it works now. Haven't tested it on other code yet, but
I think it should work regardless.

> >distutils.errors.DistutilsPlatformError: invalid Python installation: unable to open /usr/lib/python2.1/config/Makefile (No such file or directory)

> Are you sure setup.py is up to date; do a 'cvs update setup.py' to check.  
> You might get a "setup.py is in the way; remove it' message if you 
> downloaded the first setup.py script manually.

D'oh, I guess not. I thought I did (I did on all other platforms :) but I
guess I didn't, 'cause it works now. Thanx.

> >without 'make clean' anymore. You get a lot of undefined-symbol warnings
> >(see below.) If you run 'make clean;make test' it also doesn't work, because
> >the build directory is not in the Python library path, and regrtest.py
> >requires (at least) the time module.

> Again, be sure the tree is up to date; I think this stems from
> attempting to compile the signal module as shared, which doesn't work.

This happened even with completely fresh, newly checked out trees, on all
but FreeBSD (three different trees: Debian woody, BSDI 4.0 and BSDI 4.1) so
I'm pretty sure that's not it.

It works now, though, so I guess the move from a dynamic signalmodule to a
static one does the trick ;) I got 'make test' working by applying the
following patch to Makefile{,.in}, and running 'make PYTHONPATH=.:<builddir>
test' (determining builddir by hand, for now.):

***************
*** 216,223 ****
  TESTPYTHON=   ./python$(EXE) -tt
  test:         all
                -rm -f $(srcdir)/Lib/test/*.py[co]
!               -PYTHONPATH= $(TESTPYTHON) $(TESTPROG) $(TESTOPTS)
!               PYTHONPATH= $(TESTPYTHON) $(TESTPROG) $(TESTOPTS)
  
  # Install everything
  install:      altinstall bininstall maninstall
--- 216,223 ----
  TESTPYTHON=   ./python$(EXE) -tt
  test:         all
                -rm -f $(srcdir)/Lib/test/*.py[co]
!               -PYTHONPATH=$(PYTHONPATH) $(TESTPYTHON) $(TESTPROG) $(TESTOPTS)
!               PYTHONPATH=$(PYTHONPATH) $(TESTPYTHON) $(TESTPROG) $(TESTOPTS)
  
  # Install everything
  install:      altinstall bininstall maninstall

And because of that, I also noticed something funny: BSDI calls itself
'BSD/OS <version>', so distutils actually makes a directory called 'lib.bsd'
and 'temp.bsd', with inside those a directory 'os-<version>-i386-2.1'. Is
that a distutils bug, a setup.py bug, or intentional behaviour of one of the
two ?


-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From nas@arctrix.com  Thu Jan 18 07:59:22 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Wed, 17 Jan 2001 23:59:22 -0800
Subject: [Python-Dev] new Makefile.in
Message-ID: <20010117235922.A12356@glacier.fnational.com>

Spurred on by comments made by Andrew, I spent some time last
night overhauling the Python Makefiles.  I now have a toplevel
non-recursive Makefile.in that seems to work fairly well.  I'm
pretty sure it still should be portable.  It doesn't use includes
or any special GNU make features.  It is half the size of the old
Makefiles.  The build is faster and its now easier to follow if
something goes wrong.

A question: is it possible to break the Python static library up?
For example, instead of having libpython<version>.a have
Parser/parser<version>.a, Objects/objects<version>.a, etc?  There
would still only be one shared library.  This would speed up
incremental builds and also help Andrew with PEP 229.  I'm
thinking that the Makefile do something like this:

    all: python$(EXE)

    PYLIBS= Parser/parser.a Objects/objects.a ...  Modules/modules.a

    python$(EXE): $(PYLIBS)
        $(LINKCC) -o python$(EXE) $(PYLIBS) ...

    Modules/modules.a: minpython$(EXE)
        ./minpython$(EXE) setup.py


AFACT, the only thing affected by splitting up the static library
is Misc/Makefile.pre.in.  Is this correct?

  Neil


From guido@digicool.com  Thu Jan 18 14:52:23 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 09:52:23 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Your message of "Thu, 18 Jan 2001 13:22:11 +1300."
 <200101180022.NAA00898@s454.cosc.canterbury.ac.nz>
References: <200101180022.NAA00898@s454.cosc.canterbury.ac.nz>
Message-ID: <200101181452.JAA06899@cj20424-a.reston1.va.home.com>

> > a < b if and only if b > a.
> > This is what the rich comparison code does.
> 
> Someone is bound to come up with a use for comparison
> operator overloading in which this isn't true, just
> to be difficult!

They'll get what they deserve -- this will be clearly documented!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@alum.mit.edu  Thu Jan 18 15:15:25 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jan 2001 10:15:25 -0500 (EST)
Subject: [Python-Dev] Re: bug in grammar
In-Reply-To: <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de>
References: <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de>
Message-ID: <14951.2189.14393.52725@localhost.localdomain>

If I summarize your suggestion, I think you've said that ideally the
grammar should not allow assignment to list comprehensions (or a
variety of other constructs) -- but it doesn't so the compiler has to
deal with it.

This morning it seemed a lot easier to fix the bug than it did last
night :-).  com_assign() already has a number of checks for syntax
errors in assignments.  A test for list comprehensions belongs at the
same place as tests for assignment to [] and augmented assignments
applied to lists.

I'll include a fix for assignment to list comprehensions in my big
compiler patch.

Jeremy



From akuchlin@mems-exchange.org  Thu Jan 18 15:28:19 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 18 Jan 2001 10:28:19 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <20010118022321.A9021@thyrsus.com>; from esr@thyrsus.com on Thu, Jan 18, 2001 at 02:23:21AM -0500
References: <20010118022321.A9021@thyrsus.com>
Message-ID: <20010118102819.A21503@kronos.cnri.reston.va.us>

On Thu, Jan 18, 2001 at 02:23:21AM -0500, Eric S. Raymond wrote:
>And it's not hard to notice that an object hash is a memory address.

Unless the object defines __hash__()!  If you want the memory address, 
use id() instead.

--amk


From akuchlin@mems-exchange.org  Thu Jan 18 15:30:36 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 18 Jan 2001 10:30:36 -0500
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010118004400.B17392@xs4all.nl>; from thomas@xs4all.net on Thu, Jan 18, 2001 at 12:44:00AM +0100
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl>
Message-ID: <20010118103036.B21503@kronos.cnri.reston.va.us>

>On Wed, Jan 17, 2001 at 11:49:25PM +0100, Thomas Wouters wrote:
>> On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
>> setup.py. Also, SSL support for the socket module was not enabled, though
>> OpenSSL is installed, in the default path.

What does the layout of /usr/contrib look like?  Is it
/usr/contrib/openssl/include/, /usr/contrib/include/, or something
else?

>Strangely enough, this problem does not exist on FreeBSD. I can run 'make'
>or 'make test' after 'make' just fine. 'make test' still doesn't work
>because of the incorrect library path, but it doesn't barf like the other
>systems (BSDI and Debian Linux)

Have you already run "make install"?  Perhaps it's picking up the
already-installed modules when running "make test", because it really
shouldn't be working.

--amk



From gward@cnri.reston.va.us  Thu Jan 18 15:42:51 2001
From: gward@cnri.reston.va.us (Greg Ward)
Date: Thu, 18 Jan 2001 10:42:51 -0500
Subject: [Python-Dev] Where's Greg Ward ?
In-Reply-To: <3A6237D7.673BBB30@lemburg.com>; from mal@lemburg.com on Mon, Jan 15, 2001 at 12:35:51AM +0100
References: <3A6237D7.673BBB30@lemburg.com>
Message-ID: <20010118104250.A27049@thrak.cnri.reston.va.us>

On 15 January 2001, M.-A. Lemburg said:
> He seems to be offline and the people on the distutils list have some
> patches and other things which would be nice to have in distutils 
> for 2.1.

Tim was right -- I'm *really* close to being back online.  Just have to
figure out why qmail's not answering port 25 and why LILO doesn't like my
newly repartitioned hard drive, and all will be well.  Oh yeah, and getting
insurance, and a credit card, and unpacking all these cardboard boxes, and
getting some furniture, ...

(If anyone is considering it, I do *not* recommend buying a new computer,
moving internationally, and getting a high speed home Internet connection
all at the same time.)

BTW I quite approve of Andrew being temporary Distutils dictator.  Should
have done it in December, but I didn't think I'd be out of commission for so
long.  Sigh.

        Greg


From moshez@zadka.site.co.il  Fri Jan 19 00:19:45 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Fri, 19 Jan 2001 02:19:45 +0200 (IST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org>
Message-ID: <20010119001945.80DC8A83E@darjeeling.zadka.site.co.il>

On Thu, 18 Jan 2001 01:29:13 -0800 (PST), Ka-Ping Yee <ping@lfw.org> wrote:
> On Tue, 16 Jan 2001, Guido van Rossum wrote:
> > You mean the tp_print and tp_str function slots in type objects,
> > right?  tp_print *should* always render exactly the same as tp_str.
> > tp_print is used by the print statement, not by value display at the
> > interactive prompt.
> 
> Uh, i hate to disagree with you about your own interpreter, but:
> 
>     com_expr_stmt in Python/compile.c
>         inserts a PRINT_EXPR opcode if c_interactive is true;
>     eval_code2 in Python/ceval.c
>         handles PRINT_EXPR by calling displayhook;
>     sys_displayhook in Python/sysmodule.c
>         prints the object by calling PyFile_WriteObject on sys.stdout;
>     PyFile_WriteObject in Objects/fileobject.c
>         calls PyObject_Print if the file is really a PyFileObject;
>     PyObject_Print in Objects/object.c
>         calls op->ob_type->tp_print if it's not NULL.
> 
> The print statement produces a PRINT_ITEM opcode, which invokes
> PyFile_WriteObject with a Py_PRINT_RAW flag.  That Py_PRINT_RAW
> flag is propagated down to PyObject_Print and into string_print,
> where it causes the string to fwrite itself directly without quoting.
> 
> > So, string_print most definitely should *not* be changed -- only
> > string_repr!
> 
> I had to change them both before i actually saw the change in the
> interactive interpreter.  Actually, your statement above (that the
> two should always render the same) seems to imply that if i change
> one, i must also change the other.
> 
> 
> -- ?!ng
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> 
> 
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From guido@digicool.com  Thu Jan 18 16:23:19 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 11:23:19 -0500
Subject: [Python-Dev] unistr() vs. unicode()
Message-ID: <200101181623.LAA07389@cj20424-a.reston1.va.home.com>

Ping wrote in response to a SourceForge mail about MAL's unistr()
checking:

------- Forwarded Message

Date:    Wed, 17 Jan 2001 23:51:48 -0800
From:    Ka-Ping Yee <ping@lfw.org>
To:      noreply@sourceforge.net
cc:      mal@lemburg.com, guido@python.org, patches@python.org
Subject: Re: [Patches] [Patch #101664] Add new unistr() builtin + PyObject_Unic
	  ode() C API

On Wed, 17 Jan 2001 noreply@sourceforge.net wrote:
> Comment:
> This patch adds a utility function unistr() which works just like
> the standard builtin str()  -- only that the return value will
> always be a Unicode object.

Sorry for barging in, but i have an issue/question:

Why are unistr() and unicode() two separate functions?

str() performs one task: convert to string.  It can convert anything,
including strings or Unicode strings, numbers, instances, etc.

The other type-named functions e.g. int(), long(), float(), list(),
tuple() are similar in intent.

Why have unicode() just for converting strings to Unicode strings,
and unistr() for converting everything else to a Unicode string?
What does unistr(x) do differently from unicode(x) if x is a string?


- -- ?!ng

------- End of Forwarded Message

(And no, Tim, this did *not* end up in the patches list because I made
Barry remove the reply-to.  SourceForge mails never had reply-to to
begin with.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Thu Jan 18 16:28:12 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 11:28:12 -0500
Subject: [Python-Dev] urllib.urlencode & repeated values
In-Reply-To: Your message of "Wed, 17 Jan 2001 22:20:30 CST."
 <14950.28430.572215.10643@beluga.mojam.com>
References: <14950.28430.572215.10643@beluga.mojam.com>
Message-ID: <200101181628.LAA07406@cj20424-a.reston1.va.home.com>

> I'm pretty sure this has come up before, but urllib.urlencode doesn't handle
> repeated parameters properly.  If I call
> 
>     urllib.urlencode({"performers": ("U2","Lawrence Martin")})
> 
> instead of getting
> 
>     performers=U2&performers=Lawrence+Martin
> 
> I get a quoted stringified tuple:
> 
>     performers=%28%27U2%27%2c+%27Lawrence+Martin%27%29
> 
> Obviously, fixing this will change the function's current semantics, but I
> think it's worth treating lists and tuples (actually, any sequence) as
> repeated values.  If the existing semantics are deemed valuable enough, a
> third default parameter could be added to switch on the new behavior when
> desired.
> 
> If others agree I'd be happy to whip up a patch.  I think it's a bug.

Agreed.  If you can come up with something that supports all sequence
types, and treats singleton sequences the same as their one and only
item, it would even be the inverse of cgi.parse_qs()!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Thu Jan 18 16:43:49 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 18 Jan 2001 17:43:49 +0100
Subject: [Python-Dev] Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org>; from ping@lfw.org on Thu, Jan 18, 2001 at 02:14:19AM -0800
References: <3A66BCCC.14997FE3@lemburg.com> <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org>
Message-ID: <20010118174349.E17392@xs4all.nl>

On Thu, Jan 18, 2001 at 02:14:19AM -0800, Ka-Ping Yee wrote:

> Wow!!  __complex__ can produce a segfault!

>     >>> complex
>     <built-in function complex>
>     >>> class Foo:
>     ...   def __complex__(self): return 3
>     ... 
>     >>> Foo()
>     <__main__.Foo instance at 0x81e8684>
>     >>> f = _
>     >>> complex(f)
>     Segmentation fault (core dumped)

> This happens because builtin_complex first retrieves and saves
> the PyNumberMethods of the argument (in this case, from the
> instance), then tries to call __complex__ (in this case, returning 3),
> and THEN coerces the result using nbr->nb_float if the result is
> not complex!  (This calls the instance's nb_float method on the
> integer object 3!!)

I've noticed that lurking bug in the coercion code when I added augmented
assignment, though I don't recall whether I fixed it then, nor do I know if
that part's been "touched" by the recent coercion changes. If none of the
coercion champions speak up, I'll look at this sometime this weekend.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From akuchlin@mems-exchange.org  Thu Jan 18 16:50:28 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 18 Jan 2001 11:50:28 -0500
Subject: [Python-Dev] SSL detection problem
In-Reply-To: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de>; from martin@mira.cs.tu-berlin.de on Thu, Jan 18, 2001 at 11:39:30AM +0100
References: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de>
Message-ID: <20010118115028.D21503@kronos.cnri.reston.va.us>

On Thu, Jan 18, 2001 at 11:39:30AM +0100, Martin v. Loewis wrote:
>The problem is that these header files are in /usr/include/openssl,
>which is not in the standard include search path.

I have an improved version of setup.py (not checked in yet) that tries
to do better, checking for both header and library files.  One point:
the OpenSSL docs imply that the headers should be loaded as
<openssl/rsa.h>, not as <rsa.h>; the header files themselves use the
openssl/*.h form, which means you'd need two -I directives..  I'll
patch the socket module accordingly.

>The not-so-obvious question: How can one work-around such a problem
>with the new setup scheme? In the old scheme, I could have chosen to
>either provide the right -I option in Modules/Setup, to disable SSL
>support, or to disable the _socket module altogether. How can I
>achieve either configuration with the new scheme?

I still need to implement command-line options to specify such
overrides, but that couldn't possibly get done in time for alpha1.  I
was thinking of something like --<modulename>-libs="foo bar",
--<modulename>-includes="/usr/include/blah/", and so forth.
Suggestions for a better interface welcomed...

--amk


From guido@digicool.com  Thu Jan 18 16:55:39 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 11:55:39 -0500
Subject: [Python-Dev] bug in grammar
In-Reply-To: Your message of "Wed, 17 Jan 2001 21:58:19 EST."
 <14950.23499.275398.963621@localhost.localdomain>
References: <14950.23499.275398.963621@localhost.localdomain>
Message-ID: <200101181655.LAA08001@cj20424-a.reston1.va.home.com>

> As part of the implementation of PEP 227 (and in an attempt to reach
> some low-hanging fruit Guido mentioned on the types-sig long ago), I
> have been working on a compiler pass that generates a module-level
> symbol table.  I recently discovered a bug in the handling of list
> comprehensions that was giving me headaches.
> 
> I realize now that the problem is with the current grammar and/or
> compiler.  Here's a simple demonstration; try it in your friendly
> python 2.0 interpreter.
> 
> >>> [i for i in range(10)] = (1, 2, 3)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> ValueError: unpack list of wrong size
> 
> The generated bytecode is:
> 
>           0 SET_LINENO               0
> 
>           3 SET_LINENO               1
>           6 LOAD_CONST               0 (1)
>           9 LOAD_CONST               1 (2)
>          12 LOAD_CONST               2 (3)
>          15 BUILD_TUPLE              3
>          18 UNPACK_SEQUENCE          1
>          21 STORE_NAME               0 (i)
>          24 LOAD_CONST               3 (None)
>          27 RETURN_VALUE        
> 
> I assume this isn't intended :-).  The compiler is ignoring everything
> after the initial atom in the list comprehension.  It's basically
> compiling the code as if it were:
> 
> [i] = (1, 2, 3)
> 
> I'm not sure how to try and fix this.  Should the grammar allow one to
> construct the example statement above?  If not, I'm not sure how to
> fix the grammar.  If not, I suppose the compiler should detect that
> the list comp is misplaced.  This seems fairly messy, since there are
> about 10 nodes between the expr_stmt and the list_for.
> 
> Or is this a cool way to use list comprehensions to generate
> ValueErrors?

Good catch!  Not everything cool deserves to be preserved.

It looks like this happens because the code that traverses lists on
the left-hand side of an assignment was never told about list
comprehensions.  You're right that the grammar can't be fixed; it's
for the same reason that it can't be fixed to disallow "f() = 1".

The solution is to add a test for this to the compiler that flags this
as an error.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Thu Jan 18 17:01:02 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 12:01:02 -0500
Subject: [Python-Dev] Embedded language discussion
In-Reply-To: Your message of "Thu, 18 Jan 2001 00:19:31 EST."
 <200101180519.AAA00612@207-172-111-227.s227.tnt1.ann.va.dialup.rcn.com>
References: <200101180519.AAA00612@207-172-111-227.s227.tnt1.ann.va.dialup.rcn.com>
Message-ID: <200101181701.MAA08046@cj20424-a.reston1.va.home.com>

> http://www.kuro5hin.org/?op=displaystory;sid=2001/1/16/11334/2280
> 
> The poster is on a project that's trying to use Python, but they're
> encountering unspecified problems (perhaps because of the global
> interpreter lock).

I've sent the poster an email asking to be more specific about his
questions; probably doing the right dance when calling Python from a
thread created in C++ should do the trick.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Thu Jan 18 17:04:43 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 12:04:43 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Thu, 18 Jan 2001 01:29:13 PST."
 <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org>
Message-ID: <200101181704.MAA08074@cj20424-a.reston1.va.home.com>

> On Tue, 16 Jan 2001, Guido van Rossum wrote:
> > You mean the tp_print and tp_str function slots in type objects,
> > right?  tp_print *should* always render exactly the same as tp_str.
> > tp_print is used by the print statement, not by value display at the
> > interactive prompt.
> 
> Uh, i hate to disagree with you about your own interpreter, but:
> 
>     com_expr_stmt in Python/compile.c
>         inserts a PRINT_EXPR opcode if c_interactive is true;
>     eval_code2 in Python/ceval.c
>         handles PRINT_EXPR by calling displayhook;
>     sys_displayhook in Python/sysmodule.c
>         prints the object by calling PyFile_WriteObject on sys.stdout;
>     PyFile_WriteObject in Objects/fileobject.c
>         calls PyObject_Print if the file is really a PyFileObject;
>     PyObject_Print in Objects/object.c
>         calls op->ob_type->tp_print if it's not NULL.
> 
> The print statement produces a PRINT_ITEM opcode, which invokes
> PyFile_WriteObject with a Py_PRINT_RAW flag.  That Py_PRINT_RAW
> flag is propagated down to PyObject_Print and into string_print,
> where it causes the string to fwrite itself directly without quoting.
> 
> > So, string_print most definitely should *not* be changed -- only
> > string_repr!
> 
> I had to change them both before i actually saw the change in the
> interactive interpreter.  Actually, your statement above (that the
> two should always render the same) seems to imply that if i change
> one, i must also change the other.

Oops.  I'm so grateful that we have a collective memory! :-)

You're right: tp_print() can be invoked in two modes: with or without
Py_PRINT_RAW flag.  In raw mode, it should behave exactly like str();
in cooked mode exactly like repr().

--Guido van Rossum (home page: http://www.python.org/~guido/)


From martin@mira.cs.tu-berlin.de  Thu Jan 18 19:31:29 2001
From: martin@mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 20:31:29 +0100
Subject: [Python-Dev] Weird use of hash() -- will this work?
Message-ID: <200101181931.f0IJVTc00932@mira.informatik.hu-berlin.de>

> Comments?

Yes, three of them:

1. To guarantee uniqueness atleast within the process, the easiest
   solution would be

   if using_threads:
     import thread
     lock=thread.allocate_lock()
     _acquire = lock.acquire_lock
     _release = lock.release_lock
   else:
     _acquire = _release = lambda:None
     
   _cookie = time.time()
   def getCookie():
     global _cookie
     _acquire()
     _cookie+=1
     result = _cookie
     _release()
     return result

2. Invoking [] repeatedly likely returns the an object with the same
   id() when called twice in a row (i.e. with no intermediate objects
   allocated in-between).

3. Why did you send this question to python-dev? python-list is more
   appropriate.

Regards,
Martin



From tim.one@home.com  Thu Jan 18 19:49:12 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 18 Jan 2001 14:49:12 -0500
Subject: [Python-Dev] Windows Python totally rad
In-Reply-To: <3A66B846.3D24B959@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEGKIJAA.tim.one@home.com>

[MAL]
> Could you tell me which these are [new test failures on Windows]?
> The tests tested all passed just fine, so I guess these must be
> Windows-related problems.

Not to worry, all the tests pass now.  Don't want to spend time
backtracking, as I'm not the one who fixed them and don't know who did.
FWIW, they "smelled like" shallow failures (== easy to diagnose & fix).

onward!-ly y'rs  - tim



From martin@mira.cs.tu-berlin.de  Thu Jan 18 19:37:04 2001
From: martin@mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 20:37:04 +0100
Subject: [Python-Dev] new Makefile.in
Message-ID: <200101181937.f0IJb4N00997@mira.informatik.hu-berlin.de>

> A question: is it possible to break the Python static library up?
> For example, instead of having libpython<version>.a have
> Parser/parser<version>.a, Objects/objects<version>.a, etc?

Please, no. It was that way in Python 1.4 (libModules, libObjects, and
I forgot which the others were :-). We had that all documented in our
book, then Guido tried to build an extension module for the first
time, saw that these many libraries were terrible, and combined them
into a single one. That was a good thing, and we have it documented in
our book. I'm not at all looking forward to answering all the
questions why the build infrastructure of Python changed yet again...

Regards,
Martin



From fdrake@acm.org  Thu Jan 18 20:22:30 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Thu, 18 Jan 2001 15:22:30 -0500 (EST)
Subject: [Python-Dev] weak references in 2.1alpha
Message-ID: <14951.20614.176140.672447@cj42289-a.reston1.va.home.com>

  I'd like to put the weak references patch into the alpha, but
haven't received any feedback on the latest patch.  I have some
comments from Martin von L=F6wis on the PEP that need to be addressed,
and that could change the implementation a bit, but the basic
machinery seems to be pretty reasonable and works for me.
  Does anyone have any objections to it going into the alpha?  I'd
like to enable more wide-spread testing.
  Thanks!


  -Fred

--=20
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From mal@lemburg.com  Thu Jan 18 17:10:14 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jan 2001 18:10:14 +0100
Subject: [Python-Dev] Weird use of hash() -- will this work?
References: <20010118022321.A9021@thyrsus.com>
Message-ID: <3A672376.4B951848@lemburg.com>

"Eric S. Raymond" wrote:
> 
> So I'm writing a module to that needs to generate unique cookies.  The
> module will run inside one of two environments: (1) a trivial test wrapper,
> not threaded, and (2) a lomg-running multithreaded server.
> 
> Because Python garbage-collects, hash() of a just-created object isn't
> good enough.  Because we may be threading, millisecond time isn't
> good enough.  Because we may *not* be threading, thread ID isn't good
> either.
> 
> On the other hand, I'm on Linux getting millisecond time resolution.
> And it's not hard to notice that an object hash is a memory address.
> 
> So, how about `time.time()` + hex(hash([]))?
> 
> It looks to me like this will remain unique forever, because another thread
> would have to create an object at the same memory address during the same
> millisecond to collide.
> 
> Furthermore, it looks to me like this hack might be portable to any OS
> with a clock tick shorter than its timeslice.
> 
> Comments?

A combination of time.time(), process id and counter should
work in all cases. Make sure you use a lock around the counter,
though.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Thu Jan 18 17:30:52 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jan 2001 18:30:52 +0100
Subject: [Python-Dev] Standard install locations for Python ?
References: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de>
Message-ID: <3A67284C.B6C617A@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > Where should Python extensions install themselves and their docs?
> 
> I feel that extensions should not need to care. For extensions,
> distutils will pick a location, and the system administrator
> configuration the package can chose a different location.
> 
> Unfortunately, distutils does not support the installation of
> documentation, which I think it should.

Right.
 
> Now switching sides, as an administrator, I'd wish distutils to follow
> the system conventions by default.
> 
> That means on Linux, documentation should go into the system's <doc>
> directory, which is /usr/share/doc according to latest
> standards. Distributions vary, so distutils should find out - e.g. by
> querying the location from rpm. In addition, when building RPMs,
> distutils should declare these files as %doc in the spec file, so RPM
> will install it following the system conventions.

You currently have to do this by hand (e.g. in setup.cfg or
using the doc_files option). It should fairly easy to add
a command similar to install_data though which then applies
all the necessary magic to the paths.

If there a common landmark to look for on Unix (e.g. in case the
system does not use RPM) ?

Which paths should distutils check ?

(/usr/share/doc/packages, /usr/share/doc, /usr/doc/packages,
/usr/doc in that order ?)
 
> On Windows, the convention apparently is to put the documentation
> "nearby" the software, so it should probably go into Doc or a
> subdirectory thereof.

Na, I'd rather have \Python\Site-Packages and \Python\Site-Docs
for that purpose.
 
> On Unix, there appears to be no standard location, unless the
> documentation consists of man pages or perhaps info files. So
> <prefix>/share/doc is probably a place as good as any other.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From skip@mojam.com (Skip Montanaro)  Thu Jan 18 17:45:29 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Thu, 18 Jan 2001 11:45:29 -0600 (CST)
Subject: [Python-Dev] urllib.urlencode & repeated values
In-Reply-To: <200101181628.LAA07406@cj20424-a.reston1.va.home.com>
References: <14950.28430.572215.10643@beluga.mojam.com>
 <200101181628.LAA07406@cj20424-a.reston1.va.home.com>
Message-ID: <14951.11193.150232.564700@beluga.mojam.com>

    >> If others agree I'd be happy to whip up a patch.  I think it's a bug.

    Guido> Agreed.

Patch #103314:

    http://sourceforge.net/patch/?func=detailpatch&patch_id=103314&group_id=5470

I assigned it to Fred for doc review.

Skip




From akuchlin@mems-exchange.org  Thu Jan 18 18:56:40 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Thu, 18 Jan 2001 13:56:40 -0500
Subject: [Python-Dev] Standard install locations for Python ?
In-Reply-To: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de>; from martin@mira.cs.tu-berlin.de on Thu, Jan 18, 2001 at 01:20:06PM +0100
References: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de>
Message-ID: <20010118135640.G21503@kronos.cnri.reston.va.us>

On Thu, Jan 18, 2001 at 01:20:06PM +0100, Martin v. Loewis wrote:
>On Unix, there appears to be no standard location, unless the
>documentation consists of man pages or perhaps info files. So
><prefix>/share/doc is probably a place as good as any other.

This seems like a good suggestion.  Should docs go in
<prefix>/share/doc/python<version>/, then?  Perhaps with
subdirectories for different extensions?

--amk




From tismer@tismer.com  Thu Jan 18 21:39:18 2001
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 18 Jan 2001 22:39:18 +0100
Subject: [Python-Dev] Rich comparison confusion
References: <14949.46995.259157.871323@beluga.mojam.com> <200101171609.LAA04102@cj20424-a.reston1.va.home.com>
Message-ID: <3A676286.C33823B4@tismer.com>


Guido van Rossum wrote:
> 
> > I'm a bit confused about Guido's rich comparison stuff.  In the description
> > he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.
> 
> Yes.  By this I mean that A<B and B>A are interchangeable, ditto for
> A<=B and B>=A.  Also A==B interchanges for B==A, and A!=B for B!=A.

...

> I think what threw you off was the ambiguity of "inverse".  This means
> Boolean negation.  I'm not relying on Boolean negation here -- I'm
> relying on the more fundamental property that a<b and b>a have the
> same outcome.

Yes, the "inverse" is confusing. Is what you mean the "reverse" ?
Like the other right-side operators __radd__, is it correct to
think of

   __ge__  == __rle__

if __rle__ was written in the same fashion like __radd__ ?
It looks semantically the same, although the reason for a
call might be different.

And if my above view is right, would it perhaps be less
confusing to use in fact __rle__ and __rlt__,
or woudl it be more confusing, since __rlt__ would also be
invoked left-to-right, implementing ">".

Not shure if I added even more confusion.

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From tim.one@home.com  Thu Jan 18 21:53:44 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 18 Jan 2001 16:53:44 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <20010118022321.A9021@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEGPIJAA.tim.one@home.com>

[Eric S. Raymond, in search of uniqueness]
> ...
> So, how about `time.time()` + hex(hash([]))?
>
> It looks to me like this will remain unique forever, because
> another thread would have to create an object at the same memory
> address during the same millisecond to collide.

I'm afraid it's much more vulnerable than that:  Python's thread granularity
is at the bytecode level, not the statement level.  It's very easy for
thread A and B to see the same `time.time()` value, and after that
arbitrarily long amounts of time may pass before they get around to doing
the hash([]) business.  When hash() completes, the storage for [] is
immediately reclaimed under CPython, and it's again very easy for another
thread to reuse the storage.

I'm attaching an executable test case.  It uses time.clock() because that
has much higher resolution than time.time() on Windows (better than
microsecond), but rounds it back to three decimal places to simulate
millisecond resolution.  The first three runs:

    saw 14600 unique in 30000 total
    saw 14597 unique in 30000 total
    saw 14645 unique in 30000 total

So it sucks bigtime on my box.

Better idea:  borrow the _ThreadSafeCounter class from the tail end of the
current CVS tempfile.py.  The code works whether or not threads are
available.  Then

    `time.time()` + str(_counter.get_next())

is thread-safe.  For that matter, plain old

    str(_counter.get_next())

will always be unique within a single run.  However, in either case you're
still not safe against concurrent *processes* generating the same cookies.

tempfile.py has to worry about that too, of course, so the *best* idea is to
call tempfile.mktemp() and leave it at that.  It wastes some time checking
the filesystem for a file of the same name (which, btw, goes much quicker on
Linux than on Windows).

>From time to time, somebody suggests adding a uuid generator to Python.  Not
a bad idea, but nobody wants to do all the x-platform work.

like-capturing-snowflakes-ly y'rs  - tim

from threading import Thread
import time

N = 1000
NTHREADS = 30

class Worker(Thread):
    def __init__(self):
        Thread.__init__(self)

    def run(self):
        self.generated = [`round(time.clock(), 3)` + hex(hash([]))
                          for i in range(N)]

threads = []
for i in range(NTHREADS):
    threads.append(Worker())

for t in threads:
    t.start()

d = {}
total = 0
for t in threads:
    t.join()
    total += len(t.generated)
    for g in t.generated:
        d[g] = 1

print "saw", len(d), "unique in", total, "total"



From tismer@tismer.com  Thu Jan 18 21:56:08 2001
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 18 Jan 2001 22:56:08 +0100
Subject: [Python-Dev] Weird use of hash() -- will this work?
References: <20010118022321.A9021@thyrsus.com>
Message-ID: <3A676678.7E4AF278@tismer.com>


"Eric S. Raymond" wrote:
> 
> So I'm writing a module to that needs to generate unique cookies.  The
> module will run inside one of two environments: (1) a trivial test wrapper,
> not threaded, and (2) a lomg-running multithreaded server.

What do you mean by "unique"? Unique regarding your long-running server?

If so, then I wonder why one should do
> 
> So, how about `time.time()` + hex(hash([]))?
>
instead of using a single, simple counter for all sessions?

> It looks to me like this will remain unique forever, because another thread
> would have to create an object at the same memory address during the same
> millisecond to collide.
> 
> Furthermore, it looks to me like this hack might be portable to any OS
> with a clock tick shorter than its timeslice.
> 
> Comments?

If I'm not overlooking something fundamental, the counter approach
seems to be simpler and most portable. :-)

but-sometimes-my-brain-malfunctions-badly-ly y'rs  - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From nas@arctrix.com  Thu Jan 18 15:07:13 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Thu, 18 Jan 2001 07:07:13 -0800
Subject: [Python-Dev] Re: new Makefile.in
In-Reply-To: <200101181937.f0IJb4N00997@mira.informatik.hu-berlin.de>; from martin@mira.cs.tu-berlin.de on Thu, Jan 18, 2001 at 08:37:04PM +0100
References: <200101181937.f0IJb4N00997@mira.informatik.hu-berlin.de>
Message-ID: <20010118070713.A13581@glacier.fnational.com>

On Thu, Jan 18, 2001 at 08:37:04PM +0100, Martin v. Loewis wrote:
> > A question: is it possible to break the Python static library up?
> > For example, instead of having libpython<version>.a have
> > Parser/parser<version>.a, Objects/objects<version>.a, etc?
> 
> Please, no.

Okay.

> I'm not at all looking forward to answering all the questions
> why the build infrastructure of Python changed yet again...

My Makefile patch shouldn't change the way you build extensions.

  Neil


From tim.one@home.com  Fri Jan 19 01:45:42 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 18 Jan 2001 20:45:42 -0500
Subject: [Python-Dev] unistr() vs. unicode()
In-Reply-To: <200101181623.LAA07389@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEHLIJAA.tim.one@home.com>

[Guido]
> (And no, Tim, this did *not* end up in the patches list because I made
> Barry remove the reply-to.  SourceForge mails never had reply-to to
> begin with.)

Aha!  Another thing to blame Barry for <wink>.



From tim.one@home.com  Thu Jan 18 22:11:23 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 18 Jan 2001 17:11:23 -0500
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <008701c0811a$b3371c00$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEGPIJAA.tim.one@home.com>

[/F]
> python build problems and real life got in the way.
>
> will 2.1a1 be released according to plan?  will there
> be a 2.1a2 release?  maybe I should postpone this?

Depends on how confident you are.  Since this is purely an optimization, I
don't think it *needs* to get into a1 in order to make the final release;
postponing a few days would be better than pushing too hard on something
that's proved hairier than anticipated.

do-the-right-thing-whatever-that-is<wink>-ly y'rs  - tim



From guido@digicool.com  Fri Jan 19 02:17:36 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 21:17:36 -0500
Subject: [Python-Dev] Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: Your message of "Thu, 18 Jan 2001 02:14:19 PST."
 <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org>
Message-ID: <200101190217.VAA01497@cj20424-a.reston1.va.home.com>

> I hope you don't mind that i'm taking this over to python-dev,
> because it led me to discover a more general issue (see below).

No -- in fact I wanted to see this here!  (My mail backlog seems to be
clearing -- or maybe it was only a temporary unclogging... :-)

> For the others on python-dev, here's the background: MAL was
> about to check in the unistr() function, described as follows:
> 
> > This patch adds a utility function unistr() which works just like
> > the standard builtin str()  -- only that the return value will
> > always be a Unicode object.
> > 
> > The patch also adds a new object level C API PyObject_Unicode()
> > which complements PyObject_Str().
> 
> I responded:
> > Why are unistr() and unicode() two separate functions?
> > 
> > str() performs one task: convert to string.  It can convert anything,
> > including strings or Unicode strings, numbers, instances, etc.
> > 
> > The other type-named functions e.g. int(), long(), float(), list(),
> > tuple() are similar in intent.
> > 
> > Why have unicode() just for converting strings to Unicode strings,
> > and unistr() for converting everything else to a Unicode string?
> > What does unistr(x) do differently from unicode(x) if x is a string?
> 
> MAL responded:
> > unistr() is meant to complement str() very closely. unicode()
> > works as constructor for Unicode objects which can also take
> > care of decoding encoded data. str() and unistr() don't provide
> > this capability but instead always assume the default encoding.
> > 
> > There's also a subtle difference in that str() and unistr() 
> > try the tp_str slot which unicode() doesn't. unicode()
> > supports any character buffer which str() and unistr() don't.
> 
> Okay, given this explanation, i still feel fairly confident
> that unicode() should subsume unistr().  Many of the other
> type-named functions try various slots:
> 
>     int() looks for __int__
>     float() looks for __float__
>     long() looks for __long__
>     str() looks for __str__
> 
> In testing this i also discovered the following:
> 
>     >>> class Foo:
>     ...     def __int__(self):
>     ...         return 3
>     ... 
>     >>> f = Foo()
>     >>> int(f)
>     3
>     >>> long(f) 
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     AttributeError: Foo instance has no attribute '__long__'
>     >>> float(f)
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     AttributeError: Foo instance has no attribute '__float__'
> 
> This is kind of surprising.  How about:
> 
>     int() looks for __int__
>     float() looks for __float__, then tries __int__
>     long() looks for __long__, then tries __int__
>     str() looks for __str__
>     unicode() looks for __unicode__, then tries __str__

For the numeric types this could perhaps be done by calling
PyNumber_Long() from PyNumber_Float(), calling PyNumber_Int() from
PyNumber_Long().  Complex is a bit of an exception -- there's no
PyNumber_Complex(), just because I felt that nobody would need it. :-)

> The extra parameter to unicode() is very similar to the extra
> parameter to int(), so i think there is a natural parallel here.

Makes sense.

> Hmm... what about the other types?
> 
> Wow!!  __complex__ can produce a segfault!
> 
>     >>> complex
>     <built-in function complex>
>     >>> class Foo:
>     ...   def __complex__(self): return 3
>     ... 
>     >>> Foo()
>     <__main__.Foo instance at 0x81e8684>
>     >>> f = _
>     >>> complex(f)
>     Segmentation fault (core dumped)
> 
> This happens because builtin_complex first retrieves and saves
> the PyNumberMethods of the argument (in this case, from the
> instance), then tries to call __complex__ (in this case, returning 3),
> and THEN coerces the result using nbr->nb_float if the result is
> not complex!  (This calls the instance's nb_float method on the
> integer object 3!!)

Thanks!  Fixed now in CVS.

> I think __complex__ should probably look for __complex__, then
> __float__, then __int__.

I make it call PyNumber_Float(), which could be made smarter as
explained above.

> One could argue for __list__, __tuple__, or __dict__, but that
> seems much weaker; the Pythonic way has always been to implement
> __getitem__ instead.

Yes -- since __list__ etc. aren't used, let's not add them.

> There is no built-in dict(); if it existed
> i suppose it would do the opposite of x.items(); again a weak
> argument, though i might have found such a function useful once
> or twice.

Yeah, it's not very common.  Dict comprehensions anyone?

    d = {k:v for k,v in zip(range(10), range(10))}    # :-)

> And that about covers the built-in types for data.

Thanks!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Thu Jan 18 22:13:14 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 18 Jan 2001 17:13:14 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <20010118022321.A9021@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHAIJAA.tim.one@home.com>

BTW, why doesn't hash([]) blow up in 2.1a1?  In 2.0 it raised

    TypeError: unhashable type

Did someone change this deliberately?



From tim.one@home.com  Thu Jan 18 22:58:22 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 18 Jan 2001 17:58:22 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHAIJAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHCIJAA.tim.one@home.com>

[Tim whined]
> BTW, why doesn't hash([]) blow up in 2.1a1?  In 2.0 it raised
>
>     TypeError: unhashable type
>
> Did someone change this deliberately?

Answer:  it's an unintended consequence of the rich-comparison changes.
Guido knows how to fix it and probably <wink> will.  The list type grew a
tp_richcompare slot but lost its non-NULL tp_compare pointer.  PyObject_Hash
wasn't changed accordingly (it now believes lists support neither direct
hashing nor comparison, so does them a favor and hashes their memory
addresses).  Something trickier is probably going wrong elsewhere too, but I
won't try to remember what that is unless Guido gets hit by a bus tonight.

in-which-case-we-can-push-off-the-funeral-until-after-the-release-ly
    y'rs  - tim



From thomas@xs4all.net  Thu Jan 18 23:02:09 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 00:02:09 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010118103036.B21503@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Jan 18, 2001 at 10:30:36AM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us>
Message-ID: <20010119000209.F17392@xs4all.nl>

On Thu, Jan 18, 2001 at 10:30:36AM -0500, Andrew Kuchling wrote:
> >On Wed, Jan 17, 2001 at 11:49:25PM +0100, Thomas Wouters wrote:
> >> On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
> >> setup.py. Also, SSL support for the socket module was not enabled, though
> >> OpenSSL is installed, in the default path.

> What does the layout of /usr/contrib look like?  Is it
> /usr/contrib/openssl/include/, /usr/contrib/include/, or something
> else?

Actually, it's /usr/local, not /usr/contrib. I've never installed OpenSSL in
/usr/contrib, though I could, and maybe BSDI will, in the future. (BSDI
installs its own software in /usr, and optional free, pre-compiled software
in /usr/contrib.) OpenSSL installs into
/usr/local/ssl/include/openssl by default, and installing into /usr/contrib
would make it /usr/contrib/ssl/include/openssl.

> >Strangely enough, this problem does not exist on FreeBSD. I can run 'make'
> >or 'make test' after 'make' just fine. 'make test' still doesn't work
> >because of the incorrect library path, but it doesn't barf like the other
> >systems (BSDI and Debian Linux)

> Have you already run "make install"?  Perhaps it's picking up the
> already-installed modules when running "make test", because it really
> shouldn't be working.

Hm, I think you misread my statement. 'make test' *doesn't* work. But it
doesn't barf on the signal module being built dynamically either. You fixed
that for every platform now, I was just pointing out that this was not a
problem for FreeBSD for some reason.

'make test' still doesn't work, but I can make it work by specifying a
hand-tweaked PYTHONPATH that includes the OS/arch-dependant build directory.

This brings me to another point: how can 'make test' work at all ? Does
python always check for './Lib' (and './Modules') for modules ? If that's
specific for 'make test' and running python in the source distribution, that
sounds like a bit of a weird hack. I can't find any such hackery in the
source, but I also can't figure out how else it's working :)

More-later--Meteor-((c)-1979)-is-on-ly y'rs
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From martin@mira.cs.tu-berlin.de  Thu Jan 18 23:14:05 2001
From: martin@mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 19 Jan 2001 00:14:05 +0100
Subject: [Python-Dev] weak references in 2.1alpha
Message-ID: <200101182314.f0INE5B00338@mira.informatik.hu-berlin.de>

> Does anyone have any objections to it going into the alpha? 

I'd like to request that the .clear() method is removed from the patch
for this alpha, and also that the weak dictionaries are removed until
their semantics is clarified.

It's always easier to add stuff later than to remove it.

Regards,
Martin



From nas@arctrix.com  Thu Jan 18 16:31:09 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Thu, 18 Jan 2001 08:31:09 -0800
Subject: [Python-Dev] SSL detection problem
In-Reply-To: <20010118115028.D21503@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Jan 18, 2001 at 11:50:28AM -0500
References: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de> <20010118115028.D21503@kronos.cnri.reston.va.us>
Message-ID: <20010118083109.A13972@glacier.fnational.com>

On Thu, Jan 18, 2001 at 11:50:28AM -0500, Andrew Kuchling wrote:
> On Thu, Jan 18, 2001 at 11:39:30AM +0100, Martin v. Loewis wrote:
> >The not-so-obvious question: How can one work-around such a problem
> >with the new setup scheme?
> 
> I still need to implement command-line options to specify such
> overrides, but that couldn't possibly get done in time for alpha1.

My non-recursive makefile patch allows you to use both Setup and
setup.py.  Its not quite really for prime time but its getting
close.

I would be interested if someone could point me to the source for
some crappy makes.  I've tried GNU make, BSD 4.4 pmake and
whatever comes with SunOS 5.6.  Searching for "make" doesn't work
too well. :-(

  Neil


From thomas@xs4all.net  Thu Jan 18 23:45:32 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 00:45:32 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Thu, Jan 18, 2001 at 08:46:54AM -0800
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010119004532.G17392@xs4all.nl>

On Thu, Jan 18, 2001 at 08:46:54AM -0800, Guido van Rossum wrote:

> filename = '/tmp/delete_me'

This reminds me: we need a portable way to handle test-files :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Thu Jan 18 23:56:04 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 18:56:04 -0500
Subject: [Python-Dev] new Makefile.in
In-Reply-To: Your message of "Wed, 17 Jan 2001 23:59:22 PST."
 <20010117235922.A12356@glacier.fnational.com>
References: <20010117235922.A12356@glacier.fnational.com>
Message-ID: <200101182356.SAA19616@cj20424-a.reston1.va.home.com>

Hi Neil,

My mail suffers delays of 12-24 hours while mail.python.org is working
on some enormous backlog.  So I just saw your message about a new
Makefile...

> Spurred on by comments made by Andrew, I spent some time last
> night overhauling the Python Makefiles.  I now have a toplevel
> non-recursive Makefile.in that seems to work fairly well.  I'm
> pretty sure it still should be portable.  It doesn't use includes
> or any special GNU make features.  It is half the size of the old
> Makefiles.  The build is faster and its now easier to follow if
> something goes wrong.

I'd like to see this!

> A question: is it possible to break the Python static library up?
> For example, instead of having libpython<version>.a have
> Parser/parser<version>.a, Objects/objects<version>.a, etc?  There
> would still only be one shared library.  This would speed up
> incremental builds and also help Andrew with PEP 229.  I'm
> thinking that the Makefile do something like this:
> 
>     all: python$(EXE)
> 
>     PYLIBS= Parser/parser.a Objects/objects.a ...  Modules/modules.a
> 
>     python$(EXE): $(PYLIBS)
>         $(LINKCC) -o python$(EXE) $(PYLIBS) ...
> 
>     Modules/modules.a: minpython$(EXE)
>         ./minpython$(EXE) setup.py

Sounds cool to me.  (Where's the patch for a shared libpython???)

> AFACT, the only thing affected by splitting up the static library
> is Misc/Makefile.pre.in.  Is this correct?

Yeah, and that should be phased out in favor of distutils anyway.  Now
would be a great time!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Fri Jan 19 00:34:02 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:34:02 -0500
Subject: [Python-Dev] Mail delays and SourceForge bugs
Message-ID: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>

Through no fault of my own, email to guido@python.org (which includes
the python-dev list) is currently suffering delays of 12-24 hours.  I
have a feeling this is probably true for all mail going through
python.org, so checkin messages ans python-dev discussion have been
greatly frustrated, with about 1 day to go until the planned 2.1a1
release date!

On top of that, the SourceForge bug manager has developed a problem:
all references to http://sourceforge.net/bugs/?group_id=5470/ come
back with this error:

  An error occured in the logger. ERROR: pg_atoi: error in "5470/":
  can't parse "/"

I'm still hoping to release Python 2.1a1 tomorrow, unless Jeremy tells
me that he needs more time for his nested scopes patch.

In the mean time, please everybody, do check out the latest CVS
version and give it a good workout!  Andrew's setup.py still has some
rough edges, I believe that in order to run it from the build
directory you still have to point PYTHONPATH to the build/lib*
directory, where he hides the shared libraries for all modules.
Andrew, are you planning to fix this?

If there's anything that you need me to know about, please mail to
guido@digicool.com -- that address suffers no delays.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Fri Jan 19 00:51:19 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 18 Jan 2001 19:51:19 -0500
Subject: [Python-Dev] RE: [Pycabal] Mail delays and SourceForge bugs
In-Reply-To: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEHGIJAA.tim.one@home.com>

[Guido. notes current woes w/ python.org email, and SourceForge]

Note too that, over the past two days, it's not possible to follow
Python-Dev email via

http://mail.python.org/pipermail/python-dev/2001-January/date.html

either, as (unlike during previous occurrences of python.org email delays)
msgs aren't showing up there in a timely fashion either (for example, the
msg of Guido's to which I'm replying isn't there).

good-thing-guido's-so-easy-to-channel<wink>-ly y'rs  - tim



From guido@digicool.com  Fri Jan 19 00:52:02 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:52:02 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: Your message of "Thu, 18 Jan 2001 02:23:21 EST."
 <20010118022321.A9021@thyrsus.com>
References: <20010118022321.A9021@thyrsus.com>
Message-ID: <200101190052.TAA26849@cj20424-a.reston1.va.home.com>

> So I'm writing a module to that needs to generate unique cookies.  The
> module will run inside one of two environments: (1) a trivial test wrapper,
> not threaded, and (2) a lomg-running multithreaded server.
> 
> Because Python garbage-collects, hash() of a just-created object isn't
> good enough.  Because we may be threading, millisecond time isn't
> good enough.  Because we may *not* be threading, thread ID isn't good
> either.  
> 
> On the other hand, I'm on Linux getting millisecond time resolution.
> And it's not hard to notice that an object hash is a memory address.
> 
> So, how about `time.time()` + hex(hash([]))?
> 
> It looks to me like this will remain unique forever, because another thread
> would have to create an object at the same memory address during the same
> millisecond to collide.
> 
> Furthermore, it looks to me like this hack might be portable to any OS
> with a clock tick shorter than its timeslice.

Argh!  hash([]) should raise TypeError, since lists are not hashable
objects -- mutable objects can't be allowed as dictionary keys.  This
(hash([]) accidentally returned a value for a brief period after I
checked in the rich comparisons -- I've fixed that now.

But not to worry: instead of using hash([]), you can use hex(id([])).
Same thing.

On the other hand, remember how much you can do in a millisecond!
(E.g. I can call tempfile.mktemp() 5 times in that time.)  And when
you create an object and immediately delete it, the next object
created is very likely to have the same address.

But what's wrong with this:

    try:
        from thread import get_ident as unique_id
    else:
        def unique_id(): return id([])

--Guido van Rossum (home page: http://www.python.org/~guido/)


From billtut@microsoft.com  Fri Jan 19 00:53:15 2001
From: billtut@microsoft.com (Bill Tutt)
Date: Thu, 18 Jan 2001 16:53:15 -0800
Subject: [Python-Dev] MS CRT crashing:
Message-ID: <58C671173DB6174A93E9ED88DCB0883DB863F1@red-msg-07.redmond.corp.microsoft.com>

>From the internal support squad:
Turns out the C standard explicitly says you can't have an input follow
output on a stream without doing fflush or fseek in-between, to make sure
the stdio buffer is cleared.  So this program is illegal.

They've gone and resolved it by design.

FYI,
Bill

 -----Original Message-----
From: 	Bill Tutt  
Sent:	Wednesday, January 10, 2001 1:09 AM
To:	'Tim Peters'
Cc:	'Mark Hammond'
Subject:	RE: [Python-Dev] xreadlines : readlines :: xrange : range

Heh. I've tossed this code to internal support. I'll give a yell if I hear
anything interesting.

Thanks for the C test case,
Bill


From guido@digicool.com  Fri Jan 19 00:53:13 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:53:13 -0500
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: Your message of "Thu, 18 Jan 2001 07:48:41 +0100."
 <008701c0811a$b3371c00$e46940d5@hagrid>
References: <LNBBLJKPBEHFEDALKOLCGENEIIAA.tim.one@home.com> <012901c080a5$306023a0$e46940d5@hagrid>
 <008701c0811a$b3371c00$e46940d5@hagrid>
Message-ID: <200101190053.TAA26862@cj20424-a.reston1.va.home.com>

> I wrote:
> > I've almost sorted it all out.  will check it in later tonight (local
> > time).
> 
> python build problems and real life got in the way.

What?  You've got a real life?  Can't be allowed, not when we're
working on a release!

> will 2.1a1 be released according to plan?  will there
> be a 2.1a2 release?  maybe I should postpone this?

Please check it in, there's still time (2.1a1 won't go out before
Friday night, possibly it'll be delayed until Monday).

And yes, there will be a 2.1a2.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Fri Jan 19 00:55:15 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:55:15 -0500
Subject: [Python-Dev] SSL detection problem
In-Reply-To: Your message of "Thu, 18 Jan 2001 11:39:30 +0100."
 <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de>
References: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de>
Message-ID: <200101190055.TAA26905@cj20424-a.reston1.va.home.com>

> The distutils-based configuration fails to build on my system (SuSE
> 7.0) with the error
> 
> /usr/src/python/Modules/socketmodule.c:159: rsa.h: Datei oder Verzeichnis nicht
> gefunden
> /usr/src/python/Modules/socketmodule.c:160: crypto.h: Datei oder Verzeichnis nicht gefunden
> /usr/src/python/Modules/socketmodule.c:161: x509.h: Datei oder Verzeichnis nicht gefunden
> /usr/src/python/Modules/socketmodule.c:162: pem.h: Datei oder Verzeichnis nicht
> gefunden
> /usr/src/python/Modules/socketmodule.c:163: ssl.h: Datei oder Verzeichnis nicht
> gefunden                                                                       

The same happened to Fred on Mandrake 7.0 (except for the German
messages :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Fri Jan 19 00:58:16 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:58:16 -0500
Subject: [Python-Dev] Re: unistr() vs. unicode()
Message-ID: <200101190058.TAA26931@cj20424-a.reston1.va.home.com>

MAL's reply to Ping in this thread.

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Thu, 18 Jan 2001 10:52:12 +0100
From:    "M.-A. Lemburg" <mal@lemburg.com>
To:      Ka-Ping Yee <ping@lfw.org>
cc:      guido@python.org, patches@python.org
Subject: Re: [Patches] [Patch #101664] Add new unistr() builtin + PyObject_Unic
	  ode()C API

Ka-Ping Yee wrote:
> 
> On Wed, 17 Jan 2001 noreply@sourceforge.net wrote:
> > Comment:
> > This patch adds a utility function unistr() which works just like
> > the standard builtin str()  -- only that the return value will
> > always be a Unicode object.
> 
> Sorry for barging in, but i have an issue/question:
> 
> Why are unistr() and unicode() two separate functions?
> 
> str() performs one task: convert to string.  It can convert anything,
> including strings or Unicode strings, numbers, instances, etc.
> 
> The other type-named functions e.g. int(), long(), float(), list(),
> tuple() are similar in intent.
> 
> Why have unicode() just for converting strings to Unicode strings,
> and unistr() for converting everything else to a Unicode string?
> What does unistr(x) do differently from unicode(x) if x is a string?

unistr() is meant to complement str() very closely. unicode()
works as constructor for Unicode objects which can also take
care of decoding encoded data. str() and unistr() don't provide
this capability but instead always assume the default encoding.

There's also a subtle difference in that str() and unistr() 
try the tp_str slot which unicode() doesn't. unicode()
supports any character buffer which str() and unistr() don't.

Perhaps you are right though in that we should make all three
APIs behave in the same way with respect to coercing their
arguments. This could hide some errors... still in the long
run, I agree that the existing setup probably causes more confusion
than good.

Guido ?

- -- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/

_______________________________________________
Patches mailing list
Patches@python.org
http://mail.python.org/mailman/listinfo/patches

------- End of Forwarded Message



From guido@digicool.com  Fri Jan 19 01:04:22 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 20:04:22 -0500
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: Your message of "Thu, 18 Jan 2001 11:51:46 +0100."
 <3A66CAC2.74FC894@lemburg.com>
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org>
 <3A66CAC2.74FC894@lemburg.com>
Message-ID: <200101190104.UAA27056@cj20424-a.reston1.va.home.com>

> Ka-Ping Yee wrote:
> > 
> > On Thu, 18 Jan 2001, Ka-Ping Yee wrote:
> > >     str() looks for __str__
> > 
> > Oops.  I forgot that
> > 
> >       str() looks for __str__, then tries __repr__
> > 
> > So, presumably,
> > 
> >       unicode() should look for __unicode__, then __str__, then __repr__
> 
> Not quite... str() does this:
> 
> 1. strings are passed back as-is
> 2. the type slot tp_str is tried
> 3. the method __str__ is tried
> 4. Unicode returns are converted to strings
> 5. anything other than a string return value is rejected
> 
> unistr() does the same, but makes sure that the return
> value is an Unicode object.
> 
> unicode() does the following:
> 
> 1. for instances, __str__ is called
> 2. Unicode objects are returned as-is
> 3. string objects or character buffers are used as basis for decoding
> 4. decoding is applied to the character buffer and the results
>    are returned
> 
> I think we should perhaps merge the two approaches into one
> which then applies all of the above in unicode() (and then
> forget about unistr()). This might lose hide some type errors,
> but since all other generic constructors behave more or less
> in the same way, I think unicode() should too.

Yes, I would like to see these merged.  I noticed that e.g. there is
special code to compare Unicode strings in the comparison code (I
think I *could* get rid of this now we have rich comparisons, but I
decided to put that off), and when I looked at it it uses the same set
of conversions as unicode().  Some of these seem questionable to me --
why do you try so many ways to get a string out of an object?  (On the
other hand the merge of unicode() and unistr() might have this effect
anyway...)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@digicool.com  Fri Jan 19 01:06:23 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 20:06:23 -0500
Subject: [Python-Dev] bug in grammar
In-Reply-To: Your message of "Thu, 18 Jan 2001 13:39:54 +0100."
 <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de>
References: <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de>
Message-ID: <200101190106.UAA27073@cj20424-a.reston1.va.home.com>

> I think there should be a well-formedness pass in-between. I.e. after
> the AST has been build, a single pass should descend through the tree,
> looking for an expr_statement with more than a single testlist. Once
> it finds one, it should confirm that this really is a well-formed
> lvalue (in C speak). In this case, the test should be that each term
> is a an atom without factors.

Good ideal.

> If the parser itself performs such checks, the compiler could be
> simplified in many places, I guess.

Not sure that in practice it makes much of a difference: there aren't
that many of these kinds of checks, and writing a separate pass is
expensive.  On the other hand, Jeremy is just writing a separate pass
anyway, to collect name usage information for the nested scopes.
Maybe it could be folded into that pass...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@alum.mit.edu  Fri Jan 19 03:20:08 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jan 2001 22:20:08 -0500 (EST)
Subject: [Python-Dev] deprecated regex used by un-deprecated modules
Message-ID: <14951.45672.806978.600944@localhost.localdomain>

There are several modules in the standard library that use the regex
module.  When they are imported, they print a warning about using a
deprecated module.  I think this is bad form.  Either the modules that
depend on regex should by updated to use re or they should be
deprecated themselves.  

I discovered the following offenders:
asynchat
knee
poplib
reconvert

I would suggest fixing asynchat and poplib and deprecating knee.  The
reconvert module may be a special case.

Jeremy


From jeremy@alum.mit.edu  Fri Jan 19 03:31:02 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jan 2001 22:31:02 -0500 (EST)
Subject: [Python-Dev] setup.py and build subdirectories
Message-ID: <14951.46326.743921.988828@localhost.localdomain>

I have a bunch of build directories under the source tree, e.g.
src/python/dist/src/build
src/python/dist/src/build-pg
src/python/dist/src/build-O3
...

The new setup.py did not successfully build in these directories.  I
hacked distutils a tiny bit and had some success.  Patch below.  I'm
not sure if the approach is kosher, but it allows me to build
successfully.

I also have a problem running 'make test' from these build
directories.  The reference to the distutils build directory has '..'
prepended to it that shouldn't exist.

Jeremy


Index: setup.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/setup.py,v
retrieving revision 1.8
diff -c -r1.8 setup.py
*** setup.py	2001/01/18 20:39:34	1.8
--- setup.py	2001/01/19 03:26:55
***************
*** 536,540 ****
            
  # --install-platlib
  if __name__ == '__main__':
!     sysconfig.set_python_build()
      main()
--- 536,541 ----
            
  # --install-platlib
  if __name__ == '__main__':
!     path, file = os.path.split(sys.argv[0])
!     sysconfig.set_python_build(path)
      main()
Index: Lib/distutils/sysconfig.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/distutils/sysconfig.py,v
retrieving revision 1.31
diff -c -r1.31 sysconfig.py
*** Lib/distutils/sysconfig.py	2001/01/17 15:16:52	1.31
--- Lib/distutils/sysconfig.py	2001/01/19 03:27:01
***************
*** 24,37 ****
  
  python_build = 0
  
! def set_python_build():
      """Set the python_build flag to true; this means that we're
      building Python itself.  Only called from the setup.py script
      shipped with Python.
      """
      
      global python_build
!     python_build = 1
  
  def get_python_inc(plat_specific=0, prefix=None):
      """Return the directory containing installed Python header files.
--- 24,37 ----
  
  python_build = 0
  
! def set_python_build(loc):
      """Set the python_build flag to true; this means that we're
      building Python itself.  Only called from the setup.py script
      shipped with Python.
      """
      
      global python_build
!     python_build = loc + "/"
  
  def get_python_inc(plat_specific=0, prefix=None):
      """Return the directory containing installed Python header files.
***************
*** 48,54 ****
          prefix = (plat_specific and EXEC_PREFIX or PREFIX)
      if os.name == "posix":
          if python_build:
!             return "Include/"
          return os.path.join(prefix, "include", "python" + sys.version[:3])
      elif os.name == "nt":
          return os.path.join(prefix, "Include") # include or Include?
--- 48,54 ----
          prefix = (plat_specific and EXEC_PREFIX or PREFIX)
      if os.name == "posix":
          if python_build:
!             return python_build + "Include/"
          return os.path.join(prefix, "include", "python" + sys.version[:3])
      elif os.name == "nt":
          return os.path.join(prefix, "Include") # include or Include?




From tim.one@home.com  Fri Jan 19 03:46:16 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 18 Jan 2001 22:46:16 -0500
Subject: [Python-Dev] Type-converting functions, esp. unicode() vs. unistr()
Message-ID: <LNBBLJKPBEHFEDALKOLCGEIBIJAA.tim.one@home.com>

[attribution lost]
> There is no built-in dict(); if it existed i suppose it would do
> the opposite of x.items(); again a weak argument, though i might
> have found such a function useful once or twice.

[Guido]
> Yeah, it's not very common.  Dict comprehensions anyone?
>
>    d = {k:v for k,v in zip(range(10), range(10))}    # :-)

It's very common in Perl code, but is in no sense the inverse of .items()
there:   when you build a dict from a list L in Perl, it acts like Python

   {L[0]: L[1],
    L[2]: L[3],
    L[4]: L[5],
    ...
   }

That's what seems most practical most often; e.g., when crunching over text
files with records of the form

    key value

(e.g., mail headers are of this form; simple contact databases; to-do lists
segregated by date; etc), whatever fancy re.split() is used to break things
apart naturally returns a flat list.  A list of two-tuples is natural only
if it was obtained from another dict's .items() <0.9 wink>.

pushing-the-limits-of-"practicality-beats-purity"?-ly y'rs  - tim



From tim.one@home.com  Fri Jan 19 06:00:27 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 19 Jan 2001 01:00:27 -0500
Subject: [Python-Dev] test_urllib failing on Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCAEIGIJAA.tim.one@home.com>

test test_urllib crashed 
    -- exceptions.AssertionError: urllib.quote problem



From tim.one@home.com  Fri Jan 19 06:39:30 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 19 Jan 2001 01:39:30 -0500
Subject: [Python-Dev] (no subject)
Message-ID: <LNBBLJKPBEHFEDALKOLCAEIIIJAA.tim.one@home.com>

[some MS internal support group]
> Turns out the C standard explicitly says you can't have an input
> follow iutput on a stream without doing fflush or fseek in-between,
> to make sure the stdio buffer is cleared.  So this program is illegal.

It's undefined (there are no "illegal" programs -- that word doesn't appear
in the std; "undefined" does and has a precise technical meaning).

In the presence of threads-- which the C std doesn't mention --you have to
address issues the std doesn't touch.  To date, MS's is the only C runtime
we've seen that corrupts itself in this situation.  It can do anything it
likes short of blowing up and still be considered a good threaded
implementation.  As is, it has to be considered sub-standard, in the
ordinary sense of displaying worse behavior than other threaded C stdio
implementations.  It falls short there on other counts too (like the lack of
getc_unlocked() & friends), but internal corruption is a particularly
egregious failing.

and-that's-the-end-of-it-for-me-ly y'rs  - tim



From mwh21@cam.ac.uk  Fri Jan 19 08:31:18 2001
From: mwh21@cam.ac.uk (Michael Hudson)
Date: 19 Jan 2001 08:31:18 +0000
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: Thomas Wouters's message of "Fri, 19 Jan 2001 00:02:09 +0100"
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl>
Message-ID: <m3vgrc0wq1.fsf@atrus.jesus.cam.ac.uk>

Thomas Wouters <thomas@xs4all.net> writes:

> This brings me to another point: how can 'make test' work at all ? Does
> python always check for './Lib' (and './Modules') for modules ? If that's
> specific for 'make test' and running python in the source distribution, that
> sounds like a bit of a weird hack. I can't find any such hackery in the
> source, but I also can't figure out how else it's working :)

It's in Modules/getpath.c

Cheers,
M.

-- 
  I really hope there's a catastrophic bug insome future e-mail
  program where if you try and send an attachment it cancels your
  ISP account, deletes your harddrive, and pisses in your coffee
                                                         -- Adam Rixey



From gstein@lyra.org  Fri Jan 19 08:38:54 2001
From: gstein@lyra.org (Greg Stein)
Date: Fri, 19 Jan 2001 00:38:54 -0800
Subject: [Python-Dev] initializing ob_type (was: CVS: python/dist/src/Modules _cursesmodule.c,2.46,2.47)
In-Reply-To: <E14JPPW-0008Bt-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Thu, Jan 18, 2001 at 04:28:10PM -0800
References: <E14JPPW-0008Bt-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010119003854.F7731@lyra.org>

On Thu, Jan 18, 2001 at 04:28:10PM -0800, Guido van Rossum wrote:
>...
>   PyTypeObject PyCursesWindow_Type = {
> ! 	PyObject_HEAD_INIT(NULL)
>   	0,			/*ob_size*/
>   	"curses window",	/*tp_name*/
>...
> --- 2432,2443 ----
>   /* Initialization function for the module */
>   
> ! DL_EXPORT(void)
>   init_curses(void)
>   {
>   	PyObject *m, *d, *v, *c_api_object;
>   	static void *PyCurses_API[PyCurses_API_pointers];
> + 
> + 	/* Initialize object type */
> + 	PyCursesWindow_Type.ob_type = &PyType_Type;
>   
>   	/* Initialize the C API pointer array */


I've never truly understood this. Is it because Windows cannot initialize
(at load-time) a pointer to a data structure that is located in a different
DLL?

It is a bit painful to keep moving inits from load-time to run-time.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From tim.one@home.com  Fri Jan 19 09:01:22 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 19 Jan 2001 04:01:22 -0500
Subject: [Python-Dev] test_urllib failing on Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCCEINIJAA.tim.one@home.com>

Bet it was failing everywhere; it's fixed now.



From moshez@zadka.site.co.il  Fri Jan 19 17:53:36 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Fri, 19 Jan 2001 19:53:36 +0200 (IST)
Subject: [Python-Dev] Dbm failure
Message-ID: <20010119175336.3B5A0A83E@darjeeling.zadka.site.co.il>

test test_dbm skipped --  /home/moshez/prog/src/python/python/dist/src/build/lib.linux-i686-2.1/dbm.so: undefined symbol: dbm_firstkey

Did it happen to anyone else? 
Anything else you need to know?

-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From mal@lemburg.com  Fri Jan 19 09:58:08 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 10:58:08 +0100
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs.
 unistr()
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org>
 <3A66CAC2.74FC894@lemburg.com> <200101190104.UAA27056@cj20424-a.reston1.va.home.com>
Message-ID: <3A680FB0.AED2DB55@lemburg.com>

Guido van Rossum wrote:
> 
> > Ka-Ping Yee wrote:
> > >
> > > On Thu, 18 Jan 2001, Ka-Ping Yee wrote:
> > > >     str() looks for __str__
> > >
> > > Oops.  I forgot that
> > >
> > >       str() looks for __str__, then tries __repr__
> > >
> > > So, presumably,
> > >
> > >       unicode() should look for __unicode__, then __str__, then __repr__
> >
> > Not quite... str() does this:
> >
> > 1. strings are passed back as-is
> > 2. the type slot tp_str is tried
> > 3. the method __str__ is tried
> > 4. Unicode returns are converted to strings
> > 5. anything other than a string return value is rejected
> >
> > unistr() does the same, but makes sure that the return
> > value is an Unicode object.
> >
> > unicode() does the following:
> >
> > 1. for instances, __str__ is called
> > 2. Unicode objects are returned as-is
> > 3. string objects or character buffers are used as basis for decoding
> > 4. decoding is applied to the character buffer and the results
> >    are returned
> >
> > I think we should perhaps merge the two approaches into one
> > which then applies all of the above in unicode() (and then
> > forget about unistr()). This might lose hide some type errors,
> > but since all other generic constructors behave more or less
> > in the same way, I think unicode() should too.
> 
> Yes, I would like to see these merged.  I noticed that e.g. there is
> special code to compare Unicode strings in the comparison code (I
> think I *could* get rid of this now we have rich comparisons, but I
> decided to put that off), and when I looked at it it uses the same set
> of conversions as unicode().  Some of these seem questionable to me --
> why do you try so many ways to get a string out of an object?  (On the
> other hand the merge of unicode() and unistr() might have this effect
> anyway...)

... because there are so many ways to get at string
representations of objects in Python at C level.

If we agree to merge the semantics of the two APIs, then str()
would have to change too: is this desirable ? (IMHO, yes)

Here's what we could do:

a) merge the semantics of unistr() into unicode()
b) apply the same semantics in str()
c) remove unistr() -- how's that for a short-living builtin ;)

About the semantics:

These should be backward compatible to str() in that everything
that worked before should continue to work after the merge.

A strawman for processing str() and unicode():

1. strings/Unicode is passed back as-is
2. tp_str is tried
3. the method __str__ is tried
4. the PyObject_AsCharBuffer() API is tried (bf_getcharbuffer)
5. for str(): Unicode return values are converted to strings using
              the default encoding
   for unicode(): Unicode return values are passed back as-is;
              string return values are decoded according to the
              encoding parameter
6. the return object is type-checked: str() will always return
   a string object, unicode() always a Unicode object

Note that passing back Unicode is only allowed in case no encoding
was given. Otherwise an execption is raised: you can't decode
Unicode.

As extension we could add encoding and error parameters to str()
as well. The result would be either an encoding of Unicode objects
passed back by tp_str or __str__ or a recoding of string objects
returned by checks 2, 3 or 4.

If we agree to take this approach, then we should remove the
unistr() Python API before the alpha ships.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fredrik@effbot.org  Fri Jan 19 10:19:06 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Fri, 19 Jan 2001 11:19:06 +0100
Subject: [Python-Dev] initializing ob_type (was: CVS: python/dist/src/Modules _cursesmodule.c,2.46,2.47)
References: <E14JPPW-0008Bt-00@usw-pr-cvs1.sourceforge.net> <20010119003854.F7731@lyra.org>
Message-ID: <010c01c08201$4b0ec050$e46940d5@hagrid>

greg wrote:
> I've never truly understood this. Is it because Windows cannot initialize
> (at load-time) a pointer to a data structure that is located in a different
> DLL?

Windows can do it (via DLL initialization code), but the compiler
doesn't generate initialization code for C programs.

you can compile the module as C++, but that's also a bit painful...

</F>



From jack@oratrix.nl  Fri Jan 19 11:02:00 2001
From: jack@oratrix.nl (Jack Jansen)
Date: Fri, 19 Jan 2001 12:02:00 +0100
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
Message-ID: <20010119110200.9E455373C95@snelboot.oratrix.nl>

I get the impression that I'm currently seeing a non-NULL third argument in my 
(C) methods even though the method is called without keyword arguments.

Is this new semantics that I missed the discussion about, or is this a bug?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | ++++ see http://www.xs4all.nl/~tank/ ++++




From thomas@xs4all.net  Fri Jan 19 12:22:06 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 13:22:06 +0100
Subject: [Python-Dev] deprecated regex used by un-deprecated modules
In-Reply-To: <14951.45672.806978.600944@localhost.localdomain>; from jeremy@alum.mit.edu on Thu, Jan 18, 2001 at 10:20:08PM -0500
References: <14951.45672.806978.600944@localhost.localdomain>
Message-ID: <20010119132206.H17392@xs4all.nl>

On Thu, Jan 18, 2001 at 10:20:08PM -0500, Jeremy Hylton wrote:

> I would suggest fixing asynchat and poplib and deprecating knee.  The
> reconvert module may be a special case.

Can't reconvert just disable the warning before importing regex ? That would
seem the sane thing to do, at least to me.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Fri Jan 19 12:26:31 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 13:26:31 +0100
Subject: [Python-Dev] Mail delays and SourceForge bugs
In-Reply-To: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Jan 18, 2001 at 07:34:02PM -0500
References: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>
Message-ID: <20010119132631.I17392@xs4all.nl>

On Thu, Jan 18, 2001 at 07:34:02PM -0500, Guido van Rossum wrote:

> Through no fault of my own, email to guido@python.org (which includes
> the python-dev list) is currently suffering delays of 12-24 hours.  I
> have a feeling this is probably true for all mail going through
> python.org, so checkin messages ans python-dev discussion have been
> greatly frustrated, with about 1 day to go until the planned 2.1a1
> release date!

I doubt it's (just) you, Guido. I'm seeing similar delays, and I already
talked with Barry about it, too. It looks like it's clearing up a bit, now,
but it's confusing as hell, for sure ;)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Fri Jan 19 12:33:47 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 13:33:47 +0100
Subject: [Python-Dev] Dbm failure
In-Reply-To: <20010119175336.3B5A0A83E@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Fri, Jan 19, 2001 at 07:53:36PM +0200
References: <20010119175336.3B5A0A83E@darjeeling.zadka.site.co.il>
Message-ID: <20010119133347.J17392@xs4all.nl>

On Fri, Jan 19, 2001 at 07:53:36PM +0200, Moshe Zadka wrote:
> test test_dbm skipped --  /home/moshez/prog/src/python/python/dist/src/build/lib.linux-i686-2.1/dbm.so: undefined symbol: dbm_firstkey
> Did it happen to anyone else? 

Yes, to me. You're suffering from the same thing I did: GNU sucks. Okay,
okay, not as much as MS products or most other UNIX software, but still ;)
The problem is a conflict between gdbm and glibc.

gdbm (1.7.3, which is what woody currently carries, not sure why it isn't
updated) offers a dbm interface/replacement, which includes a libdbm.(so|a)
and /usr/include/gdbm-ndbm.h. Glibc (or at least the debian package) *also*
offers a dbm interface/replacement, which consists of libdb1.(so|a) and
/usr/include/db1/ndbm.h (which needs /usr/include/db1/*.h). If you add
/usr/include/db1 to your include path, and -ldbm to the dbmmodule, you end
up with the wrong versions. You need either to include /usr/include/db1 in
your includepath and use -ldb1, or fix up dbmmodule.c so it includes
gdbm-ndbm.h and uses -ldbm.

I only figured this out yesterday, and sent Andrew a mail about that... I'm
not sure what the Right(tm) way to fix this is :( I've always loathed these
library/version mismatches :P

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mal@lemburg.com  Fri Jan 19 13:07:00 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 14:07:00 +0100
Subject: [Python-Dev] Standard install locations for Python ?
References: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de> <20010118135640.G21503@kronos.cnri.reston.va.us>
Message-ID: <3A683BF4.BD74A979@lemburg.com>

Andrew Kuchling wrote:
> 
> On Thu, Jan 18, 2001 at 01:20:06PM +0100, Martin v. Loewis wrote:
> >On Unix, there appears to be no standard location, unless the
> >documentation consists of man pages or perhaps info files. So
> ><prefix>/share/doc is probably a place as good as any other.
> 
> This seems like a good suggestion.  Should docs go in
> <prefix>/share/doc/python<version>/, then?  Perhaps with
> subdirectories for different extensions?

Hmm, I guess it's better to follow bdist_rpm here: put
the docs into a subdir under .../doc/ using the package
name and version.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jeremy@alum.mit.edu  Fri Jan 19 14:39:13 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 19 Jan 2001 09:39:13 -0500 (EST)
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: <20010119110200.9E455373C95@snelboot.oratrix.nl>
References: <20010119110200.9E455373C95@snelboot.oratrix.nl>
Message-ID: <14952.20881.848489.869512@localhost.localdomain>

>>>>> "JJ" == Jack Jansen <jack@oratrix.nl> writes:

  JJ> I get the impression that I'm currently seeing a non-NULL third
  JJ> argument in my (C) methods even though the method is called
  JJ> without keyword arguments.

  JJ> Is this new semantics that I missed the discussion about, or is
  JJ> this a bug? 

This is a bug in the changes I made to the call function
implementation.  I wasn't sure what was supposed to happen to a
function that expected a kw argument but was called without one.  I
thought I saw some crashes when I passed NULL, so I changed the
implementation to pass an empty dictionary.

(Is the correct behavior documented anywhere?)

If a NULL value is correct, I'll update the implementation and see if
I can rediscover those crashes.

Jeremy


From nas@arctrix.com  Fri Jan 19 07:39:50 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Thu, 18 Jan 2001 23:39:50 -0800
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010119000209.F17392@xs4all.nl>; from thomas@xs4all.net on Fri, Jan 19, 2001 at 12:02:09AM +0100
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl>
Message-ID: <20010118233950.A15636@glacier.fnational.com>

On Fri, Jan 19, 2001 at 12:02:09AM +0100, Thomas Wouters wrote:
> I can't find any such hackery in the source, but I also can't
> figure out how else it's working :)

I thank you want to look at getpath.c.  

  Neil


From jeremy@alum.mit.edu  Fri Jan 19 14:44:50 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 19 Jan 2001 09:44:50 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.107,2.108
In-Reply-To: <E14JND2-0004Tl-00@usw-pr-cvs1.sourceforge.net>
References: <E14JND2-0004Tl-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <14952.21218.416551.695660@localhost.localdomain>

>>>>> "GvR" == Guido van Rossum <gvanrossum@users.sourceforge.net> writes:

  GvR> Log Message: Changes to recursive-object comparisons, having to
  GvR> do with a test case I found where rich comparison of unequal
  GvR> recursive objects gave unintuituve results.  In a discussion
  GvR> with Tim, where we discovered that our intuition on when a<=b
  GvR> should be true was failing, we decided to outlaw ordering
  GvR> comparisons on recursive objects.  (Once we have fixed our
  GvR> intuition and designed a matching algorithm that's practical
  GvR> and reasonable to implement, we can allow such orderings
  GvR> again.)

Sounds sensible to me!  I was quite puzzled about what <= should
return for recursive objects.

  GvR> - Changed the nesting limit to a more reasonable small 20; this
  GvR>   only slows down comparisons of very deeply nested objects
  GvR>   (unlikely to occur in practice), while speeding up
  GvR>   comparisons of recursive objects (previously, this would
  GvR>   first waste time and space on 500 nested comparisons before
  GvR>   it would start detecting recursion).

After we talked through this code yesterday, I was also thinking that
the limit was too high :-).

Jeremy


From guido@digicool.com  Fri Jan 19 15:49:54 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 10:49:54 -0500
Subject: [Python-Dev] new Makefile.in
In-Reply-To: Your message of "Thu, 18 Jan 2001 18:56:04 EST."
 <200101182356.SAA19616@cj20424-a.reston1.va.home.com>
References: <20010117235922.A12356@glacier.fnational.com>
 <200101182356.SAA19616@cj20424-a.reston1.va.home.com>
Message-ID: <200101191549.KAA28699@cj20424-a.reston1.va.home.com>

[Neil]
> > A question: is it possible to break the Python static library up?

[me]
> Sounds cool to me.

Of course after Martin's response I agree with him -- let's keep it
one library.  (Although I expect that the combined effect of setup.py
and Neil's flat Makefile will still affect the infrastructure to build
extensions... :-( )

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Fri Jan 19 15:56:58 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 10:56:58 -0500
Subject: [Python-Dev] MS CRT crashing:
In-Reply-To: Your message of "Thu, 18 Jan 2001 16:53:15 PST."
 <58C671173DB6174A93E9ED88DCB0883DB863F1@red-msg-07.redmond.corp.microsoft.com>
References: <58C671173DB6174A93E9ED88DCB0883DB863F1@red-msg-07.redmond.corp.microsoft.com>
Message-ID: <200101191556.KAA28761@cj20424-a.reston1.va.home.com>

Bill Tutt writes:
> From the internal support squad:
> Turns out the C standard explicitly says you can't have an input follow
> output on a stream without doing fflush or fseek in-between, to make sure
> the stdio buffer is cleared.  So this program is illegal.
> 
> They've gone and resolved it by design.

I'd just like to note for the record that this is exactly what I had
predicted.

I'd also like to note that I *agree*.  Tim seems to think there's a
race condition in the threading code, but it's really much simpler
than that: the same bug can easily be provoked with a single-threaded
program: just randomly read and write alternatingly.  So obviously the
people who wrote the threading code aren't interested in the bug,
because it's not in their code -- and the people who wrote the code
that doesn't behave well when abused are protected by the C standard...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Fri Jan 19 16:00:30 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:00:30 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Your message of "Thu, 18 Jan 2001 22:39:18 +0100."
 <3A676286.C33823B4@tismer.com>
References: <14949.46995.259157.871323@beluga.mojam.com> <200101171609.LAA04102@cj20424-a.reston1.va.home.com>
 <3A676286.C33823B4@tismer.com>
Message-ID: <200101191600.LAA28788@cj20424-a.reston1.va.home.com>

> Yes, the "inverse" is confusing. Is what you mean the "reverse" ?
> Like the other right-side operators __radd__, is it correct to
> think of
> 
>    __ge__  == __rle__
> 
> if __rle__ was written in the same fashion like __radd__ ?
> It looks semantically the same, although the reason for a
> call might be different.

Yes, it's semantically the same, and the reason for the call is the
same too ("the left argument doesn't support the operator so let's try
if the right one knows").

> And if my above view is right, would it perhaps be less
> confusing to use in fact __rle__ and __rlt__,
> or woudl it be more confusing, since __rlt__ would also be
> invoked left-to-right, implementing ">".

I prefer 6 new operators over 12 any day.  I can see no valid reason
why someone would want to overload a>b different than b<a, while
there are plenty of reasons why a+b and b+a should be different:
e.g. string concatenation.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Fri Jan 19 16:14:55 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Fri, 19 Jan 2001 11:14:55 -0500
Subject: [Python-Dev] new Makefile.in
In-Reply-To: <200101191549.KAA28699@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Fri, Jan 19, 2001 at 10:49:54AM -0500
References: <20010117235922.A12356@glacier.fnational.com> <200101182356.SAA19616@cj20424-a.reston1.va.home.com> <200101191549.KAA28699@cj20424-a.reston1.va.home.com>
Message-ID: <20010119111455.C25056@kronos.cnri.reston.va.us>

On Fri, Jan 19, 2001 at 10:49:54AM -0500, Guido van Rossum wrote:
>Of course after Martin's response I agree with him -- let's keep it
>one library.  (Although I expect that the combined effect of setup.py
>and Neil's flat Makefile will still affect the infrastructure to build
>extensions... :-( )

Which reminds me... there should really be a way to ignore the
setup.py stuff and use the old method.  How should that be done.  A
--use-makesetup flag to configure, maybe?

--amk



From guido@digicool.com  Fri Jan 19 16:14:20 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:14:20 -0500
Subject: [Python-Dev] Re: test_support.py
In-Reply-To: Your message of "Thu, 18 Jan 2001 21:59:23 PST."
 <E14JUa3-0006xu-00@usw-pr-cvs1.sourceforge.net>
References: <E14JUa3-0006xu-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <200101191614.LAA28881@cj20424-a.reston1.va.home.com>

>       if not condition:
> !         raise AssertionError(reason)

Wouldn't it be better if this raised TestFailed rather than
AssertionError?  Or is there code that catches the AssertionError?

[...grep...]

Yes, there's code that catches AssertionError:

(1) in Marc-Andre's own test_unicode.py;

(2) in test_re, which catches AssertionError and raises TestFailed
    instead.

Proposal:

(1) change verify() to raise TestFailed;

(2) change test_unicode.py to catch TestFailed instead.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tismer@tismer.com  Fri Jan 19 16:17:06 2001
From: tismer@tismer.com (Christian Tismer)
Date: Fri, 19 Jan 2001 17:17:06 +0100
Subject: [Python-Dev] Rich comparison confusion
References: <14949.46995.259157.871323@beluga.mojam.com> <200101171609.LAA04102@cj20424-a.reston1.va.home.com>
 <3A676286.C33823B4@tismer.com> <200101191600.LAA28788@cj20424-a.reston1.va.home.com>
Message-ID: <3A686882.F78C1268@tismer.com>


Guido van Rossum wrote:
> 
> > Yes, the "inverse" is confusing. Is what you mean the "reverse" ?
> > Like the other right-side operators __radd__, is it correct to
> > think of
> >
> >    __ge__  == __rle__
> >
> > if __rle__ was written in the same fashion like __radd__ ?
> > It looks semantically the same, although the reason for a
> > call might be different.
> 
> Yes, it's semantically the same, and the reason for the call is the
> same too ("the left argument doesn't support the operator so let's try
> if the right one knows").
> 
> > And if my above view is right, would it perhaps be less
> > confusing to use in fact __rle__ and __rlt__,
> > or woudl it be more confusing, since __rlt__ would also be
> > invoked left-to-right, implementing ">".
> 
> I prefer 6 new operators over 12 any day.  I can see no valid reason
> why someone would want to overload a>b different than b<a, while
> there are plenty of reasons why a+b and b+a should be different:
> e.g. string concatenation.

Sure, I didn't want to introduce new operators, but use the
"r" versions for three of the six new operators. But I should have
read you proposal before. The confusion is not due to you,
but Skip had a read error, since you don't talk about inverses
at all:

Skip=="""
In the description
he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.
"""

Truth=="""
There are no explicit "reversed argument" versions of
  these; instead, __lt__ and __gt__ are each other's reverse, likewise
  for__le__ and __ge__; __eq__ and __ne__ are their own reverse
  (similar at the C level).
"""

No reason for confusion at all > python-dev/null - ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From thomas@xs4all.net  Fri Jan 19 16:20:56 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 17:20:56 +0100
Subject: [Python-Dev] test_ucn errors ?
Message-ID: <20010119172056.K17392@xs4all.nl>

I'm currently seeing a failure in test_ucn:

test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
error: Illegal Unicode character

It looks like one of the unicode literals in test_ucn is invalid, but it's
damned hard to pin down which:

Python 2.1a1 (#7, Jan 19 2001, 17:06:32) 
[GCC 2.95.2 20000220 (Debian GNU/Linux)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> import test.test_ucn
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: Unicode-Escape decoding error: Illegal Unicode character
>>> 

I get the same crashes on FreeBSD and (Debian) Linux.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Fri Jan 19 16:26:34 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:26:34 -0500
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: Your message of "Fri, 19 Jan 2001 00:02:09 +0100."
 <20010119000209.F17392@xs4all.nl>
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us>
 <20010119000209.F17392@xs4all.nl>
Message-ID: <200101191626.LAA29165@cj20424-a.reston1.va.home.com>

> This brings me to another point: how can 'make test' work at all ? Does
> python always check for './Lib' (and './Modules') for modules ?

Look at the logic in Modules/getpath.c, which calculates the initial
(default) sys.path.  It detects that it's running from the build tree
and then modifies the default path a bit to include Lib and Modules
relative to where the python executable was found.

> If that's
> specific for 'make test' and running python in the source distribution, that
> sounds like a bit of a weird hack. I can't find any such hackery in the
> source, but I also can't figure out how else it's working :)

It's not jut for 'make test' -- it's to make life easy for developers
in general (and me in particular :-) who want to try out their hacks
without going through 'make install'.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Fri Jan 19 16:34:58 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 17:34:58 +0100
Subject: [Python-Dev] Re: test_support.py
References: <E14JUa3-0006xu-00@usw-pr-cvs1.sourceforge.net> <200101191614.LAA28881@cj20424-a.reston1.va.home.com>
Message-ID: <3A686CB2.C75D184D@lemburg.com>

Guido van Rossum wrote:
> 
> >       if not condition:
> > !         raise AssertionError(reason)
> 
> Wouldn't it be better if this raised TestFailed rather than
> AssertionError?  Or is there code that catches the AssertionError?
> 
> [...grep...]
> 
> Yes, there's code that catches AssertionError:
> 
> (1) in Marc-Andre's own test_unicode.py;
> 
> (2) in test_re, which catches AssertionError and raises TestFailed
>     instead.
> 
> Proposal:
> 
> (1) change verify() to raise TestFailed;
> 
> (2) change test_unicode.py to catch TestFailed instead.

+1

Why not simply make TestFailed a subclass of AssertionError ?
Then we wouldn't have to fear about breaking test code...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From thomas@xs4all.net  Fri Jan 19 16:34:15 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 17:34:15 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <200101191626.LAA29165@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Fri, Jan 19, 2001 at 11:26:34AM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl> <200101191626.LAA29165@cj20424-a.reston1.va.home.com>
Message-ID: <20010119173415.M17295@xs4all.nl>

On Fri, Jan 19, 2001 at 11:26:34AM -0500, Guido van Rossum wrote:
> > This brings me to another point: how can 'make test' work at all ? Does
> > python always check for './Lib' (and './Modules') for modules ?

> Look at the logic in Modules/getpath.c, which calculates the initial
> (default) sys.path.  It detects that it's running from the build tree
> and then modifies the default path a bit to include Lib and Modules
> relative to where the python executable was found.

Aye, I found it now.

> > If that's
> > specific for 'make test' and running python in the source distribution, that
> > sounds like a bit of a weird hack. I can't find any such hackery in the
> > source, but I also can't figure out how else it's working :)

> It's not jut for 'make test' -- it's to make life easy for developers
> in general (and me in particular :-) who want to try out their hacks
> without going through 'make install'.

Well, after some old SF movies & some sleep, I realized that :) But it is
going to have to change: you now have to include the build tree as well, and
that is quite a bit more difficult to figure out. I'd suggest a 'make run'
that calls python with the appropriate PYTHONPATH environment variable, but
that doesn't cover test-scripts (which I use a lot myself.)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Fri Jan 19 16:34:45 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:34:45 -0500
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: Your message of "Fri, 19 Jan 2001 12:02:00 +0100."
 <20010119110200.9E455373C95@snelboot.oratrix.nl>
References: <20010119110200.9E455373C95@snelboot.oratrix.nl>
Message-ID: <200101191634.LAA29239@cj20424-a.reston1.va.home.com>

> I get the impression that I'm currently seeing a non-NULL third
> argument in my (C) methods even though the method is called without
> keyword arguments.

> Is this new semantics that I missed the discussion about, or is this a bug?

Can't tell without spending more time looking at the code and
experimenting than I can afford today; but Jeremy refactored the
calling code, and it could be that you're seeing an empty dictionary
instead of a NULL.

Do you really need the NULL?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Fri Jan 19 16:41:02 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:41:02 -0500
Subject: [Python-Dev] Mail delays and SourceForge bugs
In-Reply-To: Your message of "Fri, 19 Jan 2001 13:26:31 +0100."
 <20010119132631.I17392@xs4all.nl>
References: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>
 <20010119132631.I17392@xs4all.nl>
Message-ID: <200101191641.LAA29324@cj20424-a.reston1.va.home.com>

> I doubt it's (just) you, Guido. I'm seeing similar delays, and I already
> talked with Barry about it, too. It looks like it's clearing up a bit, now,
> but it's confusing as hell, for sure ;)

It's worse for me though than for most people: for others, only mail
sent through mailman at mail.python.org is affected.  For me, mail
sent directly to guido@python.org is affected too (which is why I've
changed my From address again to that old standby,
guido@digicool.com).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Fri Jan 19 16:53:39 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:53:39 -0500
Subject: [Python-Dev] deprecated regex used by un-deprecated modules
In-Reply-To: Your message of "Thu, 18 Jan 2001 22:20:08 EST."
 <14951.45672.806978.600944@localhost.localdomain>
References: <14951.45672.806978.600944@localhost.localdomain>
Message-ID: <200101191653.LAA29774@cj20424-a.reston1.va.home.com>

> There are several modules in the standard library that use the regex
> module.  When they are imported, they print a warning about using a
> deprecated module.  I think this is bad form.  Either the modules that
> depend on regex should by updated to use re or they should be
> deprecated themselves.  
> 
> I discovered the following offenders:
> asynchat
> knee
> poplib
> reconvert
> 
> I would suggest fixing asynchat and poplib and deprecating knee.  The
> reconvert module may be a special case.

Agreed.  There's an idiom to disable the warning, which you can find
in regsub.py:

    import warnings
    warnings.filterwarnings("ignore", "", DeprecationWarning, __name__)

(The "" should be replaced by the specific warning message though.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Fri Jan 19 17:21:28 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 12:21:28 -0500
Subject: [Python-Dev] test_ucn errors ?
In-Reply-To: Your message of "Fri, 19 Jan 2001 17:20:56 +0100."
 <20010119172056.K17392@xs4all.nl>
References: <20010119172056.K17392@xs4all.nl>
Message-ID: <200101191721.MAA31937@cj20424-a.reston1.va.home.com>

> I'm currently seeing a failure in test_ucn:
> 
> test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
> error: Illegal Unicode character
> 
> It looks like one of the unicode literals in test_ucn is invalid, but it's
> damned hard to pin down which:

Feels to me like there's a bug in the string literal processing that
makes *any* string literal containing \N{...} fail during code
generation.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@effbot.org  Fri Jan 19 17:37:41 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Fri, 19 Jan 2001 18:37:41 +0100
Subject: [Python-Dev] test_ucn errors ?
References: <20010119172056.K17392@xs4all.nl>
Message-ID: <023801c0823e$86fcedc0$e46940d5@hagrid>

> test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
> error: Illegal Unicode character

Make sure you rebuild Objects/unicodeobject.o and the
ucnhash extension.  If they build without warnings, run
the following script.

import ucnhash
count = 0
for code in range(65536):
    try:
        name = ucnhash.getname(code)
        if ucnhash.getcode(name) != code:
            print name
        count += 1
    except ValueError:
        pass
print count

if it prints anything but "10538", let me know.

> It looks like one of the unicode literals in test_ucn is invalid, but it's
> damned hard to pin down which:

If the ucnhash extension cannot be found, the script won't
even compile...  shouldn't be too hard to fix.

</F>



From Barrett@stsci.edu  Fri Jan 19 17:32:26 2001
From: Barrett@stsci.edu (Paul Barrett)
Date: Fri, 19 Jan 2001 12:32:26 -0500 (EST)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <200101191600.LAA28788@cj20424-a.reston1.va.home.com>
References: <14949.46995.259157.871323@beluga.mojam.com>
 <200101171609.LAA04102@cj20424-a.reston1.va.home.com>
 <3A676286.C33823B4@tismer.com>
 <200101191600.LAA28788@cj20424-a.reston1.va.home.com>
Message-ID: <14952.30800.112503.123675@nem-srvr.stsci.edu>

Guido van Rossum writes:
 > 
 > ... I can see no valid reason why someone would want to overload
 > a>b different than b<a, ... 
 > 

I agree.  But this assumes that the result of A<B and B>A is a
collection of Booleans.  In the Interactive Data Language (IDL) these
operators are essentially mapped to ceiling and floor functions which
are not commutative.  I personally find this silly, but IDL users
coming to Python may be surprised when the comparison of two Numeric
arrays returns a Boolean-like result.

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218


From nas@arctrix.com  Fri Jan 19 10:43:12 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Fri, 19 Jan 2001 02:43:12 -0800
Subject: [Python-Dev] new Makefile.in
In-Reply-To: <20010119111455.C25056@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Fri, Jan 19, 2001 at 11:14:55AM -0500
References: <20010117235922.A12356@glacier.fnational.com> <200101182356.SAA19616@cj20424-a.reston1.va.home.com> <200101191549.KAA28699@cj20424-a.reston1.va.home.com> <20010119111455.C25056@kronos.cnri.reston.va.us>
Message-ID: <20010119024312.A16179@glacier.fnational.com>

On Fri, Jan 19, 2001 at 11:14:55AM -0500, Andrew Kuchling wrote:
> Which reminds me... there should really be a way to ignore the
> setup.py stuff and use the old method.  How should that be done.  A
> --use-makesetup flag to configure, maybe?

A different target for make would be easy.

  Neil


From fredrik@effbot.org  Fri Jan 19 18:13:15 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Fri, 19 Jan 2001 19:13:15 +0100
Subject: [Python-Dev] test_ucn errors ?
References: <20010119172056.K17392@xs4all.nl>  <200101191721.MAA31937@cj20424-a.reston1.va.home.com>
Message-ID: <03a201c08243$7fa62af0$e46940d5@hagrid>

thomas wrote:
> > I'm currently seeing a failure in test_ucn:
> > 
> > test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
> > error: Illegal Unicode character
> > 
> > It looks like one of the unicode literals in test_ucn is invalid, but it's
> > damned hard to pin down which:
> 
> Feels to me like there's a bug in the string literal processing that
> makes *any* string literal containing \N{...} fail during code
> generation.

I took another look at the error message: the only explanation
I can see here is that the lookup succeeds, but the call to ucn-
hash returns a value larger than 0x10ffff.

What is Py_UCS4 set to under gcc?

Confusing /F



From guido@digicool.com  Fri Jan 19 18:11:21 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 13:11:21 -0500
Subject: [Python-Dev] Re: test_support.py
In-Reply-To: Your message of "Fri, 19 Jan 2001 17:34:58 +0100."
 <3A686CB2.C75D184D@lemburg.com>
References: <E14JUa3-0006xu-00@usw-pr-cvs1.sourceforge.net> <200101191614.LAA28881@cj20424-a.reston1.va.home.com>
 <3A686CB2.C75D184D@lemburg.com>
Message-ID: <200101191811.NAA32539@cj20424-a.reston1.va.home.com>

> > Proposal:
> > 
> > (1) change verify() to raise TestFailed;
> > 
> > (2) change test_unicode.py to catch TestFailed instead.
> 
> +1
> 
> Why not simply make TestFailed a subclass of AssertionError ?
> Then we wouldn't have to fear about breaking test code...

No, I'd rather see the two separated.  There can be assert statements
in the modules we're testing, and I'd prefer not to see those caught
by test code that is trying to catch TestFailed.

I'll check this in momentarily.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@effbot.org  Fri Jan 19 18:19:37 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Fri, 19 Jan 2001 19:19:37 +0100
Subject: [Python-Dev] test_ucn errors ?
References: <20010119172056.K17392@xs4all.nl>  <200101191721.MAA31937@cj20424-a.reston1.va.home.com>
Message-ID: <03b301c08244$627f22a0$e46940d5@hagrid>

> Feels to me like there's a bug in the string literal processing that
> makes *any* string literal containing \N{...} fail during code
> generation.

umm.  can anyone explain how this can happen:

python ../lib/test/regrtest.py test_ucn
test_ucn
1 test OK.

python ../lib/test/test_ucn.py
UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character Name

how can a test that works under regrtest.py fail when
it's run separately?  what am I missing here?

</F>



From mal@lemburg.com  Fri Jan 19 18:48:53 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 19:48:53 +0100
Subject: [Python-Dev] test_ucn errors ?
References: <20010119172056.K17392@xs4all.nl>  <200101191721.MAA31937@cj20424-a.reston1.va.home.com> <03a201c08243$7fa62af0$e46940d5@hagrid>
Message-ID: <3A688C15.8C9CFF46@lemburg.com>

Fredrik Lundh wrote:
> 
> thomas wrote:
> > > I'm currently seeing a failure in test_ucn:
> > >
> > > test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
> > > error: Illegal Unicode character
> > >
> > > It looks like one of the unicode literals in test_ucn is invalid, but it's
> > > damned hard to pin down which:
> >
> > Feels to me like there's a bug in the string literal processing that
> > makes *any* string literal containing \N{...} fail during code
> > generation.
> 
> I took another look at the error message: the only explanation
> I can see here is that the lookup succeeds, but the call to ucn-
> hash returns a value larger than 0x10ffff.
> 
> What is Py_UCS4 set to under gcc?

Should be "unsigned int" on all modern Intel platforms.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@digicool.com  Fri Jan 19 18:48:45 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 13:48:45 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Your message of "Fri, 19 Jan 2001 12:32:26 EST."
 <14952.30800.112503.123675@nem-srvr.stsci.edu>
References: <14949.46995.259157.871323@beluga.mojam.com> <200101171609.LAA04102@cj20424-a.reston1.va.home.com> <3A676286.C33823B4@tismer.com> <200101191600.LAA28788@cj20424-a.reston1.va.home.com>
 <14952.30800.112503.123675@nem-srvr.stsci.edu>
Message-ID: <200101191848.NAA02765@cj20424-a.reston1.va.home.com>

>  > ... I can see no valid reason why someone would want to overload
>  > a>b different than b<a, ... 
>  > 
> 
> I agree.  But this assumes that the result of A<B and B>A is a
> collection of Booleans.  In the Interactive Data Language (IDL) these
> operators are essentially mapped to ceiling and floor functions which
> are not commutative.  I personally find this silly, but IDL users
> coming to Python may be surprised when the comparison of two Numeric
> arrays returns a Boolean-like result.

This means that Python can't be used to emulate this part of IDL.  I
don't understand how these can be not commutative unless they have a
side effect on the left argument, and that's not possible in Python
anyway.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Fri Jan 19 19:18:04 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 19 Jan 2001 14:18:04 -0500
Subject: [Python-Dev] test_ucn errors ?
Message-ID: <LNBBLJKPBEHFEDALKOLCGELEIJAA.tim.one@home.com>

[/F]
> umm.  can anyone explain how this can happen:
>
> python ../lib/test/regrtest.py test_ucn
> test_ucn
> test OK.
>
> python ../lib/test/test_ucn.py
> UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character
Name
>
> how can a test that works under regrtest.py fail when
> it's run separately?  what am I missing here?

Dunno, but add to the pile of mysteries that you're unique.  Here on
Win98SE:

python ../lib/test/regrtest.py test_ucn
test_ucn
test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape
      decoding error:
 Invalid Unicode Character Name
1 test failed: test_ucn


python ../lib/test/test_ucn.py
UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character Name


I suggest you reformat your hard drive, and reinstall Windows <wink>.



From mwh21@cam.ac.uk  Fri Jan 19 19:25:03 2001
From: mwh21@cam.ac.uk (Michael Hudson)
Date: 19 Jan 2001 19:25:03 +0000
Subject: [Python-Dev] test_ucn errors ?
In-Reply-To: "Fredrik Lundh"'s message of "Fri, 19 Jan 2001 19:19:37 +0100"
References: <20010119172056.K17392@xs4all.nl> <200101191721.MAA31937@cj20424-a.reston1.va.home.com> <03b301c08244$627f22a0$e46940d5@hagrid>
Message-ID: <m3n1cn1h0w.fsf@atrus.jesus.cam.ac.uk>

"Fredrik Lundh" <fredrik@effbot.org> writes:

> > Feels to me like there's a bug in the string literal processing that
> > makes *any* string literal containing \N{...} fail during code
> > generation.
> 
> umm.  can anyone explain how this can happen:
> 
> python ../lib/test/regrtest.py test_ucn
> test_ucn
> 1 test OK.

This will run the .pyc if present?
 
> python ../lib/test/test_ucn.py
> UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character Name

This won't?  

Note: no traceback -> (in effect, if not design) compile time error.

> how can a test that works under regrtest.py fail when
> it's run separately?  what am I missing here?

Well, this is just my guess.

Cheers,
M.

-- 
  Well, you pretty much need Microsoft stuff to get misbehaviours bad
  enough to actually tear the time-space continuum.  Luckily for you,
  MS Internet Explorer is available for Solaris.
                              -- Calle Dybedahl, alt.sysadmin.recovery



From skip@mojam.com (Skip Montanaro)  Fri Jan 19 19:55:29 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Fri, 19 Jan 2001 13:55:29 -0600 (CST)
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010119173415.M17295@xs4all.nl>
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us>
 <20010117234925.A17392@xs4all.nl>
 <20010118004400.B17392@xs4all.nl>
 <20010118103036.B21503@kronos.cnri.reston.va.us>
 <20010119000209.F17392@xs4all.nl>
 <200101191626.LAA29165@cj20424-a.reston1.va.home.com>
 <20010119173415.M17295@xs4all.nl>
Message-ID: <14952.39857.83065.24889@beluga.mojam.com>

    Thomas> But it is going to have to change: you now have to include the
    Thomas> build tree as well, and that is quite a bit more difficult to
    Thomas> figure out. I'd suggest a 'make run' that calls python with the
    Thomas> appropriate PYTHONPATH environment variable, but that doesn't
    Thomas> cover test-scripts (which I use a lot myself.)

Doesn't Andrew's new "platform" target in the top-level Makefile do the
right thing?  It *should* generate a platform-specific path to the correct
build subdirectory.

Skip


From MarkH@ActiveState.com  Fri Jan 19 20:11:02 2001
From: MarkH@ActiveState.com (Mark Hammond)
Date: Fri, 19 Jan 2001 12:11:02 -0800
Subject: [Python-Dev] initializing ob_type (was: CVS: python/dist/src/Modules _cursesmodule.c,2.46,2.47)
In-Reply-To: <010c01c08201$4b0ec050$e46940d5@hagrid>
Message-ID: <LCEPIIGDJPKCOIHOBJEPIEHFCPAA.MarkH@ActiveState.com>

> you can compile the module as C++, but that's also a bit painful...

My understanding is that the C std doesn't guarantee the order of static
object initialization, whereas C++ does provide these semantics.  At least
that is the excuse I found when digging into this some years ago.

Can't-believe-I-mentioned-the-C-standard-while-Tim-is-listening ly,

Mark.



From guido@digicool.com  Fri Jan 19 20:44:53 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 15:44:53 -0500
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: Your message of "Fri, 19 Jan 2001 10:58:08 +0100."
 <3A680FB0.AED2DB55@lemburg.com>
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org> <3A66CAC2.74FC894@lemburg.com> <200101190104.UAA27056@cj20424-a.reston1.va.home.com>
 <3A680FB0.AED2DB55@lemburg.com>
Message-ID: <200101192044.PAA04154@cj20424-a.reston1.va.home.com>

> If we agree to merge the semantics of the two APIs, then str()
> would have to change too: is this desirable ? (IMHO, yes)

Not clear.  Which is why I'm backing off from my initial support for
merging the two.

I believe unicode() (which is really just an interface to
PyUnicode_FromEncodedObject()) currently already does too much.  In
particular this whole business with calling __str__ on instances seems
to me to be unnecessary.  I think it should *only* bother to look for
something that supports the buffer interface (checking for regular
strings only as a tiny optimization), or existing unicode objects.

> Here's what we could do:
> 
> a) merge the semantics of unistr() into unicode()
> b) apply the same semantics in str()
> c) remove unistr() -- how's that for a short-living builtin ;)
> 
> About the semantics:
> 
> These should be backward compatible to str() in that everything
> that worked before should continue to work after the merge.
> 
> A strawman for processing str() and unicode():
> 
> 1. strings/Unicode is passed back as-is

I hope you mean str() passes 8-bit strings back as-is, unicode()
passes Unicode strings back as-is, right?

> 2. tp_str is tried
> 3. the method __str__ is tried

Shouldn't have to -- instances should define tp_str and all the magic
for calling __str__ should be there.  I don't understand why it's not
done that way, probably just for historical reasons.  I also don't
think __str__ should be tried for non-instance types.

But, more seriously, I believe tp_str or __str__ shouldn't be tried at
all by unicode().

> 4. the PyObject_AsCharBuffer() API is tried (bf_getcharbuffer)
> 5. for str(): Unicode return values are converted to strings using
>               the default encoding
>    for unicode(): Unicode return values are passed back as-is;
>               string return values are decoded according to the
>               encoding parameter
> 6. the return object is type-checked: str() will always return
>    a string object, unicode() always a Unicode object
> 
> Note that passing back Unicode is only allowed in case no encoding
> was given. Otherwise an execption is raised: you can't decode
> Unicode.
> 
> As extension we could add encoding and error parameters to str()
> as well. The result would be either an encoding of Unicode objects
> passed back by tp_str or __str__ or a recoding of string objects
> returned by checks 2, 3 or 4.

Naaaah!

> If we agree to take this approach, then we should remove the
> unistr() Python API before the alpha ships.

Frankly, I believe we need more time to sort this out, and therefore I
propose to remove the unistr() built-in before the release.  Marc,
would you do the honors?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Fri Jan 19 20:55:53 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 21:55:53 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <14952.39857.83065.24889@beluga.mojam.com>; from skip@mojam.com on Fri, Jan 19, 2001 at 01:55:29PM -0600
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl> <200101191626.LAA29165@cj20424-a.reston1.va.home.com> <20010119173415.M17295@xs4all.nl> <14952.39857.83065.24889@beluga.mojam.com>
Message-ID: <20010119215552.O17295@xs4all.nl>

On Fri, Jan 19, 2001 at 01:55:29PM -0600, Skip Montanaro wrote:
> 
>     Thomas> But it is going to have to change: you now have to include the
>     Thomas> build tree as well, and that is quite a bit more difficult to
>     Thomas> figure out. I'd suggest a 'make run' that calls python with the
>     Thomas> appropriate PYTHONPATH environment variable, but that doesn't
>     Thomas> cover test-scripts (which I use a lot myself.)

> Doesn't Andrew's new "platform" target in the top-level Makefile do the
> right thing?  It *should* generate a platform-specific path to the correct
> build subdirectory.

Yes, it does, that's what I meant with 'make run'. But that isn't quite as
user-friendly as the current method. How would you run a script with the
current python ? 'make SCRIPT=./spamtest.py runscript' ?

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Fri Jan 19 22:06:03 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 17:06:03 -0500
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: Your message of "Fri, 19 Jan 2001 17:34:15 +0100."
 <20010119173415.M17295@xs4all.nl>
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl> <200101191626.LAA29165@cj20424-a.reston1.va.home.com>
 <20010119173415.M17295@xs4all.nl>
Message-ID: <200101192206.RAA12072@cj20424-a.reston1.va.home.com>

I finally figured the best way to fix sys.path to find shared modules
built by setup.py.  At first I thought I had to add it to getpath.c,
but the problem is that the name is calculated by calling
distutils.util.get_platform(), and that requires a working Python
interpreter, so we'd end up with a chicken-or-egg situation.

So instead I added 5 lines to site.py, which tests for
os.name=='posix', then for sys.path[-1] ending in '/Modules' -- this
tests only succeeds when running from the build directory.  Then it
calls distutils.util.get_platform() and uses the result to calculate
the correct directory name, which is then appended to sys.path.

Yes, this slows down startup (it imports a large portion of the
distutils package), but I don't care -- after all this is mostly for
me so I can play with the interpreter right after I've built it,
right?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Fri Jan 19 21:32:34 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 22:32:34 +0100
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs.
 unistr()
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org> <3A66CAC2.74FC894@lemburg.com> <200101190104.UAA27056@cj20424-a.reston1.va.home.com>
 <3A680FB0.AED2DB55@lemburg.com> <200101192044.PAA04154@cj20424-a.reston1.va.home.com>
Message-ID: <3A68B272.BBBAECD1@lemburg.com>

Guido van Rossum wrote:
> 
> > If we agree to merge the semantics of the two APIs, then str()
> > would have to change too: is this desirable ? (IMHO, yes)
> 
> Not clear.  Which is why I'm backing off from my initial support for
> merging the two.
> 
> I believe unicode() (which is really just an interface to
> PyUnicode_FromEncodedObject()) currently already does too much.  In
> particular this whole business with calling __str__ on instances seems
> to me to be unnecessary.  I think it should *only* bother to look for
> something that supports the buffer interface (checking for regular
> strings only as a tiny optimization), or existing unicode objects.

Hmm, unicode() should (just like str()) take an object and
convert it to a Unicode string. Since many objects either don't
support the tp_str slot (instances don't for some reason -- just
like they don't tp_call), I had to add some special cases to
make Python instances compatible to Unicode in the same way
str() does.

What I think is really needed is a concept for "stringification"
in Python. We currently have these schemes:

1. tp_str
2. method __str__ (not only of Python instances, but any object)
3. character buffer interface

These three could easily be unified into the tp_str slot:
e.g. tp_str could do the necessary magic to call __str__
or the buffer interface.

Note that the same is true for e.g. tp_call -- the special
cases we have in ceval.c for the different builtin callable
objects would not be necessary if they would implement tp_call.

> > Here's what we could do:
> >
> > a) merge the semantics of unistr() into unicode()
> > b) apply the same semantics in str()
> > c) remove unistr() -- how's that for a short-living builtin ;)
> >
> > About the semantics:
> >
> > These should be backward compatible to str() in that everything
> > that worked before should continue to work after the merge.
> >
> > A strawman for processing str() and unicode():
> >
> > 1. strings/Unicode is passed back as-is
> 
> I hope you mean str() passes 8-bit strings back as-is, unicode()
> passes Unicode strings back as-is, right?

Right.
 
> > 2. tp_str is tried
> > 3. the method __str__ is tried
> 
> Shouldn't have to -- instances should define tp_str and all the magic
> for calling __str__ should be there.  I don't understand why it's not
> done that way, probably just for historical reasons.  I also don't
> think __str__ should be tried for non-instance types.

Ok.
 
> But, more seriously, I believe tp_str or __str__ shouldn't be tried at
> all by unicode().

Hmm, but how would you implement generic conversion to Unicode 
then ? 

We'll need some way for instances (and other types) to
provide a conversion to Unicode. Some time ago we discussed this
issue and came to the conclusion that tp_str should be allowed
to return Unicode data instead of inventing a new tp_unicode
slot for this purpose.

> > 4. the PyObject_AsCharBuffer() API is tried (bf_getcharbuffer)
> > 5. for str(): Unicode return values are converted to strings using
> >               the default encoding
> >    for unicode(): Unicode return values are passed back as-is;
> >               string return values are decoded according to the
> >               encoding parameter
> > 6. the return object is type-checked: str() will always return
> >    a string object, unicode() always a Unicode object
> >
> > Note that passing back Unicode is only allowed in case no encoding
> > was given. Otherwise an execption is raised: you can't decode
> > Unicode.
> >
> > As extension we could add encoding and error parameters to str()
> > as well. The result would be either an encoding of Unicode objects
> > passed back by tp_str or __str__ or a recoding of string objects
> > returned by checks 2, 3 or 4.
> 
> Naaaah!

Would be nice for symmetry and useful in the light of making
Unicode the only string type in Py4k ;-)
 
> > If we agree to take this approach, then we should remove the
> > unistr() Python API before the alpha ships.
> 
> Frankly, I believe we need more time to sort this out, and therefore I
> propose to remove the unistr() built-in before the release.  Marc,
> would you do the honors?

Ok. 

I'll remove the builtin and the docs, but will leave the
PyObject_Unicode() API enabled.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From uche.ogbuji@fourthought.com  Fri Jan 19 21:42:40 2001
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Fri, 19 Jan 2001 14:42:40 -0700
Subject: [Python-Dev] Extension doc bugs
Message-ID: <200101192142.OAA29168@localhost.localdomain>

I'm using the bleeding-edge documentation at 

http://python.sourceforge.net/devel-docs/api/api.html

I know that it's not complete until someone has the time to do so, but I've 
run into a few places where it's completely wrong.

For instance, from the object protocol docs: 

"""
int PyObject_Cmp (PyObject *o1, PyObject *o2, int *result) 
      Compare the values of o1 and o2 using a routine provided by o1, if one   
       exists, otherwise with a routine provided by o2. The result of the
      comparison is returned in result. Returns -1 on failure. This is the     
       equivalent of the Python statement "result = cmp(o1, o2)".
"""

After getting weird behavior implementing this, and then squinting at the 
relevant Python 2.0 code, it appears that in actuality the Cmp function is to 
return the direct comparison results (-1, 0, 1 based on ordering of the 
parameters)  furthermore, there is no such "result" argument.

4Suite has a lot of C extension code developed by squinting at Python sources 
and long gdb sessions and I have a feeling that in many cases we're taking up 
hacks that would get us into trouble across versions, and all that; but the 
"official" interfaces and behaviors are not documented (or only poorly 
documented).  In general, the C API docs are in a rather sorry state and 
though I doubt I could do a great deal about fixing it, I'd be interested in 
discussion of the matter, and perhaps making what contribution I can.

Is the doc-sig the best place for this?  My experience there wouldn't seem to 
encourage this conclusion (most of the discussion is of docstring syntax and 
neat-o automagic document generators).


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python




From mal@lemburg.com  Fri Jan 19 21:46:24 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 22:46:24 +0100
Subject: [Python-Dev] readline and setup.py
Message-ID: <3A68B5B0.771412F7@lemburg.com>

The new setup.py procedure for Python causes readline not to
be built on my machine. Instead I get a linker error telling
me that termcap is not found.

Looking at my old Setup file, I have this line:

readline readline.c \
	 -I/usr/include/readline -L/usr/lib/termcap \
	 -lreadline -lterm

I guess, setup.py should be modified to include additional
library search paths -- shouldn't hurt on platforms which
don't need them.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Fri Jan 19 21:50:53 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 22:50:53 +0100
Subject: [Python-Dev] _tkinter and setup.py
Message-ID: <3A68B6BD.BAD038D6@lemburg.com>

Why does setup.py stop with an error in case _tkinter cannot
be built (due to an old Tk/Tcl version in my case) ?

I think the policy in setup.py should be to output warnings,
but continue building the rest of the Python modules.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@digicool.com  Fri Jan 19 22:38:22 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 17:38:22 -0500
Subject: [Python-Dev] 2.1 alpha 1 release schedule
Message-ID: <200101192238.RAA12413@cj20424-a.reston1.va.home.com>

Practicality beats purity: we're very close to a release, but I've
decided to hold off to give Jeremy a chance to finish the nested
scopes, to give Fred a chance to revise the weak references according
to Martin's wishes, and in general for things to settle.

Most likely we'll be able to release Monday night (Jan 22).

Unfortunately email through python.org seems to be wedged again (I
swear, it seems like it starts getting wedged every afternoon between
3 and 4!) so I don't have a clear view of what the latest checkins
were; but from cvs update it seems that the following things happened
this afternoon:

- Barry fixed a core dump in function attribute assignments

- Marc-Andre withrew unistr(), pending more discussion

- Fredrik fixed the ucnhash problem

- I fixed two path problems in the new build process that only
  occurred when you were building in a subdirectory of the source tree

Good work, crew!  I'm taking the weekend off.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jack@oratrix.nl  Fri Jan 19 23:23:18 2001
From: jack@oratrix.nl (Jack Jansen)
Date: Sat, 20 Jan 2001 00:23:18 +0100
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: Message by Guido van Rossum <guido@digicool.com> ,
 Fri, 19 Jan 2001 11:34:45 -0500 , <200101191634.LAA29239@cj20424-a.reston1.va.home.com>
Message-ID: <20010119232323.70B03116392@oratrix.oratrix.nl>

Recently, Guido van Rossum <guido@digicool.com> said:
> > I get the impression that I'm currently seeing a non-NULL third
> > argument in my (C) methods even though the method is called without
> > keyword arguments.
> 
> > Is this new semantics that I missed the discussion about, or is this a bug?
> 
> [...] 
> Do you really need the NULL?

The places that I know I was counting on the NULL now have "if ( kw && 
PyObject_IsTrue(kw))", so I'll just have to hope there aren't any more 
lingering in there.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 


From tim.one@home.com  Sat Jan 20 00:04:10 2001
From: tim.one@home.com (Tim Peters)
Date: Fri, 19 Jan 2001 19:04:10 -0500
Subject: [Python-Dev] MS CRT crashing:
In-Reply-To: <200101191556.KAA28761@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMENLIJAA.tim.one@home.com>

[Guido]
> I'd just like to note for the record that this is exactly what I had
> predicted.

I would have hoped you'd be content to let the record speak for itself
<wink>.

> I'd also like to note that I *agree*.

With what?  That the program is undefined by the C std was never in dispute.

> Tim seems to think there's a race condition in the threading code,
> but it's really much simpler than that: the same bug can easily be
> provoked with a single-threaded program: just randomly read and
> write alternatingly.

And this is a point in their favor?!  "It's OK that the MT library corrupts
itself, because even the single-threaded library does"?

> So obviously the people who wrote the threading code aren't interested
> in the bug,

I don't know that it ever got as far as the people who wrote the threading
code, but I sure doubt it:  when the reply starts "Turns out the C standard
explicitly says  ...", it strongly suggests it was written by someone who
didn't already know what the C std says, and went looking for an excuse to
get it off their plate without further effort.  Par for the course, if so.

> because it's not in their code -- and the people who wrote the code
> that doesn't behave well when abused are protected by the C standard...

The behavior of things designated "undefined" and "implementation-defined"
by the std fall under "quality of implementation".  In the real world, the
latter is what vendors compete on; meeting the letter of the std is a bare
minimum for playing the game at all.

The plain fact is that their library is less robust than others in this
case.  I worked on a multithreaded stdio implementation at KSR, and that
sure couldn't corrupt itself.  Looks like no flavor of Linux does either.
It's not *reasonable* for a library to corrupt itself in this case, although
it's certainly reasonable for its behavior to vary from run to run.  There's
nothing in the C std that says a conforming implementation can't *crash* on
the program

void main() {int i = 1;}

either <wink>.

a-std-is-a-floor-on-acceptable-behavior-not-a-ceiling-ly y'rs  - tim



From gstein@lyra.org  Sat Jan 20 01:21:56 2001
From: gstein@lyra.org (Greg Stein)
Date: Fri, 19 Jan 2001 17:21:56 -0800
Subject: [Python-Dev] initializing ob_type
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPIEHFCPAA.MarkH@ActiveState.com>; from MarkH@ActiveState.com on Fri, Jan 19, 2001 at 12:11:02PM -0800
References: <010c01c08201$4b0ec050$e46940d5@hagrid> <LCEPIIGDJPKCOIHOBJEPIEHFCPAA.MarkH@ActiveState.com>
Message-ID: <20010119172156.Y7731@lyra.org>

On Fri, Jan 19, 2001 at 12:11:02PM -0800, Mark Hammond wrote:
> > you can compile the module as C++, but that's also a bit painful...
> 
> My understanding is that the C std doesn't guarantee the order of static
> object initialization, whereas C++ does provide these semantics.  At least
> that is the excuse I found when digging into this some years ago.

True, but when PyWhatever_Type is initialized, &PyType_Type ought to be
ready (even if it isn't initialized). Heck, &PyType_Type points into the
Python core which is *definitely* loaded by that point.

Now, if "initialization" also means "relocation to a specific address" then
I can understand.

Hrm... I've just spent some time with the Windows SDK docs, and I can't find
anything that really discusses the problem and resolution. There certainly
isn't any warning about "don't do this." It all talks about how fixups are
stored with the DLL, how you can optionally use BIND to pre-bind the values,
blah blah blah. But nothing saying "it doesn't work."

It would be interesting to know more about the actual symptoms that appears
when the ob_type init is performed by the structure (rather than at
runtime). What happens? Bad address? NULL value? Failure to resolve and
load? Is PyType_Type not exported correctly or something?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/


From guido@digicool.com  Sat Jan 20 02:05:39 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 21:05:39 -0500
Subject: [Python-Dev] How to get setup.py to build expat?
Message-ID: <200101200205.VAA13299@cj20424-a.reston1.va.home.com>

The setup.py script does not build the expat module for me.

I have expat installed in /usr/local, at least I believe so: I have
/usr/local/include/xmlparse.h and /usr/local/lib/libexpat.a -- do I
need more?

How can I get setup.py to spit out what it tries, and why it fails?
setup.py -v build doesn't give any extra output.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From fredrik@effbot.org  Sat Jan 20 02:41:43 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Sat, 20 Jan 2001 03:41:43 +0100
Subject: [Python-Dev] initializing ob_type
References: <010c01c08201$4b0ec050$e46940d5@hagrid> <LCEPIIGDJPKCOIHOBJEPIEHFCPAA.MarkH@ActiveState.com> <20010119172156.Y7731@lyra.org>
Message-ID: <00f001c0828a$bc903900$e46940d5@hagrid>

greg wrote:

> It would be interesting to know more about the actual symptoms that appears
> when the ob_type init is performed by the structure (rather than at runtime).
> What happens?

    http://www.python.org/doc/FAQ.html#3.24
    "3.24. "Initializer not a constant" while building DLL
    on MS-Windows

    "Static type object initializers in extension modules
    may cause compiles to fail with an error message
    like "initializer not a constant"

Cheers /F



From uche.ogbuji@fourthought.com  Sat Jan 20 05:29:23 2001
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Fri, 19 Jan 2001 22:29:23 -0700
Subject: [Python-Dev] Extension doc bugs
In-Reply-To: Message from uche.ogbuji@fourthought.com
 of "Fri, 19 Jan 2001 14:42:40 MST." <200101192142.OAA29168@localhost.localdomain>
Message-ID: <200101200529.WAA30349@localhost.localdomain>

> For instance, from the object protocol docs: 
> 
> """
> int PyObject_Cmp (PyObject *o1, PyObject *o2, int *result) 
>       Compare the values of o1 and o2 using a routine provided by o1, if one   
>        exists, otherwise with a routine provided by o2. The result of the
>       comparison is returned in result. Returns -1 on failure. This is the     
>        equivalent of the Python statement "result = cmp(o1, o2)".
> """
> 
> After getting weird behavior implementing this, and then squinting at the 
> relevant Python 2.0 code, it appears that in actuality the Cmp function is to 
> return the direct comparison results (-1, 0, 1 based on ordering of the 
> parameters)  furthermore, there is no such "result" argument.

Bother.  I didn't squint hard enough.  I mistook the tp_compare slot for the 
PyObject_Cmp equivalent.  I have indeed run into what I'm sure are nits in the 
Python/C API but given that my greatest alarm was false, I'll be more careful 
before bringing up the others.

I'm still curious as to the best forum for this.

-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python




From tim.one@home.com  Sat Jan 20 05:36:12 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 20 Jan 2001 00:36:12 -0500
Subject: [Python-Dev] Extension doc bugs
In-Reply-To: <200101192142.OAA29168@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPKIJAA.tim.one@home.com>

[uche.ogbuji@fourthought.com]
> ...
> In general, the C API docs are in a rather sorry state and though
> I doubt I could do a great deal about fixing it, I'd be interested in
> discussion of the matter, and perhaps making what contribution I can.
>
> Is the doc-sig the best place for this?

Nope!  Discussing it won't do any good, there or anywhere else.  What it
needs is for people to send better docs to python-docs@python.org or upload
LaTeX patches to SourceForge, and to report doc bugs on SourceForge (which
is where the start of this msg should have gone!).  Most days we just work
on whatever is backed up at SourceForge; if doc bugs don't show up there,
they won't get repaired.

the-docs-are-only-10x-better-than-the-sum-of-the-individual-
    contributions<wink>-ly y'rs  - tim



From tim.one@home.com  Sat Jan 20 06:17:04 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 20 Jan 2001 01:17:04 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Objects object.c,2.109,2.110
In-Reply-To: <E14JrC9-00056U-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPNIJAA.tim.one@home.com>

[Barry]
> Modified Files:
> 	object.c
> Log Message:
> default_3way_compare(): When comparing the pointers, they must be cast
> to integer types (i.e. Py_uintptr_t, our spelling of C9X's uintptr_t).
> ANSI specifies that pointer compares other than == and != to
> non-related structures are undefined.  This quiets an Insure
> portability warning.

Barry, that comment belongs in the code, not in the checkin msg.  The code
*used* to do this correctly (as you well know, since you & I went thru
considerable pain to fix this the first time).  However, because the
*reason* for the convolution wasn't recorded in the code as a comment,
somebody threw it all away the first time it got reworked.

c-code-isn't-often-self-explanatory-ly y'rs  - tim



From tim.one@home.com  Sat Jan 20 06:30:42 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 20 Jan 2001 01:30:42 -0500
Subject: [Python-Dev] Stupid Python Tricks, Volume 38 Number 1
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPOIJAA.tim.one@home.com>

I had a huge string and wanted to put a double-quote on each end.  The
boring:

    '"' + huge + '"'

does the job, but is inefficent <snort>.  Then this transparent variation
sprang unbidden from my hoary brow:

    huge.join('""')

*That* should put to rest the argument over whether .join() is more properly
a method of the separator or the sequence -- '""'.join(huge) instead would
look plain silly <wink>.

not-entirely-sure-i'm-channeling-on-this-one-ly y'rs  - tim




From tim.one@home.com  Sat Jan 20 09:28:18 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 20 Jan 2001 04:28:18 -0500
Subject: [Python-Dev] Comparison of recursive objects
In-Reply-To: <14952.21218.416551.695660@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEABIKAA.tim.one@home.com>

This is a multi-part message in MIME format.

------=_NextPart_000_0000_01C08299.69A67E20
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

[Guido's checkin msg]
> ...
> In a discussion with Tim, where we discovered that our intuition
> on when a<=b should be true was failing, we decided to outlaw
> ordering comparisons on recursive objects.  (Once we have fixed our
> intuition and designed a matching algorithm that's practical and
> reasonable to implement, we can allow such orderings again.)

[Jeremy]
> Sounds sensible to me!  I was quite puzzled about what <= should
> return for recursive objects.

That's easy:  x <= y for recursive objects should return true if and only if
x < y or x == y return true <0.9 wink>.

x == y isn't a problem, although Python gives a remarkable answer:
recursive objects in Python are instances of rooted, ordered, directed,
finite, node-labeled graphs, and "x == y" in Python answers whether their
graphs are isomorphic.

Viewed that way (which is the correct way <0.5 wink>), the *natural* meaning
for "x <= y" is "y contains a subgraph isomorphic to x".  And that has
*almost* all the nice properties we like:

    x <= x is true
    (x <= y and y <= z) implies x <= z
    (x <= y and y <= x) if and only if x == y

However,

1. That's much harder to compute.
2. It implies, e.g., [2] <= [1, 2], and that's not what we *want*
   non-recursive sequence comparison to mean.
3. It's a partial ordering:  given arbitrary x and y, it may be that
   neither contains an isomorphic image of the other.
4. We've again given up on avoiding surprises in *simple* comparisons
   among builtin types, like (under current CVS):

>>> 1 < [1] < 0L < 1
1
>>> 1 < 1
0
>>>

   so it's hard to see why we should do any work at all to avoid
   violating "intuition" when comparing recursive objects:  we're
   already scrubbing the face of intuition with steel wool,
   setting it on fire, then putting it out with an axe <wink>.

Now let's look at Guido's example (or one of them, anyway):

>>> a = []
>>> a.append(a)
>>> a.append("x")
>>> b = []
>>> b.append(b)
>>> b.append("y")
>>> a
[[...], 'x']
>>> b
[[...], 'y']
>>>

I think it's a trick of *typography* that caused my first thought to be
"well, clearly, a < b".  That is, the *display* shows me two 2-element
lists, each with the same "blob" as the first element, and where a[1] is
obviously less than b[1].  Since "the blobs" are the same, the second
elements control the outcome.

But those "blobs" aren't really the same:  a[0] is a, and b[0] is b, so
asking whether a < b by looking first at their first elements just leads
back to the original question:  asking whether a[0] < b[0] is again asking
whether a < b, and that makes no progress.  Saying that a is less than b by
fiat is *consistent* with the rules for lexicographic ordering, but so is
insisting that a is greater than b.  There's no basis for picking one over
the other, and so no clear hope of coming up with a generally consistent
scheme.  Well, one clear hope:  if recursive comparison says "not equal", it
could resolve the dilemma by comparing object id instead.  That would be
consistent (I mostly think at the moment ...), but if you run the program
above multiple times it may say a < b on some runs and b < a on others.

WRT "the right way", it should be clear from the attached picture that
neither a nor b contains an isomorphic image of the other, so from that POV
they're not comparable (a != b, but neither a <= b nor b <= a holds).

So this is what Guido made Python do:

>>> a == b  # still cool:  they're not isomorphic and Python knows it
0
>>> a < b
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: can't order recursive values
>>> a <= b
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: can't order recursive values

In light of that, I still find these mildly surprising:

>>> a < a
0
>>> a <= a
1
>>>

I guess some recursive values are more orderable than others <wink -- but
that's true!  the ones Python can prove are equal are indeed "more
orderable">.

>>> import copy
>>> c = copy.deepcopy(a)
>>> c
[[...], 'x']
>>> a == c
1
>>> a <= c
1
>>> a < c
0
>>>

BTW, this kind of construction appears to give equality-testing that's at
best(!) exponential-time in the size of the dicts:

def timeeq(x, y):
    from time import clock
    import sys
    s = clock()
    result = x == y
    f = clock()
    print x, result, round(f-s, 1), "seconds"
    sys.stdout.flush()

d = {}
e = {}
timeeq(d, e)
d[0] = d
e[0] = e
timeeq(d, e)
d[1] = d
e[1] = e
timeeq(d, e)
d[2] = d
e[2] = e
timeeq(d, e)

Output:

{} 1 0.0 seconds
{0: {...}} 1 0.0 seconds
{1: {...}, 0: {...}} 1 6.5 seconds

After more than 15 minutes, the 3-element dict comparison still hasn't
completed (yikes!).

ackerman's-function-eat-your-heart-out-ly y'rs  - tim

------=_NextPart_000_0000_01C08299.69A67E20
Content-Type: image/jpeg;
	name="loopy.jpg"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
	filename="loopy.jpg"

/9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAYEBQYFBAYGBQYHBwYIChAKCgkJChQODwwQFxQYGBcU
FhYaHSUfGhsjHBYWICwgIyYnKSopGR8tMC0oMCUoKSj/2wBDAQcHBwoIChMKChMoGhYaKCgoKCgo
KCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCgoKCj/wAARCAGsAdIDASIA
AhEBAxEB/8QAHAABAQEBAQEBAQEAAAAAAAAAAAYHCAUDBAIJ/8QASBAAAQIFAQMGCQgJBQADAQAA
AAECAwQFBhESBxMhFBYiMVZ1CBc3QZOVs9LUFTI2UVSktdMYI0JVYWZxluMkM4GRlCVEsaH/xAAb
AQEAAwEBAQEAAAAAAAAAAAAAAwQFBgIBB//EADkRAQACAQIDBAYIBQUBAAAAAAABAgMEEQUhQQYS
MVETYXGBkaEVIlJT0dLh8BQyQpLBFiNUscIz/9oADAMBAAIRAxEAPwDqkAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAEBTaFAuC6LyiVKerf8ApanCl4EOWrM3LQ4cPkMq/CMhRWtTpRHrnGVVynq8w6R9
suT+46j+eBVAleYdI+2XJ/cdR/PHMOkfbLk/uOo/ngVQJXmHSPtlyf3HUfzxzDpH2y5P7jqP54FU
CV5h0j7Zcn9x1H88cw6R9suT+46j+eBVAleYdI+2XJ/cdR/PHMOkfbLk/uOo/ngVQJXmHSPtlyf3
HUfzxzDpH2y5P7jqP54FUCV5h0j7Zcn9x1H88/Jaki2kX9cFNlpupRpJtMp8w2HO1CPN6Ij4s41y
tWM9ytykNmURcdFALUAAAAAAAAE1tEm52TttrqZORZGZjVCQlOUQmMc+GyNOQYT1aj2ubnS92MtU
+XNer9vLk9BTvhQKoErzXq/by5PQU74Uc16v28uT0FO+FAqgSvNer9vLk9BTvhRzXq/by5PQU74U
CqBK816v28uT0FO+FHNer9vLk9BTvhQKoErzXq/by5PQU74Uc16v28uT0FO+FAqgSvNer9vLk9BT
vhRzXq/by5PQU74UCqBK816v28uT0FO+FPEvimV6hWXcFXk75uB8zT6fMTcJsWXp6sc+HDc5EciS
qLjKJnCoBooAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAlbM+kV+d9Q/w6SKolbM+kV+d9Q/w6SK
oAAAAAAAAAAAAAAErTfKncXctM9vPlUStN8qdxdy0z28+BVAAAAAAAAldpf0dk++qT+Iy5VErtL+
jsn31SfxGXKoAAAAAAAAAAAAAAErtZ8ll5dyznsHlUSu1nyWXl3LOeweBVAAAAAAAAAAAAAAAAAA
AAAAAAA5AuC7bv2keEDM2bTbqqVtU2DOzUnL8ieqaNzDdqc7QsNX63QVVEc5dOtURfroPBE2lXDc
VbqdtXFPzNUhQZLlktMTMRHxIWmKiParlTU/VvW8XOXSjEREwvAOnwAAAAEXKbS7embyjW0kaNCn
YcR0BIsViNhRIqLhYbVznVnKJlERVTCKuUzaGRbVNkcKvR5msW65kvU3NV8WVVESHNPz1oucMcqZ
yvU5cZx0nLPWHtaqNBmn0faBCnV0aGsjxIOmPA4NTEVq4VzdPS1cXdfzspijGptivNM8bRPhPR1N
uCYNfpo1HCrd61Yjv0n+bfzjzj9xz5RqlmfSK/O+of4dJFURWzapSdXqN6T9NmGTEnHq8J8OIzqV
Pk+T/wCUVFyiovFFRUXiWpdiYmN4cxatqWmto2mAAH15AAAAAAAAAAAJWm+VO4u5aZ7efKolab5U
7i7lpnt58CqAAAAAACPqW0Sh0+9YFsTCzXL4rmQ1iNhZhMe9MsYq5zlct4oip0kyqcceL3rTnadk
+n0ubUzNcNZtMRMzt5R4y+20v6OyffVJ/EZcqiV2l/R2T76pP4jLlUe0AAAAAAAAAAAAAAErtZ8l
l5dyznsHlUSu1nyWXl3LOeweBVAAAAAAAAAAAAAAAAAAAAAAAA41uCt3Dta28zNgVmuzMjbbanNS
vJZJqMasKBqd0k/bc7cIqK/UjXOVUTHRPy+BR5U6r3LF9vAN6u/wf7Jum7nXBPQp6XjxntiTMtKR
Ww4Ey9Fy5z00q5Fd1OVrm56/nKrl/Xsu2J25s4r8xV6HO1ePMx5V0o5s5FhuYjFexyqiNhtXOWJ5
/rA+O2GjX3UqtIRLPmZpsg2ArXw5WcSXckTUuXOy5upFTSicVxpdwTPHP+au177RWvXLfzTpUFPJ
oq5LTabTz9bpNF2mzaPBXBXDjmK9ZrO/v5w5q5q7XvtFa9ct/NHNXa99orXrlv5p0qDx9H0+1b4/
os/6w1H3GL+2fzOauau177RWvXLfzSHvaQuCRqTGXZMvjT6N0aY08yZiw2p0kR2HuVqdLKIuM5VU
850NtrvSq2fSZF1GlmK+cdEhum4jFc2AqN4IidWpcqqZynQXgvmk9mmyOLMx4FwXq58WNFcsf5Pj
IrnPcqoqOjOVcqqrlVYqfVqX5zSlm0sWv6LHMzPXeeUOl4bx62LTfSGtrjpSd4rFaz3rTHj1nb3/
AC5b/wAeDPRJyFGrFWiTD4UsmmTdJua5querIcZsRUXCY0RG6V45SIvV592JWzPpFfnfUP8ADpIq
jWwYYw0ikPz3inEcnE9TbU5I2mekdIjw9vtAATM8AAAAAAAAAAAzmsXZRrV2oVd9fnFlGTVGp6QX
bmI9HqyPO6k6LV6tbev6zRjK7tsen3vtNqMKrTM7ChyNHkXQklnMblYkacR2dTVz/tt//pHlnJFf
9vbf1rvD66S2eI1szFOf8u2/q8fwez43LI/ff3SP7g8blkfvv7pH9w8HxDWx9vrXpoX5Y8Q1sfb6
16aF+WVO9rPs1/fvdB6Ds397l+X5XveNyyP3390j+4PG5ZH77+6R/cPB8Q1sfb616aF+WPENbH2+
temhfljvaz7Nf37z0HZv73L8vyve8blkfvv7pH9wyXbXVLPuXc1a3qnBWrQ+hMQ+SxmOmWcEaupW
o3U3+PWi9fRai3niGtj7fWvTQvyzyLp2T2RbFGjVOr1WtQ4EPg1qRYSvivXqYxN3xcuP/wBVcIiq
kOeupyY5rkiu379bR4Vk4Jo9VTLpMmWb+ERtE779Jju89/18Xiyu1iVqdo0mj1yHMMqEtUKfEiTn
GJDfCgTUGK6I9cq/WrYbsoiLlePDOE6KOMZSiQq3XXNocCehUTlktLOmJjTEfAbHjMgsc/Glqqrn
oulOOM8V0qp2HR5CFSqTJU+Xc90GUgMl2OeqK5WsajUVcIiZwn1Eugy5clZ7/hHhKj2t0Gg0eWv8
LyvbebV8t+cezry/x4/rABoOPDmCseEDdMlXtosjCp9EWFbu95IroMXU/TPwZdN5+s49CI5eGOKJ
5uC9Pk/Fs6hRZm5I8SRzFuKCyXqjt9E/1ENsNYbU+d0cMcqZbheOeviBnWzHbpQ67Q7dg3VOwpC6
Ku9YcOUgyMy2FEV0d8KHocrXNVF0oirqVEcjkVUwqJ9ts+22jWTIVinUqelo14ye53chMysd0Nda
scuXNRrf9tyu+f148/AzTwvLRg29SrLuG3U5B8laKRDfDjxN9DaxqvltC5XGjRF6WUdlzevzefsF
o8ltR2wX7dFWZFnKSrI8OHCmoz2x2smlfDY3orjCQGxYeNXRy3T1IqB7dH8IG6Z2vbOpGLT6IkK4
t1ytWwYupmqfjS67v9Zw6ENq8c8VXzcE6fJ+FZ1ChTNtx4cjiLbsF8vS3b6J/p4boaQ3J87pZY1E
y7K8M9fEoAAAAErtZ8ll5dyznsHlUSu1nyWXl3LOeweBVAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAACVsz6RX531D/DpIqjnrZnfFUqu1yYgy809tIq87HmokB8GGjnI2Bph5VEVUVGQYScF/Z8+VVe
hSHBnrnrNq9J2afFOFZuF5a4c8xvMRblv13jnvEc+QACZmAAAAAAAAAAAErTfKncXctM9vPlUStN
8qdxdy0z28+BVAAAAABzrAtO9No13ql6NmpCQknOR6rC3bGNVy5ZATqeq4xr6XBEVVd0UXooEGbB
GbaLTyjp5tXhnFsnDYvbDWO/aNotMc6+e3t3/wCkFdVCplu2XJSFFlGSsqlcpT9DVVVc5ajL5VXK
qqq9SZVepETqRC9JXaX9HZPvqk/iMuVRNERWNoZuTJfLab3neZ8ZnnMgAPrwAADkDwsKhEvDa3bV
k0yLLb2W3cvriNe3dzM09nB7sLlqMSC7LUXGp3WvBPlYkwzZt4WtVpUZJGXkanNRZJrYMNyshQ5l
WxpdjEaiaV1bli8FamXebpJ2IAAAAAAASu1nyWXl3LOeweVRK7WfJZeXcs57B4FUAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAATW0qq/Ith1yeR8ZkRss6HDfBXD2RH9BjkXKYw5yLlOKY4FKZr4QtQ5Fs
5jQN1r5dMwpfVqxowqxM9XH/AG8Y4defNgh1Fu5itb1NHhGD+I12HFMbxNo39m/P5M08G2kcsvKa
qUSBrhU+WXRE143cWIulvDPHLEi+ZU/5wdKmPeDJJwGWnVZxrMTMad3L35XixkNqtTHVwWI//v8A
ghsJBw+ncwR6+bU7XamdRxTJHSu1Y90fjMgALrmgAAAAAAAAAACVpvlTuLuWme3nyqJWm+VO4u5a
Z7efAqgAAAAAAASu0v6OyffVJ/EZcqiV2l/R2T76pP4jLlUAAAAAAAAAAAAAACV2s+Sy8u5Zz2Dy
qJXaz5LLy7lnPYPAqgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADJvCX+gsh3lD9lFNZM82+ScCZ2Z
1GLHZqiSsSDGgrlU0vWIjM8OvovcnH6/6FfVxvht7Gx2fyRj4ngmftRHx5PD8GCHEZZVYdEfqa+r
PcxMqulu4gJj+HFFX/k18wDwfLwpVEo1Ykq/VaXTJZsxDjQHzkyyCsR72qjkRXORFwkNvV1ZX+Bq
vjHsjtlbfrSB7550Vu9grKTtLgnBxTPWes7/AB5/5VQJXxj2R2ytv1pA98eMeyO2Vt+tIHvlphqo
Er4x7I7ZW360ge+PGPZHbK2/WkD3wKoEr4x7I7ZW360ge+eVUdsmz6nzj5aZuqn71iNVVhK6K1UV
Ecio5iK1eCp1KfJtFfGUmPDkyztjrM+yN1+DM6jtts6WZAfJzE3UmRUVdUrLqiNThhV3itznPDGe
rzcM0MDaLaD5CUmpm5aNJcphtiNhTc/BhxG5RF0uRXcHJnCp5lI65sd7d2tomVrPwzV6fHGbNimt
Z6zGyrBK+MeyO2Vt+tIHvjxj2R2ytv1pA98lUVUStN8qdxdy0z28+PGPZHbK2/WkD3zz7SrdKr20
m5Jqh1ORqUsyk02G6LJx2RmNekaeVWqrVVM4VFx/FALoAAAAAAAErtL+jsn31SfxGXKoldpf0dk+
+qT+Iy5VAAAAAAAAAAAAAAAldrPksvLuWc9g8qiV2s+Sy8u5Zz2DwKoAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAA/BX6f8rUKpU3e7nlktEl95p1aNbVbnGUzjPVk/eD5MbxtL1S847RevjHNxta9aqNh
Xlync5mZOI+XmpZYmGxERVa9iq1cLxTKLxTKIuFwdfUmpSdXpsvP0yYZMScduuHFZ1Kn/wCoqLlF
ReKKiovEynbls4j13NwUKHvKhBho2YlWMTVMMbnDm4TKvROGFzlERE4oiOybZttCqNkTUVIUPllN
j5WLJvfoTXjCPa7C6XcEReC5TgvUipjYsk6HJOPJ/LPhP7+b9J1+jx9qdHXW6TaM9Y2tXz9XP41n
wnwn1ddA8K0LspF2yDpqizO93elI0J7VbEguVMojkX/lMplFVFwq4U902K2i0b1neH5zmw5MF5x5
azW0eMTykPFvOvwrXtmfrEeA+YZLNRUhMVEV7nORrUyvUmXJleOEzwXqPaP4mIMKZgRIExDZFgxW
qx8N7Uc17VTCoqLwVFTzC0TMTEeJhtSuStssb1iY3jw3jrG/TdzjVdt1zVWO6VoUhKySx3MZARkN
ZiOjspwRV6LlVcpjR1Ljr4nk1eibRLop87UrgbUGyEq18xFSdduIbFhQlXLYPDiqcEVrcKqrlfnK
bfXrvs/Z9AiSjUlZeMrta0+nQWJEVyo3i5rcNaulWrlyplE4ZwYDde0O5rvqT4EGZmpeVmHLBg06
TcqI5r8N3btOFiqv8c8VXCIi4MXUbU5Zck2nyh+n8Fm+efSaDSVw44/rvvMzHq6/OY6zz5Ic6v2E
zsnNbNqbCkdaLKuiQY7XrlWxNavXjhEwqPRyYzhFRMqqKpF7Ndj6RbfmJm6EfLTU81Ehwmw4bosC
Fhc53jXI1zsovBEe3SnSRVchO7IqlNWNtKmrdq6aIc5ESTi4auN6irunplupWuzhOpFSIjl6j5pK
X0t63vHK3L2PXH9Rg45ps2n0tt74Zi3LwttvE7ee28+/bzdKgA3X5SAAAAAAAAAAD8tVpsjV5CLI
1WSlp6Si43kvMwmxYb8KiplrkVFwqIv9UQn/ABcWR2Ntv1XA90qgBK+LiyOxtt+q4HujxcWR2Ntv
1XA90qgBK+LiyOxtt+q4HujxcWR2Ntv1XA90qgBK+LiyOxtt+q4HujxcWR2Ntv1XA90qgBK+LiyO
xtt+q4HujxcWR2Ntv1XA90qgBK+LiyOxtt+q4HujxcWR2Ntv1XA90qgBK+LiyOxtt+q4HujxcWR2
Ntv1XA90qgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAzjaLsppV0MjTlPaynVjS5WxIbUbCjv
VdWYqImVVVz0k49LK6sIho4I8mKuWvdvG8Lei12fQ5Yzae3dtH75+blS49j910WVSYZLwalD/bSQ
c6I9nFEToK1HLnP7KLjCquD70SqbU6LIMkqfK3A2WZjQyLTnRtCIiIjWq9iqjUREw1OCfUdSApfR
1azvjtMOontnmzY/R6vBTJ7Y/wAc4c1c6tr32etepm/lHg3MzaLc27SuU+4JqHDxphcgeyGipnDt
DWo3V0lTVjOFxk6zAtoJtG1skzD5h7WY8Fovi0eOsx1iNp+UOZqTsMuabZLxJ6PT5Bj3Yiw3xFiR
Ybc4VcNRWquOKJq+rKp5thsPZnQ7QfDmoDHzlVa1UWcj9bcoiO0NTg1Fwv1uw5U1KilwCbDosOKd
4jn62bxHtNxDiFJx5L7VnpXlHsnrMe8PCm7QoM3ccGvTNMgxKtB06Y6qvW3g1ytzpVyeZyoqphML
wTHugs2rFvGN2JizZMMzOO013jadp25T4x7AAHpEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAhZSWrFeuS6mtuqr02Wp9QhykCXk4MmrGsWTl4qqqxYD3
Kquiv/a+o9DmvV+3lyegp3wosz6RX531D/DpIqgJXmvV+3lyegp3wo5r1ft5cnoKd8KVQAlea9X7
eXJ6CnfCjmvV+3lyegp3wpVACV5r1ft5cnoKd8KOa9X7eXJ6CnfClUAJXmvV+3lyegp3wo5r1ft5
cnoKd8KVQAlea9X7eXJ6CnfCjmvV+3lyegp3wpVACV5r1ft5cnoKd8KfK11qUnelcpE/W56rS0Cn
yU3CdOQoDHw3xYk016IsGHDRUVILOtF8/wBZXkrTfKncXctM9vPgVQAAAAAAAPKuyr837VrNZ3HK
Pk6SjTm516N5u2Ofp1YXGdOM4XH1Hlcuvfs7bfr6P8GNrPksvLuWc9g8qgJXl179nbb9fR/gxy69
+ztt+vo/wZVACV5de/Z22/X0f4Mcuvfs7bfr6P8ABlUAJXl179nbb9fR/gxy69+ztt+vo/wZVACV
5de/Z22/X0f4Mcuvfs7bfr6P8GVQAleXXv2dtv19H+DHLr37O236+j/BlUAJXl179nbb9fR/gxy6
9+ztt+vo/wAGVQA8q0qvzgtWjVnccn+UZKDObnXr3e8Yj9OrCZxnGcJn6j1SV2TeSyze5ZL2DCqA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAlbM+kV+d9Q/w6SKolbM+kV+d9Q/w6SKoAAAAAAAAAAAA
AAErTfKncXctM9vPlUStN8qdxdy0z28+BVAAAAAAAAldrPksvLuWc9g8qiV2s+Sy8u5Zz2DyqAAA
AAAAAAAAAAAAAAldk3kss3uWS9gwqiV2TeSyze5ZL2DCqAAAAAAAAAAAAAAAAAAAAAAAOdb626Vy
2tuaWi6BRIVAZOycGNNzLHtiQ4UVkJ0R6v3iNTGty5VMIiJnJqtm7VLKvKfdI27cEtMzqfNl4jHw
IkTg5eg2I1qvwjXKunOE68ZQC1AAAAAARe12WuGZs2Mlpxo0KdhxGxIqS7lbGiQkRcthqnHVnSuE
VFVEVEznCz2xHaHCr9Ng0SrzL1rku1UZEjORVm2JlUVF87mpwVFyqomrK9LEE6itcsYrct/Bq4+E
5cuhtrsUxaKztaI8Yjzn1fvz2q7M+kV+d9Q/w6SKolbM+kV+d9Q/w6SKonZQAAAAAAAAAAAAAErT
fKncXctM9vPlUStN8qdxdy0z28+BVAAAAAAI/ardE/aNqrU6XJMmo2/ZCVYiOWHBauem5G4XGURv
WnFyf0X8GzTaZTLvgQJSYeyUrytXXK8UbE0omXQ1XgqKnHTnUmHdaJqWGc9IyejmebRpwrVZNJOt
pXfHE7Ttz226zHSPW9Xaz5LLy7lnPYPKoldrPksvLuWc9g8qiZnAAAErX9oVsW/dNNtur1Pk9aqO
75LL8niv3m8esNnSa1WplzVTiqY8/AqjKtoGynnVtbs27kmt3L0nHK4axMOdunrFgbtNCov6xzkf
lU6OMYXiBV2LtCti++Xc1Kny/kWjlH+niwtGvVp+e1uc6HdWeoqjluubJarsv2S33P0KtVeFUflC
FNyjqXNPa9ZKE5WtSMrWNXKMjRXvx0egxeCNXOdXftGuB+yHZ5Gpt11ttSbGqcvUY0OdjMiPiNiQ
nsR78or8Q4rMLlURHY+tAOqprbHYkrOVyVj13TMUXVy9vJI67nTGbBXijMO/WPa3o5689WVKq1rh
pd1UKVrNBmuV02Z1bqNu3Q9Wlysd0XIip0mqnFPMYUzwcd1G2g//ADMzMtrMkyDTo01Na4zoutkd
75l2645jwofFuVVivz0lRU2DZXanMjZ9RLedF30WTg/rno7U1Yr3K+JpXS3o63O05TOMZ4gVQAAA
ACV2TeSyze5ZL2DCqJXZN5LLN7lkvYMKoAAAAAAAAAAAAAAAAAAAAAA4m2rS0Cc8L6FKzkGFHlo9
WpcOLCisRzIjFhy6K1yLwVFRVRUUbKJaBJ+F9GlZODCgS0CrVSHChQmI1kNjYcwiNaicEREREREP
jtinfk7ws1nuTTM3yap0yNyeVh640XTDgLoY39py4wiedVQ9DYdJVG4fCgqVfk6XPQZGWqE/NTaT
MPdvlEipGaxkVFXhE1ORNKKq8HeZqqgdZXpdVOtCjfKVW3zoSxGwWMgs1Pe9crhMqidSOXiqdX14
RYPx82x9grXoYX5ho9w0KmXFTXSFalGTUqrkfocqorXJ1KjkVFRetMovUqp1KpLeKOyP3J97j++V
c1dRNv8AamNvW3uG5eD0w7a6l5vv/Tttt8YeD4+bY+wVr0ML8wePm2PsFa9DC/MPe8UdkfuT73H9
8eKOyP3J97j++Rd3Wfar+/cv+n7N/dZfl+Z4Pj5tj7BWvQwvzDHr8uKkz95Q7itNs7JzLojZiIyY
gQ2tZGaqKj26XORcqmVRydeVyurCbDe9kbPrStyZq05Qt7u8NhQUnozXRoirhGoqv/qq4yqIirhc
GebENn0C6pqZqVbg72jS+YKMSKrFixsIuF08dKNdnrTiretNSFPURqMlow2mJmefLo6Lg9+EaTBl
4lgpetK/VnvbfW36RG879PL/ALaXsNujnO+6pqPB3M7HnYM1Gaz/AG0zLQoPRVVzxWWc7C9SORMr
xU1Ei9nEhK0qfvKn06C2BJStWhQ4MJucMb8nya+f61VVVetVVVXiqloa+Kt60iLzvL871+TBl1F7
6WncpPhEzvt+/l4AAJFQAAAAAAAAAAAx7aFfbrD2lTcdaak+yoUiTYico3SsWFGm1X9l2c71P+jY
SGfRqZVtqdd+VabJT26otN3fKYDYujMeezjUi4zhP+kI8tb2rtjnaV3h+XTYs8W1ePv058onb2eC
B/SD/ln7/wD4x+kH/LP3/wDxmtczbY7OUX/wwvdHM22OzlF/8ML3Sp6HV/eR8I/B0H0j2f8A+Hb+
+35mS/pB/wAs/f8A/GP0g/5Z+/8A+M1rmbbHZyi/+GF7o5m2x2cov/hhe6PQ6v7yPhH4H0j2f/4d
v77fmZDMbfYUzAiQJi1GRYMVqsfDfPI5r2qmFRUWFhUVPMZTcdYp05WUqVv0yNRIu83yshzWtkN/
BUWFhjVZhUVetetMaUTB1pzNtjs5Rf8AwwvdMh251G3aDAbQ6DRaE2px2qszFZJw95KsVEwidHCO
cirxzlqJnHSaqVdVhyxTvZbxtHqbvAOJ8Ptqow8P01q2t4/XmY26zMTMxy9nqjmkou1msTliVO26
rBhTizkm6SbOucqRWsejmvV/Wj3aXYReHFMu1cTouxKzNXDaNMqs/KckmZmHqfCRFROtURyZ46XI
iOTr4OTivWuPSmzWVoGx66azWILY1bi0SZjwmxYapyL9S5zUajkykRFRFV3WiphOpVdvxb0ePNEd
7Lbp4Oe7SavhmTJOLQYtpi282jwnptEeW/s8OXiAAvOWAAB59w0mBXaBU6ROPislqhKxZSK6EqI9
rIjFaqtVUVM4VcZRT/PCgQJW5KVZtqMnNxOzNwTDIrt0rtzDmGycNj/MjuMOJwRc9HjjKKd/39K3
DO2nPy9lz8tT6+/d8mmZlqOhsxEar8orH9bEcnzV4qnV1mC7O/B8uOU2owLsvuqUidbDmn1F7ZN0
TXGmldqa5U0Q0aiPXXwynRRunCrgOmgAAAAAAASuybyWWb3LJewYVRK7JvJZZvcsl7BhVAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAGCbZKDdt23/KUyWpz1pUJqJKTO7xBaj2osR8SImcLlqpheOGtw1VX
pbZQaVK0OjSdMkGaZaVhpDZlERXY63LhERXKuVVccVVVP3k9f9yQrUtWeqj1YsZjdEvDdj9ZFdwa
mMoqpniqIudKOXzFauKuG18sz4trNxDPxLFp+H46xEV5REdZnrPr/GZ6vJsyrU7nhesn8oSnLI9Z
asKBvm7yIjZCVa5WtzlcOhvRcdSscnmUtzljYVCnKltUk5xyvmHwmx5mZivfl2HMc3UqquXKr3t+
teOfrOpz5pM86ik3mNub1x/hNOE6munpfvfViZ9szPL5b+8ABaYYAAAAAAAAAABK03yp3F3LTPbz
5VErTfKncXctM9vPgVQAAAADwr7lqvN2jU4FtxtzVnw8QXo7QvWmpEd5nK3UiLwwqouU60zvZRsu
j06fdcF4t31YWIsSDAiREi7t+crFe5FVHPVeKcVx1/O+bsIIL6el7xkt0+DT03FtRpdLfSYdoi88
52+tt5b+X76yldrPksvLuWc9g8qiV2s+Sy8u5Zz2DyqJ2YAAAAAAAAAAAAAAAAldk3kss3uWS9gw
qiV2TeSyze5ZL2DCqAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABh/hO1XTIUWkMfBXeRHzcVuf1jdKa
WLjPBq64nWnFW8OpTcDmrwl/p1Id2w/axSlxC01wTt1dR2Pw1y8VpNv6Ymfl+u6u8F2FAdbFemWw
tMz8o8mfE1KutjYMN7eHUmFiv/7/AKY2chtltP8AkiLdVN3u+5HUZaX3mnTr0UyRbnGVxnHVkuSx
gx+jx1p5MbimrnW6zLqJnfvTO3s6fCNoAASqAAAAAAAAAAABK03yp3F3LTPbz5VErTfKncXctM9v
PgVQAAAAAAAJXaz5LLy7lnPYPKoldrPksvLuWc9g8qgAAAAAAAAAAAAAAAAJXZN5LLN7lkvYMKol
dk3kss3uWS9gwqgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAc3eExBitvOmx3Q3pBfT2sbEVq6XObEi
KqIvUqojm5T+KfWdIma7e7Y+XbNdPwG5naTqmG8fnQsfrU4qidSI7PFehhOsqa7HOTDMR7XQ9ltZ
TR8Tx2yeFvq/Hw+exsLrESv0u4qpH3qxZipQt4+IxG64jJCUhvciJwwr2Oxj/pOo0o5I2RTVrQLj
iy950ikTkpOMbDhzM/JsjbiIirp6Ts6GLqdlcdelVwiKp0b4uLI7G236rge6fdHnjNjieseKPtFw
q3DNbakR9S3Ovlt5e7wVQJXxcWR2Ntv1XA90eLiyOxtt+q4HulphKoEr4uLI7G236rge6PFxZHY2
2/VcD3QKoGSbVaBa1o2qtTpdgWvNRt+yEqxKVCWHBauem5GtRcZRG9acXJ/RcapNiVu96g2p0m2q
bJS6u0Mjy8nCk4ENqucnRwiK9G9JFVNTsIiLngVM2rjHbuVrM28nQcM4BfW4f4rLlrjxRO28z19n
4zDoi4dqlpURjtVUZPRtKPbCkP1yuRVx85F0IqcVwrkXH9UzkVzbcq9UN2yhy8GkQ0wrncJiI5eO
Uy5ulG8U4ac5Tr44PBrtBte09/LT9TjVyuwtKLKSjN1LQYqY1MixF6T28cdDS7oqi6VXKS0vLTlx
VyHLUyRYs1MuSHBlpZmlrURMIiZ8yInFzlVeCucqrlTLz6zPae7vtPlH4u84R2c4Vir6eaTesc+9
flHurO3L1zHsnZumzHa7U7lumWo1Vp0knKtW7jSyuZu9LHvXLXK7VnSidaY49ZtByjsmo0i/aYyh
3RR5afzv5d8vM6YkOFFYiqqq3i1+NDkwvDKo5OKIdCeLiyOxtt+q4Humjw/JfJimbzvO7jO12j02
k1ta6WndrNYnl4TznnHOfL9+M1RK03yp3F3LTPbz48XFkdjbb9VwPdPVoVt0O39/8g0am0zf6d7y
KVZB3mnOnVpRM4yuM9WVLzlnqgAAAAAAAldrPksvLuWc9g8qjz7hpMCu0Cp0icfFZLVCViykV0JU
R7WRGK1VaqoqZwq4yini816v28uT0FO+FAqgSvNer9vLk9BTvhRzXq/by5PQU74UCqBK816v28uT
0FO+FHNer9vLk9BTvhQKoErzXq/by5PQU74Uc16v28uT0FO+FAqgSvNer9vLk9BTvhRzXq/by5PQ
U74UCqBK816v28uT0FO+FHNer9vLk9BTvhQKoErzXq/by5PQU74Uc16v28uT0FO+FAbJvJZZvcsl
7BhVHn29SYFBoFMpEm+K+Wp8rClIToqor3MhsRqK5UREzhEzhEPQAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAA5q2mbIajSZ+LOWtKRp6kv6W4h9ONLqqomnT857ePBUyqIi6urUvg2TtSuG191A3/wAo
U1mG8lmlV2lqaUwx/W3DW4ROLUyq6VOsyTvHZ7b12aolSk91Or/9yWVIcb9nrXCo7g1E6SLhM4wZ
mTQWrbv6e20+TudF2sxZ8UaXi+P0lftdffHn64mJ9svMtHaxbNfgIkxOMpU41uXwZ16Mb1JnTEXo
uTK4TqcuFXSiFFzytjtHRf8A3QveMl/R8/mb7h/kH6Pn8zfcP8h6rl1kRtNIn3x+KDNoezl7zbHq
rVjy7tp/8ta55Wx2jov/ALoXvHmVPabZ1NjtgzFelXvc3Wiy6OjtxlU+dDRyIvDqznq+tDOP0fP5
m+4f5B+j5/M33D/IfZy6zpjj4/q804f2difr6u0x6qzH/mXzuHb7FV7mW5R2NYjkVI0+5VVzccU3
bFTC58+teCdXHhnFbuy672n3ysWZnZrlGUbT5Nrt2rUVXoiQ2/O09eVyuGpleBvFE2LWpTJ9k1Fb
O1DRhWwpyK10NHIqKiqjWt1dWMLlFRVyimhyMlK0+VZKyEtBlZaHnRCgw0YxuVyuETgnFVX/AJIv
4XUZv/tfaPKF2OP8H4bt9H6bvW+1b9d5+GznG1tiFenZqC+vug02SSJiKxIqRI7mImcs05bx6sqv
DiuF4Iu8WhadItKQdK0WW3W80rGivcrokZyJhFcq/wDK4TCIqrhEyp7oLmDSYsHOsc/NzvFO0Ot4
p9XNbav2Y5R+vvYvbOz6vSG2mbr0zBgtpPKZmZZHSKi60io9GtRvztSbzjlETorhV4Z2gAkw4a4Y
mK9Z3VOI8SzcRvS+bbetYrG3lG/z5gAJWeAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAP/9k=

------=_NextPart_000_0000_01C08299.69A67E20--



From thomas@xs4all.net  Sat Jan 20 14:30:26 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sat, 20 Jan 2001 15:30:26 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <200101192206.RAA12072@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Fri, Jan 19, 2001 at 05:06:03PM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl> <200101191626.LAA29165@cj20424-a.reston1.va.home.com> <20010119173415.M17295@xs4all.nl> <200101192206.RAA12072@cj20424-a.reston1.va.home.com>
Message-ID: <20010120153026.L17392@xs4all.nl>

On Fri, Jan 19, 2001 at 05:06:03PM -0500, Guido van Rossum wrote:

> So instead I added 5 lines to site.py, which tests for
> os.name=='posix', then for sys.path[-1] ending in '/Modules' -- this
> tests only succeeds when running from the build directory.  Then it
> calls distutils.util.get_platform() and uses the result to calculate
> the correct directory name, which is then appended to sys.path.

> Yes, this slows down startup (it imports a large portion of the
> distutils package), but I don't care -- after all this is mostly for
> me so I can play with the interpreter right after I've built it,
> right?

Right. The only downside (as far as I can tell) is that 'python -S' no
longer works, in the build tree. I don't think that's that big a deal, but
it should be documented somewhere, so we don't end up being boggled by it
once we forget about it :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Sat Jan 20 16:18:39 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 11:18:39 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: Your message of "Fri, 19 Jan 2001 00:45:32 +0100."
 <20010119004532.G17392@xs4all.nl>
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net>
 <20010119004532.G17392@xs4all.nl>
Message-ID: <200101201618.LAA15675@cj20424-a.reston1.va.home.com>

> On Thu, Jan 18, 2001 at 08:46:54AM -0800, Guido van Rossum wrote:
> 
> > filename = '/tmp/delete_me'
> 
> This reminds me: we need a portable way to handle test-files :)

Yeah, I noticed that this test failed on Windows -- fixed now.

The test_support module exports TESTFN; there's also tempfile.mktemp()
which should generate temporary files on all platforms.

Is that enough?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Sat Jan 20 16:36:05 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sat, 20 Jan 2001 17:36:05 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <200101201618.LAA15675@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Sat, Jan 20, 2001 at 11:18:39AM -0500
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net> <20010119004532.G17392@xs4all.nl> <200101201618.LAA15675@cj20424-a.reston1.va.home.com>
Message-ID: <20010120173605.P17295@xs4all.nl>

On Sat, Jan 20, 2001 at 11:18:39AM -0500, Guido van Rossum wrote:
> > On Thu, Jan 18, 2001 at 08:46:54AM -0800, Guido van Rossum wrote:
> > 
> > > filename = '/tmp/delete_me'
> > 
> > This reminds me: we need a portable way to handle test-files :)
> Yeah, I noticed that this test failed on Windows -- fixed now.

> The test_support module exports TESTFN; there's also tempfile.mktemp()
> which should generate temporary files on all platforms.
> Is that enough?

Well, there is one more issue, which we can't fix terribly easy: test_fcntl
tries to flock() the file. flock() doesn't work on all filesystems (like
NFS) :P If we cared a lot, we could try several alternatives (current dir,
/tmp, /var/tmp) in the specific case of flock, but personally I don't want to
bother, and real sysadmins (who should care about the test failure) are more
likely to build Python on a local disk than in their NFS-mounted
homedirectory. At least that's how we do it :-) 

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Sat Jan 20 16:43:49 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 11:43:49 -0500
Subject: [Python-Dev] Stupid Python Tricks, Volume 38 Number 1
In-Reply-To: Your message of "Sat, 20 Jan 2001 01:30:42 EST."
 <LNBBLJKPBEHFEDALKOLCIEPOIJAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCIEPOIJAA.tim.one@home.com>
Message-ID: <200101201643.LAA16269@cj20424-a.reston1.va.home.com>

> I had a huge string and wanted to put a double-quote on each end.  The
> boring:
> 
>     '"' + huge + '"'
> 
> does the job, but is inefficent <snort>.  Then this transparent variation
> sprang unbidden from my hoary brow:
> 
>     huge.join('""')

Points off for obscurity though!  My favorite for this is:

    '"%s"' % huge

Worth a microbenchmark?

> *That* should put to rest the argument over whether .join() is more properly
> a method of the separator or the sequence -- '""'.join(huge) instead would
> look plain silly <wink>.
> 
> not-entirely-sure-i'm-channeling-on-this-one-ly y'rs  - tim

Give up the channeling for a while -- there's too much interference in
the air from the Microsoft threaded stdio debate still. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@mojam.com (Skip Montanaro)  Sat Jan 20 16:47:44 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sat, 20 Jan 2001 10:47:44 -0600 (CST)
Subject: [Python-Dev] how to test my __all__ lists?
Message-ID: <14953.49456.654121.987189@beluga.mojam.com>

How do I test the __all__ lists I'm building?  I'm worried about a couple
things:

    1. I may have typos
    2. I may leave something out of a list that should be imported by
       from-module-import-*.

Thoughts?

Skip


From guido@digicool.com  Sat Jan 20 17:00:05 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 12:00:05 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: Your message of "Sat, 20 Jan 2001 17:36:05 +0100."
 <20010120173605.P17295@xs4all.nl>
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net> <20010119004532.G17392@xs4all.nl> <200101201618.LAA15675@cj20424-a.reston1.va.home.com>
 <20010120173605.P17295@xs4all.nl>
Message-ID: <200101201700.MAA16491@cj20424-a.reston1.va.home.com>

> > > > filename = '/tmp/delete_me'
> > > 
> > > This reminds me: we need a portable way to handle test-files :)
> > Yeah, I noticed that this test failed on Windows -- fixed now.
> 
> > The test_support module exports TESTFN; there's also tempfile.mktemp()
> > which should generate temporary files on all platforms.
> > Is that enough?
> 
> Well, there is one more issue, which we can't fix terribly easy: test_fcntl
> tries to flock() the file. flock() doesn't work on all filesystems (like
> NFS) :P If we cared a lot, we could try several alternatives (current dir,
> /tmp, /var/tmp) in the specific case of flock, but personally I don't want to
> bother, and real sysadmins (who should care about the test failure) are more
> likely to build Python on a local disk than in their NFS-mounted
> homedirectory. At least that's how we do it :-) 

These days, I would think that it's a pretty sure bet that the
system's tmp directory is not on NFS.  Then we could just use
tempfile.mktemp() in that module, right?  Or does the /tmp filesystem
on Linux (which AFAIK is a RAM disk implemented in virtual memory so
it uses swap space when it runs out of RAM) not support locking?

I don't particularly care about fixing this -- I haven't seen bug
reports about this.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Sat Jan 20 17:38:38 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 12:38:38 -0500
Subject: [Python-Dev] how to test my __all__ lists?
In-Reply-To: Your message of "Sat, 20 Jan 2001 10:47:44 CST."
 <14953.49456.654121.987189@beluga.mojam.com>
References: <14953.49456.654121.987189@beluga.mojam.com>
Message-ID: <200101201738.MAA16636@cj20424-a.reston1.va.home.com>

> How do I test the __all__ lists I'm building?  I'm worried about a couple
> things:
> 
>     1. I may have typos

Do "from M import *" -- this will raise an AttributeError if there's
something in __all__ that's not defined in the module.

>     2. I may leave something out of a list that should be imported by
>        from-module-import-*.

That's what alpha-testing's for.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@netaxs.com  Sat Jan 20 17:49:43 2001
From: esr@netaxs.com (Eric Raymond)
Date: Sat, 20 Jan 2001 12:49:43 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <3A672376.4B951848@lemburg.com>; from M.-A. Lemburg on Thu, Jan 18, 2001 at 06:10:14PM +0100
References: <20010118022321.A9021@thyrsus.com> <3A672376.4B951848@lemburg.com>
Message-ID: <20010120124943.C6073@unix3.netaxs.com>

> A combination of time.time(), process id and counter should
> work in all cases. Make sure you use a lock around the counter,
> though.

Yes, but...this hack has to work in a multithreaded environment,
so process ID isn't good enough.  And I don't want to keep a counter
around if I don't have to.
-- 
	<a href="http://www.tuxedo.org/~esr/home.html">Eric S. Raymond</a>


From guido@digicool.com  Sat Jan 20 18:01:04 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 13:01:04 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: Your message of "Sat, 20 Jan 2001 12:49:43 EST."
 <20010120124943.C6073@unix3.netaxs.com>
References: <20010118022321.A9021@thyrsus.com> <3A672376.4B951848@lemburg.com>
 <20010120124943.C6073@unix3.netaxs.com>
Message-ID: <200101201801.NAA16880@cj20424-a.reston1.va.home.com>

> > A combination of time.time(), process id and counter should
> > work in all cases. Make sure you use a lock around the counter,
> > though.
> 
> Yes, but...this hack has to work in a multithreaded environment,
> so process ID isn't good enough.  And I don't want to keep a counter
> around if I don't have to.

Sorry Eric, this just doesn't make sense.  Keeping a counter around in
your module (protected by a semaphore) is obviously the right
solution.  Why are you fighting it?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@netaxs.com  Sat Jan 20 18:20:26 2001
From: esr@netaxs.com (Eric Raymond)
Date: Sat, 20 Jan 2001 13:20:26 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <200101201801.NAA16880@cj20424-a.reston1.va.home.com>; from Guido van Rossum on Sat, Jan 20, 2001 at 01:01:04PM -0500
References: <20010118022321.A9021@thyrsus.com> <3A672376.4B951848@lemburg.com> <20010120124943.C6073@unix3.netaxs.com> <200101201801.NAA16880@cj20424-a.reston1.va.home.com>
Message-ID: <20010120132026.E6073@unix3.netaxs.com>

On Sat, Jan 20, 2001 at 01:01:04PM -0500, Guido van Rossum wrote:
> > Yes, but...this hack has to work in a multithreaded environment,
> > so process ID isn't good enough.  And I don't want to keep a counter
> > around if I don't have to.
> 
> Sorry Eric, this just doesn't make sense.  Keeping a counter around in
> your module (protected by a semaphore) is obviously the right
> solution.  Why are you fighting it?

Actually, I'm not fighting it any more.  I changed my mind a few minutes
after shipping that response.
-- 
	<a href="http://www.tuxedo.org/~esr/home.html">Eric S. Raymond</a>


From thomas@xs4all.net  Sat Jan 20 18:37:10 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sat, 20 Jan 2001 19:37:10 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <200101201700.MAA16491@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Sat, Jan 20, 2001 at 12:00:05PM -0500
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net> <20010119004532.G17392@xs4all.nl> <200101201618.LAA15675@cj20424-a.reston1.va.home.com> <20010120173605.P17295@xs4all.nl> <200101201700.MAA16491@cj20424-a.reston1.va.home.com>
Message-ID: <20010120193710.Q17295@xs4all.nl>

On Sat, Jan 20, 2001 at 12:00:05PM -0500, Guido van Rossum wrote:

> > Well, there is one more issue, which we can't fix terribly easy: test_fcntl
> > tries to flock() the file. flock() doesn't work on all filesystems (like
> > NFS) :P If we cared a lot, we could try several alternatives (current dir,
> > /tmp, /var/tmp) in the specific case of flock, but personally I don't want to
> > bother, and real sysadmins (who should care about the test failure) are
> > more likely to build Python on a local disk than in their NFS-mounted
> > homedirectory. At least that's how we do it :-)

> These days, I would think that it's a pretty sure bet that the
> system's tmp directory is not on NFS.  Then we could just use
> tempfile.mktemp() in that module, right?  Or does the /tmp filesystem
> on Linux (which AFAIK is a RAM disk implemented in virtual memory so
> it uses swap space when it runs out of RAM) not support locking?

Actually, most Linux distributions don't care enough about /tmp to make it a
RAM-based filesystem. At least Debian and RedHat don't :) (There's a good
reason for that: Linux's disk-data cache rocks if you have enough RAM, so
there's no real gain in using a ramdisk) BSDI does (optionally) have such a
/tmp, and probably the other BSD derived systems as well. But that doesn't
mean it doesn't support locking, so that's not a real excuse.

But like I said, I don't care enough to worry about it. I'll look at it
before alpha2.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From tim.one@home.com  Sat Jan 20 20:10:51 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 20 Jan 2001 15:10:51 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEABIKAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com>

[Tim]
> ...
> 4. We've again given up on avoiding surprises in *simple* comparisons
>    among builtin types, like (under current CVS):
>
> >>> 1 < [1] < 0L < 1
> 1
> >>> 1 < 1
> 0
> >>>

I really dislike that.  Here's a consequence at a higher level:

N = 5
x = [1 for i in range(N)] + \
    [[1] for i in range(N)] + \
    [0L for i in range(N)]

x.sort()
print x

from random import shuffle
tries = failures = 0
while failures < 5:
    tries += 1
    y = x[:]
    shuffle(y)
    y.sort()
    if x != y:
        print "oops, on try number", tries
        print y
        failures += 1

and here's a typical run (2.1a1):

[1, 1, 1, 1, 1, [1], [1], [1], [1], [1], 0L, 0L, 0L, 0L, 0L]
oops, on try number 3
[0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1]]
oops, on try number 5
[[1], 0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1]]
oops, on try number 6
[0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1]]
oops, on try number 7
[[1], 0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1]]
oops, on try number 8
[0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1], 0L, 0L, 0L, 0L]

I've often used list.sort() on a heterogeneous list simply to bring the
elements of the same type next to each other.  But as "try number 5" shows,
I can no longer rely on even getting all the lists together.  Indeed,
heterogenous list.sort() has become a very bad (biased and slow)
implementation of random.shuffle() <wink>.

Under 2.0, the program never prints "oops", because the only violations of
transitivity in 2.0's ordering of builtin types were bugs in the
implementation (none of which show up in this simple test case); 2.0's
.sort() *always* produces

[0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1]]

The base trick in 2.0 was sound:  when falling back to the "compare by name
of the type" last resort, treat all numeric types as if they had the same
name.

While Python can't enforce that any user-defined __cmp__ is consistent, I
think it should continue to set a good example in the way it implements its
own comparisons.

grumblingly y'rs  - tim



From skip@mojam.com (Skip Montanaro)  Sat Jan 20 20:42:27 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sat, 20 Jan 2001 14:42:27 -0600 (CST)
Subject: [Python-Dev] should a module's thread safety be documented?
Message-ID: <14953.63539.629197.232848@beluga.mojam.com>

A bit late for 2.1alpha1, but it just occurred to me that perhaps there
should be an annotation in the documentation that indicates whether or not a
module is thread-safe.  For example, many functions in fileinput rely on a
module global called _state.  It strikes me that this module is not likely
to be thread-safe, yet the documentation doesn't appear to mention this,
certainly not in an obvious fashion.

Anyone for adding \notthreadsafe{} and \threadsafe{} macros to the litany of
LaTex macros in Fred's arsenal?  This would make documenting these
properties both easy and consistent across modules.

Skip



From tim.one@home.com  Sat Jan 20 21:13:41 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 20 Jan 2001 16:13:41 -0500
Subject: [Python-Dev] Stupid Python Tricks, Volume 38 Number 1
In-Reply-To: <200101201643.LAA16269@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEBEIKAA.tim.one@home.com>

[Tim]
>     huge.join('""')

[Guido]
> Points off for obscurity though!

The Subject line was "Stupid Python Tricks" for a reason <wink>.  Those who
don't know the language inside-out should be tickled by figuring out why it
even *works* (hint for the baffled:  you have to view '""' as a sequence
rather than as an atomic string).

> My favorite for this is:
>
>     '"%s"' % huge
>
> Worth a microbenchmark?

Absolutely!  I get:

     obvious  15.574
     obscure   8.165
     sprintf   8.133

after running:

ITERS = 1000
indices = [0] * ITERS

def obvious(huge):
    for i in indices:  '"' + huge + '"'

def obscure(huge):
    for i in indices:  huge.join('""')

def sprintf(huge):
    for i in indices:  '"%s"' % huge

def runtimes(huge):
    from time import clock
    for f in obvious, obscure, sprintf:
        start = clock()
        f(huge)
        finish = clock()
        print "%12s %7.3f" % (f.__name__, finish - start)

runtimes("x" * 1000000)

under current 2.1a1.  Not a dead-quiet machine, but the difference is too
small to care.  Speed up huge.join attr lookup, and it would probably be
faster <wink>.  Hmm:  if I boost ITERS high enough and cut back the size of
huge, "obscure" eventually becomes *slower* than "obvious", and even if the
"huge.join" lookup is floated out of the loop.  I guess that points to the
relative burden of calling a bound method.  So, in real life, the huge.join
approach may well be the slowest!

>> not-entirely-sure-i'm-channeling-on-this-one-ly y'rs  - tim

> Give up the channeling for a while -- there's too much interference in
> the air from the Microsoft threaded stdio debate still. :-)

What debate?  You need two arguably valid points of view for a debate to
even start <wink>.

gloating-in-victory-vicious-in-defeat-but-simply-unbearable-in-
    ambiguity-ly y'rs  - tim



From fdrake@acm.org  Sat Jan 20 21:23:58 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Sat, 20 Jan 2001 16:23:58 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <200101201700.MAA16491@cj20424-a.reston1.va.home.com>
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net>
 <20010119004532.G17392@xs4all.nl>
 <200101201618.LAA15675@cj20424-a.reston1.va.home.com>
 <20010120173605.P17295@xs4all.nl>
 <200101201700.MAA16491@cj20424-a.reston1.va.home.com>
Message-ID: <14954.494.223724.705495@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > tempfile.mktemp() in that module, right?  Or does the /tmp filesystem
 > on Linux (which AFAIK is a RAM disk implemented in virtual memory so
 > it uses swap space when it runs out of RAM) not support locking?

  I thought it was Solaris that used available+virtual memory for
/tmp; that was what we ran into at CNRI.  (Which doesn't preclude
Linux from doing the same, I just don't recall that we've encountered
that.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From fdrake@acm.org  Sat Jan 20 22:05:27 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Sat, 20 Jan 2001 17:05:27 -0500 (EST)
Subject: [Python-Dev] should a module's thread safety be documented?
In-Reply-To: <14953.63539.629197.232848@beluga.mojam.com>
References: <14953.63539.629197.232848@beluga.mojam.com>
Message-ID: <14954.2983.450755.761653@cj42289-a.reston1.va.home.com>

Skip Montanaro writes:
 > A bit late for 2.1alpha1, but it just occurred to me that perhaps there
 > should be an annotation in the documentation that indicates whether or not a
 > module is thread-safe.  For example, many functions in fileinput rely on a

  If you can create a list of the known thread safe and known thread
unsafe modules, I'll come up with appropriate annotations for the
documentation.

 > Anyone for adding \notthreadsafe{} and \threadsafe{} macros to the litany of
 > LaTex macros in Fred's arsenal?  This would make documenting these
 > properties both easy and consistent across modules.

  Not sure that this is exactly the right approach to the markup; I'll
think about this one.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From skip@mojam.com (Skip Montanaro)  Sat Jan 20 22:31:52 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sat, 20 Jan 2001 16:31:52 -0600 (CST)
Subject: [Python-Dev] should a module's thread safety be documented?
In-Reply-To: <14954.2983.450755.761653@cj42289-a.reston1.va.home.com>
References: <14953.63539.629197.232848@beluga.mojam.com>
 <14954.2983.450755.761653@cj42289-a.reston1.va.home.com>
Message-ID: <14954.4568.460875.662560@beluga.mojam.com>

    Fred> If you can create a list of the known thread safe and known thread
    Fred> unsafe modules, I'll come up with appropriate annotations for the
    Fred> documentation.

I think that's going to be a significant undertaking, requiring examination
of a lot of Python and C code.  I'd rather approach it incrementally, which
was why I suggested the LaTeX macros.  As modules are determined to be safe
or unsafe, the appropriate safety macro could just be inserted into the
correct lib*.tex file.  It would (in my mind) expand to a stock bit of text
inserted at a standard place in the file.

Skip


From tim.one@home.com  Sat Jan 20 22:52:09 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 20 Jan 2001 17:52:09 -0500
Subject: [Python-Dev] should a module's thread safety be documented?
In-Reply-To: <14953.63539.629197.232848@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEBNIKAA.tim.one@home.com>

[Skip Montanaro]
> ...
> Anyone for adding \notthreadsafe{} and \threadsafe{} macros to
> the litany of LaTex macros in Fred's arsenal?  This would make
> documenting these properties both easy and consistent across
> modules.

When a module is *not* threadsafe, that's usually considered "a bug" in the
module.  So we should just point out modules that aren't threadsafe by
design.  Alas, that's A Project.



From nas@arctrix.com  Sat Jan 20 15:59:14 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Sat, 20 Jan 2001 07:59:14 -0800
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com>; from tim.one@home.com on Sat, Jan 20, 2001 at 03:10:51PM -0500
References: <LNBBLJKPBEHFEDALKOLCAEABIKAA.tim.one@home.com> <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com>
Message-ID: <20010120075914.B18840@glacier.fnational.com>

On Sat, Jan 20, 2001 at 03:10:51PM -0500, Tim Peters wrote:
> While Python can't enforce that any user-defined __cmp__ is consistent, I
> think it should continue to set a good example in the way it implements its
> own comparisons.

I think the 2.0 behavior should be fairly easy to restore.  I'll
leave it up to Guido though since he's "Mr. Comparison" now and I
haven't looked at the code since I checked in the coercion patch.

  Neil


From nas@arctrix.com  Sat Jan 20 16:03:36 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Sat, 20 Jan 2001 08:03:36 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <14954.494.223724.705495@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Sat, Jan 20, 2001 at 04:23:58PM -0500
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net> <20010119004532.G17392@xs4all.nl> <200101201618.LAA15675@cj20424-a.reston1.va.home.com> <20010120173605.P17295@xs4all.nl> <200101201700.MAA16491@cj20424-a.reston1.va.home.com> <14954.494.223724.705495@cj42289-a.reston1.va.home.com>
Message-ID: <20010120080336.C18840@glacier.fnational.com>

On Sat, Jan 20, 2001 at 04:23:58PM -0500, Fred L. Drake, Jr. wrote:
> 
> Guido van Rossum writes:
>  > tempfile.mktemp() in that module, right?  Or does the /tmp filesystem
>  > on Linux (which AFAIK is a RAM disk implemented in virtual memory so
>  > it uses swap space when it runs out of RAM) not support locking?
> 
>   I thought it was Solaris that used available+virtual memory for
> /tmp; that was what we ran into at CNRI.  (Which doesn't preclude
> Linux from doing the same, I just don't recall that we've encountered
> that.)

I don't know of any Linux system that uses a RAM based /tmp.  The
Linux implemention of ext2 is so fast it doesn't make any sense.
If you have enough memory all the data is stored in the buffer,
page, and inode caches anyhow.


  Neil


From trentm@ActiveState.com  Sat Jan 20 23:35:56 2001
From: trentm@ActiveState.com (Trent Mick)
Date: Sat, 20 Jan 2001 15:35:56 -0800
Subject: [Python-Dev] spurious print and faulty return values: Is this a bug...?
Message-ID: <20010120153556.C18375@ActiveState.com>

... or am I missing something?

With Python 2.0 on Windows 2000, when playing with sys.exit() and sys.argv()
I get some unexpected results.

First here is a simple case that shows what I expect. I run "caller_good.py"
which call "callee_good.py" and prints its return value. "callee_good.py"
returns 42 so "42" is printed:
    ----------------- caller_good.py --------------------
    import os
    retval = os.system("python callee_good.py")
    print "caller: the retval is", retval
    -----------------------------------------------------

    ----------------- callee_good.py --------------------
    import sys
    sys.exit(42)
    -----------------------------------------------------

    D:\trentm\tmp>python caller_good.py
    caller: the retval is 42


Now here is what I didn't expect. I changed "caller_bad.py" to pass, as an
argument, the value that "callee_bad.py" should return.

    ----------------- caller_bad.py ---------------------
    import os
    retval = os.system("python callee_bad.py 42")
    print "caller: the retval is", retval
    -----------------------------------------------------

    ----------------- callee_bad.py ---------------------
    import sys
    firstarg = sys.argv[1]
    print "callee_bad: firstarg is", firstarg
    sys.exit(firstarg)
    -----------------------------------------------------

    D:\trentm\tmp>python caller_bad.py
    callee_bad: firstarg is 42
    42                             # <---- where did *this* print come from?
    caller: the retval is 1        # <---- and this retval is incorrect


Any ideas? I have not tried to track this down yet nor have I tried the
latest Python-CVS state.

Trent

-- 
Trent Mick
TrentM@ActiveState.com


From moshez@zadka.site.co.il  Sun Jan 21 12:37:57 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Sun, 21 Jan 2001 14:37:57 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,NONE,1.1
In-Reply-To: <E14K45e-00030e-00@usw-pr-cvs1.sourceforge.net>
References: <E14K45e-00030e-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010121123757.D897BA83E@darjeeling.zadka.site.co.il>

Yay! I can change to python-dev manually!
(hear sounds of the timbot's teeth grinding)

On Sat, 20 Jan 2001, Skip Montanaro <montanaro@users.sourceforge.net> wrote:
> def check_all(_modname):
>     exec "import %s" % _modname
>     verify(hasattr(sys.modules[_modname],"__all__"),
>            "%s has no __all__ attribute" % _modname)
>     exec "del %s" % _modname
>     exec "from %s import *" % _modname
>     
>     _keys = locals().keys()
....

Wouldn't it be better to use the

d = {}
exec "foo", d

And verify "d" instead?

-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From guido@digicool.com  Sun Jan 21 16:51:45 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sun, 21 Jan 2001 11:51:45 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: Your message of "Sat, 20 Jan 2001 15:10:51 EST."
 <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com>
Message-ID: <200101211651.LAA25346@cj20424-a.reston1.va.home.com>

[Tim, complaining that numerical types are no longer lumped together
in default comparisons:]
> I've often used list.sort() on a heterogeneous list simply to bring the
> elements of the same type next to each other.  But as "try number 5" shows,
> I can no longer rely on even getting all the lists together.  Indeed,
> heterogenous list.sort() has become a very bad (biased and slow)
> implementation of random.shuffle() <wink>.
> 
> Under 2.0, the program never prints "oops", because the only violations of
> transitivity in 2.0's ordering of builtin types were bugs in the
> implementation (none of which show up in this simple test case); 2.0's
> .sort() *always* produces
> 
> [0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1]]
> 
> The base trick in 2.0 was sound:  when falling back to the "compare by name
> of the type" last resort, treat all numeric types as if they had the same
> name.
> 
> While Python can't enforce that any user-defined __cmp__ is consistent, I
> think it should continue to set a good example in the way it implements its
> own comparisons.

I think I can put this behavior back.  (I believe that before I
reorganized the comparison code, it seemed really tricky to do this,
but after refactoring the code, it's quite easy to do.)

My only concern is that under the old schele, two different numeric
extension types that somehow can't be compared will end up being
*equal*.  To fix this, I propose that if the names compare equal, as a
last resort we compare the type pointers -- this should be consistent
too.

Here's a patch that stops your test program from reporting failures:

*** object.c	2001/01/21 16:25:18	2.112
--- object.c	2001/01/21 16:50:16
***************
*** 522,527 ****
--- 522,528 ----
  default_3way_compare(PyObject *v, PyObject *w)
  {
  	int c;
+ 	char *vname, *wname;
  
  	if (v->ob_type == w->ob_type) {
  		/* When comparing these pointers, they must be cast to
***************
*** 550,557 ****
  	}
  
  	/* different type: compare type names */
! 	c = strcmp(v->ob_type->tp_name, w->ob_type->tp_name);
! 	return (c < 0) ? -1 : (c > 0) ? 1 : 0;
  }
  
  #define CHECK_TYPES(o) PyType_HasFeature((o)->ob_type, Py_TPFLAGS_CHECKTYPES)
--- 551,571 ----
  	}
  
  	/* different type: compare type names */
! 	if (v->ob_type->tp_as_number)
! 		vname = "";
! 	else
! 		vname = v->ob_type->tp_name;
! 	if (w->ob_type->tp_as_number)
! 		wname = "";
! 	else
! 		wname = w->ob_type->tp_name;
! 	c = strcmp(vname, wname);
! 	if (c < 0)
! 		return -1;
! 	if (c > 0)
! 		return 1;
! 	/* Same type name, or (more likely) incomparable numeric types */
! 	return (v->ob_type < w->ob_type) ? -1 : 1;
  }
  
  #define CHECK_TYPES(o) PyType_HasFeature((o)->ob_type, Py_TPFLAGS_CHECKTYPES)

Let me know if you agree with this.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Sun Jan 21 17:00:02 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sun, 21 Jan 2001 12:00:02 -0500
Subject: [Python-Dev] should a module's thread safety be documented?
In-Reply-To: Your message of "Sat, 20 Jan 2001 14:42:27 CST."
 <14953.63539.629197.232848@beluga.mojam.com>
References: <14953.63539.629197.232848@beluga.mojam.com>
Message-ID: <200101211700.MAA25479@cj20424-a.reston1.va.home.com>

> A bit late for 2.1alpha1, but it just occurred to me that perhaps there
> should be an annotation in the documentation that indicates whether or not a
> module is thread-safe.  For example, many functions in fileinput rely on a
> module global called _state.  It strikes me that this module is not likely
> to be thread-safe, yet the documentation doesn't appear to mention this,
> certainly not in an obvious fashion.
> 
> Anyone for adding \notthreadsafe{} and \threadsafe{} macros to the litany of
> LaTex macros in Fred's arsenal?  This would make documenting these
> properties both easy and consistent across modules.

It's hard to say whether a *whole module* is threadsafe.  E.g. in the
fileinput example, there's the clear implication that if you use this
in multiple threads, you should instantiate your own FileInput
instances, and then you're totally thread-safe.  Clearly the semantics
of the module-global functions are thread-unsafe though.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Sun Jan 21 18:45:07 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 13:45:07 -0500
Subject: [Python-Dev] test_sax failing (Windows)
Message-ID: <LNBBLJKPBEHFEDALKOLCGEDGIKAA.tim.one@home.com>

test test_sax crashed -- 
    exceptions.SystemError: 'finally' pops bad exception

Sometimes it crashes (some flavor of memory fault) instead.

Elsewhere?



From nas@arctrix.com  Sun Jan 21 12:28:35 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Sun, 21 Jan 2001 04:28:35 -0800
Subject: [Python-Dev] autoconf --enable vs. --with
Message-ID: <20010121042835.A19774@glacier.fnational.com>

I've been working a bit on the build process lately.  I came
across this in the autoconf documentation:


    If a software package has optional compile-time features, the
    user can give `configure' command line options to specify
    whether to compile them. The options have one of these forms:

        --enable-FEATURE[=ARG]
        --disable-FEATURE

    Some packages require, or can optionally use, other software
    packages which are already installed.  The user can give
    `configure' command line options to specify which such
    external software to use.  The options have one of these
    forms:

        --with-package[=ARG]
        --without-package


Is it worth fixing the Python configure script to comply with
these definitions?  It looks like with-cycle-gc and mybe
with-pydebug would have to be changed.

  Neil

    AC_ARG_ENABLE

    


From tim.one@home.com  Sun Jan 21 19:44:38 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 14:44:38 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101211651.LAA25346@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com>

[Guido, on again lumping numbers together]
> I think I can put this behavior back.  (I believe that before I
> reorganized the comparison code, it seemed really tricky to do this,
> but after refactoring the code, it's quite easy to do.)

I can believe that; and I believe the "bugs" in 2.0 ended up somewhere in or
around the bowels of the xxxHalfBinOp-like routines (which were really
tricky to my eyes -- the interactions among coercions and comparisons were
hard to keep straight).

> My only concern is that under the old schele, two different numeric
> extension types that somehow can't be compared will end up being
> *equal*.  To fix this, I propose that if the names compare equal, as a
> last resort we compare the type pointers -- this should be consistent
> too.

Agreed, and sounds fine!  Save Barry a little work, though:

> ! 	/* Same type name, or (more likely) incomparable numeric types */
> ! 	return (v->ob_type < w->ob_type) ? -1 : 1;

That's non-std C in a way Insure complains about elsewhere; change to

	return ((Py_uintptr_t)v->ob_type <
		  (Py_uintptr_t)w->ob_type) ? -1 : 1;

if-vendors-stuck-to-the-letter-of-the-c-std-python-wouldn't-
     compile-at-all<wink>-ly y'rs  - tim



From trentm@ActiveState.com  Sun Jan 21 20:01:44 2001
From: trentm@ActiveState.com (Trent Mick)
Date: Sun, 21 Jan 2001 12:01:44 -0800
Subject: [Python-Dev] spurious print and faulty return values: Is this a bug...?
In-Reply-To: <20010120153556.C18375@ActiveState.com>; from trentm@ActiveState.com on Sat, Jan 20, 2001 at 03:35:56PM -0800
References: <20010120153556.C18375@ActiveState.com>
Message-ID: <20010121120144.B28643@ActiveState.com>

On Sat, Jan 20, 2001 at 03:35:56PM -0800, Trent Mick wrote:
> 
> ... or am I missing something?

Ignore me. RTFM (sys.exit), Trent.

Sorry,
Trent


-- 
Trent Mick
TrentM@ActiveState.com


From tim.one@home.com  Sun Jan 21 20:13:02 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 15:13:02 -0500
Subject: [Python-Dev] spurious print and faulty return values: Is this a bug...?
In-Reply-To: <20010121120144.B28643@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEDKIKAA.tim.one@home.com>

[Trent, quoting Trent]
>>
>> ... or am I missing something?

[and back to Trent]
> Ignore me. RTFM (sys.exit), Trent.

Nobody wants to ignore *you*, Trent!  If it's not the case that you wanted
to code

sys.exit(int(firstarg))

instead, holler, cuz if that wasn't the problem I'm still baffled.

or-if-it-was-it-caught-you-because-sys.exit's-tricks-aren't-
    really-pythonic-ly y'rs  - tim



From loewis@informatik.hu-berlin.de  Sun Jan 21 21:21:24 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 21 Jan 2001 22:21:24 +0100 (MET)
Subject: [Python-Dev] test_sax failing (Windows)
Message-ID: <200101212121.WAA16327@pandora.informatik.hu-berlin.de>

> Elsewhere?

Not for me, on neither Solaris nor Linux. What expat version?

Regards,
Martin


From loewis@informatik.hu-berlin.de  Sun Jan 21 21:22:44 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 21 Jan 2001 22:22:44 +0100 (MET)
Subject: [Python-Dev] autoconf --enable vs. --with
Message-ID: <200101212122.WAA16371@pandora.informatik.hu-berlin.de>

> It looks like with-cycle-gc and mybe with-pydebug would have to be
> changed.

I'm in favour of changing it.

Regards,
Martin


From loewis@informatik.hu-berlin.de  Sun Jan 21 21:34:08 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 21 Jan 2001 22:34:08 +0100 (MET)
Subject: [Python-Dev] test___all__ fails with no bsddb
Message-ID: <200101212134.WAA16446@pandora.informatik.hu-berlin.de>

On my Solaris 2.6 installation, with no bsddb module, I get

test test___all__ failed -- dbhash has no __all__ attribute

This is caused by anydbm importing dbhash first. After that fails,
dbhash is still in sys.modules, and the next import of dbhash silently
loads an incomplete module.

Regards,
Martin


From tim.one@home.com  Sun Jan 21 21:38:11 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 16:38:11 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <200101212121.WAA16327@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEDPIKAA.tim.one@home.com>

[Martin von Loewis]
> Not for me, on neither Solaris nor Linux. What expat version?

Tell me how to answer the question, and I'll be happy to (I have no idea
what any of this stuff is or does).

My pyexpat.c (well, my *everything*) is current CVS, pyexpat.c in particular
is revision 2.33.

xmltok.dll and xmlparse.dll were obtained from

    ftp://ftp.jclark.com/pub/xml/expat.zip

for the 2.0 release.

Is any of that relevant?

The tests passed in the wee hours (EST; UTC -0500) this morning.  They began
failing after I updated around 1pm EST today.



From thomas@xs4all.net  Sun Jan 21 21:54:05 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sun, 21 Jan 2001 22:54:05 +0100
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 21, 2001 at 02:44:38PM -0500
References: <200101211651.LAA25346@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com>
Message-ID: <20010121225405.M17392@xs4all.nl>

On Sun, Jan 21, 2001 at 02:44:38PM -0500, Tim Peters wrote:

> > ! 	/* Same type name, or (more likely) incomparable numeric types */
> > ! 	return (v->ob_type < w->ob_type) ? -1 : 1;

> That's non-std C in a way Insure complains about elsewhere; change to

> 	return ((Py_uintptr_t)v->ob_type <
> 		  (Py_uintptr_t)w->ob_type) ? -1 : 1;

Why is comparing v->ob_type with w->ob_type illegal ? They're both pointers
to the same type, aren't they ?

> if-vendors-stuck-to-the-letter-of-the-c-std-python-wouldn't-
>      compile-at-all<wink>-ly y'rs  - tim

That's easy to check, gcc has these nice (and from a users point of view,
fairly useless) options: '-ansi', '-pedantic' and '-pedantic-errors'.
'-ansi' disables some GCC-specific features, -pedantic turns gcc into a
whiney pedantic I'm sure you'd get along with just fine <wink>, and
-pedantic-errors turns those whines into errors.

Doing a quick check I see one error I added myself (but haven't commited) in
the continue-inside-try patch (a trailing comma in an enumerator
definition), and one error in configure (it mis-detects the arguments to
setpgrp() in strict-ANSI mode, for some reason.) I don't see any errors in
the core Python. I see an error in the nis module (missing function
prototype, and broken system-include file) and a *lot* of errors in
linuxaudiodev, but nothing else in the set of modules I can compile. Not
bad!

Note that this was tested in a current tree. I couldn't find either Guido's
'broken' code or your proposed 'good' code, so I don't know if you checked
in a fix yet. If you didn't, don't bother, it's not broken :-)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From loewis@informatik.hu-berlin.de  Sun Jan 21 22:00:47 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 21 Jan 2001 23:00:47 +0100 (MET)
Subject: [Python-Dev] Re: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEDPIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEDPIKAA.tim.one@home.com>
Message-ID: <200101212200.XAA16672@pandora.informatik.hu-berlin.de>

> [Martin von Loewis]
> > Not for me, on neither Solaris nor Linux. What expat version?
> 
> Tell me how to answer the question, and I'll be happy to (I have no idea
> what any of this stuff is or does).
>
> My pyexpat.c (well, my *everything*) is current CVS, pyexpat.c in
> particular is revision 2.33.

That's good; mine too.

> xmltok.dll and xmlparse.dll were obtained from
> 
>     ftp://ftp.jclark.com/pub/xml/expat.zip
> 
> for the 2.0 release.
> 
> Is any of that relevant?

That gives some clue, yes. Unfortunately, that URL itself is a symlink
that was expat1_1.zip (157936 bytes) at some point, and now is
expat1_2.zip (153591 bytes). The files themselves are not
self-identifying, it's hard to tell once unzipped...

Anyway, I was using 1.1 in my own tests, and 1.2 in PyXML - either
works for me. I never tested 1.95.x (which is also not available from
jclark.com).

> The tests passed in the wee hours (EST; UTC -0500) this morning.
> They began failing after I updated around 1pm EST today.

I just merged pyexpat changes from PyXML into Python 2 so that could
be the cause. However, this very code has been used for some time by
PyXML users, why it crashes for you is a mystery to me.

Any chance of producing a C backtrace?

Regards,
Martin


From tim.one@home.com  Sun Jan 21 22:09:30 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 17:09:30 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEDPIKAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDPIKAA.tim.one@home.com>

FYI, under the debug-build Python, running test_sax.py under the debugger
dies like so:

Passed test_attrs_empty
Passed test_attrs_wattr
Passed test_escape_all
Passed test_escape_basic
Passed test_escape_extra
Passed test_expat_attrs_empty
Passed test_expat_attrs_wattr
Passed test_expat_dtdhandler
Passed test_expat_entityresolver
Passed test_expat_file
Traceback (most recent call last):
  File "../lib/test/test_sax.py", line 603, in ?
    confirm(value(), name)
  File "../lib/test/test_sax.py", line 435, in test_expat_incomplete
    parser.parse(StringIO("<foo>"))
  File "c:\code\python\dist\src\lib\xml\sax\expatreader.py", line 42, in
parse
    xmlreader.IncrementalParser.parse(self, source)
  File "c:\code\python\dist\src\lib\xml\sax\xmlreader.py", line 122, in
parse
    self.close()
  File "c:\code\python\dist\src\lib\xml\sax\expatreader.py", line 91, in
close
    self.feed("", isFinal = 1)
  File "c:\code\python\dist\src\lib\xml\sax\expatreader.py", line 82, in
feed
    except expat.error:
SystemError: 'finally' pops bad exception

Running it from a command line instead produces the same output up to but
not including the traceback, and Python crashes with a memory fault then.
Attaching to the process with a debugger at that point shows it trying to do
_Py_Dealloc on an op whose op->op_type member is NULL.  Here's the call
stack at that point:

_Py_Dealloc(_object * 0x007af100) line 1304 + 6 bytes
insertdict(dictobject * 0x007637ec, _object * 0x007a8270,
           long -1601350627, _object * 0x1e1eff18 __Py_NoneStruct)
           line 364 + 48 bytes
PyDict_SetItem(_object * 0x007637ec, _object * 0x007a8270,
          _object * 0x1e1eff18 __Py_NoneStruct) line 498 + 21 bytes
PyDict_SetItemString(_object * 0x007637ec, char * 0x1e1d84fc,
          _object * 0x1e1eff18 __Py_NoneStruct) line 1272 + 17 bytes
PySys_SetObject(char * 0x1e1d84fc, _object * 0x1e1eff18 __Py_NoneStruct)
          line 67 + 17 bytes
reset_exc_info(_ts * 0x00760630) line 2207 + 17 bytes
eval_code2(PyCodeObject * 0x00993df0, _object * 0x0098794c,
          _object * 0x00000000, _object * * 0x007a9d28, int 2,
          _object * * 0x007a9d30, int 1, _object * * 0x009a0b60,
          int 1) line 2125 + 9 bytes
fast_function(_object * 0x009a4f6c, _object * * * 0x0063f5a0, int 4,
          int 2, int 1) line 2817 + 61 bytes
eval_code2(PyCodeObject * 0x00993910, _object * 0x0098794c,
          _object * 0x00000000, _object * * 0x007a05e8, int 1,
          _object * * 0x007a05ec, int 0, _object * * 0x00000000,
         int 0) line 1860 + 37 bytes
fast_function(_object * 0x009a549c, _object * * * 0x0063f738, int 1,
         int 1, int 0) line 2817 + 61 bytes
eval_code2(PyCodeObject * 0x007b35e0, _object * 0x0098110c,
          _object * 0x00000000, _object * * 0x009beb10, int 2,
          _object * * 0x00000000, int 0, _object * * 0x00000000,
          int 0) line 1860 + 37 bytes
call_eval_code2(_object * 0x0098a97c, _object * 0x009beafc,
         _object * 0x00000000) line 2765 + 57 bytes
call_object(_object * 0x0098a97c, _object * 0x009beafc,
         _object * 0x00000000) line 2594 + 17 bytes
call_method(_object * 0x0098a97c, _object * 0x009beafc,
         _object * 0x00000000) line 2717 + 17 bytes
call_object(_object * 0x007e125c, _object * 0x009beafc,
         _object * 0x00000000) line 2592 + 17 bytes
do_call(_object * 0x007e125c, _object * * * 0x0063f96c, int 2,
        int 0) line 2915 + 17 bytes
eval_code2(PyCodeObject * 0x00991560, _object * 0x0098794c,
        _object * 0x00000000, _object * * 0x009bce98, int 2,
        _object * * 0x009bcea0, int 0, _object * * 0x00000000,
        int 0) line 1863 + 30 bytes
fast_function(_object * 0x009a7dfc, _object * * * 0x0063fb04, int 2,
        int 2, int 0) line 2817 + 61 bytes
eval_code2(PyCodeObject * 0x009f7e00, _object * 0x0076f14c,
       _object * 0x00000000, _object * * 0x00775904, int 0,
       _object * * 0x00775904, int 0, _object * * 0x00000000,
       int 0) line 1860 + 37 bytes
fast_function(_object * 0x009bc8ac, _object * * * 0x0063fc9c, int 0,
       int 0, int 0) line 2817 + 61 bytes
eval_code2(PyCodeObject * 0x009f86d0, _object * 0x0076f14c,
      _object * 0x0076f14c, _object * * 0x00000000, int 0,
      _object * * 0x00000000, int 0, _object * * 0x00000000,
      int 0) line 1860 + 37 bytes
PyEval_EvalCode(PyCodeObject * 0x009f86d0, _object * 0x0076f14c,
      _object * 0x0076f14c) line 338 + 29 bytes
run_node(_node * 0x007aa740, char * 0x00760dd9, _object * 0x0076f14c,
     _object * 0x0076f14c) line 919 + 17 bytes
run_err_node(_node * 0x007aa740, char * 0x00760dd9, _object * 0x0076f14c,
     _object * 0x0076f14c) line 907 + 21 bytes
PyRun_FileEx(_iobuf * 0x10261888, char * 0x00760dd9, int 257,
     _object * 0x0076f14c, _object * 0x0076f14c, int 1) line 899 + 21 bytes
PyRun_SimpleFileEx(_iobuf * 0x10261888, char * 0x00760dd9, int 1)
      line 612 + 30 bytes
PyRun_AnyFileEx(_iobuf * 0x10261888, char * 0x00760dd9, int 1)
      line 466 + 17 bytes
Py_Main(int 2, char * * 0x00760da0) line 295 + 44 bytes
main(int 2, char * * 0x00760da0) line 10 + 13 bytes

insertdict is doing

    Py_DECREF(old_value);

reset_exc_info is doing

    PySys_SetObject("exc_type", frame->f_exc_type);

Bet that's as helpful to you as it was to me <wink>.



From thomas@xs4all.net  Sun Jan 21 22:13:02 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sun, 21 Jan 2001 23:13:02 +0100
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <20010121225405.M17392@xs4all.nl>; from thomas@xs4all.net on Sun, Jan 21, 2001 at 10:54:05PM +0100
References: <200101211651.LAA25346@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com> <20010121225405.M17392@xs4all.nl>
Message-ID: <20010121231302.N17392@xs4all.nl>

On Sun, Jan 21, 2001 at 10:54:05PM +0100, Thomas Wouters wrote:
> I see an error in the nis module (missing function prototype, and broken
> system-include file) and a *lot* of errors in linuxaudiodev

The errors in linuxaudiodev are only errors because for some reason, in
-ansi -pedantic-errors mode, gcc doesn't define the 'linux' symbol. IMHO,
not worth fixing. The nismodule is 'broken' because of this:

static
nismaplist *
nis_maplist (void)
{
        nisresp_maplist *list;
        char *dom;
        CLIENT *cl, *clnt_create();

clnt_create() should be declared by the system include files. Anyone have
objections to me moving it to pyport.h, inside the '#if 0' ?

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From tim.one@home.com  Sun Jan 21 22:28:45 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 17:28:45 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <20010121225405.M17392@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com>

[Thomas Wouters]
> Why is comparing v->ob_type with w->ob_type illegal ? They're
> both pointers to the same type, aren't they ?

Non-equality comparison of pointers is defined if and only if the pointers
are both addresses in the same contiguous structure (think struct or array);
an exception is made for a pointer "one beyond the end" of an array, i.e. if

    sometype a[N];

then &a[0] < &a[N] == 1 is guaranteed despite that &a[N] is outside the
bounds of a; but &a[0] < &a[N+1] is undefined (which *means* undefined!
e.g., it's OK if they compare equal, or if the comparison causes a hardware
fault, or ...).

> That's easy to check, gcc has these nice (and from a users point of view,
> fairly useless) options: '-ansi', '-pedantic' and '-pedantic-errors'.
> '-ansi' disables some GCC-specific features, -pedantic turns gcc into a
> whiney pedantic I'm sure you'd get along with just fine <wink>, and
> -pedantic-errors turns those whines into errors.

Your faith in gcc is as charming as it is naive <wink>:  the most
interesting cases of undefined behavior can't be checked no-way, no-how at
compile-time.  That's why Barry keeps talking employers into dumping
thousands of dollars into a single Insure++ license.  Insure++ actually tags
every pointer at runtime with its source, and gripes if non-equality
comparisons are done on a pair not derived from the same array or malloc
etc.  Since Python type objects are individually allocated (not taken from a
preallocated contiguous vector), Insure++ should complain about that
compare.

> ...
> Note that this was tested in a current tree. I couldn't find
> either Guido's 'broken' code or your proposed 'good' code, so I
> don't know if you checked in a fix yet. If you didn't, don't bother,
> it's not broken :-)

Guido hasn't checked it in yet, but gcc isn't smart enough to detect *this*
breakage anyway.




From fredrik@effbot.org  Sun Jan 21 23:02:10 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Mon, 22 Jan 2001 00:02:10 +0100
Subject: [Python-Dev] more unicode database changes
Message-ID: <030501c083fe$2fe7dbf0$e46940d5@hagrid>

Just checked in another unicode database patch, which
saves another ~60k.  On my Windows box, the Unicode
tables are now about 200k (down from 600k in 2.0).

After this change, Modules/unicodedatabase.[ch] are no
longer used.

Since I'm on a Windows box with MSVC 5.0, I don't really
want to try removing them from the official build files. In-
stead, I've checked in empty versions of the files.

Can anyone help me get rid of all references to them from
the build files (and CVS)?

</F>

PS. btw, if my changes broke the build somewhere, let me
know asap!



From tim.one@home.com  Sun Jan 21 23:07:14 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 18:07:14 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <200101212200.XAA16672@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEEDIKAA.tim.one@home.com>

[Martin, on ftp://ftp.jclark.com/pub/xml/expat.zip]
> ...
> That gives some clue, yes. Unfortunately, that URL itself is a symlink
> that was expat1_1.zip (157936 bytes) at some point,

That's the one I've been using.

> and now is expat1_2.zip (153591 bytes).

I'm assuming you're recommending that one!  Based on that assumption, I've
downloaded a new one and will put that in the 2.1a1 Windows release.  Scream
if that's not what you want.

> ...
> Anyway, I was using 1.1 in my own tests, and 1.2 in PyXML - either
> works for me. I never tested 1.95.x (which is also not available from
> jclark.com).

If you do and love it, let me know where to get it and I'll ship that
instead.

>> The tests passed in the wee hours (EST; UTC -0500) this morning.
>> They began failing after I updated around 1pm EST today.

> I just merged pyexpat changes from PyXML into Python 2 so that could
> be the cause. However, this very code has been used for some time by
> PyXML users, why it crashes for you is a mystery to me.

Perhaps gc, perhaps uninitialized vars, ..., hard to say.  Unfortunately,
it's not unusual for flawed code to display different behavior across
platforms; or, from the long-term QA perspective, it's *great* that flawed
code doesn't always appear to work on all platforms <wink>.

> Any chance of producing a C backtrace?

Sent that before; doesn't look like much help; we're seeing a NULL type
pointer, but at that stage there's no telling when or where or why it
*became* NULL.

I'm going to rebuild the world from scratch, and use the new DLLs.  You
should assume that didn't help unless I say otherwise within 15 minutes.



From tim.one@home.com  Sun Jan 21 23:09:51 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 18:09:51 -0500
Subject: [Python-Dev] more unicode database changes
In-Reply-To: <030501c083fe$2fe7dbf0$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEEEIKAA.tim.one@home.com>

[/F]
> Just checked in another unicode database patch, which
> saves another ~60k.  On my Windows box, the Unicode
> tables are now about 200k (down from 600k in 2.0).

Yay!  I take it CNRI wasn't paying you by the byte <wink>.

> After this change, Modules/unicodedatabase.[ch] are no
> longer used.
>
> Since I'm on a Windows box with MSVC 5.0, I don't really
> want to try removing them from the official build files. In-
> stead, I've checked in empty versions of the files.

That's fine.

> Can anyone help me get rid of all references to them from
> the build files (and CVS)?
>
> </F>
>
> PS. btw, if my changes broke the build somewhere, let me
> know asap!

I'll take care of the MS project files -- and I was just about to rebuild
the world from scratch anyway.



From tim.one@home.com  Sun Jan 21 23:20:03 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 18:20:03 -0500
Subject: [Python-Dev] more unicode database changes
In-Reply-To: <030501c083fe$2fe7dbf0$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEEFIKAA.tim.one@home.com>

> After this change, Modules/unicodedatabase.[ch] are no
> longer used.

Not so:  unicodedata.c still #includes unicodedatabase.h.



From tim.one@home.com  Sun Jan 21 23:53:13 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 18:53:13 -0500
Subject: [Python-Dev] more unicode database changes
In-Reply-To: <030501c083fe$2fe7dbf0$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEEGIKAA.tim.one@home.com>

[/F]
> ...
> PS. btw, if my changes broke the build somewhere, let me
> know asap!

The Windows build is fine now and changes checked-in.  You can remove

    Modules/unicodedatabase.[ch]

from the project without hurting it (although I imagine the Unixish builds
still need to learn about this!).



From tim.one@home.com  Mon Jan 22 00:12:21 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 19:12:21 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <200101212200.XAA16672@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEEHIKAA.tim.one@home.com>

More FYI:  With the new expat1_2.zip (153591 bytes) DLLs, all tests pass on
Windows except for test_sax.  No change in symptoms.  The failure modes for
test_sax depend on all of:

+ Whether run in release or debug builds.

+ Whether text_sax.py is run directly or via regrtest.py.

+ Whether I delete all .pyc/.pyo files first, or use precomplied ones.

+ In debug builds, whether the test is started from within the
  debugger, or I start it via cmdline and attach to the process after
  it crashes (with a memory fault).

Here's a new failure mode:

test test_sax crashed -- XMLParserType: no element found: line 1, column 5

So this smells to high heaven of either a nasty gc problem or referencing
uninitialized memory.  Symptoms don't change if I stick

    import gc
    gc.disable()

at the start of test_sax.py.

Barry, can you try running test_sax under Insure?  I've got little chance of
making enough time tonight to figure this out the hard way ...



From nas@arctrix.com  Sun Jan 21 17:28:52 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Sun, 21 Jan 2001 09:28:52 -0800
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEEHIKAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 21, 2001 at 07:12:21PM -0500
References: <200101212200.XAA16672@pandora.informatik.hu-berlin.de> <LNBBLJKPBEHFEDALKOLCIEEHIKAA.tim.one@home.com>
Message-ID: <20010121092852.A24605@glacier.fnational.com>

On Sun, Jan 21, 2001 at 07:12:21PM -0500, Tim Peters wrote:
> So this smells to high heaven of either a nasty gc problem or referencing
> uninitialized memory.  Symptoms don't change if I stick
> 
>     import gc
>     gc.disable()
> 
> at the start of test_sax.py.

Can you try it with WITH_CYCLE_GC undefined?

  Neil


From greg@cosc.canterbury.ac.nz  Mon Jan 22 00:25:08 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 13:25:08 +1300 (NZDT)
Subject: [Python-Dev] a>b == b<a dangerous?
In-Reply-To: <200101191600.LAA28788@cj20424-a.reston1.va.home.com>
Message-ID: <200101220025.NAA01809@s454.cosc.canterbury.ac.nz>

Suppose I have a class which checks whether it knows
how to do a comparison, and if not, wants to pass it
on to the other operand in case it knows:

  class Foo:

    def __lt__(self, other):
      if I_know_about(other):
        # do the comparison
      else:
        return other.__gt__(self)

If the other operand has a __gt__ method which is
doing similar tricks, infinite recursion could result.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Mon Jan 22 00:36:51 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 13:36:51 +1300 (NZDT)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <200101191848.NAA02765@cj20424-a.reston1.va.home.com>
Message-ID: <200101220036.NAA01813@s454.cosc.canterbury.ac.nz>

Guido:

> I don't understand how these can be not commutative unless they have a
> side effect on the left argument

I think he meant "not reflective". If a<b == floor(a,b) and a>b ==
ceil(a,b), then clearly a<b != b>a.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From mwh21@cam.ac.uk  Mon Jan 22 00:48:16 2001
From: mwh21@cam.ac.uk (Michael Hudson)
Date: 22 Jan 2001 00:48:16 +0000
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Greg Ewing's message of "Mon, 22 Jan 2001 13:36:51 +1300 (NZDT)"
References: <200101220036.NAA01813@s454.cosc.canterbury.ac.nz>
Message-ID: <m31ytw1kfj.fsf@atrus.jesus.cam.ac.uk>

Greg Ewing <greg@cosc.canterbury.ac.nz> writes:

> Guido:
> 
> > I don't understand how these can be not commutative unless they have a
> > side effect on the left argument
> 
> I think he meant "not reflective". If a<b == floor(a,b) and a>b ==
> ceil(a,b), then clearly a<b != b>a.

What's floor of two arguments?  In common lisp, (floor a b) is the
largest integer n such that (<= n (/ a b)), in Python it's a type
error...  if you meant min(a,b), then I then think the programmer who
thinks "min(a,b)" is spelt "a<b" has problems we can't be expected to
deal with (if min has a symbol it's /\, but never mind that).

More generally, people who define their comparison operators in
non-intuitive ways shouldn't really expect intuitive behaviour.  I
thought Guido threatened to document this fact in large letters
somewhere...

Cheers,
M.

-- 
  Premature optimization is the root of all evil in programming.  
                                                       -- C.A.R. Hoare



From greg@cosc.canterbury.ac.nz  Mon Jan 22 00:52:25 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 13:52:25 +1300 (NZDT)
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com>
Message-ID: <200101220052.NAA01817@s454.cosc.canterbury.ac.nz>

> Non-equality comparison of pointers is defined if and only if the pointers
> are both addresses in the same contiguous structure

I'm not sure that the proposed alternative (casting both
pointers to ints and comparing the ints) is any better.
Does the C std define the result of doing that to two
unrelated pointers?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From tim.one@home.com  Mon Jan 22 00:56:16 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 19:56:16 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <20010121092852.A24605@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEEJIKAA.tim.one@home.com>

[Neil Schemenauer]
> Can you try it with WITH_CYCLE_GC undefined?

Good idea -- for someone with an infinite amount of free time <wink>.

But being a good sport, I did as you asked with giddy cheer.  Alas, it
didn't help (all the same bizarre context-dependent test_sax failure modes).
I'm sure I disabled WITH_CYCLE_GC correctly, because "import gc" now fails
with ImportError in both release and debug builds.

BTW, a refcount-too-low problem is another good candidate.



From greg@cosc.canterbury.ac.nz  Mon Jan 22 01:00:46 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 14:00:46 +1300 (NZDT)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <m31ytw1kfj.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <200101220100.OAA01820@s454.cosc.canterbury.ac.nz>

Michael Hudson <mwh21@cam.ac.uk>:

> if you meant min(a,b),

Yes, sorry, that's what I meant. Or at least that's what
I thought the original poster meant - if he didn't, then
I'm confused, too!

Anyway, I agree that it's a silly thing to want to make
a>b mean, and I'm not all that disappointed that it won't
be possible.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From tim.one@home.com  Mon Jan 22 01:11:52 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 20:11:52 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101220052.NAA01817@s454.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEEKIKAA.tim.one@home.com>

[Greg Ewing]
> I'm not sure that the proposed alternative (casting both
> pointers to ints and comparing the ints) is any better.
> Does the C std define the result of doing that to two
> unrelated pointers?

C99 guarantees that, if the type exists, casting a pointer to type uintptr_t
won't blow up, and also guarantees that comparisons between (at least) ints
of the same type won't blow up.  Beyond that, we don't care what it returns.
Mostly we're trying to eliminate warnings Barry has to wade thru from
Insure++ -- same reason we have a "no compiler warnings!" build policy.
Doing the cast is obviously "better" when viewed through Barry's 4AM eyes.

You can find out *why* C has this rule (which was in C89, not new in C99) by
reading the C FAQ.



From tim.one@home.com  Mon Jan 22 01:23:27 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 20:23:27 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <m31ytw1kfj.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEELIKAA.tim.one@home.com>

[Michael Hudson]
> ...
> if you meant min(a,b), then I then think the programmer who
> thinks "min(a,b)" is spelt "a<b" has problems we can't be expected to
> deal with (if min has a symbol it's /\, but never mind that).

Curiously, in the Icon language, if a is less than b then

   a < b

returns b while

   b > a

returns a.

In this way they get the same effect as Python's chained comparisons

   a < b < c < d

via purely binary operators (if a is *not* less than b, a < b in Icon
"fails", which is a silent event that causes the expression's context to
backtrack -- but we won't go into that here <wink>).

Anyway, that accounts for this curious Icon idiom:

   a <:= b

which is short for

   a := a < b

and binds a to max(a, b) (if a is smaller, a < b returns b and the
assignment proceeds; but if a is not smaller, a < b fails and that
propagates into its context, which here has no other possibilities to
backtrack into, so the stmt just ends leaving a alone).

"<"-and-">"-are-just-bags-of-pixels-ly y'rs  - tim



From uche.ogbuji@fourthought.com  Mon Jan 22 01:24:46 2001
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Sun, 21 Jan 2001 18:24:46 -0700
Subject: [Python-Dev] should a module's thread safety be documented?
In-Reply-To: Message from Guido van Rossum <guido@digicool.com>
 of "Sun, 21 Jan 2001 12:00:02 EST." <200101211700.MAA25479@cj20424-a.reston1.va.home.com>
Message-ID: <200101220124.SAA08868@localhost.localdomain>

> > A bit late for 2.1alpha1, but it just occurred to me that perhaps there
> > should be an annotation in the documentation that indicates whether or not a
> > module is thread-safe.  For example, many functions in fileinput rely on a
> > module global called _state.  It strikes me that this module is not likely
> > to be thread-safe, yet the documentation doesn't appear to mention this,
> > certainly not in an obvious fashion.
> > 
> > Anyone for adding \notthreadsafe{} and \threadsafe{} macros to the litany of
> > LaTex macros in Fred's arsenal?  This would make documenting these
> > properties both easy and consistent across modules.
> 
> It's hard to say whether a *whole module* is threadsafe.  E.g. in the
> fileinput example, there's the clear implication that if you use this
> in multiple threads, you should instantiate your own FileInput
> instances, and then you're totally thread-safe.  Clearly the semantics
> of the module-global functions are thread-unsafe though.

Perhaps what is needed rather is a prose annotation for thread-safety issues.

My TeX is rusty, but in Docbook, with the use of role attributes, one could 
have, taking your FileInput example

<sect1 role="thread-safety"><para>
  The module-global functions are not safe, but if you instantiate your own 
FileInput instances, they will be totally thread-safe.
</para></sect>

That way the MT issues could be styled differently on rendering, gathered into 
separate documentation, stripped by those who don't care, etc.  I imagine this 
is also possible in TeX.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python




From tim.one@home.com  Mon Jan 22 01:32:30 2001
From: tim.one@home.com (Tim Peters)
Date: Sun, 21 Jan 2001 20:32:30 -0500
Subject: [Python-Dev] a>b == b<a dangerous?
In-Reply-To: <200101220025.NAA01809@s454.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEEMIKAA.tim.one@home.com>

[Greg Ewing]
> Suppose I have a class which checks whether it knows
> how to do a comparison, and if not, wants to pass it
> on to the other operand in case it knows:
>
>   class Foo:
>
>     def __lt__(self, other):
>       if I_know_about(other):
>         # do the comparison
>       else:
>         return other.__gt__(self)
>
> If the other operand has a __gt__ method which is
> doing similar tricks, infinite recursion could result.

Does this have something to do with comparisons?  That is, wouldn't the same
be true if you coded two methods named "spam" and "eggs" in this way?

whatever = 0

class Foo:
    def spam(self, other):
       if whatever:
           return 1
       else:
           return other.eggs(self)

class Bar:
    def eggs(self, other):
       if whatever:
           return 1
       else:
           return other.spam(self)

Foo().spam(Bar())  # RuntimeError: Maximum recursion depth exceeded

It that's all there is to it, you got what you asked for.



From greg@cosc.canterbury.ac.nz  Mon Jan 22 03:31:41 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 16:31:41 +1300 (NZDT)
Subject: [Python-Dev] a>b == b<a dangerous?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEEMIKAA.tim.one@home.com>
Message-ID: <200101220331.QAA01833@s454.cosc.canterbury.ac.nz>

Tim Peters <tim.one@home.com>:

> Does this have something to do with comparisons?  That is, wouldn't the same
> be true if you coded two methods named "spam" and "eggs" in this
> way?

Yes, but Guido hasn't decreed that a.spam(b) and b.eggs(a) are
to have a reflective relationship with each other.

But don't worry - I've belatedly realised that the correct way
to do what I was talking about is to return NotImplemented and
let the interpreter take care of calling the reflected method.
So I withdraw my objection.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From tim.one@home.com  Mon Jan 22 07:54:32 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 22 Jan 2001 02:54:32 -0500
Subject: [Python-Dev] Worse news
Message-ID: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>

I still don't have a clue about test_sax, but have stumbled into more
failure modes.  Most of them seem related to the SystemError ("'finally'
pops bad exception").  Around that part of ceval.c, sometimes the v popped
off the stack has a NULL type pointer, other times it's a pointer to a
damaged PyTuple_Type (for example, with a tp_dealloc field of 0x61, which
leads to an illegal instruction exception).

The MS debug heap routines fill all newly malloc'ed memory with 0Xcd ("clean
landfill"), fill free'ed memory with 0Xdd ("dead landfill"), and *pad*
malloc'ed memory with some number of 0xfd bytes on both sides ("no-man's
land").  The clean landfill and no-man's land patterns are showing up more
often they should "by chance", and especially in high-order bytes.  Just
more evidence of the obvious:  something is really screwed up <wink>.

I cannot get the subtest that test_sax is calling (test_expat_incomplete) to
fail in isolation.

Next headache:  If I delete all .pyc files from Lib/ and Lib/test/, and then
run:

python ../lib/test/regrtest.py -x test_sax

by hand, all the 98 tests that *should* run on Windows (excluding, of
course, test_sax, which is no longer tried) pass.  If I immediately run them
again (without deleting .pyc) by hand:

python ../lib/test/regrtest.py -x test_sax

then they again pass.  However, if I do

rt -x test_sax

which does exactly the steps (delete .pyc, run regrest excluding test_sax,
run regrtest again) via the little MS batch file rt.bat, then on the second
time thru regrtest, and 5 times out of 5, it died in test_extcall with an
"illegal operation", while executing

		if (TYPE(c) == DOUBLESTAR) {

near the end of symtable_params in compile.c.  This is an optimized build,
and the debugger has no idea what's in c at this point; to judge from the
offending machine instruction and register contents, though, c is a bad
pointer.

Have not been able to get test_extcall to fail in isolation.

Have also been unable to get test_extcall to fail in the debug build.


So there's evidence of Deep Rot beyond test_sax, but test_sax remains the
only test that fails every time and under both build types.

Running regrtest with -r (randomize test order) is also "interesting":
first time I tried that, test_cpickle failed (truncated output) as well as
test_sax.

I doubt anyone has run the tests more often than me over the last week, so
I'm not surprised I'm seeing the most problems.  However, since *nobody* is
seeing anything on Linux, I'd at least like to get *someone* else to run the
tests on Windows.  While I'm not having any unusual problems with my box,
it's certainly possible that I've got a corrupted file or a flaky memory
chip etc, or that MSVC is generating bad code for some recent change
(although that's unlikely since the debug build generates *really*
straightforward code).

Deleting my entire PCbuild subtree and refetching it from CVS didn't make
any difference.



From esr@thyrsus.com  Mon Jan 22 08:01:27 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 03:01:27 -0500
Subject: [Python-Dev] autoconf --enable vs. --with
In-Reply-To: <200101212122.WAA16371@pandora.informatik.hu-berlin.de>; from loewis@informatik.hu-berlin.de on Sun, Jan 21, 2001 at 10:22:44PM +0100
References: <200101212122.WAA16371@pandora.informatik.hu-berlin.de>
Message-ID: <20010122030127.C20804@thyrsus.com>

Martin von Loewis <loewis@informatik.hu-berlin.de>:
> > It looks like with-cycle-gc and mybe with-pydebug would have to be
> > changed.
> 
> I'm in favour of changing it.

Likewise.  Let's be good neighbors.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Where rights secured by the Constitution are involved, there can be no
rule making or legislation which would abrogate them.
        -- Miranda vs. Arizona, 384 US 436 p. 491


From loewis@informatik.hu-berlin.de  Mon Jan 22 08:26:15 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 22 Jan 2001 09:26:15 +0100 (MET)
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEDPIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCOEDPIKAA.tim.one@home.com>
Message-ID: <200101220826.JAA20819@pandora.informatik.hu-berlin.de>

> Running it from a command line instead produces the same output up to but
> not including the traceback, and Python crashes with a memory fault then.
> Attaching to the process with a debugger at that point shows it trying to do
> _Py_Dealloc on an op whose op->op_type member is NULL.
[...]
> Bet that's as helpful to you as it was to me <wink>.

Well, it was atleast motivating enough to try it out on my Whistler
installation. Purify would probably find this rather quickly; the code
writes into the 257th element of a 256-elements array. I've committed
a fix.

Depending on the exact organization of globals, this could have easily
gone unnoticed. MSVC packs variables more than gcc does, so the write
would overwrite one byte in ErrorObject, which would then not point to
a PyObject anymore.

Thanks for your patience,
Martin


From tim.one@home.com  Mon Jan 22 09:18:04 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 22 Jan 2001 04:18:04 -0500
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing (Windows))
In-Reply-To: <200101220826.JAA20819@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com>

[Martin]
> Well, it was atleast motivating enough to try it out on my Whistler
> installation. Purify would probably find this rather quickly; the code
> writes into the 257th element of a 256-elements array.

Ah!  You shouldn't do that <wink>.

> I've committed a fix.

But you should do that.  Thank you!

Here's where I am now:

=========================================================================
All test_sax failures have gone away (yay!).
=========================================================================
Running

    rt -x test_sax

on Windows still blows up in test_extcall on the 2nd pass.  It does not blow
up:

    using the debug build; or
    if test_sax is *not* excluded; or
    in the 1st pass; or
    when running text_extcall in isolation; or
    if the steps rt performs are done by hand
=========================================================================
Running

    rt -r

on Windows still sees test_cpickle fail in the first pass (with truncated
output), but succeed in the second pass.  First-pass failure is always like
so (modulo line breaks I'm inserting by hand):

test test_cpickle failed -- Tail of expected stdout unseen:
'dumps()\012
loads()\012
ok\012
loads() DATA\012
ok\012
dumps() binary\012
loads() binary\012
ok\012
loads() BINDATA\012
ok\012
dumps() RECURSIVE\012
ok\012'

I've also seen it fail at least once when doing the same thing by hand:

    del ..\lib\*.pyc
    del ..\lib\test\*.pyc
    python ../lib/test/regrtest.py -r

else-i-would-have-asked-martin-to-look-for-a digit-to-change-in-
    command.com<wink>-ly y'rs  - tim



From mal@lemburg.com  Mon Jan 22 10:19:18 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 11:19:18 +0100
Subject: [Python-Dev] more unicode database changes
References: <030501c083fe$2fe7dbf0$e46940d5@hagrid>
Message-ID: <3A6C0926.D0A004E4@lemburg.com>

Fredrik Lundh wrote:
> 
> Just checked in another unicode database patch, which
> saves another ~60k.  On my Windows box, the Unicode
> tables are now about 200k (down from 600k in 2.0).

Great work, Fredrik :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Mon Jan 22 10:42:52 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 11:42:52 +0100
Subject: [Python-Dev] readline and setup.py
References: <3A68B5B0.771412F7@lemburg.com>
Message-ID: <3A6C0EAC.7D322174@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> The new setup.py procedure for Python causes readline not to
> be built on my machine. Instead I get a linker error telling
> me that termcap is not found.
> 
> Looking at my old Setup file, I have this line:
> 
> readline readline.c \
>          -I/usr/include/readline -L/usr/lib/termcap \
>          -lreadline -lterm
> 
> I guess, setup.py should be modified to include additional
> library search paths -- shouldn't hurt on platforms which
> don't need them.

Here's a patch which works for me:

projects/Python> diff CVS-Python/setup.py Dev-Python/
--- CVS-Python/setup.py Mon Jan 22 11:36:56 2001
+++ Dev-Python/setup.py Mon Jan 22 11:40:15 2001
@@ -216,10 +216,11 @@ class PyBuildExt(build_ext):
             exts.append( Extension('rgbimg', ['rgbimgmodule.c']) )
 
         # readline
         if (self.compiler.find_library_file(lib_dirs, 'readline')):
             exts.append( Extension('readline', ['readline.c'],
+                                   library_dirs=['/usr/lib/termcap'],
                                    libraries=['readline', 'termcap']) )
 
         # The crypt module is now disabled by default because it breaks builds
         # on many systems (where -lcrypt is needed), e.g. Linux (I believe).
 


-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Mon Jan 22 10:52:17 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 11:52:17 +0100
Subject: [Python-Dev] _tkinter and setup.py
References: <3A68B6BD.BAD038D6@lemburg.com>
Message-ID: <3A6C10E1.EF890356@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> Why does setup.py stop with an error in case _tkinter cannot
> be built (due to an old Tk/Tcl version in my case) ?
> 
> I think the policy in setup.py should be to output warnings,
> but continue building the rest of the Python modules.

I haven't heard anything from the powers to be... what should the
policy be for auto-detected and -configured modules ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From thomas@xs4all.net  Mon Jan 22 12:37:04 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 13:37:04 +0100
Subject: [Python-Dev] _tkinter and setup.py
In-Reply-To: <3A6C10E1.EF890356@lemburg.com>; from mal@lemburg.com on Mon, Jan 22, 2001 at 11:52:17AM +0100
References: <3A68B6BD.BAD038D6@lemburg.com> <3A6C10E1.EF890356@lemburg.com>
Message-ID: <20010122133704.O17392@xs4all.nl>

On Mon, Jan 22, 2001 at 11:52:17AM +0100, M.-A. Lemburg wrote:
> "M.-A. Lemburg" wrote:

> > I think the policy in setup.py should be to output warnings,
> > but continue building the rest of the Python modules.

> I haven't heard anything from the powers to be... what should the
> policy be for auto-detected and -configured modules ?

I think Andrew is still working on a way to disable modules from the command
line somehow. (I think moving setup.py to setup.py.in, and using autoconf
--options would be easiest on both developer and user, but that's just me.)
I also think everyone agrees with you that a module that can't be build
shouldn't stop the entire process in the final release (and possibly the
betas) but that it's definately a good way to debug setup.py in the alphas.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From tismer@tismer.com  Mon Jan 22 13:13:46 2001
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 22 Jan 2001 14:13:46 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing
 (Windows))
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com>
Message-ID: <3A6C320A.37CBB4E5@tismer.com>

Maybe I can help.

Tim Peters wrote:
...
> Here's where I am now:
> 
> =========================================================================
> All test_sax failures have gone away (yay!).
> =========================================================================
> Running
> 
>     rt -x test_sax
> 
> on Windows still blows up in test_extcall on the 2nd pass.  It does not blow
> up:
> 
>     using the debug build; or
>     if test_sax is *not* excluded; or
>     in the 1st pass; or
>     when running text_extcall in isolation; or
>     if the steps rt performs are done by hand
...

I got problems with XML as well. I'm not using SAX, but plain
expat for speed. The following error happens after parsing
thousands of small XML files:

from_my_log_window="""
\\bned-s1\tismer\pxml\sdf\mdl\DisplayRGB\1
\\bned-s1\tismer\pxml\sdf\mdl\DisplayVideo\1
Traceback (innermost last):
  File "<interactive input>", line 1, in ?
  File "D:\crml_doc\pxml\clean.py", line 151, in getall
    getall(here, res)
  File "D:\crml_doc\pxml\clean.py", line 151, in getall
    getall(here, res)
  File "D:\crml_doc\pxml\clean.py", line 151, in getall
    getall(here, res)
  File "D:\crml_doc\pxml\clean.py", line 149, in getall
    res.append(p.parse())
  File "D:\crml_doc\pxml\clean.py", line 81, in parse
    self.parsers[0].Parse(self.txt1, 1)
  File "D:\crml_doc\pxml\clean.py", line 53, in endElementMaster
    if self.txt2: self.parsers[1].Parse(self.txt2, 1)
  File "D:\crml_doc\pxml\clean.py", line 46, in startElementOther
    if name <> "MASTER":
UnicodeError: UTF-8 decoding error: invalid data
"""

The good news: The error is reproducible, happens the same under
PythonWin and DOS Python, and I can reduce it to a single XML file.
That indicates to me that I am near the reason of the bug,
not at late, indirect effects.
It also *might* be related to Unicode.

I will now try to create a minimized script and XML data that
produces the above again.

back in an hour - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From thomas@xs4all.net  Mon Jan 22 13:52:44 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 14:52:44 +0100
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 21, 2001 at 05:28:45PM -0500
References: <20010121225405.M17392@xs4all.nl> <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com>
Message-ID: <20010122145244.Y17295@xs4all.nl>

On Sun, Jan 21, 2001 at 05:28:45PM -0500, Tim Peters wrote:
> [Thomas Wouters]
> > Why is comparing v->ob_type with w->ob_type illegal ? They're
> > both pointers to the same type, aren't they ?

> Non-equality comparison of pointers is defined if and only if the pointers
> are both addresses in the same contiguous structure (think struct or array);
> an exception is made for a pointer "one beyond the end" of an array, i.e. if

>     sometype a[N];

> then &a[0] < &a[N] == 1 is guaranteed despite that &a[N] is outside the
> bounds of a; but &a[0] < &a[N+1] is undefined (which *means* undefined!
> e.g., it's OK if they compare equal, or if the comparison causes a hardware
> fault, or ...).

Ok, I guess I stand corrected. I was confused by the name of Py_uintptr_t: I
thought it was a pointer-to-int, not an int large enough to hold a pointer.
I'm also positively appalled by the fact the standard refuses to define sane
behaviour for out-of-bounds access on an array, but attaches some weird
significance to what pointers are pointing *to*, when comparing the values
of those pointers, regardless of what type of object they are stored in. But
I guess I don't have to whine about that to you, Tim :-)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From tismer@tismer.com  Mon Jan 22 14:03:25 2001
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 22 Jan 2001 15:03:25 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing
 (Windows))
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com> <3A6C320A.37CBB4E5@tismer.com>
Message-ID: <3A6C3DAD.522CE623@tismer.com>


Christian Tismer wrote:
> 
> Maybe I can help.

...

...
> I will now try to create a minimized script and XML data that
> produces the above again.
> 
> back in an hour - chris

Here we go.
The following session produces the mentioned UTF8 error:

>>> txt = "<master desc='blah\325weird' />"
>>> def startelt(name, dic):
... 	print name, dic
... 	
>>> p=expat.ParserCreate()
>>> p.StartElementHandler = startelt
>>> p.Parse(txt)
Traceback (innermost last):
  File "<interactive input>", line 1, in ?
UnicodeError: UTF-8 decoding error: invalid data

Behavior depends of the ASCII code.
>From code 128 (0200) to 191 (0277) the parser gives an
not well-formed exception, as it should be.

The codes from 192 to 236, 238-243 produce
"UTF-8 decoding error: invalid data",
the rest gives "not well-formed".

I would like to know if this happens with your (Tim) modified
version as well. I'm using plain vanilla BeOpen Python 2.0 .

cheers - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From jeremy@alum.mit.edu  Mon Jan 22 14:19:34 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 09:19:34 -0500 (EST)
Subject: [Python-Dev] Worse news
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>
Message-ID: <14956.16758.68050.257212@localhost.localdomain>

Tim,

Funny (strange or haha?) that test_extcall is failing since the two
pieces of code I've modified most recently are compile.c and the
section of ceval.c that handles extended call syntax.  I just got
through my mail this morning and I'll see what I can reproduce on
Linux.

As for the test_sax failure, is any of the Python code being executed
conditional on platform?  The compiler may be generating bad bytecode
for a code path that is only executed on Windows.

Jeremy



From mal@lemburg.com  Mon Jan 22 14:27:38 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 15:27:38 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing
 (Windows))
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com> <3A6C320A.37CBB4E5@tismer.com> <3A6C3DAD.522CE623@tismer.com>
Message-ID: <3A6C4359.BCB06252@lemburg.com>

Christian Tismer wrote:
> 
> Christian Tismer wrote:
> >
> > Maybe I can help.
> 
> ...
> 
> ...
> > I will now try to create a minimized script and XML data that
> > produces the above again.
> >
> > back in an hour - chris
> 
> Here we go.
> The following session produces the mentioned UTF8 error:
> 
> >>> txt = "<master desc='blah\325weird' />"
> >>> def startelt(name, dic):
> ...     print name, dic
> ...
> >>> p=expat.ParserCreate()
> >>> p.StartElementHandler = startelt
> >>> p.Parse(txt)
> Traceback (innermost last):
>   File "<interactive input>", line 1, in ?
> UnicodeError: UTF-8 decoding error: invalid data
> 
> Behavior depends of the ASCII code.
> >From code 128 (0200) to 191 (0277) the parser gives an
> not well-formed exception, as it should be.
> 
> The codes from 192 to 236, 238-243 produce
> "UTF-8 decoding error: invalid data",
> the rest gives "not well-formed".
> 
> I would like to know if this happens with your (Tim) modified
> version as well. I'm using plain vanilla BeOpen Python 2.0 .

This has nothing to do with Python. UTF-8 marks the codes 
from 128-191 as illegal prefix. See Object/unicodeobject.c:

static 
char utf8_code_length[256] = {
    /* Map UTF-8 encoded prefix byte to sequence length.  zero means
       illegal prefix.  see RFC 2279 for details */
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
    2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
    3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
    4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 0, 0
};

Perhaps the parser should catch the UnicodeError and
instead return a not-wellformed exception ?!
 
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Mon Jan 22 14:38:14 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 15:38:14 +0100
Subject: [Python-Dev] _tkinter and setup.py
References: <3A68B6BD.BAD038D6@lemburg.com> <3A6C10E1.EF890356@lemburg.com> <20010122133704.O17392@xs4all.nl>
Message-ID: <3A6C45D5.9A6FA25C@lemburg.com>

Thomas Wouters wrote:
> 
> On Mon, Jan 22, 2001 at 11:52:17AM +0100, M.-A. Lemburg wrote:
> > "M.-A. Lemburg" wrote:
> 
> > > I think the policy in setup.py should be to output warnings,
> > > but continue building the rest of the Python modules.
> 
> > I haven't heard anything from the powers to be... what should the
> > policy be for auto-detected and -configured modules ?
> 
> I think Andrew is still working on a way to disable modules from the command
> line somehow. (I think moving setup.py to setup.py.in, and using autoconf
> --options would be easiest on both developer and user, but that's just me.)

This is fairly simple to do: distutils allows great flexibility
when it comes to adding user options, e.g. we could have

python setup.py --enable-tkinter --disable-readline

or more generic

python setup.py --enable-package tkinter --disable-package readline

The options could then be edited in setup.cfg.

> I also think everyone agrees with you that a module that can't be build
> shouldn't stop the entire process in the final release (and possibly the
> betas) but that it's definately a good way to debug setup.py in the alphas.

True... but currently the only way to get Python to compile is
to hand-edit setup.py and this is not easy for people with no 
prior distutils experience.

BTW, in my case, setup.py did find the TK-libs for 8.0, but for
a beta version -- as a result, _tkinter.c's version #error line 
triggered and the build failed.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@digicool.com  Mon Jan 22 14:38:30 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 09:38:30 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,NONE,1.1
In-Reply-To: Your message of "Sun, 21 Jan 2001 14:37:57 +0200."
 <20010121123757.D897BA83E@darjeeling.zadka.site.co.il>
References: <E14K45e-00030e-00@usw-pr-cvs1.sourceforge.net>
 <20010121123757.D897BA83E@darjeeling.zadka.site.co.il>
Message-ID: <200101221438.JAA29303@cj20424-a.reston1.va.home.com>

> Wouldn't it be better to use the
> 
> d = {}
> exec "foo", d

Surely you meant

    exec "foo" in d

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Mon Jan 22 14:43:42 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 15:43:42 +0100
Subject: [Python-Dev] _tkinter and setup.py
In-Reply-To: <3A6C45D5.9A6FA25C@lemburg.com>; from mal@lemburg.com on Mon, Jan 22, 2001 at 03:38:14PM +0100
References: <3A68B6BD.BAD038D6@lemburg.com> <3A6C10E1.EF890356@lemburg.com> <20010122133704.O17392@xs4all.nl> <3A6C45D5.9A6FA25C@lemburg.com>
Message-ID: <20010122154342.B17295@xs4all.nl>

On Mon, Jan 22, 2001 at 03:38:14PM +0100, M.-A. Lemburg wrote:

> > I think Andrew is still working on a way to disable modules from the command
> > line somehow. (I think moving setup.py to setup.py.in, and using autoconf
> > --options would be easiest on both developer and user, but that's just me.)

> This is fairly simple to do: distutils allows great flexibility
> when it comes to adding user options, e.g. we could have
> 
> python setup.py --enable-tkinter --disable-readline
> 
> or more generic
> 
> python setup.py --enable-package tkinter --disable-package readline
> 
> The options could then be edited in setup.cfg.

Note that the 'user' only has 'configure' and 'make' to run, so optimally,
the options would have to be given to one of those (preferably to
'configure', to keep it similar to 90% of the packages out there.)

> but currently the only way to get Python to compile is
> to hand-edit setup.py and this is not easy for people with no 
> prior distutils experience.

You only have to edit the 'disabled_module_list' variable... not too hard
even if you don't have distutils experience (though you do need some python
experience.) I don't think its wrong to expect people who compile alpha
versions to have at least that much knowledge (though it should be noted in
the README somewhere.)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From loewis@informatik.hu-berlin.de  Mon Jan 22 14:46:39 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 22 Jan 2001 15:46:39 +0100 (MET)
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing
 (Windows))
In-Reply-To: <3A6C4359.BCB06252@lemburg.com> (mal@lemburg.com)
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com> <3A6C320A.37CBB4E5@tismer.com> <3A6C3DAD.522CE623@tismer.com> <3A6C4359.BCB06252@lemburg.com>
Message-ID: <200101221446.PAA05164@pandora.informatik.hu-berlin.de>

> This has nothing to do with Python. UTF-8 marks the codes 
> from 128-191 as illegal prefix. 
[...]
> Perhaps the parser should catch the UnicodeError and
> instead return a not-wellformed exception ?!

Right on both accounts. If no encoding is specified, and if the
document appears not to be UTF-16 in any endianness, an XML processor
shall assume it is UTF-8. As Marc-Andre explains, your document is not
proper UTF-8, hence the error.

The confusing thing is that expat itself does not care about it not
being UTF-8; that is only detected when the callback is invoked in
pyexpat, and therefore conversion to a Unicode object is attempted.

The right solution probably would be to change expat so that it
determines correctness of the encoding for each string it gets as part
of the wellformedness analysis, and produces illformedness exceptions
when an encoding error occurs. Patches are welcome, although they
probable should go to sourceforge.net/projects/expat.

Regards,
Martin


From jack@oratrix.nl  Mon Jan 22 14:57:33 2001
From: jack@oratrix.nl (Jack Jansen)
Date: Mon, 22 Jan 2001 15:57:33 +0100
Subject: [Python-Dev] test_sax and site-python
Message-ID: <20010122145733.85E51373C95@snelboot.oratrix.nl>

I'm not sure whether this is really a bug, but I had the problem that there 
was something wrong with the xml package I had installed into my 
Lib/site-python, and this caused test_sax to complain.

If the test stuff is expected to test only the core functionality maybe 
sys.path should be edited so that it only contains directories that are part 
of the core distribution?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen@oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | ++++ see http://www.xs4all.nl/~tank/ ++++




From tismer@tismer.com  Mon Jan 22 15:05:24 2001
From: tismer@tismer.com (Christian Tismer)
Date: Mon, 22 Jan 2001 16:05:24 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing
 (Windows))
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com> <3A6C320A.37CBB4E5@tismer.com> <3A6C3DAD.522CE623@tismer.com> <3A6C4359.BCB06252@lemburg.com>
Message-ID: <3A6C4C34.4D1252C9@tismer.com>


"M.-A. Lemburg" wrote:
...
> > The codes from 192 to 236, 238-243 produce
> > "UTF-8 decoding error: invalid data",
> > the rest gives "not well-formed".
> >
> > I would like to know if this happens with your (Tim) modified
> > version as well. I'm using plain vanilla BeOpen Python 2.0 .
> 
> This has nothing to do with Python. UTF-8 marks the codes
> from 128-191 as illegal prefix. See Object/unicodeobject.c:
...

Schade.

> Perhaps the parser should catch the UnicodeError and
> instead return a not-wellformed exception ?!

I belive it would be better.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From guido@digicool.com  Mon Jan 22 15:06:06 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 10:06:06 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include pyport.h,2.24,2.25
In-Reply-To: Your message of "Sun, 21 Jan 2001 15:34:14 PST."
 <E14KTzy-0002Xt-00@usw-pr-cvs1.sourceforge.net>
References: <E14KTzy-0002Xt-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <200101221506.KAA29773@cj20424-a.reston1.va.home.com>

> Move declaration of 'clnt_create()' NIS function to pyport.h, as it's
> supposed to be declared in system include files (with a proper prototype.)
> Should be moved to a platform-specific block if anyone finds out which
> broken platforms need it :-)

[The following is inside #if 0]
> + /* From Modules/nismodule.c */
> + CLIENT *clnt_create();
> + 

Thomas, I'm not sure if this particular declaration belongs in
pyport.h, even inside #if 0.

CLIENT is declared in a NIS-specific header file that's not included by
pyport.h, but which *is* included by nismodule.c.

I think you did the right thing to nismodule.c; the pyport.h patch is
redundant in my eyes.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Mon Jan 22 15:12:49 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 16:12:49 +0100
Subject: [Python-Dev] _tkinter and setup.py
References: <3A68B6BD.BAD038D6@lemburg.com> <3A6C10E1.EF890356@lemburg.com> <20010122133704.O17392@xs4all.nl> <3A6C45D5.9A6FA25C@lemburg.com> <20010122154342.B17295@xs4all.nl>
Message-ID: <3A6C4DF1.F71AA631@lemburg.com>

Thomas Wouters wrote:
> 
> On Mon, Jan 22, 2001 at 03:38:14PM +0100, M.-A. Lemburg wrote:
> 
> > > I think Andrew is still working on a way to disable modules from the command
> > > line somehow. (I think moving setup.py to setup.py.in, and using autoconf
> > > --options would be easiest on both developer and user, but that's just me.)
> 
> > This is fairly simple to do: distutils allows great flexibility
> > when it comes to adding user options, e.g. we could have
> >
> > python setup.py --enable-tkinter --disable-readline
> >
> > or more generic
> >
> > python setup.py --enable-package tkinter --disable-package readline
> >
> > The options could then be edited in setup.cfg.
> 
> Note that the 'user' only has 'configure' and 'make' to run, so optimally,
> the options would have to be given to one of those (preferably to
> 'configure', to keep it similar to 90% of the packages out there.)

Hmm, but then you'll have to hack autoconf again... (even if only
to pass the options to setup.py somehow, e.g. via your proposed
setup.cfg.in trick).
 
> > but currently the only way to get Python to compile is
> > to hand-edit setup.py and this is not easy for people with no
> > prior distutils experience.
> 
> You only have to edit the 'disabled_module_list' variable... not too hard
> even if you don't have distutils experience (though you do need some python
> experience.) I don't think its wrong to expect people who compile alpha
> versions to have at least that much knowledge (though it should be noted in
> the README somewhere.)

Oops, you're right; must have overlooked that one in setup.py.


-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From thomas@xs4all.net  Mon Jan 22 15:14:02 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 16:14:02 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include pyport.h,2.24,2.25
In-Reply-To: <200101221506.KAA29773@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 10:06:06AM -0500
References: <E14KTzy-0002Xt-00@usw-pr-cvs1.sourceforge.net> <200101221506.KAA29773@cj20424-a.reston1.va.home.com>
Message-ID: <20010122161402.D17295@xs4all.nl>

On Mon, Jan 22, 2001 at 10:06:06AM -0500, Guido van Rossum wrote:
> > Move declaration of 'clnt_create()' NIS function to pyport.h, as it's
> > supposed to be declared in system include files (with a proper prototype.)
> > Should be moved to a platform-specific block if anyone finds out which
> > broken platforms need it :-)
> 
> [The following is inside #if 0]
> > + /* From Modules/nismodule.c */
> > + CLIENT *clnt_create();
> > + 
> 
> Thomas, I'm not sure if this particular declaration belongs in
> pyport.h, even inside #if 0.
> 
> CLIENT is declared in a NIS-specific header file that's not included by
> pyport.h, but which *is* included by nismodule.c.
> 
> I think you did the right thing to nismodule.c; the pyport.h patch is
> redundant in my eyes.

The same goes for most prototypes inside that '#if 0'. I see it more as an
easy list to see what prototypes were removed than as proper examples of the
prototype. You're right about CLIENT being defined in system-specific
include files, I just wasn't worried about it because it was inside an '#if 0'
that will never be turned into an '#if 1'. If a specific platform needs that
prototype, we'll figure out how to arrange the prototype then :)

But if you want me to remove it, that's fine.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Mon Jan 22 15:22:29 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 10:22:29 -0500
Subject: [Python-Dev] autoconf --enable vs. --with
In-Reply-To: Your message of "Mon, 22 Jan 2001 03:01:27 EST."
 <20010122030127.C20804@thyrsus.com>
References: <200101212122.WAA16371@pandora.informatik.hu-berlin.de>
 <20010122030127.C20804@thyrsus.com>
Message-ID: <200101221522.KAA30287@cj20424-a.reston1.va.home.com>

> I've been working a bit on the build process lately.  I came
> across this in the autoconf documentation:
> 
> 
>     If a software package has optional compile-time features, the
>     user can give `configure' command line options to specify
>     whether to compile them. The options have one of these forms:
> 
>         --enable-FEATURE[=ARG]
>         --disable-FEATURE
> 
>     Some packages require, or can optionally use, other software
>     packages which are already installed.  The user can give
>     `configure' command line options to specify which such
>     external software to use.  The options have one of these
>     forms:
> 
>         --with-package[=ARG]
>         --without-package
> 
> 
> Is it worth fixing the Python configure script to comply with
> these definitions?  It looks like with-cycle-gc and mybe
> with-pydebug would have to be changed.

OK, but please add explicit checks for the old --with[out]-cycle-gc
and --with[out]-pydebug flags that cause errors (not just warnings)
when these forms are used.  It's bad enough that configure doesn't
flag typos in such options as errors; if we change the option names,
we really owe users who were using the old forms a clear error.

(Is this stupid autoconf behavior changable?  Does it also apply to
enable/disable?)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake@acm.org  Mon Jan 22 15:19:49 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 22 Jan 2001 10:19:49 -0500 (EST)
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEEDIKAA.tim.one@home.com>
References: <200101212200.XAA16672@pandora.informatik.hu-berlin.de>
 <LNBBLJKPBEHFEDALKOLCKEEDIKAA.tim.one@home.com>
Message-ID: <14956.20373.104748.573294@cj42289-a.reston1.va.home.com>

[Martin, on ftp://ftp.jclark.com/pub/xml/expat.zip]
 > Anyway, I was using 1.1 in my own tests, and 1.2 in PyXML - either
 > works for me. I never tested 1.95.x (which is also not available from
 > jclark.com).

Tim Peters writes:
 > If you do and love it, let me know where to get it and I'll ship that
 > instead.

  I'll recommend not updating to 1.95.1; let's awit at least until
1.95.2 is out.  These are really just pre-2.0 releases to shake things
out.  I have been using the current Expat CVS lightly, but need to do
more testing before I can be confident in it and our bindings (not
yet checked in anywhere; should be in PyXML soon).


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From jeremy@alum.mit.edu  Mon Jan 22 15:44:41 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 10:44:41 -0500 (EST)
Subject: [Python-Dev] Worse news
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>
Message-ID: <14956.21865.943601.735426@localhost.localdomain>

On Linux, I am also seeing test_cpickle failures.  I have not been
able to reproduce failures in test_extcall or test_sax.

I ran 'regrtest.py -r -x test_thread test_unicodedata test_signal
test_select test_poll' 10 times and test_cpickle failed five times.
(I did the peculiar run because exclyding those five tests shaves two
minutes off the running time of the test suite.)

No more time to look into this...

Jeremy


From jeremy@alum.mit.edu  Mon Jan 22 15:26:27 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 10:26:27 -0500 (EST)
Subject: [Python-Dev] getcode() function in pyexpat.c
Message-ID: <14956.20771.447958.389724@localhost.localdomain>

The pyexpat module uses functions named getcode() and
call_with_frame() for handlers of some sort.  I can make this much out
from the code, but the rest is a bit of a mystery.  I was trying to
read this code because of the errors Tim is seeing with test_sax on
Windows.  A few comments to explain this highly stylized and
macro-laden code would be appreciated.

The module appears to be creating empty code objects and calling
them.  I say they appear to be empty, because when they are created 
they don't appear to have anything initialized except name, filename,
and firstlineno.

    getcode(EndNamespaceDecl, 419)
    <code at 0x81b73c0
        co_name = 'EndNamespaceDecl'
        co_filename = 'pyexpat.c'
        co_firstlineno = 419
        co_argcount = 0
        co_nlocals = 0
        co_stacksize = 0
        co_flags = 0
        co_consts = ()
        co_names = ()
        co_varnames = ()
        co_freevars = ()
        co_cellvars = ()
        co_code = ''
    >

(The freevars and cellvars entries are part of the support for nested
scopes.  They can be safely ignored for the moment.) 

I simply don't understand what's going on -- and I'm deeply suspicious
that it is the source of whatever problems Tim is seeing with
test_sax.

Jeremy


From thomas@xs4all.net  Mon Jan 22 15:55:35 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 16:55:35 +0100
Subject: [Python-Dev] 'make distclean' broken.
Message-ID: <20010122165535.P17392@xs4all.nl>

'make distclean' seems broken, at least on non-GNU make's:

[snip]
clobbering subdirectory Modules
rm -f *.o python core *~ [@,#]* *.old *.orig *.rej
rm -f add2lib hassignal
rm -f *.a tags TAGS config.c Makefile.pre
rm -f *.so *.sl so_locations
make -f ./Makefile.in  SUBDIRS="Include Lib Misc Demo" clobber
"./Makefile.in", line 134: Need an operator
make: fatal errors encountered -- cannot continue
*** Error code 1 (ignored)
rm -f config.status config.log config.cache config.h Makefile
rm -f buildno platform
rm -f Modules/Makefile
[snip]

(This is using FreeBSD's 'make'.)

Looking at line 134, I'm not sure why it works with GNU make other than that
it avoids complaining about syntax errors it doesn't run into (which could
be both bad and good :) or that it avoids complaining about obvious GNU
autoconf tricks. But I don't know enough about make to say for sure, nor to
fix the above problem.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Mon Jan 22 15:55:42 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 10:55:42 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: Your message of "Sun, 21 Jan 2001 17:28:45 EST."
 <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com>
Message-ID: <200101221555.KAA30935@cj20424-a.reston1.va.home.com>

> Your faith in gcc is as charming as it is naive <wink>:  the most
> interesting cases of undefined behavior can't be checked no-way, no-how at
> compile-time.  That's why Barry keeps talking employers into dumping
> thousands of dollars into a single Insure++ license.  Insure++ actually tags
> every pointer at runtime with its source, and gripes if non-equality
> comparisons are done on a pair not derived from the same array or malloc
> etc.  Since Python type objects are individually allocated (not taken from a
> preallocated contiguous vector), Insure++ should complain about that
> compare.

IMHO, *this* *particular* gripe of Insure++ is just a pain in the
butt, and I wish there was a way to turn it off in Insure++ without
having to fix the code.

IMHO, this was included in the standard to allow segmented-memory
implementations of C.  Think certain DOS or Windows 3.1 memory models
where a pointer is a segment plus an offset.  This is not current
practice even on Palmpilots!

The standard may say that such comparisons are undefined, but I don't
care about this particular undefinedness, and I'm annoyed by the
required patches.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Mon Jan 22 16:02:15 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 11:02:15 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: Your message of "Sun, 21 Jan 2001 14:44:38 EST."
 <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com>
Message-ID: <200101221602.LAA31103@cj20424-a.reston1.va.home.com>

> > My only concern is that under the old schele, two different numeric
> > extension types that somehow can't be compared will end up being
> > *equal*.  To fix this, I propose that if the names compare equal, as a
> > last resort we compare the type pointers -- this should be consistent
> > too.
> 
> Agreed, and sounds fine!

Checked in now.

While fixing the test_b1 code again, which depends on this behavior, I
thought of a refinement: it wouldn't be hard to make None compare
smaller than *anything* (including numbers).

Is this worth it?

diff -c -r2.113 object.c
*** object.c	2001/01/22 15:59:32	2.113
--- object.c	2001/01/22 16:03:38
***************
*** 550,555 ****
--- 550,561 ----
  		PyErr_Clear();
  	}
  
+ 	/* None is smaller than anything */
+ 	if (v == Py_None)
+ 		return -1;
+ 	if (w == Py_None)
+ 		return 1;
+ 
  	/* different type: compare type names */
  	if (v->ob_type->tp_as_number)
  		vname = "";


--Guido van Rossum (home page: http://www.python.org/~guido/)


From mwh21@cam.ac.uk  Mon Jan 22 16:12:47 2001
From: mwh21@cam.ac.uk (Michael Hudson)
Date: Mon, 22 Jan 2001 16:12:47 +0000 (GMT)
Subject: [Python-Dev] Worse news
In-Reply-To: <14956.21865.943601.735426@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.10101221609430.24819-100000@localhost.localdomain>

On Mon, 22 Jan 2001, Jeremy Hylton wrote:

> On Linux, I am also seeing test_cpickle failures.  I have not been
> able to reproduce failures in test_extcall or test_sax.

Hmm - my machine's done 28 exemplary "make clean; make test" runs this
morning.  I last updated yesterday afternoon my time (~1700 GMT).

Of course, I don't build pyexpat...

> No more time to look into this...

Don't you just love memory corruption bugs?

Cheers,
M.



From akuchlin@mems-exchange.org  Mon Jan 22 16:28:59 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Mon, 22 Jan 2001 11:28:59 -0500
Subject: [Python-Dev] Python 2.1 article
Message-ID: <E14Kjpz-0000cu-00@ute.cnri.reston.va.us>

I've put together an almost-complete first draft of a "What's New in
2.1" article.  The only missing piece is a section on the Nested
Scopes PEP, which obviously has to wait for the changes to get checked
in.  http://www.amk.ca/python/2.1/ ; as usual, nitpicking comments are
welcomed.

--amk



From nas@arctrix.com  Mon Jan 22 10:00:43 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 02:00:43 -0800
Subject: [Python-Dev] Worse news
In-Reply-To: <Pine.LNX.4.10.10101221609430.24819-100000@localhost.localdomain>; from mwh21@cam.ac.uk on Mon, Jan 22, 2001 at 04:12:47PM +0000
References: <14956.21865.943601.735426@localhost.localdomain> <Pine.LNX.4.10.10101221609430.24819-100000@localhost.localdomain>
Message-ID: <20010122020043.A25687@glacier.fnational.com>

On Mon, Jan 22, 2001 at 04:12:47PM +0000, Michael Hudson wrote:
> Don't you just love memory corruption bugs?

Great fun.

I've played around with efence and debauch on the weekend.  I
even when as far as merging an updated fmalloc from the XFree
source tree into debauch and writing a reporting script in
Python.

I probably would have caught the pyexpat overrun if I would have
used efence with EF_ALIGNMENT=0 and complied with -fpack-struct.
I'll have to try it tonight.  Maybe something else will turn up.

  Neil


From guido@digicool.com  Mon Jan 22 17:12:29 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 12:12:29 -0500
Subject: [Python-Dev] 'make distclean' broken.
In-Reply-To: Your message of "Mon, 22 Jan 2001 16:55:35 +0100."
 <20010122165535.P17392@xs4all.nl>
References: <20010122165535.P17392@xs4all.nl>
Message-ID: <200101221712.MAA00694@cj20424-a.reston1.va.home.com>

> 'make distclean' seems broken, at least on non-GNU make's:
> 
> [snip]
> clobbering subdirectory Modules
> rm -f *.o python core *~ [@,#]* *.old *.orig *.rej
> rm -f add2lib hassignal
> rm -f *.a tags TAGS config.c Makefile.pre
> rm -f *.so *.sl so_locations
> make -f ./Makefile.in  SUBDIRS="Include Lib Misc Demo" clobber
> "./Makefile.in", line 134: Need an operator
> make: fatal errors encountered -- cannot continue
> *** Error code 1 (ignored)
> rm -f config.status config.log config.cache config.h Makefile
> rm -f buildno platform
> rm -f Modules/Makefile
> [snip]
> 
> (This is using FreeBSD's 'make'.)
> 
> Looking at line 134, I'm not sure why it works with GNU make other than that
> it avoids complaining about syntax errors it doesn't run into (which could
> be both bad and good :) or that it avoids complaining about obvious GNU
> autoconf tricks. But I don't know enough about make to say for sure, nor to
> fix the above problem.

There's one line in Makefile.in that trips over Make (mine also
complains about it):

    @SET_DLLLIBRARY@

Looking at the code in configure.in that generates this macro:

    AC_SUBST(SET_DLLLIBRARY)
    LDLIBRARY=''
    SET_DLLLIBRARY=''
       .
       . (and later)
       .
    cygwin*)
	  LDLIBRARY='libpython$(VERSION).dll.a'
	  SET_DLLLIBRARY='DLLLIBRARY=	$(basename $(LDLIBRARY))'
	  ;;

I don't see why we couldn't change this so that Makefile.in just
contains

    DLLLIBRARY=		@DLLLIBRARY@

and then configure.in could be changed to

    AC_SUBST(DLLLIBRARY)
    LDLIBRARY=''
    DLLLIBRARY=''
       .
       . (and later)
       .
    cygwin*)
	  LDLIBRARY='libpython$(VERSION).dll.a'
	  DLLLIBRARY='DLLLIBRARY=	$(basename $(LDLIBRARY))'
	  ;;

Or am I missing something?

Does this fix the problem?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@thyrsus.com  Mon Jan 22 17:21:09 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 12:21:09 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101221602.LAA31103@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 11:02:15AM -0500
References: <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com> <200101221602.LAA31103@cj20424-a.reston1.va.home.com>
Message-ID: <20010122122109.A14952@thyrsus.com>

Guido van Rossum <guido@digicool.com>:
> While fixing the test_b1 code again, which depends on this behavior, I
> thought of a refinement: it wouldn't be hard to make None compare
> smaller than *anything* (including numbers).
> 
> Is this worth it?

I think so, if only for the sake of well-definedness.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"They that can give up essential liberty to obtain a little temporary 
safety deserve neither liberty nor safety."
	-- Benjamin Franklin, Historical Review of Pennsylvania, 1759.


From thomas@xs4all.net  Mon Jan 22 17:25:30 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 18:25:30 +0100
Subject: [Python-Dev] 'make distclean' broken.
In-Reply-To: <200101221712.MAA00694@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 12:12:29PM -0500
References: <20010122165535.P17392@xs4all.nl> <200101221712.MAA00694@cj20424-a.reston1.va.home.com>
Message-ID: <20010122182530.E17295@xs4all.nl>

On Mon, Jan 22, 2001 at 12:12:29PM -0500, Guido van Rossum wrote:

> and then configure.in could be changed to

>     AC_SUBST(DLLLIBRARY)
>     LDLIBRARY=''
>     DLLLIBRARY=''
>        .
>        . (and later)
>        .
>     cygwin*)
> 	  LDLIBRARY='libpython$(VERSION).dll.a'
> 	  DLLLIBRARY='DLLLIBRARY=	$(basename $(LDLIBRARY))'
> 	  ;;

You mean 
 	  DLLLIBRARY='$(basename $(LDLIBRARY))'

But yes, that fixes it.

> Or am I missing something?

Well, on *that* I'm not sure, that's why I asked :P If things in the Python
source boggle me, they are always there for a good reason. Well, maybe just
'almost always', but practically always :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From nas@arctrix.com  Mon Jan 22 10:39:59 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 02:39:59 -0800
Subject: [Python-Dev] 'make distclean' broken.
In-Reply-To: <200101221712.MAA00694@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 12:12:29PM -0500
References: <20010122165535.P17392@xs4all.nl> <200101221712.MAA00694@cj20424-a.reston1.va.home.com>
Message-ID: <20010122023959.A25798@glacier.fnational.com>

[Guido on change SET_DLLLIBRARY]
> Or am I missing something?

I don't think so.  My new Makefile uses "FOO = @FOO@" everywhere.
SET_CXX is the same way in the current Makefile.

  Neil


From esr@thyrsus.com  Mon Jan 22 17:41:59 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 12:41:59 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
Message-ID: <20010122124159.A14999@thyrsus.com>

\section{\module{set} ---
         Basic set algebra for Python}

\declaremodule{standard}{set}
\modulesynopsis{Basic set algebra operations on sequences.}
\moduleauthor{Eric S. Raymond}{esr@thyrsus.com}
\sectionauthor{Eric S. Raymond}{esr@thyrsus.com}

The \module{set} module defines functions for treating lists and other
sequences as mathematical sets, and defines a set class that uses
these operations natively and overloads Python's standard operator set.

The \module{set} functions work on any sequence type and return lists.
The set methods can take a set or any sequence type as an argument.
Set or sequence elements may be of any type and may be mutable.
Comparisons and membership tests of elements against sequence objects
are done using \keyword{in}, and so can be customized by supplying a 
suitable \method{__getattr__} method for the sequence type.

The running time of these functions is O(n**2) in the worst case
unless otherwise noted.  For cases that can be short-circuited by 
cardinality comparisons, this has been done.

\begin{funcdesc}{setify}{list1}
Returns a list of the argument sequence's elements with duplicates removed.
\end{funcdesc}

\begin{funcdesc}{union}{list1, list2}
Set union.  All elements of both sets or sequences are returned.
\end{funcdesc}

\begin{funcdesc}{intersection}{list1, list2}
Set intersection.  All elements common to both sets or sequences are returned.
\end{funcdesc}

\begin{funcdesc}{difference}{list1, list2}
Set difference.  All elements of the first set or sequence not present
in the second are returned.
\end{funcdesc}

\begin{funcdesc}{symmetric_difference}{list1, list2}
Set symmetric difference.  All elements present in one sequence or the other
but not in both are returned.
\end{funcdesc}

\begin{funcdesc}{cartesian}{list1, list2}
Returns a list of tuples consisting of all possible pairs of elements
from the first and second sequences or sets.
\end{funcdesc}

\begin{funcdesc}{equality}{list1, list2}
Set comparison.  Return 1 if the two sets or sequences contain exactly
the same elements, 0 or otherwise.
\end{funcdesc}

\begin{funcdesc}{subset}{list1, list2}
Set subset test.  Return 1 if all elements of the fiorst set or
sequence are members of the second, 0 otherwise.
\end{funcdesc}

\begin{funcdesc}{proper_subset}{list1, list2}
Set subset test, excluding equality.  Return 1 if the arguments fail a
set equality test, and all elements of the fiorst set or sequence are
members of the second, 0 otherwise.
\end{funcdesc}

\begin{funcdesc}{powerset}{list1}
Return the set of all subsets of the argument set or
sequence. Warning: this produces huge results from small arguments and
is O(2**n) in both running time and space requirements; you can
readily run yourself out of memory using it.
\end{funcdesc}

\subsection{set Objects \label{set-objects}}

A \class{set} instance uses the \module{set} module functions to
implement set semantics on the list it contains, and to support 
a full set of Python list methods and operaors.  Thus, the set
methods can take a set or any sequence type as an argument.  

A set object contains a single data member:

\begin{memberdesc}{elements}
List containing the elements of the set.  
\end{memberdesc}

Set objects can be treated as mutable sequences; they support the
special methods 
\method{__len__}, 
\method{__getattr__},
\method{__setattr__}, 
and \method{__delattr__}.  
Through
\method{__getattr__}, they support the memebership test via
\keyword{in}. All the standard mutable-sequence methods
\method{list}, 
\method{append}, 
\method{extend}, 
\method{count}, 
\method{index}, 
\method{insert} (the index argument is ignored), 
\method{pop}, 
\method{remove}, 
\method{reverse}, 
and \method{sort}
are also supported.  After method calls that add elements
(\method{setattr},
\method{append}, \method{extend}, \method{insert}), the
elements of the data member are re-setified, so it is not possible to
introduce duplicates.

Calling \function{repr()} on a set returns the result of calling
\function{repr} on its element list.  Calling \function{str()} returns
a representation resembling mathematical notation for the set; an
open set bracket, followed by a comma-separated list of \function{str()}
representations of the elements, followed by a close set brackets.

Set objects support the following Python operators:

\begin {tableiii}{l|l|l}{code}{Operator}{Function}{Description}
\lineiii{|,+}{union}{Union}
\lineiii{&}{intersection}{Intersection}
\lineiii{-}{difference}{Difference}
\lineiii{^}{symmetric_difference}{Symmetric differe}
\lineiii{*}{cartesian}{Cartesian product}
\lineiii{==}{equality}{Equality test}
\lineiii{!=,<>}{}{Inequality test}
\lineiii{<}{proper_subset}{Proper-subset test}
\lineiii{<=}{subset}{Subset test}
\lineiii{>}{}{Proper superset test}
\lineiii{>=}{}{Superset test}
\end {tableiii}

-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Government is actually the worst failure of civilized man. There has
never been a really good one, and even those that are most tolerable
are arbitrary, cruel, grasping and unintelligent.
	-- H. L. Mencken 


From esr@snark.thyrsus.com  Mon Jan 22 18:28:57 2001
From: esr@snark.thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 13:28:57 -0500
Subject: [Python-Dev] I still can't build HTML in a current CVS tree.
Message-ID: <200101221828.f0MISvH15121@snark.thyrsus.com>

Fred, I still can't build HTML documentation in a current CVS tree -- same
complaint about lib/modindex.html being absent.  Can we get this fixed
before 2.1 ships?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

...Virtually never are murderers the ordinary, law-abiding people
against whom gun bans are aimed.  Almost without exception, murderers
are extreme aberrants with lifelong histories of crime, substance
abuse, psychopathology, mental retardation and/or irrational violence
against those around them, as well as other hazardous behavior, e.g.,
automobile and gun accidents."
        -- Don B. Kates, writing on statistical patterns in gun crime


From fredrik@effbot.org  Mon Jan 22 18:33:56 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Mon, 22 Jan 2001 19:33:56 +0100
Subject: [Python-Dev] Python 2.1 article
References: <E14Kjpz-0000cu-00@ute.cnri.reston.va.us>
Message-ID: <059b01c084a1$e431e490$e46940d5@hagrid>

> I've put together an almost-complete first draft of a "What's New in
> 2.1" article.  The only missing piece is a section on the Nested
> Scopes PEP, which obviously has to wait for the changes to get checked
> in.

what's the current 2.1a1 eta?  (pep 226 still
says last friday)

today?  wednesday?  this week?  this month?

Curious /F



From mal@lemburg.com  Mon Jan 22 18:33:24 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 19:33:24 +0100
Subject: [Python-Dev] I think my set module is ready for prime time;
 comments?
References: <20010122124159.A14999@thyrsus.com>
Message-ID: <3A6C7CF4.F10AA77B@lemburg.com>

[LaTeX file]

Eric, we are all hackers, but plain LaTeX is not really the right
format for a posting to a mailing list... at least not if
you really expect feedback ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From martin@mira.cs.tu-berlin.de  Mon Jan 22 18:36:16 2001
From: martin@mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 22 Jan 2001 19:36:16 +0100
Subject: [Python-Dev] getcode() function in pyexpat.c
Message-ID: <200101221836.f0MIaGL00923@mira.informatik.hu-berlin.de>

> A few comments to explain this highly stylized and macro-laden code
> would be appreciated.

I probably can't do that before 2.1a1, but I promise to suggest
something right afterwards.

In general, the macro magic is designed to make the many expat
callbacks available to Python. RC_HANDLER (for return code) is the
most general template; VOID_HANDLER and INT_HANDLER are common
specializations. In the core of RC_HANDLER, there a tuple is built and
a Python function is called.

The code used to do PyEval_CallObject right inside the macro; the
call_with_frame feature is new compared to 2.0. It solves the specific
problem of incomprehensible tracebacks.

In a typical SAX application, the user code calls
expatreader.ExpatParser.parse, which in turn calls 

            self._parser.Parse(data, isFinal)

Now, in 2.0, a common problem was a traceback

            self._parser.Parse(data, isFinal)
TypeError: not enough arguments; expected 4, got 2

Everybody assumes a problem in the call to Parse; the real problem is
in the call to the callback inside RC_HANDLER, which tried to call a
user's function with two arguments that expected four.

2.1 would improve this slightly on its own, writing

            self._parser.Parse(data, isFinal)
TypeError: characters() takes exactly 4 arguments (2 given)

With that code, you get

  File "/usr/local/lib/python2.1/xml/sax/expatreader.py", line 81, in feed
    self._parser.Parse(data, isFinal)
  File "pyexpat.c", line 379, in CharacterData
TypeError: characters() takes exactly 4 arguments (2 given)

So that tells you that it is the CharacterData handler that invokes
characters(). You are right that the frame object is not used
otherwise; it is just there to make a nice traceback.

> I simply don't understand what's going on -- and I'm deeply
> suspicious that it is the source of whatever problems Tim is seeing
> with test_sax.

I thought so, too, at first; it turned out that the problem was
elsewhere.

Regards,
Martin


From guido@digicool.com  Mon Jan 22 19:04:02 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 14:04:02 -0500
Subject: [Python-Dev] Python 2.1 article
In-Reply-To: Your message of "Mon, 22 Jan 2001 19:33:56 +0100."
 <059b01c084a1$e431e490$e46940d5@hagrid>
References: <E14Kjpz-0000cu-00@ute.cnri.reston.va.us>
 <059b01c084a1$e431e490$e46940d5@hagrid>
Message-ID: <200101221904.OAA01170@cj20424-a.reston1.va.home.com>

> what's the current 2.1a1 eta?  (pep 226 still
> says last friday)

You missed my email that I sent out Friday.  Tentatively it's going
out tonight.  No point in updating the PEP each time there's slippage.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Mon Jan 22 19:10:54 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 14:10:54 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Mon, 22 Jan 2001 12:41:59 EST."
 <20010122124159.A14999@thyrsus.com>
References: <20010122124159.A14999@thyrsus.com>
Message-ID: <200101221910.OAA01218@cj20424-a.reston1.va.home.com>

Eric,

There's already a PEP on a set object type, and everybody and their
aunt has already implemented a set datatype.

If *your* set module is ready for prime time, why not publish it in
the Vaults of Parnassus?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@alum.mit.edu  Mon Jan 22 19:29:18 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 14:29:18 -0500 (EST)
Subject: [Python-Dev] Re: getcode() function in pyexpat.c
In-Reply-To: <200101221836.f0MIaGL00923@mira.informatik.hu-berlin.de>
References: <200101221836.f0MIaGL00923@mira.informatik.hu-berlin.de>
Message-ID: <14956.35342.724657.865367@localhost.localdomain>

>>>>> "MvL" == Martin v Loewis <martin@mira.cs.tu-berlin.de> writes:

  >> I simply don't understand what's going on -- and I'm deeply
  >> suspicious that it is the source of whatever problems Tim is
  >> seeing with test_sax.

  MvL> I thought so, too, at first; it turned out that the problem was
  MvL> elsewhere.

What was the cause of that problem?  I didn't see any mail after Tim's
middle-of-the-night message "Worse news."

Jeremy



From tim.one@home.com  Mon Jan 22 20:01:59 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 22 Jan 2001 15:01:59 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101221602.LAA31103@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEGIIKAA.tim.one@home.com>

[Guido]
> ...
> While fixing the test_b1 code again, which depends on this behavior, I
> thought of a refinement: it wouldn't be hard to make None compare
> smaller than *anything* (including numbers).
>
> Is this worth it?

First, an attempt to see what Python did in this morning's CVS turned up an
internal error for Jeremy:

>>> [None < x for x in (1, 1L, 1j, 1.0, [1], {}, (1,))]
name: None, in ?, file '<stdin>', line 1
locals: {'[1]': 0, 'x': 1}
globals: {}
Fatal Python error: compiler did not label name as local or global

abnormal program termination

A simpler way to provoke that:

>>> [None < 2 for x in "x"]
name: None, in ?, file '<stdin>', line 1
locals: {'[1]': 0, 'x': 1}
globals: {}
Fatal Python error: compiler did not label name as local or global


Anyway, I think forcing None to be "the smallest" is cute!  Inexpensive to
do, and while I don't see a compelling *use* for it, I bet it would be least
surprising to newbies.  +1.



From fdrake@acm.org  Mon Jan 22 20:08:54 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Mon, 22 Jan 2001 15:08:54 -0500 (EST)
Subject: [Python-Dev] Re: I still can't build HTML in a current CVS tree.
In-Reply-To: <200101221828.f0MISvH15121@snark.thyrsus.com>
References: <200101221828.f0MISvH15121@snark.thyrsus.com>
Message-ID: <14956.37718.968912.189834@cj42289-a.reston1.va.home.com>

Eric S. Raymond writes:
 > Fred, I still can't build HTML documentation in a current CVS tree -- same
 > complaint about lib/modindex.html being absent.  Can we get this fixed
 > before 2.1 ships?

  I'm guessing I've lost a previous email on the topic, or it's buried
in my inbox.  If this is still a problem after today's checkins, could
you please file a bug report and assign it to me?
  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From tim.one@home.com  Mon Jan 22 20:26:15 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 22 Jan 2001 15:26:15 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101221555.KAA30935@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEGJIKAA.tim.one@home.com>

[Guido]
> IMHO, *this* *particular* gripe of Insure++ is just a pain in the
> butt, and I wish there was a way to turn it off in Insure++ without
> having to fix the code.

Maybe there is.  Barry?

> IMHO, this was included in the standard to allow segmented-memory
> implementations of C.  Think certain DOS or Windows 3.1 memory models
> where a pointer is a segment plus an offset.  This is not current
> practice even on Palmpilots!

I could ask Tom MacDonald (former X3J11 chair), but don't want to bother
him.  The way these things usually turn out:  the committee debated it 100
times over 10 years, but some committee member steadfastly claimed it was
important.  Since ANSI/ISO committees work via consensus, one implacable
objector is enough.

WRT pointers, I know that while the C committee did worry about segmented
architectures a lot in the past, tagged architectures gave them much
thornier problems (the HW tags each "word" with some manner of metadata
(such as a busy/free or empty/full bit, or read+write permission bits, or a
data type identifier, or a "capability" tag tying into a HW-enforced
security architecture, ...), and checks those on each access, and some of
the metadata can propagate into a pointer, and the HW can raise faults on
pointer comparisons if the metadata doesn't match).  While such machines
aren't in common use, the US Govt does all sorts of things they don't talk
about -- if it's not IBM's representative protecting a 40-year old
architecture, it's someone emphatically not from the NSA <wink> protecting
something they're not at liberty to discuss.  Of course Python wants to run
there too, even if we never hear about it ...

> The standard may say that such comparisons are undefined, but I don't
> care about this particular undefinedness, and I'm annoyed by the
> required patches.

Ya, and I'm annoyed that MS stdio corrupts itself -- but they're just
clinging to the letter of the std too, and I've learned to live with it
gracefully <wink>.

pointer-ordering-comparisons-should-be-very-rare-anyway-ly y'rs  - tim



From tim.one@home.com  Mon Jan 22 20:55:30 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 22 Jan 2001 15:55:30 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: <Pine.LNX.4.10.10101221609430.24819-100000@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEGNIKAA.tim.one@home.com>

[Michael Hudson]
> Hmm - my machine's done 28 exemplary "make clean; make test" runs this
> morning.  I last updated yesterday afternoon my time (~1700 GMT).

So does mine now.  The remaining failures require *unusual* ways of running
the test suite (with -r to get test_cpickle to fail, confirmed now by Jeremy
under Linux; and in an extremely specialized and seemingly Windows-specific
way to get test_extcall to blow up w/ a bad pointer).



From tim.one@home.com  Mon Jan 22 21:07:27 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 22 Jan 2001 16:07:27 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: <14956.16758.68050.257212@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>

[Jeremy Hylton]
> Funny (strange or haha?) that test_extcall is failing since the two
> pieces of code I've modified most recently are compile.c and the
> section of ceval.c that handles extended call syntax.

Ya, I knew that, but I avoided wagging a Finger of Shame in your direction
because coincidence isn't proof <wink>.

> ...
> As for the test_sax failure,

There is no test_sax failure anywhere anymore that I know of (Martin found a
dead-wrong array decl in contributed pyexpat.c code and repaired it).

And I believe my "rt -x test_sax" failure in test_extcall almost certainly
has nothing to do with test_sax -- far more likely the connection to
test_sax is an accident, and that if I spend umpteen hours trying other
things at random I'll provoke the same memory accident leading to a bad
pointer via excluding some other test.  I just picked test_sax because that
*was* broken and I wanted to get thru the rest of the tests.

BTW, delighted(?) to hear that test_cpickle fails for you too!  I'm sure
test_extcall is going to blow up for other people eventually too -- but it
is sooooo hard to provoke even for me.  I've dropped the effort pending news
from someone running Insure++ or efence or whatever.



From guido@digicool.com  Mon Jan 22 21:18:26 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 16:18:26 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: Your message of "Mon, 22 Jan 2001 16:07:27 EST."
 <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>
Message-ID: <200101222118.QAA28305@cj20424-a.reston1.va.home.com>

[Tim]
> So does mine now.  The remaining failures require *unusual* ways of running
> the test suite (with -r to get test_cpickle to fail, confirmed now by Jeremy
> under Linux;
[and later]
> BTW, delighted(?) to hear that test_cpickle fails for you too!

This (test_cpickle) is a red herring -- it's a shallow failure in the
test suite.  test_cpickle imports test_pickle, but test_pickle first
outputs the test output from testing pickle -- unless test_pickle has
been run before!  This succeeds:

  ./python Lib/test/regrtest.py test_cpickle test_pickle

and this fails:

  ./python Lib/test/regrtest.py test_pickle test_cpickle

Use regrtest.py -v to fidn out why. :-)

I'm not sure how to restucture this, but it's not of the same quality
as test_extcall or test_sax failing.  Neither of those has failed for
me on Linux during hours of testing.  However on Windows I get an
occasional appfail dialog box when using rt.bat.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From nas@arctrix.com  Mon Jan 22 14:44:00 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 06:44:00 -0800
Subject: [Python-Dev] Worse news
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 22, 2001 at 04:07:27PM -0500
References: <14956.16758.68050.257212@localhost.localdomain> <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>
Message-ID: <20010122064400.A26543@glacier.fnational.com>

On Mon, Jan 22, 2001 at 04:07:27PM -0500, Tim Peters wrote:
> I've dropped the effort pending news from someone running
> Insure++ or efence or whatever.

efence to the rescue!  I compiled with -fstruct-pack and used
EF_ALIGNMENT=0 and now I can trigger a core dump by running
test_extcall.  More news comming...

  Neil


From tim.one@home.com  Mon Jan 22 21:41:08 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 22 Jan 2001 16:41:08 -0500
Subject: [Python-Dev] test_sax and site-python
In-Reply-To: <20010122145733.85E51373C95@snelboot.oratrix.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHHIKAA.tim.one@home.com>

[Jack Jansen]
> I'm not sure whether this is really a bug, but I had the problem
> that there  was something wrong with the xml package I had
> installed into my Lib/site-python, and this caused test_sax to
> complain.
>
> If the test stuff is expected to test only the core functionality
> maybe sys.path should be edited so that it only contains directories
> that are part of the core distribution?

AFAIK, xml *is* considered part of the core now, and has been since 2.0 was
released.  The wisdom of that decision is debatable with hindsight, but
AFAICT xml is in the same boat as, say, zlib now:  not builtin, and requires
3rd-party code to work, but part of the core all the same.  The Windows
installer comes w/ the necessary xml (and zlib) pieces, and I suppose the
Mac Python package also should.



From nas@arctrix.com  Mon Jan 22 15:00:57 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 07:00:57 -0800
Subject: [Python-Dev] Worse news
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 22, 2001 at 04:07:27PM -0500
References: <14956.16758.68050.257212@localhost.localdomain> <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>
Message-ID: <20010122070057.A26575@glacier.fnational.com>

Perhaps this will help somone track down the bug:

[running test_extcall...]
unbound method method() must be called with instance as first argument
unbound method method() must be called with instance as first argument

Program received signal SIGSEGV, Segmentation fault.
symtable_params (st=0x429bafd0, n=0x42a3ffd7) at Python/compile.c:4330
4330                    if (TYPE(c) == DOUBLESTAR) {
(gdb) l
4325                            symtable_add_def(st, STR(CHILD(n, i)), 
4326                                             DEF_PARAM | DEF_STAR);
4327                            i += 2;
4328                            c = CHILD(n, i);
4329                    }
4330                    if (TYPE(c) == DOUBLESTAR) {
4331                            i++;
4332                            symtable_add_def(st, STR(CHILD(n, i)), 
4333                                             DEF_PARAM | DEF_DOUBLESTAR);
4334                    }
(gdb) p c
$3 = (node *) 0x42a43fff
(gdb) p *c
$4 = {n_type = 0, n_str = 0x0, n_lineno = 0, n_nchildren = 0, n_child = 0x0}
(gdb) p n
$5 = (node *) 0x42a3ffd7
(gdb) p *n
$6 = {n_type = 261, n_str = 0x0, n_lineno = 1, n_nchildren = 2, 
  n_child = 0x42a43fc3}
(gdb) bt 10
#0  symtable_params (st=0x429bafd0, n=0x42a3ffd7) at Python/compile.c:4330
#1  0x8060126 in symtable_funcdef (st=0x429bafd0, n=0x42a23feb)
    at Python/compile.c:4245
#2  0x805fd29 in symtable_node (st=0x429bafd0, n=0x429b0fc3)
    at Python/compile.c:4128
#3  0x80600da in symtable_node (st=0x429bafd0, n=0x4290cfeb)
    at Python/compile.c:4232
#4  0x805f443 in symtable_build (c=0xbffff5c8, n=0x4290cfeb)
    at Python/compile.c:3816
#5  0x805f130 in jcompile (n=0x4290cfeb, filename=0x80a040f "<string>", 
    base=0x0) at Python/compile.c:3720
#6  0x805f0c2 in PyNode_Compile (n=0x4290cfeb, filename=0x80a040f "<string>")
    at Python/compile.c:3699
#7  0x8069adf in run_node (n=0x4290cfeb, filename=0x80a040f "<string>", 
    globals=0x40644fe0, locals=0x40644fe0) at Python/pythonrun.c:915
#8  0x8069ac0 in run_err_node (n=0x4290cfeb, filename=0x80a040f "<string>", 
    globals=0x40644fe0, locals=0x40644fe0) at Python/pythonrun.c:907
#9  0x8069a30 in PyRun_String (
    str=0x429f9fd1 "def zv(*v): print \"ok zv\", a, b, d, e, v, k", start=257, 
    globals=0x40644fe0, locals=0x40644fe0) at Python/pythonrun.c:881
(More stack frames follow...)



From thomas@xs4all.net  Mon Jan 22 22:13:29 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 23:13:29 +0100
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122070057.A26575@glacier.fnational.com>; from nas@arctrix.com on Mon, Jan 22, 2001 at 07:00:57AM -0800
References: <14956.16758.68050.257212@localhost.localdomain> <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com> <20010122070057.A26575@glacier.fnational.com>
Message-ID: <20010122231329.A27785@xs4all.nl>

On Mon, Jan 22, 2001 at 07:00:57AM -0800, Neil Schemenauer wrote:
> Perhaps this will help somone track down the bug:

> [running test_extcall...]
> unbound method method() must be called with instance as first argument
> unbound method method() must be called with instance as first argument
> 
> Program received signal SIGSEGV, Segmentation fault.
> symtable_params (st=0x429bafd0, n=0x42a3ffd7) at Python/compile.c:4330
> 4330                    if (TYPE(c) == DOUBLESTAR) {
> (gdb) l
> 4325                            symtable_add_def(st, STR(CHILD(n, i)), 
> 4326                                             DEF_PARAM | DEF_STAR);
> 4327                            i += 2;
> 4328                            c = CHILD(n, i);
> 4329                    }
> 4330                    if (TYPE(c) == DOUBLESTAR) {
> 4331                            i++;
> 4332                            symtable_add_def(st, STR(CHILD(n, i)), 
> 4333                                             DEF_PARAM | DEF_DOUBLESTAR);
> 4334                    }

> (gdb) p c
> $3 = (node *) 0x42a43fff
> (gdb) p *c
> $4 = {n_type = 0, n_str = 0x0, n_lineno = 0, n_nchildren = 0, n_child = 0x0}
> (gdb) p n
> $5 = (node *) 0x42a3ffd7
> (gdb) p *n
> $6 = {n_type = 261, n_str = 0x0, n_lineno = 1, n_nchildren = 2, 
>   n_child = 0x42a43fc3}

n_child is 0x42a43fc3. That's n_child[0]. 0x42a43fff is the child being
handled now. That would be n_child[3] (0x42a43fff - 0x42a3ffd7 == 60, a
struct node is 20 bytes.) But n_children is 2, so it's an off-by-two error
somewhere -- and look, there's a "i += 2' right above it ! It *looks* like
this code will blow up whenever you use '*eggs' without '**spam' in a
funtion definition. That's a fairly wild guess, but it's worth a try. Try
this patch:

Index: Python/compile.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Python/compile.c,v
retrieving revision 2.148
diff -c -c -r2.148 compile.c
*** Python/compile.c    2001/01/22 04:35:57     2.148
--- Python/compile.c    2001/01/22 22:12:31
***************
*** 4324,4329 ****
--- 4324,4331 ----
                        i++;
                        symtable_add_def(st, STR(CHILD(n, i)), 
                                         DEF_PARAM | DEF_STAR);
+                       if (NCH(n) <= i+2)
+                               return;
                        i += 2;
                        c = CHILD(n, i);
                }


-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From esr@thyrsus.com  Mon Jan 22 20:13:09 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 15:13:09 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101221910.OAA01218@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 02:10:54PM -0500
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com>
Message-ID: <20010122151309.C15236@thyrsus.com>

Guido van Rossum <guido@digicool.com>:
> There's already a PEP on a set object type, and everybody and their
> aunt has already implemented a set datatype.

I've just read the PEP.  Greg's proposal has a couple of problems.
The biggest one is that the interface design isn't very Pythonic --
it's formally adequate, but doesn't exploit the extent to which sets
naturally have common semantics with existing Python sequence types.
This is bad; it means that a lot of code that could otherwise ignore
the difference between lists and sets would have to be specialized 
one way or the other for no good reason.

The only other set module I can find in the Vaults or anywhere else is
kjBuckets (which I knew about before).  Looks like a good design, but
complicated -- and requires installation of an extension.

> If *your* set module is ready for prime time, why not publish it in
> the Vaults of Parnassus?

I suppose that's what I'll do if you don't bless it for the standard
library.  But here are the reasons I suggest you should do so:

1. It supports a set of operations that are both often useful and
fiddly to get right, thus enhancing the "batteries are included"
effect.  (I used its ancestor for representing seen-message numbers in
a specialized mailreader, for example.)

2. It's simple for application programmers to use.  No extension module
to integrate.

3. It's unsurprising.  My set objects behave almost exactly like other
mutable sequences, with all the same built-in methods working, except for 
the fact that you can't introduce duplicates with the mutators.

4. It's already completely documented in a form suitable for the library.

5. It's simple enough not to cause you maintainance hassles down the
road, and even if it did the maintainer is unlikely to disappear :-).
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The United States is in no way founded upon the Christian religion
	-- George Washington & John Adams, in a diplomatic message to Malta.


From guido@digicool.com  Mon Jan 22 22:29:26 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 17:29:26 -0500
Subject: [Python-Dev] test_sax and site-python
In-Reply-To: Your message of "Mon, 22 Jan 2001 16:41:08 EST."
 <LNBBLJKPBEHFEDALKOLCGEHHIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCGEHHIKAA.tim.one@home.com>
Message-ID: <200101222229.RAA28667@cj20424-a.reston1.va.home.com>

> [Jack Jansen]
> > I'm not sure whether this is really a bug, but I had the problem
> > that there  was something wrong with the xml package I had
> > installed into my Lib/site-python, and this caused test_sax to
> > complain.
> >
> > If the test stuff is expected to test only the core functionality
> > maybe sys.path should be edited so that it only contains directories
> > that are part of the core distribution?
> 
[Tim]
> AFAIK, xml *is* considered part of the core now, and has been since 2.0 was
> released.  The wisdom of that decision is debatable with hindsight, but
> AFAICT xml is in the same boat as, say, zlib now:  not builtin, and requires
> 3rd-party code to work, but part of the core all the same.  The Windows
> installer comes w/ the necessary xml (and zlib) pieces, and I suppose the
> Mac Python package also should.

Yes, but Jack was talking about a non-std xml package in
site-python...  I agree that this shouldn't be picked up.  But is it
worth taking draconian measures to avoid this?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Mon Jan 22 22:35:08 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 22 Jan 2001 17:35:08 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: <200101222118.QAA28305@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHLIKAA.tim.one@home.com>

[Guido]
> This (test_cpickle) is a red herring -- it's a shallow failure in the
> test suite.

Fixed now -- thanks!

Please note that Neil got text_extcall to fail in exactly the same place
(see his recent Python-Dev) mail.  That's the only remaining failure I know
of.

> ...
> However on Windows I get an occasional appfail dialog box when
> using rt.bat.

I don't believe I've ever seen one of those ("appfail" rings no bells), and
rt has never acted strangely for me.   Your DOS-box properties may be
screwed up:  use Start -> Find -> Files or Folders ...; set "Look in" to C:;
enter *.pif in the "Named:" box; click Find.  You'll probably get a dozen
hits.  One of them will correspond to the method you use to open a DOS box
(which I don't know).  Right-click on that one and select Properties.  On
the Memory tab of the dialog that pops up, the four dropdown lists should
have "Auto" selected.  "Uses HMA" should be checked.  Hmm ... looks like
"Protected" *should* be checked but mine isn't ... oh, this goes on and on.
I don't even know which version of Windows you're using here!  How about I
look at it next time I'm at your house ...



From greg@cosc.canterbury.ac.nz  Mon Jan 22 22:50:07 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 23 Jan 2001 11:50:07 +1300 (NZDT)
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122231329.A27785@xs4all.nl>
Message-ID: <200101222250.LAA01929@s454.cosc.canterbury.ac.nz>

> 4330                    if (TYPE(c) == DOUBLESTAR) {
> 4325                            symtable_add_def(st, STR(CHILD(n, i)), 
> 4326                                             DEF_PARAM | DEF_STAR);

Shouldn't line 4330 say if (TYPE(c) == STAR) ?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From thomas@xs4all.net  Mon Jan 22 22:56:02 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 23:56:02 +0100
Subject: [Python-Dev] Worse news
In-Reply-To: <200101222250.LAA01929@s454.cosc.canterbury.ac.nz>; from greg@cosc.canterbury.ac.nz on Tue, Jan 23, 2001 at 11:50:07AM +1300
References: <20010122231329.A27785@xs4all.nl> <200101222250.LAA01929@s454.cosc.canterbury.ac.nz>
Message-ID: <20010122235602.B27785@xs4all.nl>

On Tue, Jan 23, 2001 at 11:50:07AM +1300, Greg Ewing wrote:
> > 4330                    if (TYPE(c) == DOUBLESTAR) {
> > 4325                            symtable_add_def(st, STR(CHILD(n, i)), 
> > 4326                                             DEF_PARAM | DEF_STAR);

> Shouldn't line 4330 say if (TYPE(c) == STAR) ?

No, that's line 4323. You can't have doublestar without having star, and
star should precede doublestar. (Grammar should enforce that.) 

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From paulp@ActiveState.com  Mon Jan 22 23:02:07 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Mon, 22 Jan 2001 15:02:07 -0800
Subject: [Python-Dev] pydoc - put it in the core
References: <14945.59192.400783.403810@beluga.mojam.com> <200101142055.PAA13041@cj20424-a.reston1.va.home.com>
Message-ID: <3A6CBBEF.4732BFF2@ActiveState.com>

Guido van Rossum wrote:
> 
> ....
>
> Yes, wow!
> 
> ....

I apologize but I'm not clear on my responsibilities here, if any. I
wrote a PEP for online help. I submitted a partial implementation. Ping
wrote a full implementation that basically supercedes mine. There are
various ideas for improving it, but I think that we agree that the core
is solid. Several people have said that it should be moved into the core
library. Nobody has said that it shouldn't. Whose move is it? What's
next?

 Paul Prescod


From fredrik@effbot.org  Mon Jan 22 23:08:40 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Tue, 23 Jan 2001 00:08:40 +0100
Subject: [Python-Dev] test___all__ fails if bsddb not available
Message-ID: <079a01c084c8$43023e40$e46940d5@hagrid>

test___all__
test test___all__ failed -- dbhash has no __all__ attribute

maybe this test shouldn't depend on optional modules?

</F>



From nas@arctrix.com  Mon Jan 22 16:24:34 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 08:24:34 -0800
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122231329.A27785@xs4all.nl>; from thomas@xs4all.net on Mon, Jan 22, 2001 at 11:13:29PM +0100
References: <14956.16758.68050.257212@localhost.localdomain> <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com> <20010122070057.A26575@glacier.fnational.com> <20010122231329.A27785@xs4all.nl>
Message-ID: <20010122082433.B26765@glacier.fnational.com>

On Mon, Jan 22, 2001 at 11:13:29PM +0100, Thomas Wouters wrote:
> That's a fairly wild guess, but it's worth a try. Try this
> patch:
[...]

Works for me.

  Neil


From greg@cosc.canterbury.ac.nz  Mon Jan 22 23:21:14 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 23 Jan 2001 12:21:14 +1300 (NZDT)
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122235602.B27785@xs4all.nl>
Message-ID: <200101222321.MAA01957@s454.cosc.canterbury.ac.nz>

Thomas Wouters <thomas@xs4all.net>:

> You can't have doublestar without having star

What?!? You could in 1.5.2. Has that changed?

Anyway, it just looked a bit odd that it seemed to be testing
for DOUBLESTAR and then adding a DEF_STAR thing to the symtab.
But I guess I should shut up until I've seen all of the code.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From thomas@xs4all.net  Mon Jan 22 23:26:02 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 23 Jan 2001 00:26:02 +0100
Subject: [Python-Dev] Worse news
In-Reply-To: <200101222321.MAA01957@s454.cosc.canterbury.ac.nz>; from greg@cosc.canterbury.ac.nz on Tue, Jan 23, 2001 at 12:21:14PM +1300
References: <20010122235602.B27785@xs4all.nl> <200101222321.MAA01957@s454.cosc.canterbury.ac.nz>
Message-ID: <20010123002602.C27785@xs4all.nl>

On Tue, Jan 23, 2001 at 12:21:14PM +1300, Greg Ewing wrote:
> Thomas Wouters <thomas@xs4all.net>:

> > You can't have doublestar without having star

> What?!? You could in 1.5.2. Has that changed?

Sorry, my bad, I'm wrong. (I just tested this.) I could swear it was that
way, but it's 0:25 right now, after a night with about 2 hours decent sleep,
so ignore my delusions :)

> Anyway, it just looked a bit odd that it seemed to be testing
> for DOUBLESTAR and then adding a DEF_STAR thing to the symtab.
> But I guess I should shut up until I've seen all of the code.

No, it's not doing that. It's adding the symbol name to the symtab, with
DEF_DOUBLESTAR as one of its flags. Not sure what the flag does, but I could
guess. (But see the above mentioned delusions as to why I'm not doing that
out loud anymore :-) The 'if' in front of it adds the symbol to the symtab
with DEF_STAR as a flag, in the case of 'STAR' (rather than DOUBLESTAR).
Really. go check :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Mon Jan 22 23:31:03 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 23 Jan 2001 00:31:03 +0100
Subject: [Python-Dev] Worse news
In-Reply-To: <20010123002602.C27785@xs4all.nl>; from thomas@xs4all.net on Tue, Jan 23, 2001 at 12:26:02AM +0100
References: <20010122235602.B27785@xs4all.nl> <200101222321.MAA01957@s454.cosc.canterbury.ac.nz> <20010123002602.C27785@xs4all.nl>
Message-ID: <20010123003103.D27785@xs4all.nl>

On Tue, Jan 23, 2001 at 12:26:02AM +0100, Thomas Wouters wrote:
> On Tue, Jan 23, 2001 at 12:21:14PM +1300, Greg Ewing wrote:
> > Thomas Wouters <thomas@xs4all.net>:
> 
> > > You can't have doublestar without having star
> 
> > What?!? You could in 1.5.2. Has that changed?

> Sorry, my bad, I'm wrong. (I just tested this.) I could swear it was that
> way, but it's 0:25 right now, after a night with about 2 hours decent sleep,
> so ignore my delusions :)

Ah, yeah, what I meant to *think* was: you can't have *spam *after* **eggs:

>>> def foo(x, **kwarg, *arg)
  File "<stdin>", line 1
    def foo(x, **kwarg, *arg)
                      ^
SyntaxError: invalid syntax

So the logic of the latter part of the function seems okay (after the little
patch I posted before.) Jeremy should give his expert opinion before it goes
in, though, since it's his code :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Mon Jan 22 23:36:17 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 18:36:17 -0500
Subject: [Python-Dev] test___all__ fails if bsddb not available
In-Reply-To: Your message of "Tue, 23 Jan 2001 00:08:40 +0100."
 <079a01c084c8$43023e40$e46940d5@hagrid>
References: <079a01c084c8$43023e40$e46940d5@hagrid>
Message-ID: <200101222336.SAA30480@cj20424-a.reston1.va.home.com>

> test test___all__ failed -- dbhash has no __all__ attribute
> 
> maybe this test shouldn't depend on optional modules?

Fixed -- I just skip dbhash if bsddb can't be imported.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@alum.mit.edu  Tue Jan 23 00:38:28 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 19:38:28 -0500 (EST)
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122231329.A27785@xs4all.nl>
References: <14956.16758.68050.257212@localhost.localdomain>
 <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>
 <20010122070057.A26575@glacier.fnational.com>
 <20010122231329.A27785@xs4all.nl>
Message-ID: <14956.53892.651549.493268@localhost.localdomain>

Thomas,

Your patch has the right diagnosis, although I would write it a tad
differently.  NCH(n) <= i + 2 should be NCH(n) < i + 2, because
CHILD(n, NCH(i)) is not valid.

I'll check it in.

Jeremy


From jeremy@alum.mit.edu  Tue Jan 23 01:23:56 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 20:23:56 -0500 (EST)
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: <20010119232323.70B03116392@oratrix.oratrix.nl>
References: <guido@digicool.com>
 <200101191634.LAA29239@cj20424-a.reston1.va.home.com>
 <20010119232323.70B03116392@oratrix.oratrix.nl>
Message-ID: <14956.56620.706531.647341@localhost.localdomain>

>>>>> "JJ" == Jack Jansen <jack@oratrix.nl> writes:

  JJ> Recently, Guido van Rossum <guido@digicool.com> said:
  >> > I get the impression that I'm currently seeing a non-NULL third
  >> > argument in my (C) methods even though the method is called
  >> > without keyword arguments.
  >>
  >> > Is this new semantics that I missed the discussion about, or is
  >> > this a bug?
  >>
  >> [...]  Do you really need the NULL?

  JJ> The places that I know I was counting on the NULL now have "if (
  JJ> kw && PyObject_IsTrue(kw))", so I'll just have to hope there
  JJ> aren't any more lingering in there.

Guido,

Does your query ("Do you really need the NULL?") mean that you don't
care whether the argument is NULL or an empty dictionary?  I could
change the code to do either for 2.1a2, if you have a preference.

Jeremy


From guido@digicool.com  Tue Jan 23 01:33:20 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 20:33:20 -0500
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: Your message of "Mon, 22 Jan 2001 20:23:56 EST."
 <14956.56620.706531.647341@localhost.localdomain>
References: <guido@digicool.com> <200101191634.LAA29239@cj20424-a.reston1.va.home.com> <20010119232323.70B03116392@oratrix.oratrix.nl>
 <14956.56620.706531.647341@localhost.localdomain>
Message-ID: <200101230133.UAA04378@cj20424-a.reston1.va.home.com>

> Guido,
> 
> Does your query ("Do you really need the NULL?") mean that you don't
> care whether the argument is NULL or an empty dictionary?  I could
> change the code to do either for 2.1a2, if you have a preference.
> 
> Jeremy

Robust code IMO should treat NULL and {} the same.  But since
traditionally we passed NULL, it's better to pass NULL rather than {}.
I believe that's the status quo now, right?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@alum.mit.edu  Tue Jan 23 01:54:53 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 20:54:53 -0500 (EST)
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: <200101230133.UAA04378@cj20424-a.reston1.va.home.com>
References: <guido@digicool.com>
 <200101191634.LAA29239@cj20424-a.reston1.va.home.com>
 <20010119232323.70B03116392@oratrix.oratrix.nl>
 <14956.56620.706531.647341@localhost.localdomain>
 <200101230133.UAA04378@cj20424-a.reston1.va.home.com>
Message-ID: <14956.58477.874472.190937@localhost.localdomain>

>>>>> "GvR" == Guido van Rossum <guido@digicool.com> writes:

  [Jeremy wrote:]
  >> Does your query ("Do you really need the NULL?") mean that you
  >> don't care whether the argument is NULL or an empty dictionary?
  >> I could change the code to do either for 2.1a2, if you have a
  >> preference.

  GvR> Robust code IMO should treat NULL and {} the same.  But since
  GvR> traditionally we passed NULL, it's better to pass NULL rather
  GvR> than {}.  I believe that's the status quo now, right?

The current status in CVS is to pass {}, because there appeared to be
some case where a PyCFunction was not expecting NULL.  I assumed,
without checking, that {} was required and change the implementation
to always pass a dictionary to METH_KEYWORDS functions.  I could
change it back to NULL and see if I can reproduce the error I was
seeing.

Jeremy


From guido@digicool.com  Tue Jan 23 02:01:12 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 21:01:12 -0500
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: Your message of "Mon, 22 Jan 2001 20:54:53 EST."
 <14956.58477.874472.190937@localhost.localdomain>
References: <guido@digicool.com> <200101191634.LAA29239@cj20424-a.reston1.va.home.com> <20010119232323.70B03116392@oratrix.oratrix.nl> <14956.56620.706531.647341@localhost.localdomain> <200101230133.UAA04378@cj20424-a.reston1.va.home.com>
 <14956.58477.874472.190937@localhost.localdomain>
Message-ID: <200101230201.VAA15993@cj20424-a.reston1.va.home.com>

>   [Jeremy wrote:]
>   >> Does your query ("Do you really need the NULL?") mean that you
>   >> don't care whether the argument is NULL or an empty dictionary?
>   >> I could change the code to do either for 2.1a2, if you have a
>   >> preference.
> 
>   GvR> Robust code IMO should treat NULL and {} the same.  But since
>   GvR> traditionally we passed NULL, it's better to pass NULL rather
>   GvR> than {}.  I believe that's the status quo now, right?
> 
> The current status in CVS is to pass {}, because there appeared to be
> some case where a PyCFunction was not expecting NULL.  I assumed,
> without checking, that {} was required and change the implementation
> to always pass a dictionary to METH_KEYWORDS functions.  I could
> change it back to NULL and see if I can reproduce the error I was
> seeing.

Yes, that's a good idea.  I hope that the {} in alpha 1 won't make
folks think that they will never see a NULL in the future and code
accordingly...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Tue Jan 23 02:15:11 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 21:15:11 -0500
Subject: [Python-Dev] 2.1a1 release tonight -- but no nested scopes or weak refs
Message-ID: <200101230215.VAA16577@cj20424-a.reston1.va.home.com>

We've decided to release 2.1a1 without further ado, but without two
big hopeful patches: Jeremy's nested scopes aren't finished and will
take considerably more time, and Fred's weak references need more
review (I haven't had the time to look at the code).  Rather than wait
longer, I've decided to try and release 2.1a1 tonight -- there's
nothing I'm waiting for now before I can cut a tarball.  There will be
an alpha2 release around February 1.

Please don't make any check-ins until I announce the 2.1a1 release
here.  (PythonLabs: please mail or phone me if you need to check in a
last-minute thing -- I'm tagging the tree now.)

More news as it happens,

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@mojam.com (Skip Montanaro)  Tue Jan 23 02:36:24 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 22 Jan 2001 20:36:24 -0600 (CST)
Subject: [Python-Dev] test_grammar failing
Message-ID: <14956.60968.363878.643640@beluga.mojam.com>

At the end of this:

    make distclean ; ./configure ; make OPT='-g -pipe' ; make test

I get this:

    rm -f ./Lib/test/*.py[co]
    PYTHONPATH=./build/lib.`cat platform` ./python -tt ./Lib/test/regrtest.py -l
    test_grammar
    name: None, in test_in_func, file './Lib/test/test_grammar.py', line 617
    locals: {'x': 2, '[1]': 1, 'l': 0}
    globals: {}
    Fatal Python error: compiler did not label name as local or global
    make: *** [test] Aborted
    PYTHONPATH=./build/lib.`cat platform` ./python -tt ./Lib/test/regrtest.py -l
    test_grammar
    name: None, in test_in_func, file './Lib/test/test_grammar.py', line 617
    locals: {'x': 2, '[1]': 1, 'l': 0}
    globals: {}
    Fatal Python error: compiler did not label name as local or global
    make: *** [test] Aborted

Any ideas?  I notice that Jeremy checked in some changes to test_grammar.py
this evening.

Skip


From gvwilson@nevex.com  Tue Jan 23 02:47:33 2001
From: gvwilson@nevex.com (Greg Wilson)
Date: Mon, 22 Jan 2001 21:47:33 -0500 (EST)
Subject: [Python-Dev] re: I think my set module is ready for prime time
Message-ID: <Pine.LNX.4.10.10101222146150.20319-100000@akbar.nevex.com>

> > Guido van Rossum:
> > There's already a PEP on a set object type, and everybody and their
> > aunt has already implemented a set datatype.

> Eric Raymond:
> Greg's proposal has a couple of problems.
> The biggest one is that the interface design isn't very Pythonic --
> ...doesn't exploit the extent to which sets
> naturally have common semantics with existing Python sequence types.
> This is bad; it means that a lot of code that could otherwise ignore
> the difference between lists and sets would have to be specialized 
> one way or the other for no good reason.

I agree with Eric's point; I put the interface design on hold while I
went off to try to find an efficient implementation capable of
handling mutable values (i.e. one that would allow things like sets of
sets).  I'm still looking :-(, but would appreciate comments from this
list on Eric's interface.

Thanks,
Greg



From guido@digicool.com  Tue Jan 23 03:02:50 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 22:02:50 -0500
Subject: [Python-Dev] test_grammar failing
In-Reply-To: Your message of "Mon, 22 Jan 2001 20:36:24 CST."
 <14956.60968.363878.643640@beluga.mojam.com>
References: <14956.60968.363878.643640@beluga.mojam.com>
Message-ID: <200101230302.WAA27104@cj20424-a.reston1.va.home.com>

> At the end of this:
> 
>     make distclean ; ./configure ; make OPT='-g -pipe' ; make test
> 
> I get this:
> 
>     rm -f ./Lib/test/*.py[co]
>     PYTHONPATH=./build/lib.`cat platform` ./python -tt ./Lib/test/regrtest.py -l
>     test_grammar
>     name: None, in test_in_func, file './Lib/test/test_grammar.py', line 617
>     locals: {'x': 2, '[1]': 1, 'l': 0}
>     globals: {}
>     Fatal Python error: compiler did not label name as local or global
>     make: *** [test] Aborted
>     PYTHONPATH=./build/lib.`cat platform` ./python -tt ./Lib/test/regrtest.py -l
>     test_grammar
>     name: None, in test_in_func, file './Lib/test/test_grammar.py', line 617
>     locals: {'x': 2, '[1]': 1, 'l': 0}
>     globals: {}
>     Fatal Python error: compiler did not label name as local or global
>     make: *** [test] Aborted
> 
> Any ideas?  I notice that Jeremy checked in some changes to test_grammar.py
> this evening.

Try another cvs update and rebuild.  The test that Jeremy checked in
is supposed to catch a bug in the compiler code that he checked in.
The latest compile.c is 103277 bytes long (in Unix).

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Tue Jan 23 03:33:02 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 22:33:02 -0500
Subject: [Python-Dev] Python 2.1 alpha 1 released!
Message-ID: <200101230333.WAA28376@cj20424-a.reston1.va.home.com>

Thanks to the PythonLabs developers and the many hard-working
volunteers, I'm proud to release Python 2.1a1 -- the first alpha
release of Python version 2.1.

The release mechanics are different than for previous releases: we're
only releasing through SourceForge for now.  The official source
tarball is already available from the download page:

  http://sourceforge.net/project/showfiles.php?group_id=5470

Additional files will be released soon: a Windows installer,
Linux RPMs, and documentation.

Please give it a good try!  The only way Python 2.1 can become a
rock-solid product is if people test the alpha releases.  Especially
if you are using Python for demanding applications or on extreme
platforms we are interested in hearing your feedback.  Are you
embedding Python or using threads?  Please test your application using
Python 2.1a1!  Please submit all bug reports through SourceForge:

  http://sourceforge.net/bugs/?group_id=5470

Here's the NEWS file:

What's New in Python 2.1 alpha 1?
=================================

Core language, builtins, and interpreter

- There is a new Unicode companion to the PyObject_Str() API
  called PyObject_Unicode(). It behaves in the same way as the
  former, but assures that the returned value is an Unicode object
  (applying the usual coercion if necessary).

- The comparison operators support "rich comparison overloading" (PEP
  207).  C extension types can provide a rich comparison function in
  the new tp_richcompare slot in the type object.  The cmp() function
  and the C function PyObject_Compare() first try the new rich
  comparison operators before trying the old 3-way comparison.  There
  is also a new C API PyObject_RichCompare() (which also falls back on
  the old 3-way comparison, but does not constrain the outcome of the
  rich comparison to a Boolean result).

  The rich comparison function takes two objects (at least one of
  which is guaranteed to have the type that provided the function) and
  an integer indicating the opcode, which can be Py_LT, Py_LE, Py_EQ,
  Py_NE, Py_GT, Py_GE (for <, <=, ==, !=, >, >=), and returns a Python
  object, which may be NotImplemented (in which case the tp_compare
  slot function is used as a fallback, if defined).

  Classes can overload individual comparison operators by defining one
  or more of the methods__lt__, __le__, __eq__, __ne__, __gt__,
  __ge__.  There are no explicit "reflected argument" versions of
  these; instead, __lt__ and __gt__ are each other's reflection,
  likewise for__le__ and __ge__; __eq__ and __ne__ are their own
  reflection (similar at the C level).  No other implications are
  made; in particular, Python does not assume that == is the Boolean
  inverse of !=, or that < is the Boolean inverse of >=.  This makes
  it possible to define types with partial orderings.

  Classes or types that want to implement (in)equality tests but not
  the ordering operators (i.e. unordered types) should implement ==
  and !=, and raise an error for the ordering operators.

  It is possible to define types whose rich comparison results are not
  Boolean; e.g. a matrix type might want to return a matrix of bits
  for A < B, giving elementwise comparisons.  Such types should ensure
  that any interpretation of their value in a Boolean context raises
  an exception, e.g. by defining __nonzero__ (or the tp_nonzero slot
  at the C level) to always raise an exception.

- Complex numbers use rich comparisons to define == and != but raise
  an exception for <, <=, > and >=.  Unfortunately, this also means
  that cmp() of two complex numbers raises an exception when the two
  numbers differ.  Since it is not mathematically meaningful to compare
  complex numbers except for equality, I hope that this doesn't break
  too much code.

- Functions and methods now support getting and setting arbitrarily
  named attributes (PEP 232).  Functions have a new __dict__
  (a.k.a. func_dict) which hold the function attributes.  Methods get
  and set attributes on their underlying im_func.  It is a TypeError
  to set an attribute on a bound method.

- The xrange() object implementation has been improved so that
  xrange(sys.maxint) can be used on 64-bit platforms.  There's still a
  limitation that in this case len(xrange(sys.maxint)) can't be
  calculated, but the common idiom "for i in xrange(sys.maxint)" will
  work fine as long as the index i doesn't actually reach 2**31.
  (Python uses regular ints for sequence and string indices; fixing
  that is much more work.)

- Two changes to from...import:

  1) "from M import X" now works even if M is not a real module; it's
     basically a getattr() operation with AttributeError exceptions
     changed into ImportError.

  2) "from M import *" now looks for M.__all__ to decide which names to
     import; if M.__all__ doesn't exist, it uses M.__dict__.keys() but
     filters out names starting with '_' as before.  Whether or not
     __all__ exists, there's no restriction on the type of M.

- File objects have a new method, xreadlines().  This is the fastest
  way to iterate over all lines in a file:

  for line in file.xreadlines():
      ...do something to line...

  See the xreadlines module (mentioned below) for how to do this for
  other file-like objects.

- Even if you don't use file.xreadlines(), you may expect a speedup on
  line-by-line input.  The file.readline() method has been optimized
  quite a bit in platform-specific ways:  on systems (like Linux) that
  support flockfile(), getc_unlocked(), and funlockfile(), those are
  used by default.  On systems (like Windows) without getc_unlocked(),
  a complicated (but still thread-safe) method using fgets() is used by
  default.

  You can force use of the fgets() method by #define'ing 
  USE_FGETS_IN_GETLINE at build time (it may be faster than 
  getc_unlocked()).

  You can force fgets() not to be used by #define'ing 
  DONT_USE_FGETS_IN_GETLINE (this is the first thing to try if std test 
  test_bufio.py fails -- and let us know if it does!).

- In addition, the fileinput module, while still slower than the other
  methods on most platforms, has been sped up too, by using
  file.readlines(sizehint).

- Support for run-time warnings has been added, including a new
  command line option (-W) to specify the disposition of warnings.
  See the description of the warnings module below.

- Extensive changes have been made to the coercion code.  This mostly
  affects extension modules (which can now implement mixed-type
  numerical operators without having to use coercion), but
  occasionally, in boundary cases the coercion semantics have changed
  subtly.  Since this was a terrible gray area of the language, this
  is considered an improvement.  Also note that __rcmp__ is no longer
  supported -- instead of calling __rcmp__, __cmp__ is called with
  reflected arguments.

- In connection with the coercion changes, a new built-in singleton
  object, NotImplemented is defined.  This can be returned for
  operations that wish to indicate they are not implemented for a
  particular combination of arguments.  From C, this is
  Py_NotImplemented.

- The interpreter accepts now bytecode files on the command line even
  if they do not have a .pyc or .pyo extension. On Linux, after executing

  echo ':pyc:M::\x87\xc6\x0d\x0a::/usr/local/bin/python:' > /proc/sys/fs/binfmt_misc/register

  any byte code file can be used as an executable (i.e. as an argument
  to execve(2)).

- %[xXo] formats of negative Python longs now produce a sign
  character.  In 1.6 and earlier, they never produced a sign,
  and raised an error if the value of the long was too large
  to fit in a Python int.  In 2.0, they produced a sign if and
  only if too large to fit in an int.  This was inconsistent
  across platforms (because the size of an int varies across
  platforms), and inconsistent with hex() and oct().  Example:

  >>> "%x" % -0x42L
  '-42'      # in 2.1
  'ffffffbe' # in 2.0 and before, on 32-bit machines
  >>> hex(-0x42L)
  '-0x42L'   # in all versions of Python

  The behavior of %d formats for negative Python longs remains
  the same as in 2.0 (although in 1.6 and before, they raised
  an error if the long didn't fit in a Python int).

  %u formats don't make sense for Python longs, but are allowed
  and treated the same as %d in 2.1.  In 2.0, a negative long
  formatted via %u produced a sign if and only if too large to
  fit in an int.  In 1.6 and earlier, a negative long formatted
  via %u raised an error if it was too big to fit in an int.

- Dictionary objects have an odd new method, popitem().  This removes
  an arbitrary item from the dictionary and returns it (in the form of
  a (key, value) pair).  This can be useful for algorithms that use a
  dictionary as a bag of "to do" items and repeatedly need to pick one
  item.  Such algorithms normally end up running in quadratic time;
  using popitem() they can usually be made to run in linear time.

Standard library

- In the time module, the time argument to the functions strftime,
  localtime, gmtime, asctime and ctime is now optional, defaulting to
  the current time (in the local timezone).

- The ftplib module now defaults to passive mode, which is deemed a
  more useful default given that clients are often inside firewalls
  these days.  Note that this could break if ftplib is used to connect
  to a *server* that is inside a firewall, from outside; this is
  expected to be a very rare situation.  To fix that, you can call
  ftp.set_pasv(0).

- The module site now treats .pth files not only for path configuration,
  but also supports extensions to the initialization code: Lines starting
  with import are executed.

- There's a new module, warnings, which implements a mechanism for
  issuing and filtering warnings.  There are some new built-in
  exceptions that serve as warning categories, and a new command line
  option, -W, to control warnings (e.g. -Wi ignores all warnings, -We
  turns warnings into errors).  warnings.warn(message[, category])
  issues a warning message; this can also be called from C as
  PyErr_Warn(category, message).

- A new module xreadlines was added.  This exports a single factory
  function, xreadlines().  The intention is that this code is the
  absolutely fastest way to iterate over all lines in an open
  file(-like) object:

  import xreadlines
  for line in xreadlines.xreadlines(file):
      ...do something to line...

  This is equivalent to the previous the speed record holder using
  file.readlines(sizehint).  Note that if file is a real file object
  (as opposed to a file-like object), this is equivalent:

  for line in file.xreadlines():
      ...do something to line...

- The bisect module has new functions bisect_left, insort_left,
  bisect_right and insort_right.  The old names bisect and insort
  are now aliases for bisect_right and insort_right.  XXX_right
  and XXX_left methods differ in what happens when the new element
  compares equal to one or more elements already in the list:  the
  XXX_left methods insert to the left, the XXX_right methods to the
  right.  Code that doesn't care where equal elements end up should
  continue to use the old, short names ("bisect" and "insort").

- The new curses.panel module wraps the panel library that forms part
  of SYSV curses and ncurses.  Contributed by Thomas Gellekum.

- The SocketServer module now sets the allow_reuse_address flag by
  default in the TCPServer class.

- A new function, sys._getframe(), returns the stack frame pointer of
  the caller.  This is intended only as a building block for
  higher-level mechanisms such as string interpolation.

Build issues

- For Unix (and Unix-compatible) builds, configuration and building of
  extension modules is now greatly automated.  Rather than having to
  edit the Modules/Setup file to indicate which modules should be
  built and where their include files and libraries are, a
  distutils-based setup.py script now takes care of building most
  extension modules.  All extension modules built this way are built
  as shared libraries.  Only a few modules that must be linked
  statically are still listed in the Setup file; you won't need to
  edit their configuration.

- Python should now build out of the box on Cygwin.  If it doesn't,
  mail to Jason Tishler (jlt63 at users.sourceforge.net).

- Python now always uses its own (renamed) implementation of getopt()
  -- there's too much variation among C library getopt()
  implementations.

- C++ compilers are better supported; the CXX macro is always set to a
  C++ compiler if one is found.

Windows changes

- select module:  By default under Windows, a select() call
  can specify no more than 64 sockets.  Python now boosts
  this Microsoft default to 512.  If you need even more than
  that, see the MS docs (you'll need to #define FD_SETSIZE
  and recompile Python from source).

- Support for Windows 3.1, DOS and OS/2 is gone.  The Lib/dos-8x3
  subdirectory is no more!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ping@lfw.org  Tue Jan 23 04:11:09 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 22 Jan 2001 20:11:09 -0800 (PST)
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: <3A6CBBEF.4732BFF2@ActiveState.com>
Message-ID: <Pine.LNX.4.10.10101221953190.1568-100000@skuld.kingmanhall.org>

Guido van Rossum wrote:
> Yes, wow!

Paul Prescod wrote:
> I apologize but I'm not clear on my responsibilities here, if any. I
> wrote a PEP for online help. I submitted a partial implementation.

Hi, guys.  Sorry i haven't been sending updates on what i'm doing.
Here's the current picture as i see it.

> Ping wrote a full implementation that basically supercedes mine.

My implementation is "full" in that it deploys and seems to work on
arbitrary modules as it stands, but it doesn't really supercede Paul's
because it leaves out the big piece of Paul's work that did conversion
from packaged HTML docs to plain text.

It also has the deficiency that it imports modules live; for untrusted
modules, this is a security risk.  I know Paul has been working on
stuff to compile a module into a kind of skeleton object that has all
the same name bindings but no live contents, and if that works reliably,
we should definitely try plugging that in.

> There are various ideas for improving it, but I think that we agree
> that the core is solid.

Yes.  I believe that as it stands, pydoc is useful enough to be a net
positive addition to the core.  inspect.py alone has been stable and
alpha-ready for some time, i believe.

Here is a summary of its status and work that remains.  pydoc has:

    inspecting live objects
    generating text docs from live objects
    generating HTML docs from live objects
    serving HTML docs from a little web server
    showing docs from the command line
    showing docs from within the interactive interpreter
    apropos-style module listing

It's missing the following, and Paul had stuff for this:

    inspecting unsafe modules
    generating text docs from packaged HTML (e.g. language reference)

It also needs these:

    generating docs from a file given on the command line (easy)
    more Windows and Mac testing and decisions
    various small bugfixes

This past week i've been messing around with Windows and Mac stuff,
trying to see whether it's possible to reliably spawn a webserver
and launch a web browser at the same time (this would seem to be a
good default action to do on GUI platforms).

In trying to do the latter i've found the webbrowser module pretty
unreliable, by the way.  For example, it relies on a constant delay
of 4 seconds to launch a new browser that can't be expected on all
platforms, and fails to launch Netscape 3 because it supplies an
illegal command-line option.  When i've found good cross-platform
ways to make this work i'll suggest some patches.

I've so far considered this project blocked only on cross-platform
testing -- do you agree?  While i know that inspecting unsafe modules
and processing packaged HTML are important features, i don't consider
them essential.


-- ?!ng



From ping@lfw.org  Tue Jan 23 04:14:50 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 22 Jan 2001 20:14:50 -0800 (PST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <Pine.LNX.4.10.10101221953190.1568-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101222011470.1568-100000@skuld.kingmanhall.org>

On Mon, 22 Jan 2001, Ka-Ping Yee wrote:
> In trying to do the latter i've found the webbrowser module pretty
> unreliable, by the way.  For example, it relies on a constant delay
> of 4 seconds to launch a new browser that can't be expected on all
> platforms, and fails to launch Netscape 3 because it supplies an
> illegal command-line option.  When i've found good cross-platform
> ways to make this work i'll suggest some patches.

Oh, and i forgot to mention... i was pretty disappointed that:

    setenv BROWSER my_browser_program
    python -c 'import webbrowser; webbrowser.open("http://python.org/")'

doesn't execute "my_browser_program http://python.org/" as i would
have hoped.  Even for a known browser type:

    setenv BROWSER lynx
    python -c 'import webbrowser; webbrowser.open("http://python.org/")'

does not work as expected, either.  (Red Hat Linux here.)


-- ?!ng



From ping@lfw.org  Tue Jan 23 04:22:56 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 22 Jan 2001 20:22:56 -0800 (PST)
Subject: [Python-Dev] Is X a (sequence|mapping)?
Message-ID: <Pine.LNX.4.10.10101222016150.1568-100000@skuld.kingmanhall.org>

We can implement abstract interfaces (sequence, mapping, number) in
Python with the appropriate __special__ methods, but i don't see an
easy way to test if something supports one of these abstract interfaces
in Python.

At the moment, to see if something is a sequence i believe i have to
say something like

    try:
        x[0]
    except:
        # not a sequence
    else:
        # okay, it's a sequence

or

    if hasattr(x, '__getitem__') or type(x) in [type(()), type([])]:
        ...

Is there, or should there be, a better way to do this?



-- ?!ng



From greg@cosc.canterbury.ac.nz  Tue Jan 23 04:46:26 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 23 Jan 2001 17:46:26 +1300 (NZDT)
Subject: [Python-Dev] re: I think my set module is ready for prime time
In-Reply-To: <Pine.LNX.4.10.10101222146150.20319-100000@akbar.nevex.com>
Message-ID: <200101230446.RAA01992@s454.cosc.canterbury.ac.nz>

Greg Wilson <gvwilson@nevex.com>:

> an efficient implementation capable of
> handling mutable values (i.e. one that would allow things like sets of
> sets)

I suspect that such a thing is impossible. To avoid a
linear search you have to take advantage of some kind
of hashing or ordering, which you can't do if your
objects can change their values out from under you.

Also, there's nothing to stop someone from mutating
two previously unequal elements so that they're equal.
Then you have a "set" with two identical elements,
which isn't a set any more, it's just a collection.

So, I submit that the very concept of a set only
makes sense for immutable values.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From tim.one@home.com  Tue Jan 23 05:03:18 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 23 Jan 2001 00:03:18 -0500
Subject: [Python-Dev] Is X a (sequence|mapping)?
In-Reply-To: <Pine.LNX.4.10.10101222016150.1568-100000@skuld.kingmanhall.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEJBIKAA.tim.one@home.com>

[?!ng]
> ...
> At the moment, to see if something is a sequence i believe i have to
> say something like
>
>     try:
>         x[0]
>     except:
>         # not a sequence
>     else:
>         # okay, it's a sequence
>
> or
>
>     if hasattr(x, '__getitem__') or type(x) in [type(()), type([])]:
>         ...
>
> Is there, or should there be, a better way to do this?

Dunno.  What's a sequence?  If you want to know whether x[0] will blow up,
trying x[0] is the most obvious way.  BTW, I expect trying x[:0] is a better
idea:  doesn't succeed for dicts, and doesn't blow up for an irrelevant
reason if x is an empty sequence.  BTW2, your second method suggests an
uncomfortable truth:  many contexts that want "a sequence" don't want
strings to pass the test, despite that strings are as much sequences as
lists in Python, no matter how "a sequence" is defined.

afraid that-what-you-want-to-do-with-it-is-more-important-than-what-
    python-calls-it-ly y'rs  - tim



From ping@lfw.org  Tue Jan 23 05:27:30 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Mon, 22 Jan 2001 21:27:30 -0800 (PST)
Subject: [Python-Dev] I think my set module is ready for prime time;
 comments?
In-Reply-To: <20010122124159.A14999@thyrsus.com>
Message-ID: <Pine.LNX.4.10.10101222125340.1568-100000@skuld.kingmanhall.org>

On Mon, 22 Jan 2001, Eric S. Raymond wrote:
> \section{\module{set} ---
>          Basic set algebra for Python}

I'd like to look at the module.  Did you actually show us the code
for this, or am i a blind doofus?

(Please, no answers to the unasked question of whether i am a doofus.)


-- ?!ng



From tim.one@home.com  Tue Jan 23 06:05:26 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 23 Jan 2001 01:05:26 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122064400.A26543@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEJEIKAA.tim.one@home.com>

In finding and repairing the test_extcall bug, Neil and Thomas have once
again contributed beyond the call of duty.  Thank you!  It took some doing
to convince Guido to release his Dutch Death Grip on the PythonLabs coffers,
but in the end he was overcome by the moral necessity of rewarding you
sterling fellows for your golden deeds:  you're both entitled to free(*)--
yes, FREE(*)! --copies of all Python 2.1 alpha, *and* beta, releases(*)!

you-wouldn't-believe-how-much-he-charges-us-ly y'rs  - tim


(*) Does not apply to Jython releases.  All applicable taxes are the
responsibility of the recipient.  No warranty is expressed or implied.  This
offer has not been reviewed or approved by CWI, CNRI, BeOpen.com, or Digital
Creations 2.  Export restrictions may apply.  By acceptance of this offer,
recipient grants perpetual license to use their name, image and likeness in
Python promotional materials without compensation.  Packaging, handling,
shipping and insurance costs to be borne by recipient, but in no case to
exceed 1 (one) US$/byte.  This offer may be withdrawn at any time, including
but not limited to retroactively, at the sole discretion of Guido van
Rossum, or such of his heirs and successors as he may designate from time to
time.



From martin@mira.cs.tu-berlin.de  Tue Jan 23 08:14:32 2001
From: martin@mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 23 Jan 2001 09:14:32 +0100
Subject: [Python-Dev] Is X a (sequence|mapping)?
Message-ID: <200101230814.f0N8EWQ00849@mira.informatik.hu-berlin.de>

> i don't see an easy way to test if something supports one of these
> abstract interfaces in Python.

Why do you want to test for that? If you have an algorithm that only
operates on integer-indexed things, what can you do if the test fails?

So it is always better to just use the object in the algorithm, and
let it break with an exception if somebody passes a bad object.

Regards,
Martin


From mal@lemburg.com  Tue Jan 23 09:08:24 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 10:08:24 +0100
Subject: [Python-Dev] webbrowser.py
References: <Pine.LNX.4.10.10101222011470.1568-100000@skuld.kingmanhall.org>
Message-ID: <3A6D4A08.B3806984@lemburg.com>

Ka-Ping Yee wrote:
> 
> On Mon, 22 Jan 2001, Ka-Ping Yee wrote:
> > In trying to do the latter i've found the webbrowser module pretty
> > unreliable, by the way.  For example, it relies on a constant delay
> > of 4 seconds to launch a new browser that can't be expected on all
> > platforms, and fails to launch Netscape 3 because it supplies an
> > illegal command-line option.  When i've found good cross-platform
> > ways to make this work i'll suggest some patches.
> 
> Oh, and i forgot to mention... i was pretty disappointed that:
> 
>     setenv BROWSER my_browser_program
>     python -c 'import webbrowser; webbrowser.open("http://python.org/")'
> 
> doesn't execute "my_browser_program http://python.org/" as i would
> have hoped.  Even for a known browser type:
> 
>     setenv BROWSER lynx
>     python -c 'import webbrowser; webbrowser.open("http://python.org/")'
> 
> does not work as expected, either.  (Red Hat Linux here.)

Hmm, lynx should work (the module has explicit support for it)
and yes, I agree, webbrowser should trust BROWSER and use a
generic calling mechanism (program <url>) for opening the
URL.

Too late for 2.1a1, but maybe for a2 ?!

BTW, I think that the second line here is causing the problem:

class CommandLineBrowser:
    _browsers = [] # <- this overrides the global of the same name
    if os.environ.get("DISPLAY"):
        _browsers.extend([
            ("netscape", "netscape %s >/dev/null &"),
            ("mosaic", "mosaic %s >/dev/null &"),
            ])
    _browsers.extend([
        ("lynx", "lynx %s"),
        ("w3m", "w3m %s"),
        ])


-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Tue Jan 23 09:15:11 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 10:15:11 +0100
Subject: [Python-Dev] Is X a (sequence|mapping)?
References: <200101230814.f0N8EWQ00849@mira.informatik.hu-berlin.de>
Message-ID: <3A6D4B9F.38B17046@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > i don't see an easy way to test if something supports one of these
> > abstract interfaces in Python.
> 
> Why do you want to test for that? If you have an algorithm that only
> operates on integer-indexed things, what can you do if the test fails?
> 
> So it is always better to just use the object in the algorithm, and
> let it break with an exception if somebody passes a bad object.

Right. 

Polymorphic code will usually get you more out of an 
algorithm, than type-safe or interface-safe code.

BTW, there are Python interfaces to PySequence_Check() and
PyMapping_Check() burried in the builtin operator module in case
you really do care ;) ...

	operator.isSequenceType()
	operator.isMappingType()
	+ some other C style _Check() APIs

These only look at the type slots though, so Python instances
will appear to support everything but when used fail with
an exception if they don't provide the proper __xxx__ hooks.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From esr@thyrsus.com  Tue Jan 23 09:17:30 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 04:17:30 -0500
Subject: [Python-Dev] webbrowser.py
Message-ID: <20010123041730.A25165@thyrsus.com>

Ping's complaints are justified -- I've been looking at and testing
webbrowser.py and it's a mess.  Among other things:

1. The BROWSER variable is not interpreted properly.

2. The code is stupid about loading platform support it doesn't need.

3. It's not possible to specify lynx as a browser under Unix, because the
   computation of available browsers is split in two and partly done inside
   the CommandLineBrowser class.

3. The module code is excessively hard to read, obscuring these bugs.

Our mistake was hurriedly merging the launcher code from IDLE with the
browser-finder hack I wrote (the guts of CommandLineBrowser).  The resulting
code is a bad, overcomplicated architecture with a nasty seam in it.

As co-designer/implementor I should have caught this sooner, but I was
in a hurry to get a CML2 prototype out the door and didn't test
anything but the case I needed.  My apologies to all.

I'm rewriting to fix these problems now.  Documented semantics of entry
points will be preserved.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The politician attempts to remedy the evil by increasing the very thing
that caused the evil in the first place: legal plunder.
	-- Frederick Bastiat


From mal@lemburg.com  Tue Jan 23 10:26:16 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 11:26:16 +0100
Subject: [Python-Dev] I think my set module is ready for prime time;
 comments?
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com>
Message-ID: <3A6D5C48.A076DA0@lemburg.com>

"Eric S. Raymond" wrote:
> 
> Guido van Rossum <guido@digicool.com>:
> > There's already a PEP on a set object type, and everybody and their
> > aunt has already implemented a set datatype.
> 
> I've just read the PEP.  Greg's proposal has a couple of problems.
> The biggest one is that the interface design isn't very Pythonic --
> it's formally adequate, but doesn't exploit the extent to which sets
> naturally have common semantics with existing Python sequence types.
> This is bad; it means that a lot of code that could otherwise ignore
> the difference between lists and sets would have to be specialized
> one way or the other for no good reason.
> 
> The only other set module I can find in the Vaults or anywhere else is
> kjBuckets (which I knew about before).  Looks like a good design, but
> complicated -- and requires installation of an extension.

There's also a kjSet.py available at Aaron's site:

	http://www.chordate.com/kwParsing/index.html

which is a pure Python version of the C extenion's kjSet type.
 
> > If *your* set module is ready for prime time, why not publish it in
> > the Vaults of Parnassus?
> 
> I suppose that's what I'll do if you don't bless it for the standard
> library.  But here are the reasons I suggest you should do so:
> 
> 1. It supports a set of operations that are both often useful and
> fiddly to get right, thus enhancing the "batteries are included"
> effect.  (I used its ancestor for representing seen-message numbers in
> a specialized mailreader, for example.)
> 
> 2. It's simple for application programmers to use.  No extension module
> to integrate.
> 
> 3. It's unsurprising.  My set objects behave almost exactly like other
> mutable sequences, with all the same built-in methods working, except for
> the fact that you can't introduce duplicates with the mutators.
> 
> 4. It's already completely documented in a form suitable for the library.
> 
> 5. It's simple enough not to cause you maintainance hassles down the
> road, and even if it did the maintainer is unlikely to disappear :-).

All very well, but are sets really that essential to every
day Python programming ? If we include sets then we ought to
also include graphs, tries, btrees and all those other goodies
we have in computer science. All of these types are available
out there, but I believe the audience who really cares for these
types is also capable of downloading the extensions and installing
them.

It would be nice if all of these extension could go into a SUMO
edition of Python though... together with your set module.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From esr@thyrsus.com  Tue Jan 23 11:08:06 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 06:08:06 -0500
Subject: [Python-Dev] What does "batteries are included" mean?
In-Reply-To: <3A6D5C48.A076DA0@lemburg.com>; from mal@lemburg.com on Tue, Jan 23, 2001 at 11:26:16AM +0100
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com>
Message-ID: <20010123060806.A25436@thyrsus.com>

M.-A. Lemburg <mal@lemburg.com>:
> All very well, but are sets really that essential to every
> day Python programming ? If we include sets then we ought to
> also include graphs, tries, btrees and all those other goodies
> we have in computer science.

I use sets a lot.  And there was enough demand to generate a PEP.

But the wider question here is how seriously we take "batteries are
included" as a design principle.  Does a facility have to be useful
*every day* to be worth being in the standard library?  And if so,
what are things like the POP3 and IMAP libraries (or, for that matter,
my own shlex and netrc modules) doing there?

I don't think so.  I think there are at least four different
possible reasons for something to be in the standard library:

1. It's useful every day.

2. It's useful less frequently than every day, but is a stable
cross-platform implementation of a wheel that would otherwise have to
be reinvented frequently.  That is, you can solve it *once* and have a
zero-maintainance increment to the power of the language.

3. It's a technique that's not often used, and not necessarily stable 
in the face of platform variations, but nothing else will do
when you need it and it's notably difficult to get right.  (popen2 and
BaseHTTPServer would be good examples of this.)

4. It's a developer checklist feature that improves Python's competitive
position against Perl, Tcl, and other contenders for the same ecological
niche.

IMO a lightweight set facility, like POP3 and IMAP, qualifies under 2 and 4
even if not under 1 and 3.  

This question keeps coming up in different guises.  I'm often the one to
raise it, because I favor an aggressive interpretation of "batteries
are included" that would pull in a lot of stuff.  Yes, this makes more
work for us -- but I think it's work we should be doing.  

While minimalism is an excellent design heuristic for the core language,
I think it's a bad one for the libraries.  Python is a high-level language
and programmers using it both expect and deserve high-level libraries --
yes, including graphs/tries/btrees and all that computer science stuff.

Just as much to the point, Python competing against languages like
Perl that frequently get design wins against it because of the
richness of the environment *they* are willing to carry around.

Guido and Tim and others are more conservative than I, which would be
OK -- but it seems to me that the conservatives do not have consistent
or well-thought-out criteria for what to include, which is *not* OK.
We need to solve this problem.

Some time back I initiated a library guidelines PEP, then dropped it
due to press of overwork.  But the general question is going to keep
coming up and we ought to have policy guidelines that potential 
library developers can understand.  

Should I pick this up again?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

I do not find in orthodox Christianity one redeeming feature.
	-- Thomas Jefferson


From mal@lemburg.com  Tue Jan 23 11:50:39 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 12:50:39 +0100
Subject: [Python-Dev] What does "batteries are included" mean?
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com>
Message-ID: <3A6D700F.7A9E2509@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal@lemburg.com>:
> > All very well, but are sets really that essential to every
> > day Python programming ? If we include sets then we ought to
> > also include graphs, tries, btrees and all those other goodies
> > we have in computer science.
> 
> I use sets a lot.  And there was enough demand to generate a PEP.

Sure, but sets are fairly easy to implement using Python dictionaries
-- at least at the level normally needed by Python programs. Sets, queues
and graphs are examples of data types which can have many
different faces; it is hard to design APIs for these which meet 
everyones needs.
 
> But the wider question here is how seriously we take "batteries are
> included" as a design principle.  Does a facility have to be useful
> *every day* to be worth being in the standard library?  And if so,
> what are things like the POP3 and IMAP libraries (or, for that matter,
> my own shlex and netrc modules) doing there?

You can argue the same way for all kinds of extensions and
packages you find in the Vaults. That's why there's demand for
a different packaging of Python and this is what Moshe's
PEP 206 addresses:

	http://python.sourceforge.net/peps/pep-0206.html

> I don't think so. I think there are at least four different
> possible reasons for something to be in the standard library:
> 
> 1. It's useful every day.
> 
> 2. It's useful less frequently than every day, but is a stable
> cross-platform implementation of a wheel that would otherwise have to
> be reinvented frequently.  That is, you can solve it *once* and have a
> zero-maintainance increment to the power of the language.
> 
> 3. It's a technique that's not often used, and not necessarily stable
> in the face of platform variations, but nothing else will do
> when you need it and it's notably difficult to get right.  (popen2 and
> BaseHTTPServer would be good examples of this.)
> 
> 4. It's a developer checklist feature that improves Python's competitive
> position against Perl, Tcl, and other contenders for the same ecological
> niche.
> 
> IMO a lightweight set facility, like POP3 and IMAP, qualifies under 2 and 4
> even if not under 1 and 3.
> 
> This question keeps coming up in different guises.  I'm often the one to
> raise it, because I favor an aggressive interpretation of "batteries
> are included" that would pull in a lot of stuff.  Yes, this makes more
> work for us -- but I think it's work we should be doing.
> 
> While minimalism is an excellent design heuristic for the core language,
> I think it's a bad one for the libraries.  Python is a high-level language
> and programmers using it both expect and deserve high-level libraries --
> yes, including graphs/tries/btrees and all that computer science stuff.
> 
> Just as much to the point, Python competing against languages like
> Perl that frequently get design wins against it because of the
> richness of the environment *they* are willing to carry around.
> 
> Guido and Tim and others are more conservative than I, which would be
> OK -- but it seems to me that the conservatives do not have consistent
> or well-thought-out criteria for what to include, which is *not* OK.
> We need to solve this problem.
> 
> Some time back I initiated a library guidelines PEP, then dropped it
> due to press of overwork.  But the general question is going to keep
> coming up and we ought to have policy guidelines that potential
> library developers can understand.
> 
> Should I pick this up again?

Hmm, we already have the PEP 206 which focusses on the topic.
Perhaps you could work with Moshe to sort out the "which
batteries do we need" sub-topic ?!

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From esr@thyrsus.com  Tue Jan 23 12:20:46 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 07:20:46 -0500
Subject: [Python-Dev] What does "batteries are included" mean?
In-Reply-To: <3A6D700F.7A9E2509@lemburg.com>; from mal@lemburg.com on Tue, Jan 23, 2001 at 12:50:39PM +0100
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com> <3A6D700F.7A9E2509@lemburg.com>
Message-ID: <20010123072046.A25593@thyrsus.com>

M.-A. Lemburg <mal@lemburg.com>:
> > But the wider question here is how seriously we take "batteries are
> > included" as a design principle.  Does a facility have to be useful
> > *every day* to be worth being in the standard library?  And if so,
> > what are things like the POP3 and IMAP libraries (or, for that matter,
> > my own shlex and netrc modules) doing there?
> 
> You can argue the same way for all kinds of extensions and
> packages you find in the Vaults. That's why there's demand for
> a different packaging of Python and this is what Moshe's
> PEP 206 addresses:
> 
> 	http://python.sourceforge.net/peps/pep-0206.html

Muttering "PEP 206" evades the fundamental problem rather than solving it.

Not that I'm saying Moshe hasn't made a valiant effort, within the political
constraint that the BDFL and others seem unwilling to confront the deeper 
issue.  But PEP 206 is not enough.  Here is why:

1. If the "Sumo" packaging ever happens, the vanilla non-Sumo version that
Guido issues will quickly become of mostly theoretical interest -- because
Red Hat and everybody else will move to Sumo instantly, figuring they have
nothing to lose by including more features.

2. If by some change I'm wrong about 1, the outcome will be worse;
we'll in effect have fragmented the language, because there won't be
consistency in what library stuff is available between Sumo and
non-Sumo builds on the same platform.

3. There are documentation issues as well.  It's already a blot on
Python that the standard documentation set doesn't cover Tkinter.  In
the Sumo distribution, the gap between what's installed and what's
documented is likely to widen further.  Developers will see this as
pointlessly irritating -- and they'll be right.

The stock distribution should *be* the Sumo distribution.  If we're really
so terrified of the extra maintainence load, then the right fix is to
mark some modules and documentation as "externally maintained" with 
prominent pointers back to the responsible people.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The day will come when the mystical generation of Jesus by the Supreme
Being as his father, in the womb of a virgin, will be classed with the
fable of the generation of Minerva in the brain of Jupiter.
	-- Thomas Jefferson, 1823


From mal@lemburg.com  Tue Jan 23 12:48:09 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 13:48:09 +0100
Subject: [Python-Dev] What does "batteries are included" mean?
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com> <3A6D700F.7A9E2509@lemburg.com> <20010123072046.A25593@thyrsus.com>
Message-ID: <3A6D7D89.A6BE1B74@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal@lemburg.com>:
> > > But the wider question here is how seriously we take "batteries are
> > > included" as a design principle.  Does a facility have to be useful
> > > *every day* to be worth being in the standard library?  And if so,
> > > what are things like the POP3 and IMAP libraries (or, for that matter,
> > > my own shlex and netrc modules) doing there?
> >
> > You can argue the same way for all kinds of extensions and
> > packages you find in the Vaults. That's why there's demand for
> > a different packaging of Python and this is what Moshe's
> > PEP 206 addresses:
> >
> >       http://python.sourceforge.net/peps/pep-0206.html
> 
> Muttering "PEP 206" evades the fundamental problem rather than solving it.
> 
> Not that I'm saying Moshe hasn't made a valiant effort, within the political
> constraint that the BDFL and others seem unwilling to confront the deeper
> issue.  But PEP 206 is not enough.  Here is why:
> 
> 1. If the "Sumo" packaging ever happens, the vanilla non-Sumo version that
> Guido issues will quickly become of mostly theoretical interest -- because
> Red Hat and everybody else will move to Sumo instantly, figuring they have
> nothing to lose by including more features.
> 
> 2. If by some change I'm wrong about 1, the outcome will be worse;
> we'll in effect have fragmented the language, because there won't be
> consistency in what library stuff is available between Sumo and
> non-Sumo builds on the same platform.
> 
> 3. There are documentation issues as well.  It's already a blot on
> Python that the standard documentation set doesn't cover Tkinter.  In
> the Sumo distribution, the gap between what's installed and what's
> documented is likely to widen further.  Developers will see this as
> pointlessly irritating -- and they'll be right.
> 
> The stock distribution should *be* the Sumo distribution.  If we're really
> so terrified of the extra maintainence load, then the right fix is to
> mark some modules and documentation as "externally maintained" with
> prominent pointers back to the responsible people.

That's your POV, others think different and since this is not
a democracy, the Sumo distribution is a feasable way of satisfying
both needs.

There are a few other issues to consider as well:

* licensing is a problem (and this is also mentioned in the PEP 206)
  since some of the nicer additions are GPLed and thus not
  in the spirit of Python's closed-source friendliness which
  has provided it with a large user base in the commercial field

* packages authors are not all the same and some may not want
  to split their distribution due to the integration of their
  package in a Sumo-distribution

* the packages mentioned in PEP 206 are very complex and usually
  largish; maintaining them will cause much more effort compared
  to the standard lib modules and extensions

* the build process varies widely between packages; even though
  we have distutils, some of the packages extend it to fit
  their specific needs (which is OK, but causes extra efforts
  in getting the build process combined)

I'm not objecting to the Sumo-distribution project; to the 
contrary -- I tried a similar project a few years ago:
the Python PowerTools distribution which you can download
from:

	http://www.lemburg.com/python/PowerTools-0.2.zip

The project died quickly though, as I wasn't able to keep
up with the maintenance effort.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From akuchlin@mems-exchange.org  Tue Jan 23 13:40:06 2001
From: akuchlin@mems-exchange.org (Andrew Kuchling)
Date: Tue, 23 Jan 2001 08:40:06 -0500
Subject: [Python-Dev] What does "batteries are included" mean?
In-Reply-To: <3A6D7D89.A6BE1B74@lemburg.com>; from mal@lemburg.com on Tue, Jan 23, 2001 at 01:48:09PM +0100
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com> <3A6D700F.7A9E2509@lemburg.com> <20010123072046.A25593@thyrsus.com> <3A6D7D89.A6BE1B74@lemburg.com>
Message-ID: <20010123084006.A23485@newcnri.cnri.reston.va.us>

On Tue, Jan 23, 2001 at 01:48:09PM +0100, M.-A. Lemburg wrote:
>There are a few other issues to consider as well:
>   <good list deleted>

To add a few:

* The larger the amount of code in the distribution, the more effort it is
  maintain it all.

* Minor fixes aren't available until the next Python release.  For example,
  to drag out the XML code again: there have been two PyXML releases since
  Python 2.0 fixing various bugs, but someone who sticks to installing just 
  Python will not be able to get at those bugfixes until April (when 2.1
  is supposed to get finalized). 

If there were a core Python distribution and a sumo distribution, and the
sumo distribution was the one that most people downloaded and used, that
would be perfectly OK.  Practically no one assembles their own Linux
distribution, and that's not considered a problem.  To some degree, if
you're using a well-packaged Linux distribution such as Debian, you also
have Python distribution mechanism with intermodule dependencies; we just
have to reinvent the wheel for people on other platforms.

>The project died quickly though, as I wasn't able to keep
>up with the maintenance effort.

Interesting.  Did you get much feedback indicating that people used it much?
Perhaps when you were doing that effort the Python community was composed
more of self-reliant early adopter types; there are probably more newbies
around now.

--amk


From mal@lemburg.com  Tue Jan 23 14:05:13 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 15:05:13 +0100
Subject: [Python-Dev] What does "batteries are included" mean?
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com> <3A6D700F.7A9E2509@lemburg.com> <20010123072046.A25593@thyrsus.com> <3A6D7D89.A6BE1B74@lemburg.com> <20010123084006.A23485@newcnri.cnri.reston.va.us>
Message-ID: <3A6D8F99.53A0F411@lemburg.com>

Andrew Kuchling wrote:
> 
> On Tue, Jan 23, 2001 at 01:48:09PM +0100, M.-A. Lemburg wrote:
> >There are a few other issues to consider as well:
> >   <good list deleted>
> 
> To add a few:
> 
> * The larger the amount of code in the distribution, the more effort it is
>   maintain it all.
> 
> * Minor fixes aren't available until the next Python release.  For example,
>   to drag out the XML code again: there have been two PyXML releases since
>   Python 2.0 fixing various bugs, but someone who sticks to installing just
>   Python will not be able to get at those bugfixes until April (when 2.1
>   is supposed to get finalized).
> 
> If there were a core Python distribution and a sumo distribution, and the
> sumo distribution was the one that most people downloaded and used, that
> would be perfectly OK.  Practically no one assembles their own Linux
> distribution, and that's not considered a problem.  To some degree, if
> you're using a well-packaged Linux distribution such as Debian, you also
> have Python distribution mechanism with intermodule dependencies; we just
> have to reinvent the wheel for people on other platforms.
> 
> >The project died quickly though, as I wasn't able to keep
> >up with the maintenance effort.
> 
> Interesting.  Did you get much feedback indicating that people used it much?

Not much -- the interested parties were mostly Python experts (the
lib started out as a project called expert-lib).

> Perhaps when you were doing that effort the Python community was composed
> more of self-reliant early adopter types; there are probably more newbies
> around now.

True. The included packages are dated 1997-1998 -- at that time
Starship was just starting to get off the ground (this are moving
at a much faster pace now).

The PowerTools package still uses the Makefile.pre.in mechanism
(with much success though) as distutils wasn't even considered
at the time. Perhaps Moshe could pick this up to have a head
start for Sumo-Python ?!

Some of the included packages are not available elsewhere, AFAIK,
so it may well be worthwhile having a look (e.g. the LGPLed trie and
btree implementations donated by John W. M. Stevens).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@digicool.com  Tue Jan 23 14:06:47 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 09:06:47 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: Your message of "Tue, 23 Jan 2001 04:17:30 EST."
 <20010123041730.A25165@thyrsus.com>
References: <20010123041730.A25165@thyrsus.com>
Message-ID: <200101231406.JAA04765@cj20424-a.reston1.va.home.com>

> Ping's complaints are justified -- I've been looking at and testing
> webbrowser.py and it's a mess.  Among other things:
> 
> 1. The BROWSER variable is not interpreted properly.
> 
> 2. The code is stupid about loading platform support it doesn't need.
> 
> 3. It's not possible to specify lynx as a browser under Unix, because the
>    computation of available browsers is split in two and partly done inside
>    the CommandLineBrowser class.
> 
> 3. The module code is excessively hard to read, obscuring these bugs.
> 
> Our mistake was hurriedly merging the launcher code from IDLE with the
> browser-finder hack I wrote (the guts of CommandLineBrowser).  The resulting
> code is a bad, overcomplicated architecture with a nasty seam in it.
> 
> As co-designer/implementor I should have caught this sooner, but I was
> in a hurry to get a CML2 prototype out the door and didn't test
> anything but the case I needed.  My apologies to all.
> 
> I'm rewriting to fix these problems now.  Documented semantics of entry
> points will be preserved.

Excellent, Eric!  That's the spirit.

Can you point me to docs explaining the meaning of the BROWSER
environment variable?  I've never heard of it...  The last new
environment variables I learned were PAGER and EDITOR, probably 15
years ago when 4.1BSD was released... :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@thyrsus.com  Tue Jan 23 14:22:26 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 09:22:26 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101231406.JAA04765@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 09:06:47AM -0500
References: <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com>
Message-ID: <20010123092226.A25968@thyrsus.com>

Guido van Rossum <guido@digicool.com>:
> Can you point me to docs explaining the meaning of the BROWSER
> environment variable?  I've never heard of it...  The last new
> environment variables I learned were PAGER and EDITOR, probably 15
> years ago when 4.1BSD was released... :-)

You've never heard of BROWSER because I invented it and have not
widely popularized it yet :-).  Ping knew about it either because he
read the module code and saw that it was supposed to work, or because
he remembered the design discussion when webbrowser.py was first
implemented.

I've had conversations with some key Perl and Tcl people (Larry Wall,
Tom Christiansen, Clif Flynt) about the BROWSER convention, and they
agree it's a good idea.  I'll probably hack support for it into Perl's
browser launcher next.

It's documented in the version of libwebbrowser.tex now in the CVS tree.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Power concedes nothing without a demand. It never did, and it never will.
Find out just what people will submit to, and you have found out the exact
amount of injustice and wrong which will be imposed upon them; and these will
continue until they are resisted with either words or blows, or with both.
The limits of tyrants are prescribed by the endurance of those whom they
oppress.
	-- Frederick Douglass, August 4, 1857


From nas@arctrix.com  Tue Jan 23 08:30:56 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Tue, 23 Jan 2001 00:30:56 -0800
Subject: [Python-Dev] Does autoconfig detect INSTALL incorrectly?
Message-ID: <20010123003056.A28309@glacier.fnational.com>

Why is the configure.in file set to always use "install-sh"?
There is a comment that says:

    # Install just never works :-(

I don't think that statement is accurate.  /usr/bin/install works
quite well on my machine.  The only commments I can find in the
changelog are:

    revision 1.16
    date: 1995/01/20 14:12:16;  author: guido;  state: Exp;  lines: +27 -2
    add INSTALL_PROGRAM and INSTALL_DATA; check for getopt

and:

    revision 1.5
    date: 1994/08/19 15:33:51;  author: guido;  state: Exp;  lines: +14 -6
    Simplify value of INSTALL (always 'cp').

Is there any reason why the autoconf macro AC_PROG_INSTALL is not used?  The
documentation seems to indicate that is does what we want.

 Neil


From guido@digicool.com  Tue Jan 23 15:31:39 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 10:31:39 -0500
Subject: [Python-Dev] Is X a (sequence|mapping)?
In-Reply-To: Your message of "Tue, 23 Jan 2001 10:15:11 +0100."
 <3A6D4B9F.38B17046@lemburg.com>
References: <200101230814.f0N8EWQ00849@mira.informatik.hu-berlin.de>
 <3A6D4B9F.38B17046@lemburg.com>
Message-ID: <200101231531.KAA05122@cj20424-a.reston1.va.home.com>

> Polymorphic code will usually get you more out of an 
> algorithm, than type-safe or interface-safe code.

Right.

But there are times when people want to write methods that take
e.g. either a sequence or a mapping, and need to distinguish between
the two.  That's not easy in Python!  Java and C++ support it very
well though, and thus we'll always keep seeing this kind of
complaint.  Not sure what to do, except to recommend "find out which
methods you expect in one case but not in the other (e.g. keys()) and
do a hasattr() test for that."

> BTW, there are Python interfaces to PySequence_Check() and
> PyMapping_Check() burried in the builtin operator module in case
> you really do care ;) ...
> 
> 	operator.isSequenceType()
> 	operator.isMappingType()
> 	+ some other C style _Check() APIs
> 
> These only look at the type slots though, so Python instances
> will appear to support everything but when used fail with
> an exception if they don't provide the proper __xxx__ hooks.

Yes, these should probably be deprecated.  I certainly have never used
them!  (The operator module doesn't seem to get much use in
general...  Was it a bad idea?)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Tue Jan 23 15:49:23 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 10:49:23 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Mon, 22 Jan 2001 15:13:09 EST."
 <20010122151309.C15236@thyrsus.com>
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com>
 <20010122151309.C15236@thyrsus.com>
Message-ID: <200101231549.KAA05172@cj20424-a.reston1.va.home.com>

> I've just read the PEP.  Greg's proposal has a couple of problems.
> The biggest one is that the interface design isn't very Pythonic --
> it's formally adequate, but doesn't exploit the extent to which sets
> naturally have common semantics with existing Python sequence types.
> This is bad; it means that a lot of code that could otherwise ignore
> the difference between lists and sets would have to be specialized 
> one way or the other for no good reason.

Actually, I thought that Greg's proposal has some charm: it seems to
be using a natural extension of the existing dictionary syntax, where
a set is a dictionary without the values.  I haven't thought about
this deeply enough, but I see a lot of potential here.

I understand that you have probably given this more thought than I
have recently, so I'd like to see your more detailed analysis of what
you do and don't like about Greg's proposal!

> The only other set module I can find in the Vaults or anywhere else is
> kjBuckets (which I knew about before).  Looks like a good design, but
> complicated -- and requires installation of an extension.
> 
> > If *your* set module is ready for prime time, why not publish it in
> > the Vaults of Parnassus?
> 
> I suppose that's what I'll do if you don't bless it for the standard
> library.  But here are the reasons I suggest you should do so:
> 
> 1. It supports a set of operations that are both often useful and
> fiddly to get right, thus enhancing the "batteries are included"
> effect.  (I used its ancestor for representing seen-message numbers in
> a specialized mailreader, for example.)

I haven't read your docs yet (and no time because Digital Creations is
requiring my attention all of today), but I expect that designing a
universal set type, one that is good enough to be used in all sorts of
applications, is very difficult.  

> 2. It's simple for application programmers to use.  No extension module
> to integrate.

This is a silly argument for wanting something to be added to the
core.  If it's part of the core, the need for an extension is
immaterial because that extension will always be available.  So
I conclude that your module is set up perfectly for a popular module
in the Vaults. :-)

> 3. It's unsurprising.  My set objects behave almost exactly like other
> mutable sequences, with all the same built-in methods working, except for 
> the fact that you can't introduce duplicates with the mutators.

Ah, so you see a set as an extension of a sequence.  That may be the
big rift between your version and Greg's PEP: are sets more like
sequences or more like dictionaries?

> 4. It's already completely documented in a form suitable for the library.

Much appreciated.

> 5. It's simple enough not to cause you maintainance hassles down the
> road, and even if it did the maintainer is unlikely to disappear :-).

I'll be the judge of that, and since you prefer not to show your
source code (why is that?), I can't tell yet.

[...time flows...]

Having just skimmed your docs, I'm disappointed that you choose lists
as your fundamental representation type -- this makes it slow to test
for membership and hence makes intersection and union slow.  I suppose
that you have evidence from using this that those operations aren't
used much, or not for large sets?  This is one of the problems with
coming up with a set type for the core: it has to work for (nearly)
everybody.  It's no big deal if the Vaults contain three or more set
modules -- perfect even, people can choose the best one for their
purpose.  But in the core, there's only room for one set type or
module.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@thyrsus.com  Tue Jan 23 16:30:50 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 11:30:50 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101231549.KAA05172@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 10:49:23AM -0500
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <200101231549.KAA05172@cj20424-a.reston1.va.home.com>
Message-ID: <20010123113050.A26162@thyrsus.com>

--tKW2IUtsqtDRztdT
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Guido van Rossum <guido@digicool.com>: 
> I understand that you have probably given this more thought than I
> have recently, so I'd like to see your more detailed analysis of what
> you do and don't like about Greg's proposal!

I've already covered my big objection, the fact that it doesn't
support the degree of polymorphic crossover one might expect with
sequence types (and Greg has agreed that I have a point there).
Another problem is the lack of support for mutable elements (and yes,
I'm quite aware of the problems with this.)

One thing I do like is the proposal for an actual set input syntax.  Of course
this would require that the set type become one of the builtins, with 
compiler support.

> I haven't read your docs yet (and no time because Digital Creations is
> requiring my attention all of today), but I expect that designing a
> universal set type, one that is good enough to be used in all sorts of
> applications, is very difficult.  

For "difficult" read "can't be done".  This is one of those cases where
no matter what implementation you choose, some of the operations you want
to be cheap will be worst-case quadratic.  Life is like that.  So I chose
a dead-simple representation and accepted quadratic times for 
union/intersection.

> > 2. It's simple for application programmers to use.  No extension module
> > to integrate.
> 
> This is a silly argument for wanting something to be added to the
> core.  If it's part of the core, the need for an extension is
> immaterial because that extension will always be available.  So
> I conclude that your module is set up perfectly for a popular module
> in the Vaults. :-)

Reasonable point.
 
> > 3. It's unsurprising.  My set objects behave almost exactly like other
> > mutable sequences, with all the same built-in methods working, except for 
> > the fact that you can't introduce duplicates with the mutators.
> 
> Ah, so you see a set as an extension of a sequence.  That may be the
> big rift between your version and Greg's PEP: are sets more like
> sequences or more like dictionaries?

Indeed it is.  

> > 5. It's simple enough not to cause you maintainance hassles down the
> > road, and even if it did the maintainer is unlikely to disappear :-).
> 
> I'll be the judge of that, and since you prefer not to show your
> source code (why is that?), I can't tell yet.

No nefarious concealment going on here here :-), I've sent versions of
the code to Greg and Ping already.  I'll shoot you a copy too.
 
> Having just skimmed your docs, I'm disappointed that you choose lists
> as your fundamental representation type -- this makes it slow to test
> for membership and hence makes intersection and union slow.

Not quite.  Membership test is still linear-time; so is adding and deleting
elements.  It's true that union and intersection are quadratic, but see below.

>                                                      I suppose
> that you have evidence from using this that those operations aren't
> used much, or not for large sets?

Exactly!  In my experience the usage pattern of a class like this runs
heavily to small sets (usually < 64 elements); membership tests
dominate usage, with addition and deletion of elements running second
and the "classical" boolean operations like union and intersection
being uncommon.

What you get by going with a dictionary representation is that
membership test becomes close to constant-time, while insertion and
deletion become sometimes cheap and sometimes quite expensive
(depending of course on whether you have to allocate a new 
hash bucket).  Given the usage pattern I described, the overall
difference in performance is marginal.

>                              This is one of the problems with
> coming up with a set type for the core: it has to work for (nearly)
> everybody.

As I pointed out above (and someone else on the list had made the same point
earlier), "works for everbody" isn't really possible here.  So my solution
does the next best thing -- pick a choice of tradeoffs that isn't obviously
worse than the alternatives and keeps things bog-simple.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Alcohol still kills more people every year than all `illegal' drugs put
together, and Prohibition only made it worse.  Oppose the War On Some Drugs!

--tKW2IUtsqtDRztdT
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="set.py"

"""
A set-algebra module for Python.

The functions work on any sequence type and return lists.
The set methods can take a set or any sequence type as an argument.
They are insensitive to the types of the elements.

Lists are used rather than dictionaries so the elements can be mutable.

"""
# Design and implementation by ESR, January 2001.

def setify(list1):		# Used by set constructor
    "Remove duplicates in sequence."
    res = []
    for i in range(len(list1)):
	duplicate = 0
        for j in range(i):
	    if list1[i] == list1[j]:
		duplicate = 1
		break
	if not duplicate:
	    res.append(list1[i])
    return res

def union(list1, list2):		# Used for |
    "Compute set intersection of sequences."
    res = list1[:]
    for x in list2:
	if not x in list1:
	    res.append(x)
    return res

def intersection(list1, list2):		# Used for &
    "Compute set intersection of sequences."
    res = []
    for x in list1:
	if x in list2:
	    res.append(x)
    return res

def difference(list1, list2):		# Used for -
    "Compute set difference of sequences."
    res = []
    for x in list1:
	if not x in list2:
	    res.append(x)
    return res

def symmetric_difference(list1, list2):	# Used for ^
    "Compute set symmetric-difference of sequences."
    res = []
    for x in list1:
	if not x in list2:
	    res.append(x)
    for x in list2:
	if not x in list1:
	    res.append(x)
    return res

def cartesian(list1, list2):		# Used for *
    "Cartesian product of sequences considered as sets."
    res = []
    for x in list1:
	for y in list2:
	    res.append((x,y))
    return res

def equality(list1, list2):
    "Test sequences considered as sets for equality."
    if len(list1) != len(list2):
        return 0
    for x in list1:
        if not x in list2:
            return 0
    for x in list2:
        if not x in list1:
            return 0
    return 1

def proper_subset(list1, list2):
    "Return 1 if first argument is a proper subset of second, 0 otherwise."
    if not len(list1) < len(list2):
        return 0
    for x in list1:
        if not x in list2:
            return 0
    return 1

def subset(list1, list2):
    "Return 1 if first argument is a subset of second, 0 otherwise."
    if not len(list1) <= len(list2):
        return 0
    for x in list1:
        if not x in list2:
            return 0
    return 1

def powerset(base):
    "Compute the set of all subsets of a set."
    powerset = []
    for n in xrange(2 ** len(base)):
	subset = []
	for e in xrange(len(base)):
	     if n & 2 ** e:
		subset.append(base[e])
	powerset.append(subset)
    return powerset

class set:
    "Lists with set-theoretic operations."

    def __init__(self, value):
        self.elements = setify(value)

    def __len__(self):
	return len(self.elements)

    def __getitem__(self, ind):
	return self.elements[ind]

    def __setitem__(self, ind, val):
        if val not in self.elements:
            self.elements[ind] = val

    def __delitem__(self, ind):
	del self.elements[ind]

    def list(self):
        return self.elements

    def append(self, new):
        if new not in self.elements:
            self.elements.append(new)

    def extend(self, new):
	self.elements.extend(new)
        self.elements = setify(self.elements)

    def count(self, x):
	self.elements.count(x)

    def index(self, x):
	self.elements.index(x)

    def insert(self, i, x):
        if x not in self.elements:
            self.elements.index(i, x)

    def pop(self, i=None):
	self.elements.pop(i)

    def remove(self, x):
	self.elements.remove(x)

    def reverse(self):
	self.elements.reverse()

    def sort(self, cmp=None):
	self.elements.sort(cmp)

    def __or__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(union(self.elements, other))

    __add__ = __or__

    def __and__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(intersection(self.elements, other))

    def __sub__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(difference(self.elements, other))

    def __xor__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(symmetric_difference(self.elements, other))

    def __mul__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(cartesian(self.elements, other))

    def __eq__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return self.elements == other

    def __ne__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return self.elements != other

    def __lt__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return proper_subset(self.elements, other)

    def __le__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return subset(self.elements, other)

    def __gt__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return proper_subset(other, self.elements)

    def __ge__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return subset(other, self.elements)

    def __str__(self):
        res = "{"
        for x in self.elements:
            res = res + str(x) + ", "
        res = res[0:-2] + "}"
        return res

    def __repr__(self):
        return repr(self.elements)

if __name__ == '__main__':
    a = set([1, 2, 3, 4])
    b = set([1, 4])
    c = set([5, 6])
    d = [1, 1, 2, 1]
    print `d`, "setifies to", set(d)
    print `a`, "|", `b`, "is", `a | b`
    print `a`, "^", `b`, "is", `a ^ b`
    print `a`, "&", `b`, "is", `a & b`
    print `b`, "*", `c`, "is", `b * c`
    print `a`, '<', `b`, "is", `a < b`
    print `a`, '>', `b`, "is", `a > b`
    print `b`, '<', `c`, "is", `b < c`
    print `b`, '>', `c`, "is", `b > c`
    print "Power set of", `c`, "is", powerset(c)

# end

--tKW2IUtsqtDRztdT--


From sdm7g@virginia.edu  Tue Jan 23 17:12:22 2001
From: sdm7g@virginia.edu (Steven D. Majewski)
Date: Tue, 23 Jan 2001 12:12:22 -0500 (EST)
Subject: [Python-Dev] libraries=['m'] in config.py [Re: Python 2.1 alpha 1 released!]
In-Reply-To: <200101230333.WAA28376@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.NXT.4.21.0101231204010.227-100000@localhost.virginia.edu>


Is there a simple way (other than editing config.py) to remove the
effect of all of the "libraries=['m']" options from config.py ? 

This breaks the MacOSX build as there's no libm -- that functionality
is build into the System.framework .

Shouldn't these type of flags be acquired from configure or the
make environment somehow ? 

-- Steve Majewski 


( BTW: OSX build also needs a "-traditional-cpp" flag to get thru 
  compiling classobject.c without error. ) 






From uche.ogbuji@fourthought.com  Tue Jan 23 17:28:18 2001
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Tue, 23 Jan 2001 10:28:18 -0700
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing
 (Windows))
In-Reply-To: Message from Martin von Loewis <loewis@informatik.hu-berlin.de>
 of "Mon, 22 Jan 2001 15:46:39 +0100." <200101221446.PAA05164@pandora.informatik.hu-berlin.de>
Message-ID: <200101231728.KAA03408@localhost.localdomain>

> > This has nothing to do with Python. UTF-8 marks the codes 
> > from 128-191 as illegal prefix. 
> [...]
> > Perhaps the parser should catch the UnicodeError and
> > instead return a not-wellformed exception ?!
> 
> Right on both accounts. If no encoding is specified, and if the
> document appears not to be UTF-16 in any endianness, an XML processor
> shall assume it is UTF-8. As Marc-Andre explains, your document is not
> proper UTF-8, hence the error.
> 
> The confusing thing is that expat itself does not care about it not
> being UTF-8; that is only detected when the callback is invoked in
> pyexpat, and therefore conversion to a Unicode object is attempted.

Pyexpat violates the XML spec here.  XML parsers are not allowed to "recover" 
from well-formedness errors.  And I would classify blithley reporting the 
character data as "recovery".

However, I'm amazed that this wouldn't have come up before, considering the 
pedigree of expat.

I'll poke around, and raise a bug on the expat site if need be.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python




From tismer@tismer.com  Tue Jan 23 17:35:08 2001
From: tismer@tismer.com (Christian Tismer)
Date: Tue, 23 Jan 2001 18:35:08 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing
 (Windows))
References: <200101231728.KAA03408@localhost.localdomain>
Message-ID: <3A6DC0CC.C4FF83DF@tismer.com>


uche.ogbuji@fourthought.com wrote:
> 
> > > This has nothing to do with Python. UTF-8 marks the codes
> > > from 128-191 as illegal prefix.
> > [...]
> > > Perhaps the parser should catch the UnicodeError and
> > > instead return a not-wellformed exception ?!
> >
> > Right on both accounts. If no encoding is specified, and if the
> > document appears not to be UTF-16 in any endianness, an XML processor
> > shall assume it is UTF-8. As Marc-Andre explains, your document is not
> > proper UTF-8, hence the error.
> >
> > The confusing thing is that expat itself does not care about it not
> > being UTF-8; that is only detected when the callback is invoked in
> > pyexpat, and therefore conversion to a Unicode object is attempted.
> 
> Pyexpat violates the XML spec here.  XML parsers are not allowed to "recover"
> from well-formedness errors.  And I would classify blithley reporting the
> character data as "recovery".
> 
> However, I'm amazed that this wouldn't have come up before, considering the
> pedigree of expat.

Well, I had to write a preprocessor which turns some "xml-like"
but not well-formed stuff into something useable. This was a
bulk of 100 MB of data, partially hand-written, partially
machine-generated, but not really well-formed. Some
special characters appeared very late in the data set, raising
an error in Python 2.0, but not in 1.5.2, so I perceived
it as an error in the parser first, not the data. :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From uche.ogbuji@fourthought.com  Tue Jan 23 17:55:12 2001
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Tue, 23 Jan 2001 10:55:12 -0700
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing
 (Windows))
In-Reply-To: Message from Christian Tismer <tismer@tismer.com>
 of "Mon, 22 Jan 2001 16:05:24 +0100." <3A6C4C34.4D1252C9@tismer.com>
Message-ID: <200101231755.KAA03471@localhost.localdomain>

> "M.-A. Lemburg" wrote:
> ...
> > > The codes from 192 to 236, 238-243 produce
> > > "UTF-8 decoding error: invalid data",
> > > the rest gives "not well-formed".
> > >
> > > I would like to know if this happens with your (Tim) modified
> > > version as well. I'm using plain vanilla BeOpen Python 2.0 .
> > 
> > This has nothing to do with Python. UTF-8 marks the codes
> > from 128-191 as illegal prefix. See Object/unicodeobject.c:
> ...
> 
> Schade.
> 
> > Perhaps the parser should catch the UnicodeError and
> > instead return a not-wellformed exception ?!
> 
> I belive it would be better.

Yes, and given there is not much time before thr 2.1 release, doing so is an 
acceptable stop-gap.  However, I think the real fix has to lie in expat.

I just had a *very* quick and dirty perusal of expat 1.2 and 1.95.1, and not 
only do the UTF-8 validity checks (at the top of xmltok.c) seem wrong, but it 
doesn't look as if they're ever invoked.

I'll try to some time to look into this more closely, or perhaps someone will 
straighten me out if I'm on the wrong trail.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python




From fredrik@effbot.org  Tue Jan 23 18:03:42 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Tue, 23 Jan 2001 19:03:42 +0100
Subject: [Python-Dev] getting rid of ucnhash
Message-ID: <013901c08566$d2a8f360$e46940d5@hagrid>

It's probably just me, but the names of the two unicode
modules tend to irritate me:

> ls u*.pyd
ucnhash.pyd      unicodedata.pyd

(the former contains names, the latter data)

I've been meaning to rename the former, but I just realized
that it might be better to get rid of it completely, and move
its functionality into the unicodedata module.

The result is a single 200k unicodedata module, which con-
tains the name database as well as two new functions:

    name(character [, default]) => map unicode
    character to name.  if the name doesn't exist,
    return the default object, or raise ValueError.

    lookup(name) => unicode character
    (or raise KeyError if it doesn't exist)

Should I check it in now, change the names/semantics and check
it in, or post it to sourceforge?

Cheers /F




From uche.ogbuji@fourthought.com  Tue Jan 23 18:00:19 2001
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Tue, 23 Jan 2001 11:00:19 -0700
Subject: [Python-Dev] I think my set module is ready for prime time;
 comments?
In-Reply-To: Message from "Eric S. Raymond" <esr@thyrsus.com>
 of "Mon, 22 Jan 2001 12:41:59 EST." <20010122124159.A14999@thyrsus.com>
Message-ID: <200101231800.LAA03515@localhost.localdomain>

> \section{\module{set} ---
>          Basic set algebra for Python}

Looks good.  Are you making this available for download?  I could put this to 
experimental use right away (experimental since, IIRC, you are using the new 
rich comparisons).


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python




From uche.ogbuji@fourthought.com  Tue Jan 23 18:16:27 2001
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Tue, 23 Jan 2001 11:16:27 -0700
Subject: [Python-Dev] I think my set module is ready for prime time;
 comments?
In-Reply-To: Message from "Eric S. Raymond" <esr@thyrsus.com>
 of "Mon, 22 Jan 2001 15:13:09 EST." <20010122151309.C15236@thyrsus.com>
Message-ID: <200101231816.LAA03551@localhost.localdomain>

> Guido van Rossum <guido@digicool.com>:
> > There's already a PEP on a set object type, and everybody and their
> > aunt has already implemented a set datatype.

Tim mentioned that he had one, and he also claimed that every other dodder had 
a set class, but the only one listed in the vaults is kjBuckets, which I'm not 
sure is maintained any more.  (Is Aaron Watters hereabouts?)

> I've just read the PEP.  Greg's proposal has a couple of problems.
> The biggest one is that the interface design isn't very Pythonic --
> it's formally adequate, but doesn't exploit the extent to which sets
> naturally have common semantics with existing Python sequence types.
> This is bad; it means that a lot of code that could otherwise ignore
> the difference between lists and sets would have to be specialized 
> one way or the other for no good reason.

IMO, Eric's Set interface is close to perfect.

PEP 218 is interesting, but I'm not sure it's worth slogging through the 
inevitable uproar over an entirely new syntactic construct (the "{}" notation) 
before getting something as useful as a set class into the standard library.


> > If *your* set module is ready for prime time, why not publish it in
> > the Vaults of Parnassus?
> 
> I suppose that's what I'll do if you don't bless it for the standard
> library.  But here are the reasons I suggest you should do so:

For what it's worth, I'm +1 on adding this to the standard library.  I've seen 
so many set hacks with dictionaries (memory ouch) and list hacks (speed ouch) 
in Python code out there, that I'm convinced it would meet much more common 
usage than, say zlib, xdr, or even expat.

On this hacker list everyone's aunt might whip up set extensions on boring 
weekends, but I doubt this describes the overall Python populace.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python




From uche.ogbuji@fourthought.com  Tue Jan 23 18:29:36 2001
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Tue, 23 Jan 2001 11:29:36 -0700
Subject: [Python-Dev] I think my set module is ready for prime time;
 comments?
In-Reply-To: Message from "M.-A. Lemburg" <mal@lemburg.com>
 of "Tue, 23 Jan 2001 11:26:16 +0100." <3A6D5C48.A076DA0@lemburg.com>
Message-ID: <200101231829.LAA03575@localhost.localdomain>

> All very well, but are sets really that essential to every
> day Python programming ?

Not everyday, but as I said, the standard library has zlib, expat, tkinter, 
colorsys, and a whole lot of other stuff that is undoubtedly less useful than 
a set class.

> If we include sets then we ought to
> also include graphs, tries, btrees

I see all of these as far less commonly useful than sets (at least in 
situations where implementations using existing data structures won't suffice).

I run into needs for sets all the time.  I don't have as much trouble with 
your other examples, though I've always considered tries as a possible 
performance boost in XPath.  Oddly enough another data structure I often wish 
I had is a splay tree, and I hope to wrap my old C++ splay tree implementation 
for Python one of these days.

> and all those other goodies
> we have in computer science. All of these types are available
> out there, but I believe the audience who really cares for these
> types is also capable of downloading the extensions and installing
> them.
> 
> It would be nice if all of these extension could go into a SUMO
> edition of Python though... together with your set module.

Considering "batteries included", it's worth considering these very important 
"batteries".


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python




From skip@mojam.com (Skip Montanaro)  Tue Jan 23 18:35:04 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 23 Jan 2001 12:35:04 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,1.3,1.4
In-Reply-To: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net>
References: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <14957.52952.48739.53360@beluga.mojam.com>

    Guido> - Use "exec ... in dict" to avoid having to walk on eggshells;
    Guido>   locals no don't have to start with underscore.

Thanks.  I have just been incredibly short on time lately.

    Guido> - Only test dbhash if bsddb can be imported.  (Wonder if there
    Guido>   are more like this?)

Alpha testing should pick those up, yes? ;-)

    Guido> ! try:
    Guido> !     import bsddb
    Guido> ! except ImportError:
    Guido> !     if verbose:
    Guido> !         print "can't import bsddb, so skipping dbhash"
    Guido> ! else:
    Guido> !     check_all("dbhash")

Instead of having to know that dbhash includes bsddb, shouldn't dbhash be
the module that's imported here?

Skip


From uche.ogbuji@fourthought.com  Tue Jan 23 18:36:59 2001
From: uche.ogbuji@fourthought.com (uche.ogbuji@fourthought.com)
Date: Tue, 23 Jan 2001 11:36:59 -0700
Subject: [Python-Dev] I think my set module is ready for prime time;
 comments?
In-Reply-To: Message from "Eric S. Raymond" <esr@thyrsus.com>
 of "Tue, 23 Jan 2001 11:30:50 EST." <20010123113050.A26162@thyrsus.com>
Message-ID: <200101231836.LAA03655@localhost.localdomain>

> """
> A set-algebra module for Python.
> 
> The functions work on any sequence type and return lists.
> The set methods can take a set or any sequence type as an argument.
> They are insensitive to the types of the elements.
> 
> Lists are used rather than dictionaries so the elements can be mutable.
> 
> """

Hmm.  I was hoping this was actually a C extension for the performance boost, 
esp. given the number of __foo__ methods in the set class.

Implementation in Python makes my interest in adding it to the standard lib 
more tepid (not to cast the least bit of aspersion on your work).


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji@fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python




From skip@mojam.com (Skip Montanaro)  Tue Jan 23 18:37:44 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Tue, 23 Jan 2001 12:37:44 -0600 (CST)
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: <3A6CBBEF.4732BFF2@ActiveState.com>
References: <14945.59192.400783.403810@beluga.mojam.com>
 <200101142055.PAA13041@cj20424-a.reston1.va.home.com>
 <3A6CBBEF.4732BFF2@ActiveState.com>
Message-ID: <14957.53112.119272.797494@beluga.mojam.com>

    Paul> I apologize but I'm not clear on my responsibilities here, if
    Paul> any. I wrote a PEP for online help. I submitted a partial
    Paul> implementation. 

Perhaps I am the one who should apologize.  I started the thread.  I tried
Ping's code and was simply amazed at how useful it was.  I didn't bother
checking the list of PEPs to see if it overlapped with something there, and
I suspect any discussion of this stuff has taken place in the doc sig, where
I don't hang out.

Skip


From esr@thyrsus.com  Tue Jan 23 18:39:04 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 13:39:04 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101231816.LAA03551@localhost.localdomain>; from uche.ogbuji@fourthought.com on Tue, Jan 23, 2001 at 11:16:27AM -0700
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain>
Message-ID: <20010123133904.B26487@thyrsus.com>

uche.ogbuji@fourthought.com <uche.ogbuji@fourthought.com>:
> I've seen so many set hacks with dictionaries (memory ouch) and list
> hacks (speed ouch) in Python code out there, that I'm convinced it
> would meet much more common usage than, say zlib, xdr, or even
> expat.

Uche brings up a point I meant to make in my reply to Guido.  The dict-
vs.-list choice in set representation is indeed a choice between 
memory ouch and speed ouch.  

I believe most uses of sets are small sets.  That reduces the speed ouch
of using a list representation and increases the proportional memory
ouch of a dictionary implementation.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Question with boldness even the existence of a God; because, if there
be one, he must more approve the homage of reason, than that of
blindfolded fear.... Do not be frightened from this inquiry from any
fear of its consequences. If it ends in the belief that there is no
God, you will find incitements to virtue in the comfort and
pleasantness you feel in its exercise...
	-- Thomas Jefferson, in a 1787 letter to his nephew


From jeremy@alum.mit.edu  Tue Jan 23 18:41:23 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Tue, 23 Jan 2001 13:41:23 -0500 (EST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123113050.A26162@thyrsus.com>
References: <20010122124159.A14999@thyrsus.com>
 <200101221910.OAA01218@cj20424-a.reston1.va.home.com>
 <20010122151309.C15236@thyrsus.com>
 <200101231549.KAA05172@cj20424-a.reston1.va.home.com>
 <20010123113050.A26162@thyrsus.com>
Message-ID: <14957.53331.342827.462297@localhost.localdomain>

--OvJPdPv5cJ
Content-Type: text/plain; charset=us-ascii
Content-Description: message body text
Content-Transfer-Encoding: 7bit

>>>>> "ESR" == Eric S Raymond <esr@thyrsus.com> writes:

  ESR> Guido van Rossum <guido@digicool.com>:
  >> Having just skimmed your docs, I'm disappointed that you choose
  >> lists as your fundamental representation type -- this makes it
  >> slow to test for membership and hence makes intersection and
  >> union slow.

  ESR> Not quite.  Membership test is still linear-time; so is adding
  ESR> and deleting elements.  It's true that union and intersection
  ESR> are quadratic, but see below.

  >> I suppose that you have evidence from using this that those
  >> operations aren't used much, or not for large sets?

  ESR> Exactly!  In my experience the usage pattern of a class like
  ESR> this runs heavily to small sets (usually < 64 elements);
  ESR> membership tests dominate usage, with addition and deletion of
  ESR> elements running second and the "classical" boolean operations
  ESR> like union and intersection being uncommon.

I use a Set type in the compiler package (Tools/compiler/compiler) to
collect the names for a code block.  I implemented a trivial Set type
using a dictionary, because it supported the operations I was most
interested in: addition, membership tests, intersection, and get
elements as sequence (in arbitrary order).  Those are the only
operations the compiler uses.

I think I use sets for this purpose frequently, although I can't think
of any other good examples at the moment.  I usually just use a
dictionary explicitly.  In the compiler, I chose an explicit Set class
with unique method names (add, has_elt, elements) to make it obvious
for readers that I was using a set.

  ESR> What you get by going with a dictionary representation is that
  ESR> membership test becomes close to constant-time, while insertion
  ESR> and deletion become sometimes cheap and sometimes quite
  ESR> expensive (depending of course on whether you have to allocate
  ESR> a new hash bucket).  Given the usage pattern I described, the
  ESR> overall difference in performance is marginal.

The cost of insertion would presumably be dominated by the frequency
of dictionary resizes.  I don't know how often they occur, but I
assume the dictionary type is designed to accommodate efficient
insert.

I did a quick and dirty performance comparison of dictionary-based and
list-based sets.  (I'll include the code below.)  It uses sample data
collected from running the compiler; so it is measuring actual usage.

The tests showed that dictionary-based sets were always faster.  For
small tests (3 operations), the difference was about 10 percent.  For
larger tests (88 operations), the difference ranged from 180 to almost
700 percent.

  >> This is one of the problems with coming up with a set type for
  >> the core: it has to work for (nearly) everybody.

  ESR> As I pointed out above (and someone else on the list had made
  ESR> the same point earlier), "works for everbody" isn't really
  ESR> possible here.  So my solution does the next best thing -- pick
  ESR> a choice of tradeoffs that isn't obviously worse than the
  ESR> alternatives and keeps things bog-simple.

For my applications, the dictionary-based approach is faster and
offers a natural interface.  If a set implementation were included in
the standard library, I would like to see either (1) the
implementation that favors my needs <wink> or (2) multiple
implementations tuned for different uses.  I think it would be just as
easy to make set implementations available separately, though.

Jeremy


--OvJPdPv5cJ
Content-Type: text/plain
Content-Disposition: inline;
	filename="sets.tar"
Content-Transfer-Encoding: base64

c2V0cy8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADAwNDA3NzUA
MDAwMTU1NgAwMDAwNzY1ADAwMDAwMDAwMDAwADA3MjMzMzUwMDA1ADAxMTIxNQAgNQAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB1c3RhciAgAGplcmVt
eQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAYWRtaW4AAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABzZXRzL3Rlc3RzZXQxOC5weQAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAMDEwMDY2NAAwMDAxNTU2ADAwMDA3NjUAMDAwMDAwMDQ2MTQA
MDcyMzMzNDcyNDMAMDEzNDQ3ACAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAHVzdGFyICAAamVyZW15AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABh
ZG1pbgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHNp
emUgPSA4OA0KDQpkZWYgdGVzdChmYWN0b3J5KToNCiAgICBzZXQgPSBmYWN0b3J5KCkNCiAg
ICBzZXQuYWRkKCdvcHRpbWl6ZWQnKQ0KICAgIHNldC5hZGQoJ19faW5pdF9fJykNCiAgICBz
ZXQuYWRkKCdfc2V0dXBHcmFwaERlbGVnYXRpb24nKQ0KICAgIHNldC5hZGQoJ2dldENvZGUn
KQ0KICAgIHNldC5hZGQoJ2lzTG9jYWxOYW1lJykNCiAgICBzZXQuYWRkKCdzdG9yZU5hbWUn
KQ0KICAgIHNldC5hZGQoJ2xvYWROYW1lJykNCiAgICBzZXQuYWRkKCdkZWxOYW1lJykNCiAg
ICBzZXQuYWRkKCdfbmFtZU9wJykNCiAgICBzZXQuYWRkKCdzZXRfbGluZW5vJykNCiAgICBz
ZXQuYWRkKCd2aXNpdE1vZHVsZScpDQogICAgc2V0LmFkZCgndmlzaXRGdW5jdGlvbicpDQog
ICAgc2V0LmFkZCgndmlzaXRMYW1iZGEnKQ0KICAgIHNldC5hZGQoJ192aXNpdEZ1bmNPckxh
bWJkYScpDQogICAgc2V0LmFkZCgndmlzaXRDbGFzcycpDQogICAgc2V0LmFkZCgndmlzaXRJ
ZicpDQogICAgc2V0LmFkZCgndmlzaXRXaGlsZScpDQogICAgc2V0LmFkZCgndmlzaXRGb3In
KQ0KICAgIHNldC5hZGQoJ3Zpc2l0QnJlYWsnKQ0KICAgIHNldC5hZGQoJ3Zpc2l0Q29udGlu
dWUnKQ0KICAgIHNldC5hZGQoJ3Zpc2l0VGVzdCcpDQogICAgc2V0LmFkZCgndmlzaXRBbmQn
KQ0KICAgIHNldC5hZGQoJ3Zpc2l0T3InKQ0KICAgIHNldC5hZGQoJ3Zpc2l0Q29tcGFyZScp
DQogICAgc2V0LmFkZCgnX19saXN0X2NvdW50JykNCiAgICBzZXQuYWRkKCd2aXNpdExpc3RD
b21wJykNCiAgICBzZXQuYWRkKCd2aXNpdExpc3RDb21wRm9yJykNCiAgICBzZXQuYWRkKCd2
aXNpdExpc3RDb21wSWYnKQ0KICAgIHNldC5hZGQoJ3Zpc2l0QXNzZXJ0JykNCiAgICBzZXQu
YWRkKCd2aXNpdFJhaXNlJykNCiAgICBzZXQuYWRkKCd2aXNpdFRyeUV4Y2VwdCcpDQogICAg
c2V0LmFkZCgndmlzaXRUcnlGaW5hbGx5JykNCiAgICBzZXQuYWRkKCd2aXNpdERpc2NhcmQn
KQ0KICAgIHNldC5hZGQoJ3Zpc2l0Q29uc3QnKQ0KICAgIHNldC5hZGQoJ3Zpc2l0S2V5d29y
ZCcpDQogICAgc2V0LmFkZCgndmlzaXRHbG9iYWwnKQ0KICAgIHNldC5hZGQoJ3Zpc2l0TmFt
ZScpDQogICAgc2V0LmFkZCgndmlzaXRQYXNzJykNCiAgICBzZXQuYWRkKCd2aXNpdEltcG9y
dCcpDQogICAgc2V0LmFkZCgndmlzaXRGcm9tJykNCiAgICBzZXQuYWRkKCdfcmVzb2x2ZURv
dHMnKQ0KICAgIHNldC5hZGQoJ3Zpc2l0R2V0YXR0cicpDQogICAgc2V0LmFkZCgndmlzaXRB
c3NpZ24nKQ0KICAgIHNldC5hZGQoJ3Zpc2l0QXNzTmFtZScpDQogICAgc2V0LmFkZCgndmlz
aXRBc3NBdHRyJykNCiAgICBzZXQuYWRkKCdfdmlzaXRBc3NTZXF1ZW5jZScpDQogICAgc2V0
LmFkZCgndmlzaXRBc3NUdXBsZScpDQogICAgc2V0LmFkZCgndmlzaXRBc3NMaXN0JykNCiAg
ICBzZXQuYWRkKCd2aXNpdEFzc1R1cGxlJykNCiAgICBzZXQuYWRkKCd2aXNpdEFzc0xpc3Qn
KQ0KICAgIHNldC5hZGQoJ3Zpc2l0QXVnQXNzaWduJykNCiAgICBzZXQuYWRkKCdfYXVnbWVu
dGVkX29wY29kZScpDQogICAgc2V0LmFkZCgndmlzaXRBdWdOYW1lJykNCiAgICBzZXQuYWRk
KCd2aXNpdEF1Z0dldGF0dHInKQ0KICAgIHNldC5hZGQoJ3Zpc2l0QXVnU2xpY2UnKQ0KICAg
IHNldC5hZGQoJ3Zpc2l0QXVnU3Vic2NyaXB0JykNCiAgICBzZXQuYWRkKCd2aXNpdEV4ZWMn
KQ0KICAgIHNldC5hZGQoJ3Zpc2l0Q2FsbEZ1bmMnKQ0KICAgIHNldC5hZGQoJ3Zpc2l0UHJp
bnQnKQ0KICAgIHNldC5hZGQoJ3Zpc2l0UHJpbnRubCcpDQogICAgc2V0LmFkZCgndmlzaXRS
ZXR1cm4nKQ0KICAgIHNldC5hZGQoJ3Zpc2l0U2xpY2UnKQ0KICAgIHNldC5hZGQoJ3Zpc2l0
U3Vic2NyaXB0JykNCiAgICBzZXQuYWRkKCdiaW5hcnlPcCcpDQogICAgc2V0LmFkZCgndmlz
aXRBZGQnKQ0KICAgIHNldC5hZGQoJ3Zpc2l0U3ViJykNCiAgICBzZXQuYWRkKCd2aXNpdE11
bCcpDQogICAgc2V0LmFkZCgndmlzaXREaXYnKQ0KICAgIHNldC5hZGQoJ3Zpc2l0TW9kJykN
CiAgICBzZXQuYWRkKCd2aXNpdFBvd2VyJykNCiAgICBzZXQuYWRkKCd2aXNpdExlZnRTaGlm
dCcpDQogICAgc2V0LmFkZCgndmlzaXRSaWdodFNoaWZ0JykNCiAgICBzZXQuYWRkKCd1bmFy
eU9wJykNCiAgICBzZXQuYWRkKCd2aXNpdEludmVydCcpDQogICAgc2V0LmFkZCgndmlzaXRV
bmFyeVN1YicpDQogICAgc2V0LmFkZCgndmlzaXRVbmFyeUFkZCcpDQogICAgc2V0LmFkZCgn
dmlzaXRVbmFyeUludmVydCcpDQogICAgc2V0LmFkZCgndmlzaXROb3QnKQ0KICAgIHNldC5h
ZGQoJ3Zpc2l0QmFja3F1b3RlJykNCiAgICBzZXQuYWRkKCdiaXRPcCcpDQogICAgc2V0LmFk
ZCgndmlzaXRCaXRhbmQnKQ0KICAgIHNldC5hZGQoJ3Zpc2l0Qml0b3InKQ0KICAgIHNldC5h
ZGQoJ3Zpc2l0Qml0eG9yJykNCiAgICBzZXQuYWRkKCd2aXNpdEVsbGlwc2lzJykNCiAgICBz
ZXQuYWRkKCd2aXNpdFR1cGxlJykNCiAgICBzZXQuYWRkKCd2aXNpdExpc3QnKQ0KICAgIHNl
dC5hZGQoJ3Zpc2l0U2xpY2VvYmonKQ0KICAgIHNldC5hZGQoJ3Zpc2l0RGljdCcpDQoAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAHNldHMvdGVzdHNldDg4LnB5AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAwMTAw
NjY0ADAwMDE1NTYAMDAwMDc2NQAwMDAwMDAwMDUzNAAwNzIzMzM0NzI0MwAwMTM0NTMAIDAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAdXN0YXIgIABq
ZXJlbXkAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGFkbWluAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAc2l6ZSA9IDEzDQoNCmRlZiB0ZXN0KGZh
Y3RvcnkpOg0KICAgIHNldCA9IGZhY3RvcnkoKQ0KICAgIHNldC5hZGQoJ3NlbGYnKQ0KICAg
IHNldC5hZGQoJ2V4cHInKQ0KICAgIHNldC5hZGQoJ2ZsYWdzJykNCiAgICBzZXQuYWRkKCds
b3dlcicpDQogICAgc2V0LmFkZCgndXBwZXInKQ0KICAgIHNldC5oYXNfZWx0KCdleHByJykN
CiAgICBzZXQuaGFzX2VsdCgnc2VsZicpDQogICAgc2V0Lmhhc19lbHQoJ2ZsYWdzJykNCiAg
ICBzZXQuaGFzX2VsdCgnc2VsZicpDQogICAgc2V0Lmhhc19lbHQoJ2xvd2VyJykNCiAgICBz
ZXQuaGFzX2VsdCgnc2VsZicpDQogICAgc2V0Lmhhc19lbHQoJ3VwcGVyJykNCiAgICBzZXQu
aGFzX2VsdCgnc2VsZicpDQoAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAHNldHMvdGVzdHNldDk4LnB5AAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAwMTAwNjY0ADAwMDE1NTYAMDAwMDc2NQAwMDAwMDAwMDE3NQAwNzIzMzM0
NzI0MwAwMTM0NTUAIDAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAdXN0YXIgIABqZXJlbXkAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGFkbWluAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAc2l6ZSA9IDMN
Cg0KZGVmIHRlc3QoZmFjdG9yeSk6DQogICAgc2V0ID0gZmFjdG9yeSgpDQogICAgc2V0LmFk
ZCgnX19pbml0X18nKQ0KICAgIHNldC5hZGQoJ19nZXRDaGlsZHJlbicpDQogICAgc2V0LmFk
ZCgnX19yZXByX18nKQ0KAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAc2V0cy90aW1lc2V0LnB5AAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAADAxMDA2NjQAMDAwMTU1NgAwMDAwNzY1ADAwMDAwMDAxNDczADA3
MjMzMzQ3NjE0ADAxMzI1NwAgMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAB1c3RhciAgAGplcmVteQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAYWRt
aW4AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABpbXBv
cnQgZXNyc2V0DQppbXBvcnQgamFoc2V0DQppbXBvcnQgb3MNCmltcG9ydCB0aW1lDQoNCmRl
ZiB0aW1laXQoZiwgaXRlcnM9cmFuZ2UoMzAwMCkpOg0KICAgIHQwID0gdGltZS5jbG9jaygp
DQogICAgZm9yIGkgaW4gaXRlcnM6DQogICAgICAgIGYoKQ0KICAgIHQxID0gdGltZS5jbG9j
aygpDQogICAgcmV0dXJuIHQxIC0gdDANCg0KY2xhc3MgZXNyd3JhcChlc3JzZXQuc2V0KToN
CiAgICBkZWYgX19pbml0X18oc2VsZik6DQogICAgICAgIHNlbGYuZWxlbWVudHMgPSBbXQ0K
DQogICAgYWRkID0gZXNyc2V0LnNldC5hcHBlbmQNCg0KICAgIGRlZiBoYXNfZWx0KHNlbGYs
IGVsdCk6DQogICAgICAgIHJldHVybiBlbHQgaW4gc2VsZi5lbGVtZW50cw0KDQogICAgZGVm
IHJlbW92ZShzZWxmLCBlbHQpOg0KICAgICAgICBpID0gc2VsZi5pbmRleChlbHQpDQogICAg
ICAgIGRlbCBzZWxmLmVsZW1lbnRzW2ldDQoNCmRlZiBsaXN0X3Rlc3QoKToNCiAgICBtb2R1
bGUudGVzdChlc3J3cmFwKQ0KDQpkZWYgZGljdF90ZXN0KCk6DQogICAgbW9kdWxlLnRlc3Qo
amFoc2V0LlNldCkNCg0KZm9yIGZpbGUgaW4gb3MubGlzdGRpcigiLiIpOg0KICAgIGlmIG5v
dCBmaWxlLnN0YXJ0c3dpdGgoJ3Rlc3RzZXQnKToNCiAgICAgICAgY29udGludWUNCiAgICBu
YW1lLCBleHQgPSBvcy5wYXRoLnNwbGl0ZXh0KGZpbGUpDQogICAgaWYgZXh0ICE9ICcucHkn
Og0KICAgICAgICBjb250aW51ZQ0KICAgIG1vZHVsZSA9IF9faW1wb3J0X18obmFtZSkNCg0K
ICAgIHByaW50IG5hbWUsIG1vZHVsZS5zaXplDQogICAgcHJpbnQgImRpY3QiLCB0aW1laXQo
ZGljdF90ZXN0KSwgImxpc3QiLCB0aW1laXQobGlzdF90ZXN0KQ0KICAgIHByaW50DQogICAg
DQoAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHNldHMvZXNyc2V0LnB5AAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAwMTAwNjY0ADAwMDE1NTYAMDAwMDc2NQAw
MDAwMDAxMzA0MgAwNzIzMzM0NzI1MwAwMTMxMDQAIDAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAdXN0YXIgIABqZXJlbXkAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAGFkbWluAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAIyBEZXNpZ24gYW5kIGltcGxlbWVudGF0aW9uIGJ5IEVTUiwgSmFudWFyeSAy
MDAxLg0KDQpkZWYgc2V0aWZ5KGxpc3QxKToJCSMgVXNlZCBieSBzZXQgY29uc3RydWN0b3IN
CiAgICAiUmVtb3ZlIGR1cGxpY2F0ZXMgaW4gc2VxdWVuY2UuIg0KICAgIHJlcyA9IFtdDQog
ICAgZm9yIGkgaW4gcmFuZ2UobGVuKGxpc3QxKSk6DQoJZHVwbGljYXRlID0gMA0KICAgICAg
ICBmb3IgaiBpbiByYW5nZShpKToNCgkgICAgaWYgbGlzdDFbaV0gPT0gbGlzdDFbal06DQoJ
CWR1cGxpY2F0ZSA9IDENCgkJYnJlYWsNCglpZiBub3QgZHVwbGljYXRlOg0KCSAgICByZXMu
YXBwZW5kKGxpc3QxW2ldKQ0KICAgIHJldHVybiByZXMNCg0KZGVmIHVuaW9uKGxpc3QxLCBs
aXN0Mik6CQkjIFVzZWQgZm9yIHwNCiAgICAiQ29tcHV0ZSBzZXQgaW50ZXJzZWN0aW9uIG9m
IHNlcXVlbmNlcy4iDQogICAgcmVzID0gbGlzdDFbOl0NCiAgICBmb3IgeCBpbiBsaXN0MjoN
CglpZiBub3QgeCBpbiBsaXN0MToNCgkgICAgcmVzLmFwcGVuZCh4KQ0KICAgIHJldHVybiBy
ZXMNCg0KZGVmIGludGVyc2VjdGlvbihsaXN0MSwgbGlzdDIpOgkJIyBVc2VkIGZvciAmDQog
ICAgIkNvbXB1dGUgc2V0IGludGVyc2VjdGlvbiBvZiBzZXF1ZW5jZXMuIg0KICAgIHJlcyA9
IFtdDQogICAgZm9yIHggaW4gbGlzdDE6DQoJaWYgeCBpbiBsaXN0MjoNCgkgICAgcmVzLmFw
cGVuZCh4KQ0KICAgIHJldHVybiByZXMNCg0KZGVmIGRpZmZlcmVuY2UobGlzdDEsIGxpc3Qy
KToJCSMgVXNlZCBmb3IgLQ0KICAgICJDb21wdXRlIHNldCBkaWZmZXJlbmNlIG9mIHNlcXVl
bmNlcy4iDQogICAgcmVzID0gW10NCiAgICBmb3IgeCBpbiBsaXN0MToNCglpZiBub3QgeCBp
biBsaXN0MjoNCgkgICAgcmVzLmFwcGVuZCh4KQ0KICAgIHJldHVybiByZXMNCg0KZGVmIHN5
bW1ldHJpY19kaWZmZXJlbmNlKGxpc3QxLCBsaXN0Mik6CSMgVXNlZCBmb3IgXg0KICAgICJD
b21wdXRlIHNldCBzeW1tZXRyaWMtZGlmZmVyZW5jZSBvZiBzZXF1ZW5jZXMuIg0KICAgIHJl
cyA9IFtdDQogICAgZm9yIHggaW4gbGlzdDE6DQoJaWYgbm90IHggaW4gbGlzdDI6DQoJICAg
IHJlcy5hcHBlbmQoeCkNCiAgICBmb3IgeCBpbiBsaXN0MjoNCglpZiBub3QgeCBpbiBsaXN0
MToNCgkgICAgcmVzLmFwcGVuZCh4KQ0KICAgIHJldHVybiByZXMNCg0KZGVmIGNhcnRlc2lh
bihsaXN0MSwgbGlzdDIpOgkJIyBVc2VkIGZvciAqDQogICAgIkNhcnRlc2lhbiBwcm9kdWN0
IG9mIHNlcXVlbmNlcyBjb25zaWRlcmVkIGFzIHNldHMuIg0KICAgIHJlcyA9IFtdDQogICAg
Zm9yIHggaW4gbGlzdDE6DQoJZm9yIHkgaW4gbGlzdDI6DQoJICAgIHJlcy5hcHBlbmQoKHgs
eSkpDQogICAgcmV0dXJuIHJlcw0KDQpkZWYgZXF1YWxpdHkobGlzdDEsIGxpc3QyKToNCiAg
ICAiVGVzdCBzZXF1ZW5jZXMgY29uc2lkZXJlZCBhcyBzZXRzIGZvciBlcXVhbGl0eS4iDQog
ICAgaWYgbGVuKGxpc3QxKSAhPSBsZW4obGlzdDIpOg0KICAgICAgICByZXR1cm4gMA0KICAg
IGZvciB4IGluIGxpc3QxOg0KICAgICAgICBpZiBub3QgeCBpbiBsaXN0MjoNCiAgICAgICAg
ICAgIHJldHVybiAwDQogICAgZm9yIHggaW4gbGlzdDI6DQogICAgICAgIGlmIG5vdCB4IGlu
IGxpc3QxOg0KICAgICAgICAgICAgcmV0dXJuIDANCiAgICByZXR1cm4gMQ0KDQpkZWYgcHJv
cGVyX3N1YnNldChsaXN0MSwgbGlzdDIpOg0KICAgICJSZXR1cm4gMSBpZiBmaXJzdCBhcmd1
bWVudCBpcyBhIHByb3BlciBzdWJzZXQgb2Ygc2Vjb25kLCAwIG90aGVyd2lzZS4iDQogICAg
aWYgbm90IGxlbihsaXN0MSkgPCBsZW4obGlzdDIpOg0KICAgICAgICByZXR1cm4gMA0KICAg
IGZvciB4IGluIGxpc3QxOg0KICAgICAgICBpZiBub3QgeCBpbiBsaXN0MjoNCiAgICAgICAg
ICAgIHJldHVybiAwDQogICAgcmV0dXJuIDENCg0KZGVmIHN1YnNldChsaXN0MSwgbGlzdDIp
Og0KICAgICJSZXR1cm4gMSBpZiBmaXJzdCBhcmd1bWVudCBpcyBhIHN1YnNldCBvZiBzZWNv
bmQsIDAgb3RoZXJ3aXNlLiINCiAgICBpZiBub3QgbGVuKGxpc3QxKSA8PSBsZW4obGlzdDIp
Og0KICAgICAgICByZXR1cm4gMA0KICAgIGZvciB4IGluIGxpc3QxOg0KICAgICAgICBpZiBu
b3QgeCBpbiBsaXN0MjoNCiAgICAgICAgICAgIHJldHVybiAwDQogICAgcmV0dXJuIDENCg0K
ZGVmIHBvd2Vyc2V0KGJhc2UpOg0KICAgICJDb21wdXRlIHRoZSBzZXQgb2YgYWxsIHN1YnNl
dHMgb2YgYSBzZXQuIg0KICAgIHBvd2Vyc2V0ID0gW10NCiAgICBmb3IgbiBpbiB4cmFuZ2Uo
MiAqKiBsZW4oYmFzZSkpOg0KCXN1YnNldCA9IFtdDQoJZm9yIGUgaW4geHJhbmdlKGxlbihi
YXNlKSk6DQoJICAgICBpZiBuICYgMiAqKiBlOg0KCQlzdWJzZXQuYXBwZW5kKGJhc2VbZV0p
DQoJcG93ZXJzZXQuYXBwZW5kKHN1YnNldCkNCiAgICByZXR1cm4gcG93ZXJzZXQNCg0KY2xh
c3Mgc2V0Og0KICAgICJMaXN0cyB3aXRoIHNldC10aGVvcmV0aWMgb3BlcmF0aW9ucy4iDQoN
CiAgICBkZWYgX19pbml0X18oc2VsZiwgdmFsdWUpOg0KICAgICAgICBzZWxmLmVsZW1lbnRz
ID0gc2V0aWZ5KHZhbHVlKQ0KDQogICAgZGVmIF9fbGVuX18oc2VsZik6DQoJcmV0dXJuIGxl
bihzZWxmLmVsZW1lbnRzKQ0KDQogICAgZGVmIF9fZ2V0aXRlbV9fKHNlbGYsIGluZCk6DQoJ
cmV0dXJuIHNlbGYuZWxlbWVudHNbaW5kXQ0KDQogICAgZGVmIF9fc2V0aXRlbV9fKHNlbGYs
IGluZCwgdmFsKToNCiAgICAgICAgaWYgdmFsIG5vdCBpbiBzZWxmLmVsZW1lbnRzOg0KICAg
ICAgICAgICAgc2VsZi5lbGVtZW50c1tpbmRdID0gdmFsDQoNCiAgICBkZWYgX19kZWxpdGVt
X18oc2VsZiwgaW5kKToNCglkZWwgc2VsZi5lbGVtZW50c1tpbmRdDQoNCiAgICBkZWYgbGlz
dChzZWxmKToNCiAgICAgICAgcmV0dXJuIHNlbGYuZWxlbWVudHMNCg0KICAgIGRlZiBhcHBl
bmQoc2VsZiwgbmV3KToNCiAgICAgICAgaWYgbmV3IG5vdCBpbiBzZWxmLmVsZW1lbnRzOg0K
ICAgICAgICAgICAgc2VsZi5lbGVtZW50cy5hcHBlbmQobmV3KQ0KDQogICAgZGVmIGV4dGVu
ZChzZWxmLCBuZXcpOg0KCXNlbGYuZWxlbWVudHMuZXh0ZW5kKG5ldykNCiAgICAgICAgc2Vs
Zi5lbGVtZW50cyA9IHNldGlmeShzZWxmLmVsZW1lbnRzKQ0KDQogICAgZGVmIGNvdW50KHNl
bGYsIHgpOg0KCXNlbGYuZWxlbWVudHMuY291bnQoeCkNCg0KICAgIGRlZiBpbmRleChzZWxm
LCB4KToNCglzZWxmLmVsZW1lbnRzLmluZGV4KHgpDQoNCiAgICBkZWYgaW5zZXJ0KHNlbGYs
IGksIHgpOg0KICAgICAgICBpZiB4IG5vdCBpbiBzZWxmLmVsZW1lbnRzOg0KICAgICAgICAg
ICAgc2VsZi5lbGVtZW50cy5pbmRleChpLCB4KQ0KDQogICAgZGVmIHBvcChzZWxmLCBpPU5v
bmUpOg0KCXNlbGYuZWxlbWVudHMucG9wKGkpDQoNCiAgICBkZWYgcmVtb3ZlKHNlbGYsIHgp
Og0KCXNlbGYuZWxlbWVudHMucmVtb3ZlKHgpDQoNCiAgICBkZWYgcmV2ZXJzZShzZWxmKToN
CglzZWxmLmVsZW1lbnRzLnJldmVyc2UoKQ0KDQogICAgZGVmIHNvcnQoc2VsZiwgY21wPU5v
bmUpOg0KCXNlbGYuZWxlbWVudHMuc29ydChjbXApDQoNCiAgICBkZWYgX19vcl9fKHNlbGYs
IG90aGVyKToNCglpZiB0eXBlKG90aGVyKSA9PSB0eXBlKHNlbGYpOg0KCSAgICBvdGhlciA9
IG90aGVyLmVsZW1lbnRzDQogICAgICAgIHJldHVybiBzZXQodW5pb24oc2VsZi5lbGVtZW50
cywgb3RoZXIpKQ0KDQogICAgX19hZGRfXyA9IF9fb3JfXw0KDQogICAgZGVmIF9fYW5kX18o
c2VsZiwgb3RoZXIpOg0KCWlmIHR5cGUob3RoZXIpID09IHR5cGUoc2VsZik6DQoJICAgIG90
aGVyID0gb3RoZXIuZWxlbWVudHMNCiAgICAgICAgcmV0dXJuIHNldChpbnRlcnNlY3Rpb24o
c2VsZi5lbGVtZW50cywgb3RoZXIpKQ0KDQogICAgZGVmIF9fc3ViX18oc2VsZiwgb3RoZXIp
Og0KCWlmIHR5cGUob3RoZXIpID09IHR5cGUoc2VsZik6DQoJICAgIG90aGVyID0gb3RoZXIu
ZWxlbWVudHMNCiAgICAgICAgcmV0dXJuIHNldChkaWZmZXJlbmNlKHNlbGYuZWxlbWVudHMs
IG90aGVyKSkNCg0KICAgIGRlZiBfX3hvcl9fKHNlbGYsIG90aGVyKToNCglpZiB0eXBlKG90
aGVyKSA9PSB0eXBlKHNlbGYpOg0KCSAgICBvdGhlciA9IG90aGVyLmVsZW1lbnRzDQogICAg
ICAgIHJldHVybiBzZXQoc3ltbWV0cmljX2RpZmZlcmVuY2Uoc2VsZi5lbGVtZW50cywgb3Ro
ZXIpKQ0KDQogICAgZGVmIF9fbXVsX18oc2VsZiwgb3RoZXIpOg0KCWlmIHR5cGUob3RoZXIp
ID09IHR5cGUoc2VsZik6DQoJICAgIG90aGVyID0gb3RoZXIuZWxlbWVudHMNCiAgICAgICAg
cmV0dXJuIHNldChjYXJ0ZXNpYW4oc2VsZi5lbGVtZW50cywgb3RoZXIpKQ0KDQogICAgZGVm
IF9fZXFfXyhzZWxmLCBvdGhlcik6DQoJaWYgdHlwZShvdGhlcikgPT0gdHlwZShzZWxmKToN
CgkgICAgb3RoZXIgPSBvdGhlci5lbGVtZW50cw0KICAgICAgICByZXR1cm4gc2VsZi5lbGVt
ZW50cyA9PSBvdGhlcg0KDQogICAgZGVmIF9fbmVfXyhzZWxmLCBvdGhlcik6DQoJaWYgdHlw
ZShvdGhlcikgPT0gdHlwZShzZWxmKToNCgkgICAgb3RoZXIgPSBvdGhlci5lbGVtZW50cw0K
ICAgICAgICByZXR1cm4gc2VsZi5lbGVtZW50cyAhPSBvdGhlcg0KDQogICAgZGVmIF9fbHRf
XyhzZWxmLCBvdGhlcik6DQoJaWYgdHlwZShvdGhlcikgPT0gdHlwZShzZWxmKToNCgkgICAg
b3RoZXIgPSBvdGhlci5lbGVtZW50cw0KICAgICAgICByZXR1cm4gcHJvcGVyX3N1YnNldChz
ZWxmLmVsZW1lbnRzLCBvdGhlcikNCg0KICAgIGRlZiBfX2xlX18oc2VsZiwgb3RoZXIpOg0K
CWlmIHR5cGUob3RoZXIpID09IHR5cGUoc2VsZik6DQoJICAgIG90aGVyID0gb3RoZXIuZWxl
bWVudHMNCiAgICAgICAgcmV0dXJuIHN1YnNldChzZWxmLmVsZW1lbnRzLCBvdGhlcikNCg0K
ICAgIGRlZiBfX2d0X18oc2VsZiwgb3RoZXIpOg0KCWlmIHR5cGUob3RoZXIpID09IHR5cGUo
c2VsZik6DQoJICAgIG90aGVyID0gb3RoZXIuZWxlbWVudHMNCiAgICAgICAgcmV0dXJuIHBy
b3Blcl9zdWJzZXQob3RoZXIsIHNlbGYuZWxlbWVudHMpDQoNCiAgICBkZWYgX19nZV9fKHNl
bGYsIG90aGVyKToNCglpZiB0eXBlKG90aGVyKSA9PSB0eXBlKHNlbGYpOg0KCSAgICBvdGhl
ciA9IG90aGVyLmVsZW1lbnRzDQogICAgICAgIHJldHVybiBzdWJzZXQob3RoZXIsIHNlbGYu
ZWxlbWVudHMpDQoNCiAgICBkZWYgX19zdHJfXyhzZWxmKToNCiAgICAgICAgcmVzID0gInsi
DQogICAgICAgIGZvciB4IGluIHNlbGYuZWxlbWVudHM6DQogICAgICAgICAgICByZXMgPSBy
ZXMgKyBzdHIoeCkgKyAiLCAiDQogICAgICAgIHJlcyA9IHJlc1swOi0yXSArICJ9Ig0KICAg
ICAgICByZXR1cm4gcmVzDQoNCiAgICBkZWYgX19yZXByX18oc2VsZik6DQogICAgICAgIHJl
dHVybiByZXByKHNlbGYuZWxlbWVudHMpDQoNCmlmIF9fbmFtZV9fID09ICdfX21haW5fXyc6
DQogICAgYSA9IHNldChbMSwgMiwgMywgNF0pDQogICAgYiA9IHNldChbMSwgNF0pDQogICAg
YyA9IHNldChbNSwgNl0pDQogICAgZCA9IFsxLCAxLCAyLCAxXQ0KICAgIHByaW50IGBkYCwg
InNldGlmaWVzIHRvIiwgc2V0KGQpDQogICAgcHJpbnQgYGFgLCAifCIsIGBiYCwgImlzIiwg
YGEgfCBiYA0KICAgIHByaW50IGBhYCwgIl4iLCBgYmAsICJpcyIsIGBhIF4gYmANCiAgICBw
cmludCBgYWAsICImIiwgYGJgLCAiaXMiLCBgYSAmIGJgDQogICAgcHJpbnQgYGJgLCAiKiIs
IGBjYCwgImlzIiwgYGIgKiBjYA0KICAgIHByaW50IGBhYCwgJzwnLCBgYmAsICJpcyIsIGBh
IDwgYmANCiAgICBwcmludCBgYWAsICc+JywgYGJgLCAiaXMiLCBgYSA+IGJgDQogICAgcHJp
bnQgYGJgLCAnPCcsIGBjYCwgImlzIiwgYGIgPCBjYA0KICAgIHByaW50IGBiYCwgJz4nLCBg
Y2AsICJpcyIsIGBiID4gY2ANCiAgICBwcmludCAiUG93ZXIgc2V0IG9mIiwgYGNgLCAiaXMi
LCBwb3dlcnNldChjKQ0KDQojIGVuZA0KAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAHNldHMvamFoc2V0LnB5AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAwMTAwNjY0ADAwMDE1NTYAMDAwMDc2NQAwMDAwMDAwMDYwMQAwNzIzMzM0Nzcx
NQAwMTMwNTUAIDAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAdXN0YXIgIABqZXJlbXkAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGFkbWluAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAY2xhc3MgU2V0Og0K
ICAgIGRlZiBfX2luaXRfXyhzZWxmKToNCiAgICAgICAgc2VsZi5lbHRzID0ge30NCiMjICAg
ICAgICBzZXRzIGFyZSBmYXN0ZXIgd2hlbiBtZXRob2Qgb3ZlcmhlYWQgaXMgcmVtb3ZlZDoN
CiMjICAgICAgICBzZWxmLmVsZW1lbnRzID0gc2VsZi5lbHRzLmtleXMNCiMjICAgICAgICBz
ZWxmLmhhc19lbHQgPSBzZWxmLmVsdHMuaGFzX2tleQ0KDQogICAgZGVmIGFkZChzZWxmLCBl
bHQpOg0KICAgICAgICBzZWxmLmVsdHNbZWx0XSA9IE5vbmUNCg0KICAgIGRlZiBlbGVtZW50
cyhzZWxmKToNCiAgICAgICAgcmV0dXJuIHNlbGYuZWx0cy5rZXlzKCkNCg0KICAgIGRlZiBo
YXNfZWx0KHNlbGYsIGVsdCk6DQogICAgICAgIHJldHVybiBzZWxmLmVsdHMuaGFzX2tleShl
bHQpDQogICAgDQoAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==

--OvJPdPv5cJ--


From loewis@informatik.hu-berlin.de  Tue Jan 23 18:51:37 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Tue, 23 Jan 2001 19:51:37 +0100 (MET)
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing
 (Windows))
In-Reply-To: <200101231755.KAA03471@localhost.localdomain>
 (uche.ogbuji@fourthought.com)
References: <200101231755.KAA03471@localhost.localdomain>
Message-ID: <200101231851.TAA19488@pandora.informatik.hu-berlin.de>

> I'll try to some time to look into this more closely, or perhaps
> someone will straighten me out if I'm on the wrong trail.

Spending only a little time myself, either, I'd agree with your
conclusions.

Regards,
Martin


From esr@thyrsus.com  Tue Jan 23 18:55:30 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 13:55:30 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <14957.53331.342827.462297@localhost.localdomain>; from jeremy@alum.mit.edu on Tue, Jan 23, 2001 at 01:41:23PM -0500
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <200101231549.KAA05172@cj20424-a.reston1.va.home.com> <20010123113050.A26162@thyrsus.com> <14957.53331.342827.462297@localhost.localdomain>
Message-ID: <20010123135530.A26565@thyrsus.com>

Jeremy Hylton <jeremy@alum.mit.edu>:
Content-Description: message body text
> The tests showed that dictionary-based sets were always faster.  For
> small tests (3 operations), the difference was about 10 percent.  For
> larger tests (88 operations), the difference ranged from 180 to almost
> 700 percent.

Not surprising.  88 elements is getting pretty large.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Hoplophobia (n.): The irrational fear of weapons, correctly described by 
Freud as "a sign of emotional and sexual immaturity".  Hoplophobia, like
homophobia, is a displacement symptom; hoplophobes fear their own
"forbidden" feelings and urges to commit violence.  This would be
harmless, except that they project these feelings onto others.  The
sequelae of this neurosis include irrational and dangerous behaviors
such as passing "gun-control" laws and trashing the Constitution.


From petrilli@amber.org  Tue Jan 23 19:06:05 2001
From: petrilli@amber.org (Christopher Petrilli)
Date: Tue, 23 Jan 2001 14:06:05 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123133904.B26487@thyrsus.com>; from esr@thyrsus.com on Tue, Jan 23, 2001 at 01:39:04PM -0500
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com>
Message-ID: <20010123140604.E18796@trump.amber.org>

Eric S. Raymond [esr@thyrsus.com] wrote:
> I believe most uses of sets are small sets.  That reduces the speed ouch
> of using a list representation and increases the proportional memory
> ouch of a dictionary implementation.

The problem is that there are a lot of uses for large sets, especially 
when you begin to introduce intersections and unions.  If an
implementation is only useful for a few dozen (or a hundered) items in 
the set, that eliminates a lot of places where the real use of set
types is useful---optimizing large scale manipulations.

Zope for example, manipulates sets with 10,000 items in it on a
regular basis when doing text index manipulation.  The data structures 
are heavily optimized for this kind of behaviour, without a major
sacrifice in space.  I think Jim perhaps can talk to this. 

Unfortunately, for me, a Python implementation of Sets is only
interesting academicaly.  Any time I've needed to work with them at a
large scale, I've needed them *much* faster than Python could achieve
without a C extension.

Perhaps the difference is in problem domain.  In the "scripting"
problem domain, I would agree that Setswould rarely reach large sizes, 
and so a algorithm which performed in quadratic time might be fine,
because the actual resultant time is small.  However, in more
full-blown applications, this would be counter productive, and the
user would be forced implement their own (or use Aaron's excellent
kjBuckets).

Just my opinion, of course.
Chris
-- 
| Christopher Petrilli
| petrilli@amber.org


From ping@lfw.org  Tue Jan 23 19:27:38 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Tue, 23 Jan 2001 11:27:38 -0800 (PST)
Subject: [Python-Dev] Sets: elt in dict, lst.include
In-Reply-To: <14957.53331.342827.462297@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org>

On Tue, 23 Jan 2001, Jeremy Hylton wrote:
> For my applications, the dictionary-based approach is faster and
> offers a natural interface.

The only change that needs to be made to support sets of immutable
elements is to provide "in" on dictionaries.  The rest is then all
quite natural:

    dict[key] = 1
    if key in dict: ...
    for key in dict: ...

(Then we can also get rid of the ugly has_key method.)

For those that need mutable set elements badly enough to sacrifice
a little speed, we can add two methods to lists:

    lst.include(elt)   # same as - if elt not in lst: lst.append(elt)
    lst.exclude(elt)   # same as - while elt in lst: lst.remove(elt)

(These are generally useful methods to have anyway.)


This proposal has the following advantages:

    1. You still get to choose which implementation best suits your needs.

    2. No new types are introduced; lists and dicts are well understood.

    3. Both features are extremely simple to understand and explain.

    4. Both features are useful in their own right, and could stand as
       independent proposals to improve lists and dicts respectively.
       (For instance, i spotted about 10 places in the std library where
       the 'include' method could be used, and i know i would use it
       myself -- certainly more often than pop or reverse!)

    5. In all cases this is faster than a new Python class.  (For instance,
       Jeremy's implementation even contained a commented-out optimization
       that stored self.elts.has_key as self.has_elt to speed things up a
       bit.  Using straight dicts would see this optimization and raise it
       one, with no effort at all.)

    6. Either feature can be independently approved or rejected without
       affecting the other.


-- ?!ng



From loewis@informatik.hu-berlin.de  Tue Jan 23 19:33:00 2001
From: loewis@informatik.hu-berlin.de (Martin von Loewis)
Date: Tue, 23 Jan 2001 20:33:00 +0100 (MET)
Subject: [Python-Dev] getting rid of ucnhash
Message-ID: <200101231933.UAA02223@pandora.informatik.hu-berlin.de>

> Should I check it in now, change the names/semantics and check it
> in, or post it to sourceforge?

Is that two or three options? If three, what change in semantics did
you propose?

Anyway, I feel it could go in right now; the only breakage would be to
applications that use ucnhash.ucnhashAPI, right?

Regards,
Martin


From fredrik@effbot.org  Tue Jan 23 19:49:09 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Tue, 23 Jan 2001 20:49:09 +0100
Subject: [Python-Dev] Re:  getting rid of ucnhash
References: <200101231933.UAA02223@pandora.informatik.hu-berlin.de>
Message-ID: <01e801c08575$8f71c680$e46940d5@hagrid>

martin wrote:

> > Should I check it in now, change the names/semantics and check it
> > in, or post it to sourceforge?
> 
> Is that two or three options?

three, I think.

> If three, what change in semantics did you propose?

none -- but maybe someone else has a better name for "lookup"?

(the "name" function behaves like the existing property methods
in 2.0's unicodedata)

> Anyway, I feel it could go in right now; the only breakage would be to
> applications that use ucnhash.ucnhashAPI, right?

yup -- and those applications are already broken, since the CObject
was renamed in 2.1a1.

(well, any code using 2.1a1's new ucnhash.getcode/getname functions
will of course also break.  but I think we can live with that ;-)

Cheers /F



From ping@lfw.org  Tue Jan 23 19:43:50 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Tue, 23 Jan 2001 11:43:50 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101231135570.1568-100000@skuld.kingmanhall.org>

Christopher Petrilli wrote:
> The problem is that there are a lot of uses for large sets, especially 
> when you begin to introduce intersections and unions.
[...]
> Unfortunately, for me, a Python implementation of Sets is only
> interesting academicaly.  Any time I've needed to work with them at a
> large scale, I've needed them *much* faster than Python could achieve
> without a C extension.

On Tue, 23 Jan 2001, Ka-Ping Yee wrote:
> This proposal has the following advantages:
[six nice things about 'in dict' and 'lst.include']

I forgot to mention an important seventh advantage:

    7. The list and dictionary data structures are implemented
       in the C core, so we leave open the possibility of a
       wizard going and optimizing the snot out of them later.

Just as there's e.g. a boundary on recursion levels before Python
invokes the cycle detection algorithm during comparison, if we
decide we need more speed for big sets, Python could notice when
a list or dictionary gets very big and invoke more powerful
optimizations.  We don't have to do this now, but the important
thing is that we will always have the option to make Christopher's
dream come true.  (A wizard can do this once, and every Python
script on the planet benefits.)

In general i support Python deciding on the Right Thing to do
under the hood, performance-wise, so that the programmer doesn't
have to think too hard about what data structure to choose.


-- ?!ng



From nas@arctrix.com  Tue Jan 23 13:08:07 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Tue, 23 Jan 2001 05:08:07 -0800
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123140604.E18796@trump.amber.org>; from petrilli@amber.org on Tue, Jan 23, 2001 at 02:06:05PM -0500
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com> <20010123140604.E18796@trump.amber.org>
Message-ID: <20010123050807.A29115@glacier.fnational.com>

On Tue, Jan 23, 2001 at 02:06:05PM -0500, Christopher Petrilli wrote:
> Unfortunately, for me, a Python implementation of Sets is only
> interesting academicaly.  Any time I've needed to work with them at a
> large scale, I've needed them *much* faster than Python could achieve
> without a C extension.

I think this argues that if sets are added to the core they
should be implemented as an extension type with the speed of
dictionaries and the memory usage of lists.  Basicly, we would
use the implementation of PyDict but drop the values.

  Neil


From jeremy@alum.mit.edu  Tue Jan 23 19:48:18 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Tue, 23 Jan 2001 14:48:18 -0500 (EST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <14957.53331.342827.462297@localhost.localdomain>
References: <20010122124159.A14999@thyrsus.com>
 <200101221910.OAA01218@cj20424-a.reston1.va.home.com>
 <20010122151309.C15236@thyrsus.com>
 <200101231549.KAA05172@cj20424-a.reston1.va.home.com>
 <20010123113050.A26162@thyrsus.com>
 <14957.53331.342827.462297@localhost.localdomain>
Message-ID: <14957.57346.248852.656387@localhost.localdomain>

--lebymX04xi
Content-Type: text/plain; charset=us-ascii
Content-Description: message body text
Content-Transfer-Encoding: 7bit

Sorry about the garbled attachment on the previous message; I think I
got the content-type wrong.  Here's a second try.

Jeremy


--lebymX04xi
Content-Type: application/octet-stream
Content-Disposition: attachment;
	filename="sets.tar"
Content-Transfer-Encoding: base64

c2V0cy8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADAwNDA3NzUA
MDAwMTU1NgAwMDAwNzY1ADAwMDAwMDAwMDAwADA3MjMzMzUwMDA1ADAxMTIxNQAgNQAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB1c3RhciAgAGplcmVt
eQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAYWRtaW4AAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABzZXRzL3Rlc3RzZXQxOC5weQAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAMDEwMDY2NAAwMDAxNTU2ADAwMDA3NjUAMDAwMDAwMDQ2MTQA
MDcyMzMzNDcyNDMAMDEzNDQ3ACAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAHVzdGFyICAAamVyZW15AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABh
ZG1pbgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHNp
emUgPSA4OAoKZGVmIHRlc3QoZmFjdG9yeSk6CiAgICBzZXQgPSBmYWN0b3J5KCkKICAgIHNl
dC5hZGQoJ29wdGltaXplZCcpCiAgICBzZXQuYWRkKCdfX2luaXRfXycpCiAgICBzZXQuYWRk
KCdfc2V0dXBHcmFwaERlbGVnYXRpb24nKQogICAgc2V0LmFkZCgnZ2V0Q29kZScpCiAgICBz
ZXQuYWRkKCdpc0xvY2FsTmFtZScpCiAgICBzZXQuYWRkKCdzdG9yZU5hbWUnKQogICAgc2V0
LmFkZCgnbG9hZE5hbWUnKQogICAgc2V0LmFkZCgnZGVsTmFtZScpCiAgICBzZXQuYWRkKCdf
bmFtZU9wJykKICAgIHNldC5hZGQoJ3NldF9saW5lbm8nKQogICAgc2V0LmFkZCgndmlzaXRN
b2R1bGUnKQogICAgc2V0LmFkZCgndmlzaXRGdW5jdGlvbicpCiAgICBzZXQuYWRkKCd2aXNp
dExhbWJkYScpCiAgICBzZXQuYWRkKCdfdmlzaXRGdW5jT3JMYW1iZGEnKQogICAgc2V0LmFk
ZCgndmlzaXRDbGFzcycpCiAgICBzZXQuYWRkKCd2aXNpdElmJykKICAgIHNldC5hZGQoJ3Zp
c2l0V2hpbGUnKQogICAgc2V0LmFkZCgndmlzaXRGb3InKQogICAgc2V0LmFkZCgndmlzaXRC
cmVhaycpCiAgICBzZXQuYWRkKCd2aXNpdENvbnRpbnVlJykKICAgIHNldC5hZGQoJ3Zpc2l0
VGVzdCcpCiAgICBzZXQuYWRkKCd2aXNpdEFuZCcpCiAgICBzZXQuYWRkKCd2aXNpdE9yJykK
ICAgIHNldC5hZGQoJ3Zpc2l0Q29tcGFyZScpCiAgICBzZXQuYWRkKCdfX2xpc3RfY291bnQn
KQogICAgc2V0LmFkZCgndmlzaXRMaXN0Q29tcCcpCiAgICBzZXQuYWRkKCd2aXNpdExpc3RD
b21wRm9yJykKICAgIHNldC5hZGQoJ3Zpc2l0TGlzdENvbXBJZicpCiAgICBzZXQuYWRkKCd2
aXNpdEFzc2VydCcpCiAgICBzZXQuYWRkKCd2aXNpdFJhaXNlJykKICAgIHNldC5hZGQoJ3Zp
c2l0VHJ5RXhjZXB0JykKICAgIHNldC5hZGQoJ3Zpc2l0VHJ5RmluYWxseScpCiAgICBzZXQu
YWRkKCd2aXNpdERpc2NhcmQnKQogICAgc2V0LmFkZCgndmlzaXRDb25zdCcpCiAgICBzZXQu
YWRkKCd2aXNpdEtleXdvcmQnKQogICAgc2V0LmFkZCgndmlzaXRHbG9iYWwnKQogICAgc2V0
LmFkZCgndmlzaXROYW1lJykKICAgIHNldC5hZGQoJ3Zpc2l0UGFzcycpCiAgICBzZXQuYWRk
KCd2aXNpdEltcG9ydCcpCiAgICBzZXQuYWRkKCd2aXNpdEZyb20nKQogICAgc2V0LmFkZCgn
X3Jlc29sdmVEb3RzJykKICAgIHNldC5hZGQoJ3Zpc2l0R2V0YXR0cicpCiAgICBzZXQuYWRk
KCd2aXNpdEFzc2lnbicpCiAgICBzZXQuYWRkKCd2aXNpdEFzc05hbWUnKQogICAgc2V0LmFk
ZCgndmlzaXRBc3NBdHRyJykKICAgIHNldC5hZGQoJ192aXNpdEFzc1NlcXVlbmNlJykKICAg
IHNldC5hZGQoJ3Zpc2l0QXNzVHVwbGUnKQogICAgc2V0LmFkZCgndmlzaXRBc3NMaXN0JykK
ICAgIHNldC5hZGQoJ3Zpc2l0QXNzVHVwbGUnKQogICAgc2V0LmFkZCgndmlzaXRBc3NMaXN0
JykKICAgIHNldC5hZGQoJ3Zpc2l0QXVnQXNzaWduJykKICAgIHNldC5hZGQoJ19hdWdtZW50
ZWRfb3Bjb2RlJykKICAgIHNldC5hZGQoJ3Zpc2l0QXVnTmFtZScpCiAgICBzZXQuYWRkKCd2
aXNpdEF1Z0dldGF0dHInKQogICAgc2V0LmFkZCgndmlzaXRBdWdTbGljZScpCiAgICBzZXQu
YWRkKCd2aXNpdEF1Z1N1YnNjcmlwdCcpCiAgICBzZXQuYWRkKCd2aXNpdEV4ZWMnKQogICAg
c2V0LmFkZCgndmlzaXRDYWxsRnVuYycpCiAgICBzZXQuYWRkKCd2aXNpdFByaW50JykKICAg
IHNldC5hZGQoJ3Zpc2l0UHJpbnRubCcpCiAgICBzZXQuYWRkKCd2aXNpdFJldHVybicpCiAg
ICBzZXQuYWRkKCd2aXNpdFNsaWNlJykKICAgIHNldC5hZGQoJ3Zpc2l0U3Vic2NyaXB0JykK
ICAgIHNldC5hZGQoJ2JpbmFyeU9wJykKICAgIHNldC5hZGQoJ3Zpc2l0QWRkJykKICAgIHNl
dC5hZGQoJ3Zpc2l0U3ViJykKICAgIHNldC5hZGQoJ3Zpc2l0TXVsJykKICAgIHNldC5hZGQo
J3Zpc2l0RGl2JykKICAgIHNldC5hZGQoJ3Zpc2l0TW9kJykKICAgIHNldC5hZGQoJ3Zpc2l0
UG93ZXInKQogICAgc2V0LmFkZCgndmlzaXRMZWZ0U2hpZnQnKQogICAgc2V0LmFkZCgndmlz
aXRSaWdodFNoaWZ0JykKICAgIHNldC5hZGQoJ3VuYXJ5T3AnKQogICAgc2V0LmFkZCgndmlz
aXRJbnZlcnQnKQogICAgc2V0LmFkZCgndmlzaXRVbmFyeVN1YicpCiAgICBzZXQuYWRkKCd2
aXNpdFVuYXJ5QWRkJykKICAgIHNldC5hZGQoJ3Zpc2l0VW5hcnlJbnZlcnQnKQogICAgc2V0
LmFkZCgndmlzaXROb3QnKQogICAgc2V0LmFkZCgndmlzaXRCYWNrcXVvdGUnKQogICAgc2V0
LmFkZCgnYml0T3AnKQogICAgc2V0LmFkZCgndmlzaXRCaXRhbmQnKQogICAgc2V0LmFkZCgn
dmlzaXRCaXRvcicpCiAgICBzZXQuYWRkKCd2aXNpdEJpdHhvcicpCiAgICBzZXQuYWRkKCd2
aXNpdEVsbGlwc2lzJykKICAgIHNldC5hZGQoJ3Zpc2l0VHVwbGUnKQogICAgc2V0LmFkZCgn
dmlzaXRMaXN0JykKICAgIHNldC5hZGQoJ3Zpc2l0U2xpY2VvYmonKQogICAgc2V0LmFkZCgn
dmlzaXREaWN0JykKAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAABzZXRzL3Rlc3RzZXQ4OC5weQAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAMDEwMDY2NAAwMDAxNTU2ADAwMDA3NjUAMDAwMDAwMDA1MzQAMDcyMzMz
NDcyNDMAMDEzNDUzACAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAHVzdGFyICAAamVyZW15AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABhZG1pbgAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHNpemUgPSAx
MwoKZGVmIHRlc3QoZmFjdG9yeSk6CiAgICBzZXQgPSBmYWN0b3J5KCkKICAgIHNldC5hZGQo
J3NlbGYnKQogICAgc2V0LmFkZCgnZXhwcicpCiAgICBzZXQuYWRkKCdmbGFncycpCiAgICBz
ZXQuYWRkKCdsb3dlcicpCiAgICBzZXQuYWRkKCd1cHBlcicpCiAgICBzZXQuaGFzX2VsdCgn
ZXhwcicpCiAgICBzZXQuaGFzX2VsdCgnc2VsZicpCiAgICBzZXQuaGFzX2VsdCgnZmxhZ3Mn
KQogICAgc2V0Lmhhc19lbHQoJ3NlbGYnKQogICAgc2V0Lmhhc19lbHQoJ2xvd2VyJykKICAg
IHNldC5oYXNfZWx0KCdzZWxmJykKICAgIHNldC5oYXNfZWx0KCd1cHBlcicpCiAgICBzZXQu
aGFzX2VsdCgnc2VsZicpCgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAc2V0cy90ZXN0c2V0OTgucHkAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAADAxMDA2NjQAMDAwMTU1NgAwMDAwNzY1ADAwMDAwMDAwMTc1ADA3MjMzMzQ3
MjQzADAxMzQ1NQAgMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAB1c3RhciAgAGplcmVteQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAYWRtaW4AAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABzaXplID0gMwoK
ZGVmIHRlc3QoZmFjdG9yeSk6CiAgICBzZXQgPSBmYWN0b3J5KCkKICAgIHNldC5hZGQoJ19f
aW5pdF9fJykKICAgIHNldC5hZGQoJ19nZXRDaGlsZHJlbicpCiAgICBzZXQuYWRkKCdfX3Jl
cHJfXycpCgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAHNldHMvdGltZXNldC5weQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAwMTAwNjY0ADAwMDE1NTYAMDAwMDc2NQAwMDAwMDAwMTQ3MwAwNzIzMzM0NzYx
NAAwMTMyNTcAIDAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAdXN0YXIgIABqZXJlbXkAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGFkbWluAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAaW1wb3J0IGVzcnNl
dAppbXBvcnQgamFoc2V0CmltcG9ydCBvcwppbXBvcnQgdGltZQoKZGVmIHRpbWVpdChmLCBp
dGVycz1yYW5nZSgzMDAwKSk6CiAgICB0MCA9IHRpbWUuY2xvY2soKQogICAgZm9yIGkgaW4g
aXRlcnM6CiAgICAgICAgZigpCiAgICB0MSA9IHRpbWUuY2xvY2soKQogICAgcmV0dXJuIHQx
IC0gdDAKCmNsYXNzIGVzcndyYXAoZXNyc2V0LnNldCk6CiAgICBkZWYgX19pbml0X18oc2Vs
Zik6CiAgICAgICAgc2VsZi5lbGVtZW50cyA9IFtdCgogICAgYWRkID0gZXNyc2V0LnNldC5h
cHBlbmQKCiAgICBkZWYgaGFzX2VsdChzZWxmLCBlbHQpOgogICAgICAgIHJldHVybiBlbHQg
aW4gc2VsZi5lbGVtZW50cwoKICAgIGRlZiByZW1vdmUoc2VsZiwgZWx0KToKICAgICAgICBp
ID0gc2VsZi5pbmRleChlbHQpCiAgICAgICAgZGVsIHNlbGYuZWxlbWVudHNbaV0KCmRlZiBs
aXN0X3Rlc3QoKToKICAgIG1vZHVsZS50ZXN0KGVzcndyYXApCgpkZWYgZGljdF90ZXN0KCk6
CiAgICBtb2R1bGUudGVzdChqYWhzZXQuU2V0KQoKZm9yIGZpbGUgaW4gb3MubGlzdGRpcigi
LiIpOgogICAgaWYgbm90IGZpbGUuc3RhcnRzd2l0aCgndGVzdHNldCcpOgogICAgICAgIGNv
bnRpbnVlCiAgICBuYW1lLCBleHQgPSBvcy5wYXRoLnNwbGl0ZXh0KGZpbGUpCiAgICBpZiBl
eHQgIT0gJy5weSc6CiAgICAgICAgY29udGludWUKICAgIG1vZHVsZSA9IF9faW1wb3J0X18o
bmFtZSkKCiAgICBwcmludCBuYW1lLCBtb2R1bGUuc2l6ZQogICAgcHJpbnQgImRpY3QiLCB0
aW1laXQoZGljdF90ZXN0KSwgImxpc3QiLCB0aW1laXQobGlzdF90ZXN0KQogICAgcHJpbnQK
ICAgIAoAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAHNldHMvZXNyc2V0LnB5
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAwMTAwNjY0ADAwMDE1NTYAMDAwMDc2
NQAwMDAwMDAxMzA0MgAwNzIzMzM0NzI1MwAwMTMxMDQAIDAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAdXN0YXIgIABqZXJlbXkAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAGFkbWluAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAIyBEZXNpZ24gYW5kIGltcGxlbWVudGF0aW9uIGJ5IEVTUiwgSmFudWFy
eSAyMDAxLgoKZGVmIHNldGlmeShsaXN0MSk6CQkjIFVzZWQgYnkgc2V0IGNvbnN0cnVjdG9y
CiAgICAiUmVtb3ZlIGR1cGxpY2F0ZXMgaW4gc2VxdWVuY2UuIgogICAgcmVzID0gW10KICAg
IGZvciBpIGluIHJhbmdlKGxlbihsaXN0MSkpOgoJZHVwbGljYXRlID0gMAogICAgICAgIGZv
ciBqIGluIHJhbmdlKGkpOgoJICAgIGlmIGxpc3QxW2ldID09IGxpc3QxW2pdOgoJCWR1cGxp
Y2F0ZSA9IDEKCQlicmVhawoJaWYgbm90IGR1cGxpY2F0ZToKCSAgICByZXMuYXBwZW5kKGxp
c3QxW2ldKQogICAgcmV0dXJuIHJlcwoKZGVmIHVuaW9uKGxpc3QxLCBsaXN0Mik6CQkjIFVz
ZWQgZm9yIHwKICAgICJDb21wdXRlIHNldCBpbnRlcnNlY3Rpb24gb2Ygc2VxdWVuY2VzLiIK
ICAgIHJlcyA9IGxpc3QxWzpdCiAgICBmb3IgeCBpbiBsaXN0MjoKCWlmIG5vdCB4IGluIGxp
c3QxOgoJICAgIHJlcy5hcHBlbmQoeCkKICAgIHJldHVybiByZXMKCmRlZiBpbnRlcnNlY3Rp
b24obGlzdDEsIGxpc3QyKToJCSMgVXNlZCBmb3IgJgogICAgIkNvbXB1dGUgc2V0IGludGVy
c2VjdGlvbiBvZiBzZXF1ZW5jZXMuIgogICAgcmVzID0gW10KICAgIGZvciB4IGluIGxpc3Qx
OgoJaWYgeCBpbiBsaXN0MjoKCSAgICByZXMuYXBwZW5kKHgpCiAgICByZXR1cm4gcmVzCgpk
ZWYgZGlmZmVyZW5jZShsaXN0MSwgbGlzdDIpOgkJIyBVc2VkIGZvciAtCiAgICAiQ29tcHV0
ZSBzZXQgZGlmZmVyZW5jZSBvZiBzZXF1ZW5jZXMuIgogICAgcmVzID0gW10KICAgIGZvciB4
IGluIGxpc3QxOgoJaWYgbm90IHggaW4gbGlzdDI6CgkgICAgcmVzLmFwcGVuZCh4KQogICAg
cmV0dXJuIHJlcwoKZGVmIHN5bW1ldHJpY19kaWZmZXJlbmNlKGxpc3QxLCBsaXN0Mik6CSMg
VXNlZCBmb3IgXgogICAgIkNvbXB1dGUgc2V0IHN5bW1ldHJpYy1kaWZmZXJlbmNlIG9mIHNl
cXVlbmNlcy4iCiAgICByZXMgPSBbXQogICAgZm9yIHggaW4gbGlzdDE6CglpZiBub3QgeCBp
biBsaXN0MjoKCSAgICByZXMuYXBwZW5kKHgpCiAgICBmb3IgeCBpbiBsaXN0MjoKCWlmIG5v
dCB4IGluIGxpc3QxOgoJICAgIHJlcy5hcHBlbmQoeCkKICAgIHJldHVybiByZXMKCmRlZiBj
YXJ0ZXNpYW4obGlzdDEsIGxpc3QyKToJCSMgVXNlZCBmb3IgKgogICAgIkNhcnRlc2lhbiBw
cm9kdWN0IG9mIHNlcXVlbmNlcyBjb25zaWRlcmVkIGFzIHNldHMuIgogICAgcmVzID0gW10K
ICAgIGZvciB4IGluIGxpc3QxOgoJZm9yIHkgaW4gbGlzdDI6CgkgICAgcmVzLmFwcGVuZCgo
eCx5KSkKICAgIHJldHVybiByZXMKCmRlZiBlcXVhbGl0eShsaXN0MSwgbGlzdDIpOgogICAg
IlRlc3Qgc2VxdWVuY2VzIGNvbnNpZGVyZWQgYXMgc2V0cyBmb3IgZXF1YWxpdHkuIgogICAg
aWYgbGVuKGxpc3QxKSAhPSBsZW4obGlzdDIpOgogICAgICAgIHJldHVybiAwCiAgICBmb3Ig
eCBpbiBsaXN0MToKICAgICAgICBpZiBub3QgeCBpbiBsaXN0MjoKICAgICAgICAgICAgcmV0
dXJuIDAKICAgIGZvciB4IGluIGxpc3QyOgogICAgICAgIGlmIG5vdCB4IGluIGxpc3QxOgog
ICAgICAgICAgICByZXR1cm4gMAogICAgcmV0dXJuIDEKCmRlZiBwcm9wZXJfc3Vic2V0KGxp
c3QxLCBsaXN0Mik6CiAgICAiUmV0dXJuIDEgaWYgZmlyc3QgYXJndW1lbnQgaXMgYSBwcm9w
ZXIgc3Vic2V0IG9mIHNlY29uZCwgMCBvdGhlcndpc2UuIgogICAgaWYgbm90IGxlbihsaXN0
MSkgPCBsZW4obGlzdDIpOgogICAgICAgIHJldHVybiAwCiAgICBmb3IgeCBpbiBsaXN0MToK
ICAgICAgICBpZiBub3QgeCBpbiBsaXN0MjoKICAgICAgICAgICAgcmV0dXJuIDAKICAgIHJl
dHVybiAxCgpkZWYgc3Vic2V0KGxpc3QxLCBsaXN0Mik6CiAgICAiUmV0dXJuIDEgaWYgZmly
c3QgYXJndW1lbnQgaXMgYSBzdWJzZXQgb2Ygc2Vjb25kLCAwIG90aGVyd2lzZS4iCiAgICBp
ZiBub3QgbGVuKGxpc3QxKSA8PSBsZW4obGlzdDIpOgogICAgICAgIHJldHVybiAwCiAgICBm
b3IgeCBpbiBsaXN0MToKICAgICAgICBpZiBub3QgeCBpbiBsaXN0MjoKICAgICAgICAgICAg
cmV0dXJuIDAKICAgIHJldHVybiAxCgpkZWYgcG93ZXJzZXQoYmFzZSk6CiAgICAiQ29tcHV0
ZSB0aGUgc2V0IG9mIGFsbCBzdWJzZXRzIG9mIGEgc2V0LiIKICAgIHBvd2Vyc2V0ID0gW10K
ICAgIGZvciBuIGluIHhyYW5nZSgyICoqIGxlbihiYXNlKSk6CglzdWJzZXQgPSBbXQoJZm9y
IGUgaW4geHJhbmdlKGxlbihiYXNlKSk6CgkgICAgIGlmIG4gJiAyICoqIGU6CgkJc3Vic2V0
LmFwcGVuZChiYXNlW2VdKQoJcG93ZXJzZXQuYXBwZW5kKHN1YnNldCkKICAgIHJldHVybiBw
b3dlcnNldAoKY2xhc3Mgc2V0OgogICAgIkxpc3RzIHdpdGggc2V0LXRoZW9yZXRpYyBvcGVy
YXRpb25zLiIKCiAgICBkZWYgX19pbml0X18oc2VsZiwgdmFsdWUpOgogICAgICAgIHNlbGYu
ZWxlbWVudHMgPSBzZXRpZnkodmFsdWUpCgogICAgZGVmIF9fbGVuX18oc2VsZik6CglyZXR1
cm4gbGVuKHNlbGYuZWxlbWVudHMpCgogICAgZGVmIF9fZ2V0aXRlbV9fKHNlbGYsIGluZCk6
CglyZXR1cm4gc2VsZi5lbGVtZW50c1tpbmRdCgogICAgZGVmIF9fc2V0aXRlbV9fKHNlbGYs
IGluZCwgdmFsKToKICAgICAgICBpZiB2YWwgbm90IGluIHNlbGYuZWxlbWVudHM6CiAgICAg
ICAgICAgIHNlbGYuZWxlbWVudHNbaW5kXSA9IHZhbAoKICAgIGRlZiBfX2RlbGl0ZW1fXyhz
ZWxmLCBpbmQpOgoJZGVsIHNlbGYuZWxlbWVudHNbaW5kXQoKICAgIGRlZiBsaXN0KHNlbGYp
OgogICAgICAgIHJldHVybiBzZWxmLmVsZW1lbnRzCgogICAgZGVmIGFwcGVuZChzZWxmLCBu
ZXcpOgogICAgICAgIGlmIG5ldyBub3QgaW4gc2VsZi5lbGVtZW50czoKICAgICAgICAgICAg
c2VsZi5lbGVtZW50cy5hcHBlbmQobmV3KQoKICAgIGRlZiBleHRlbmQoc2VsZiwgbmV3KToK
CXNlbGYuZWxlbWVudHMuZXh0ZW5kKG5ldykKICAgICAgICBzZWxmLmVsZW1lbnRzID0gc2V0
aWZ5KHNlbGYuZWxlbWVudHMpCgogICAgZGVmIGNvdW50KHNlbGYsIHgpOgoJc2VsZi5lbGVt
ZW50cy5jb3VudCh4KQoKICAgIGRlZiBpbmRleChzZWxmLCB4KToKCXNlbGYuZWxlbWVudHMu
aW5kZXgoeCkKCiAgICBkZWYgaW5zZXJ0KHNlbGYsIGksIHgpOgogICAgICAgIGlmIHggbm90
IGluIHNlbGYuZWxlbWVudHM6CiAgICAgICAgICAgIHNlbGYuZWxlbWVudHMuaW5kZXgoaSwg
eCkKCiAgICBkZWYgcG9wKHNlbGYsIGk9Tm9uZSk6CglzZWxmLmVsZW1lbnRzLnBvcChpKQoK
ICAgIGRlZiByZW1vdmUoc2VsZiwgeCk6CglzZWxmLmVsZW1lbnRzLnJlbW92ZSh4KQoKICAg
IGRlZiByZXZlcnNlKHNlbGYpOgoJc2VsZi5lbGVtZW50cy5yZXZlcnNlKCkKCiAgICBkZWYg
c29ydChzZWxmLCBjbXA9Tm9uZSk6CglzZWxmLmVsZW1lbnRzLnNvcnQoY21wKQoKICAgIGRl
ZiBfX29yX18oc2VsZiwgb3RoZXIpOgoJaWYgdHlwZShvdGhlcikgPT0gdHlwZShzZWxmKToK
CSAgICBvdGhlciA9IG90aGVyLmVsZW1lbnRzCiAgICAgICAgcmV0dXJuIHNldCh1bmlvbihz
ZWxmLmVsZW1lbnRzLCBvdGhlcikpCgogICAgX19hZGRfXyA9IF9fb3JfXwoKICAgIGRlZiBf
X2FuZF9fKHNlbGYsIG90aGVyKToKCWlmIHR5cGUob3RoZXIpID09IHR5cGUoc2VsZik6Cgkg
ICAgb3RoZXIgPSBvdGhlci5lbGVtZW50cwogICAgICAgIHJldHVybiBzZXQoaW50ZXJzZWN0
aW9uKHNlbGYuZWxlbWVudHMsIG90aGVyKSkKCiAgICBkZWYgX19zdWJfXyhzZWxmLCBvdGhl
cik6CglpZiB0eXBlKG90aGVyKSA9PSB0eXBlKHNlbGYpOgoJICAgIG90aGVyID0gb3RoZXIu
ZWxlbWVudHMKICAgICAgICByZXR1cm4gc2V0KGRpZmZlcmVuY2Uoc2VsZi5lbGVtZW50cywg
b3RoZXIpKQoKICAgIGRlZiBfX3hvcl9fKHNlbGYsIG90aGVyKToKCWlmIHR5cGUob3RoZXIp
ID09IHR5cGUoc2VsZik6CgkgICAgb3RoZXIgPSBvdGhlci5lbGVtZW50cwogICAgICAgIHJl
dHVybiBzZXQoc3ltbWV0cmljX2RpZmZlcmVuY2Uoc2VsZi5lbGVtZW50cywgb3RoZXIpKQoK
ICAgIGRlZiBfX211bF9fKHNlbGYsIG90aGVyKToKCWlmIHR5cGUob3RoZXIpID09IHR5cGUo
c2VsZik6CgkgICAgb3RoZXIgPSBvdGhlci5lbGVtZW50cwogICAgICAgIHJldHVybiBzZXQo
Y2FydGVzaWFuKHNlbGYuZWxlbWVudHMsIG90aGVyKSkKCiAgICBkZWYgX19lcV9fKHNlbGYs
IG90aGVyKToKCWlmIHR5cGUob3RoZXIpID09IHR5cGUoc2VsZik6CgkgICAgb3RoZXIgPSBv
dGhlci5lbGVtZW50cwogICAgICAgIHJldHVybiBzZWxmLmVsZW1lbnRzID09IG90aGVyCgog
ICAgZGVmIF9fbmVfXyhzZWxmLCBvdGhlcik6CglpZiB0eXBlKG90aGVyKSA9PSB0eXBlKHNl
bGYpOgoJICAgIG90aGVyID0gb3RoZXIuZWxlbWVudHMKICAgICAgICByZXR1cm4gc2VsZi5l
bGVtZW50cyAhPSBvdGhlcgoKICAgIGRlZiBfX2x0X18oc2VsZiwgb3RoZXIpOgoJaWYgdHlw
ZShvdGhlcikgPT0gdHlwZShzZWxmKToKCSAgICBvdGhlciA9IG90aGVyLmVsZW1lbnRzCiAg
ICAgICAgcmV0dXJuIHByb3Blcl9zdWJzZXQoc2VsZi5lbGVtZW50cywgb3RoZXIpCgogICAg
ZGVmIF9fbGVfXyhzZWxmLCBvdGhlcik6CglpZiB0eXBlKG90aGVyKSA9PSB0eXBlKHNlbGYp
OgoJICAgIG90aGVyID0gb3RoZXIuZWxlbWVudHMKICAgICAgICByZXR1cm4gc3Vic2V0KHNl
bGYuZWxlbWVudHMsIG90aGVyKQoKICAgIGRlZiBfX2d0X18oc2VsZiwgb3RoZXIpOgoJaWYg
dHlwZShvdGhlcikgPT0gdHlwZShzZWxmKToKCSAgICBvdGhlciA9IG90aGVyLmVsZW1lbnRz
CiAgICAgICAgcmV0dXJuIHByb3Blcl9zdWJzZXQob3RoZXIsIHNlbGYuZWxlbWVudHMpCgog
ICAgZGVmIF9fZ2VfXyhzZWxmLCBvdGhlcik6CglpZiB0eXBlKG90aGVyKSA9PSB0eXBlKHNl
bGYpOgoJICAgIG90aGVyID0gb3RoZXIuZWxlbWVudHMKICAgICAgICByZXR1cm4gc3Vic2V0
KG90aGVyLCBzZWxmLmVsZW1lbnRzKQoKICAgIGRlZiBfX3N0cl9fKHNlbGYpOgogICAgICAg
IHJlcyA9ICJ7IgogICAgICAgIGZvciB4IGluIHNlbGYuZWxlbWVudHM6CiAgICAgICAgICAg
IHJlcyA9IHJlcyArIHN0cih4KSArICIsICIKICAgICAgICByZXMgPSByZXNbMDotMl0gKyAi
fSIKICAgICAgICByZXR1cm4gcmVzCgogICAgZGVmIF9fcmVwcl9fKHNlbGYpOgogICAgICAg
IHJldHVybiByZXByKHNlbGYuZWxlbWVudHMpCgppZiBfX25hbWVfXyA9PSAnX19tYWluX18n
OgogICAgYSA9IHNldChbMSwgMiwgMywgNF0pCiAgICBiID0gc2V0KFsxLCA0XSkKICAgIGMg
PSBzZXQoWzUsIDZdKQogICAgZCA9IFsxLCAxLCAyLCAxXQogICAgcHJpbnQgYGRgLCAic2V0
aWZpZXMgdG8iLCBzZXQoZCkKICAgIHByaW50IGBhYCwgInwiLCBgYmAsICJpcyIsIGBhIHwg
YmAKICAgIHByaW50IGBhYCwgIl4iLCBgYmAsICJpcyIsIGBhIF4gYmAKICAgIHByaW50IGBh
YCwgIiYiLCBgYmAsICJpcyIsIGBhICYgYmAKICAgIHByaW50IGBiYCwgIioiLCBgY2AsICJp
cyIsIGBiICogY2AKICAgIHByaW50IGBhYCwgJzwnLCBgYmAsICJpcyIsIGBhIDwgYmAKICAg
IHByaW50IGBhYCwgJz4nLCBgYmAsICJpcyIsIGBhID4gYmAKICAgIHByaW50IGBiYCwgJzwn
LCBgY2AsICJpcyIsIGBiIDwgY2AKICAgIHByaW50IGBiYCwgJz4nLCBgY2AsICJpcyIsIGBi
ID4gY2AKICAgIHByaW50ICJQb3dlciBzZXQgb2YiLCBgY2AsICJpcyIsIHBvd2Vyc2V0KGMp
CgojIGVuZAoAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
c2V0cy9qYWhzZXQucHkAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADAxMDA2NjQA
MDAwMTU1NgAwMDAwNzY1ADAwMDAwMDAwNjAxADA3MjMzMzQ3NzE1ADAxMzA1NQAgMAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAB1c3RhciAgAGplcmVt
eQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAYWRtaW4AAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABjbGFzcyBTZXQ6CiAgICBkZWYgX19pbml0X18o
c2VsZik6CiAgICAgICAgc2VsZi5lbHRzID0ge30KIyMgICAgICAgIHNldHMgYXJlIGZhc3Rl
ciB3aGVuIG1ldGhvZCBvdmVyaGVhZCBpcyByZW1vdmVkOgojIyAgICAgICAgc2VsZi5lbGVt
ZW50cyA9IHNlbGYuZWx0cy5rZXlzCiMjICAgICAgICBzZWxmLmhhc19lbHQgPSBzZWxmLmVs
dHMuaGFzX2tleQoKICAgIGRlZiBhZGQoc2VsZiwgZWx0KToKICAgICAgICBzZWxmLmVsdHNb
ZWx0XSA9IE5vbmUKCiAgICBkZWYgZWxlbWVudHMoc2VsZik6CiAgICAgICAgcmV0dXJuIHNl
bGYuZWx0cy5rZXlzKCkKCiAgICBkZWYgaGFzX2VsdChzZWxmLCBlbHQpOgogICAgICAgIHJl
dHVybiBzZWxmLmVsdHMuaGFzX2tleShlbHQpCiAgICAKAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAA=

--lebymX04xi--


From petrilli@amber.org  Tue Jan 23 20:06:16 2001
From: petrilli@amber.org (Christopher Petrilli)
Date: Tue, 23 Jan 2001 15:06:16 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123050807.A29115@glacier.fnational.com>; from nas@arctrix.com on Tue, Jan 23, 2001 at 05:08:07AM -0800
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com> <20010123140604.E18796@trump.amber.org> <20010123050807.A29115@glacier.fnational.com>
Message-ID: <20010123150616.F18796@trump.amber.org>

Neil Schemenauer [nas@arctrix.com] wrote:
> On Tue, Jan 23, 2001 at 02:06:05PM -0500, Christopher Petrilli wrote:
> > Unfortunately, for me, a Python implementation of Sets is only
> > interesting academicaly.  Any time I've needed to work with them at a
> > large scale, I've needed them *much* faster than Python could achieve
> > without a C extension.
> 
> I think this argues that if sets are added to the core they
> should be implemented as an extension type with the speed of
> dictionaries and the memory usage of lists.  Basicly, we would
> use the implementation of PyDict but drop the values.

This is effectively the implementation that Zope has for Sets.  In
addition we have "buckets" that have scores on them (which are
implemented as a modified BTree).  

Unfortunately Jim Fulton (who wrote all the code for that level) is in 
a meeting, but I hope he'll comment on the implementation that was
chosen for our software.

Chris
-- 
| Christopher Petrilli
| petrilli@amber.org


From jeremy@alum.mit.edu  Tue Jan 23 19:56:05 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Tue, 23 Jan 2001 14:56:05 -0500 (EST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123135530.A26565@thyrsus.com>
References: <20010122124159.A14999@thyrsus.com>
 <200101221910.OAA01218@cj20424-a.reston1.va.home.com>
 <20010122151309.C15236@thyrsus.com>
 <200101231549.KAA05172@cj20424-a.reston1.va.home.com>
 <20010123113050.A26162@thyrsus.com>
 <14957.53331.342827.462297@localhost.localdomain>
 <20010123135530.A26565@thyrsus.com>
Message-ID: <14957.57813.23072.723418@localhost.localdomain>

>>>>> "ESR" == Eric S Raymond <esr@thyrsus.com> writes:

  ESR> Jeremy Hylton <jeremy@alum.mit.edu>: Content-Description:
  ESR> message body text
  >> The tests showed that dictionary-based sets were always faster.
  >> For small tests (3 operations), the difference was about 10
  >> percent.  For larger tests (88 operations), the difference ranged
  >> from 180 to almost 700 percent.

  ESR> Not surprising.  88 elements is getting pretty large.

Large for what?  I've got directories with that many files and modules
with the many names defined at the top-level :-).  I'm just reporting
the range of set sizes I've encountered for a real application.  In
general, I expect a few hundred elements should be handled without
trouble by most Python containers.

Jeremy


From gvwilson@nevex.com  Tue Jan 23 20:26:22 2001
From: gvwilson@nevex.com (Greg Wilson)
Date: Tue, 23 Jan 2001 15:26:22 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123200601.87817EF68@mail.python.org>
Message-ID: <001101c0857a$c0dce420$770a0a0a@nevex.com>

Greg Wilson:
Meta-question: do people want to continue to discuss sets on the
general python-dev list, or take it out-of-line (e.g. to an egroups
list)?  I'm finding all of the discussion very useful, but I realize
that many readers might prefer to concentrate on the 2.1 release...

> Jeremy Hylton <jeremy@alum.mit.edu>:
> > The tests showed that dictionary-based sets were always faster.
> > small tests (3 operations), the difference was about 10 percent.
> > larger tests (88 operations), the difference ranged from 
> > 180 to almost 700 percent.

> Eric Raymond <esr@thyrsus.com>:
> Not surprising.  88 elements is getting pretty large.

Greg Wilson:
Really?  I was testing my implementation with sets of email addresses
grep'd out of old mail folders --- typical sizes were several thousand
elements.

> From: Christopher Petrilli <petrilli@amber.org>
> Unfortunately, for me, a Python implementation of Sets is only
> interesting academicaly.  Any time I've needed to work with them at a
> large scale, I've needed them *much* faster than Python could achieve
> without a C extension.

Greg Wilson:
I had been expecting to implement this in C, not in pure Python, for
performance.

> From: Christopher Petrilli <petrilli@amber.org>
> In the "scripting" problem domain, I would agree that Sets would
> rarely reach large sizes,
> and so a algorithm which performed in quadratic time might be fine,

Greg Wilson:
I strongly disagree (see the email address example above --- it was
the first thing that occurred to me to try).  I am still hoping to
find a sub-quadratic (preferably sub-linear) implementation.  I can
do it in C++ with observer/observable (contained items notify containers
of changes in value, sets store all equivalent items in the same bucket),
but that doesn't really help...

> From: Ka-Ping Yee <ping@lfw.org>
> The only change that needs to be made to support sets of immutable
> elements is to provide "in" on dictionaries...

and:

> From: Neil Schemenauer <nas@arctrix.com>
> ...if sets are added to the core...we would
> use the implementation of PyDict but drop the values.

Unfortunately, if values are required to be immutable, then sets of
sets aren't possible... :-(

Thanks, everyone,
Greg



From esr@thyrsus.com  Tue Jan 23 20:38:39 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 15:38:39 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org>; from ping@lfw.org on Tue, Jan 23, 2001 at 11:27:38AM -0800
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org>
Message-ID: <20010123153839.B26676@thyrsus.com>

Ka-Ping Yee <ping@lfw.org>:
> The only change that needs to be made to support sets of immutable
> elements is to provide "in" on dictionaries.  The rest is then all
> quite natural:
> 
>     dict[key] = 1
>     if key in dict: ...
>     for key in dict: ...

Independently of implementation issues about sets, I think this is a
damn fine idea. +1.

> (Then we can also get rid of the ugly has_key method.)
> 
> For those that need mutable set elements badly enough to sacrifice
> a little speed, we can add two methods to lists:
> 
>     lst.include(elt)   # same as - if elt not in lst: lst.append(elt)
>     lst.exclude(elt)   # same as - while elt in lst: lst.remove(elt)

+1 on the concept, -0 on the names.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

[The disarming of citizens] has a double effect, it palsies the hand
and brutalizes the mind: a habitual disuse of physical forces totally
destroys the moral [force]; and men lose at once the power of
protecting themselves, and of discerning the cause of their
oppression.
        -- Joel Barlow, "Advice to the Privileged Orders", 1792-93


From tim.one@home.com  Tue Jan 23 22:02:41 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 23 Jan 2001 17:02:41 -0500
Subject: [Python-Dev] Is X a (sequence|mapping)?
In-Reply-To: <200101231531.KAA05122@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEELFIKAA.tim.one@home.com>

>> 	operator.isMappingType()
>> 	+ some other C style _Check() APIs

[Guido]
> Yes, these should probably be deprecated.  I certainly have never
> used them!  (The operator module doesn't seem to get much use in
> general...

It's used heavily by test_operator.py <wink>.  Outside of that, it's used
maybe three times in the std distribution, nowhere essential; the

    return map(operator.__div__, rgbtuple, _maxtuple)

in Pynche's ColorDB.py is typical.  2.0's

    return [x / 256. for x in rgbtuple]

does the same thing more clearly (_maxtuple is a module constant).

It appeals to functional-language fans and extreme micro-optimizers, so they
don't have to type "lambda" in the simplest cases.  At least
operator.truth(x) is *clearer* than "not not x".

> Was it a bad idea?)

Mixed, but I'd say more bad than good overall.



From thomas@xs4all.net  Tue Jan 23 23:38:14 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 24 Jan 2001 00:38:14 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010123153839.B26676@thyrsus.com>; from esr@thyrsus.com on Tue, Jan 23, 2001 at 03:38:39PM -0500
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com>
Message-ID: <20010124003814.F27785@xs4all.nl>

On Tue, Jan 23, 2001 at 03:38:39PM -0500, Eric S. Raymond wrote:

> > The only change that needs to be made to support sets of immutable
> > elements is to provide "in" on dictionaries.  The rest is then all
> > quite natural:

> >     dict[key] = 1
> >     if key in dict: ...
> >     for key in dict: ...

> Independently of implementation issues about sets, I think this is a
> damn fine idea. +1.

It's come up before. The problem with it is that it's not quite obvious
whether it is 'if key in dict' or 'if value in dict'. Sure, from the above
example it's obvious what you *expect*, but I suspect that 'for x in dict'
will result in a 40/60 split in expectations, and like American voters, the
20% middle section will change their vote each recount :-)

Now, if only there was a terribly obvious way to spell it... so that it's
immediately obvious which of the two you wanted.... something like, oh, I
donno, this, maybe:

  if key in dict.keys: ...
  if value in dict.values: ...

Ponder-ponder--Guido-should-use-the-time-machine-for-this-one!-ly y'rs,
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From fredrik@effbot.org  Wed Jan 24 00:13:20 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Wed, 24 Jan 2001 01:13:20 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl>
Message-ID: <02f401c0859a$765d07c0$e46940d5@hagrid>

> It's come up before. The problem with it is that it's not quite obvious
> whether it is 'if key in dict' or 'if value in dict'.

you forgot "if (key, value) in dict"

on the other hand, it's not quite obvious that "list.sort"
doesn't return the sorted list, "print >>None" prints to
standard output, "except KeyError, ValueError" doesn't
catch a ValueError exception, etc, etc, etc.

(nor that it's "has_key" and "hasattr", and not "has_key"
and "has_attr" or "haskey" and "hasattr" ;-)

let's just say that "in" is the same thing as "has_key",
and be done with it.

Cheers /F



From tim.one@home.com  Wed Jan 24 01:51:22 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 23 Jan 2001 20:51:22 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123140604.E18796@trump.amber.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCMELKIKAA.tim.one@home.com>

[Christopher Petrilli]
> ....
> Unfortunately, for me, a Python implementation of Sets is only
> interesting academicaly.  Any time I've needed to work with them at a
> large scale, I've needed them *much* faster than Python could achieve
> without a C extension.

How do you know that?  I've used large sets in Python happily without
resorting to C or kjbuckets (which is really aiming at fast operations on
*graphs*, in which area it has no equal).

Everyone (except Eric <wink>) uses dicts to implement sets in Python, and
"most" set operations can work at full C speed then; e.g., assuming both
sets have N elements:

    membership testing
        O(1) -- it's just dict.has_key()
    element insertion
        O(1) -- dict[element] = 1
    element removal
        O(1) -- del dict[element]
    union
        O(N), but at full C speed -- dict1.update(dict2)
    intersection
        O(N), but at Python speed (the only 2.1 dog in the bunch!)
    choose some element and remove it
        took O(N) time and additional space in 2.0, but
        is O(1) in both since dict.pop() was introduced
    iteration
        O(N), with O(N) additional space using dict.keys(),
        or O(1) additional space using dict.pop() repeatedly

What are you going to do in C that's faster than using a Python dict for
this purpose?  Most key set operations are straightforward Python dict
1-liners then, and Python dicts are very fast.  kjbuckets sets were slower
last time I timed them (several years ago, but Python dicts have gotten
faster since then while kjbuckets has been stagnant).

There's a long tradition in the Lisp world of using unordered lists to
represent sets (when the only tool you have is a hammer ... <0.5 wink>), but
it's been easy to do much better than that in Python almost since the start.
Even in the Python list world, enormous improvements for large sets can be
gotten by maintaining lists in sorted order (then most O(N) operations drop
to O(log2(N)), and O(N**2) to O(N)).  Curiously, though, in 2.1 we can still
use a dict-set for complex numbers, but no longer a sorted-list-set!
Requiring a total ordering can get in the way more than requiring
hashability (and vice versa -- that's a tough one).

measurement-is-the-measure-of-all-measurable-things-ly y'rs  - tim



From greg@cosc.canterbury.ac.nz  Wed Jan 24 02:45:01 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 24 Jan 2001 15:45:01 +1300 (NZDT)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010124003814.F27785@xs4all.nl>
Message-ID: <200101240245.PAA02098@s454.cosc.canterbury.ac.nz>

Thomas Wouters <thomas@xs4all.net>:

> Now, if only there was a terribly obvious way to spell it... so that it's
> immediately obvious which of the two you wanted...

Well, in the case of

  for key in d:

or

  for value in d:

it's immediately obvious to a *human* reader what is meant,
so all we need to do is make the compiler a bit smarter. This
can easily be done by the use of a small table, containing
the equivalents of the words 'key' and 'value' in all known
natural languages, against which the target variable name is
matched using some suitable fuzzy matching algorithm.
Soundex could be used for this, if we can decide on which
version to use...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@digicool.com  Wed Jan 24 02:46:37 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 21:46:37 -0500
Subject: [Python-Dev] getting rid of ucnhash
In-Reply-To: Your message of "Tue, 23 Jan 2001 19:03:42 +0100."
 <013901c08566$d2a8f360$e46940d5@hagrid>
References: <013901c08566$d2a8f360$e46940d5@hagrid>
Message-ID: <200101240246.VAA06336@cj20424-a.reston1.va.home.com>

> It's probably just me, but the names of the two unicode
> modules tend to irritate me:
> 
> > ls u*.pyd
> ucnhash.pyd      unicodedata.pyd
> 
> (the former contains names, the latter data)
> 
> I've been meaning to rename the former, but I just realized
> that it might be better to get rid of it completely, and move
> its functionality into the unicodedata module.
> 
> The result is a single 200k unicodedata module, which con-
> tains the name database as well as two new functions:
> 
>     name(character [, default]) => map unicode
>     character to name.  if the name doesn't exist,
>     return the default object, or raise ValueError.
> 
>     lookup(name) => unicode character
>     (or raise KeyError if it doesn't exist)
> 
> Should I check it in now, change the names/semantics and check
> it in, or post it to sourceforge?

To me, both of these are irrelevant details of the Unicode
implementation. :-)   IOW, feel free to check it in.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg@cosc.canterbury.ac.nz  Wed Jan 24 02:49:21 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 24 Jan 2001 15:49:21 +1300 (NZDT)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMELKIKAA.tim.one@home.com>
Message-ID: <200101240249.PAA02101@s454.cosc.canterbury.ac.nz>

Tim Peters <tim.one@home.com>:

> Requiring a total ordering can get in the way more than requiring
> hashability

Often it's useful to have *some* total ordering, and
you don't really care what it is as long as its consistent.

Maybe all types should be required to support cmp(x,y) even 
if doing x < y via the rich comparison route raises a
NotOrderable exception.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Wed Jan 24 02:52:43 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 24 Jan 2001 15:52:43 +1300 (NZDT)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123050807.A29115@glacier.fnational.com>
Message-ID: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>

Neil Schemenauer <nas@arctrix.com>:

> Basicly, we would
> use the implementation of PyDict but drop the values.

This could be incorporated into PyDict. Instead of storing keys and
values in the same array, keep them in separate arrays and only
allocate the values array the first time someone stores a value other
than 1.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@digicool.com  Wed Jan 24 02:58:59 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 21:58:59 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Wed, 24 Jan 2001 01:13:20 +0100."
 <02f401c0859a$765d07c0$e46940d5@hagrid>
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl>
 <02f401c0859a$765d07c0$e46940d5@hagrid>
Message-ID: <200101240258.VAA06479@cj20424-a.reston1.va.home.com>

> let's just say that "in" is the same thing as "has_key",
> and be done with it.

You know, I've long resisted this, but I agree now -- this is the
right thing.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Wed Jan 24 03:11:30 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 22:11:30 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,1.3,1.4
In-Reply-To: Your message of "Tue, 23 Jan 2001 12:35:04 CST."
 <14957.52952.48739.53360@beluga.mojam.com>
References: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net>
 <14957.52952.48739.53360@beluga.mojam.com>
Message-ID: <200101240311.WAA06582@cj20424-a.reston1.va.home.com>

>     Guido> - Use "exec ... in dict" to avoid having to walk on eggshells;
>     Guido>   locals no don't have to start with underscore.
> 
> Thanks.  I have just been incredibly short on time lately.

You're welcome.

>     Guido> - Only test dbhash if bsddb can be imported.  (Wonder if there
>     Guido>   are more like this?)
> 
> Alpha testing should pick those up, yes? ;-)

Yes. :-)

>     Guido> ! try:
>     Guido> !     import bsddb
>     Guido> ! except ImportError:
>     Guido> !     if verbose:
>     Guido> !         print "can't import bsddb, so skipping dbhash"
>     Guido> ! else:
>     Guido> !     check_all("dbhash")
> 
> Instead of having to know that dbhash includes bsddb, shouldn't dbhash be
> the module that's imported here?

I think I saw a complaint about this that specifically said that when
dbhash is imported when bsddb can't be imported, an incomplete dbhash
is left behind in sys.modules, and then a second import of dbhash will
succeed -- but of course it will define no objects.  Since dbhash may
be imported elsewhere, testing for bsddb is safer.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Wed Jan 24 03:22:14 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 22:22:14 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.114,2.115
In-Reply-To: Your message of "Tue, 23 Jan 2001 08:24:38 PST."
 <E14L6FK-0001ZY-00@usw-pr-cvs1.sourceforge.net>
References: <E14L6FK-0001ZY-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <200101240322.WAA06671@cj20424-a.reston1.va.home.com>

> A few miscellaneous helpers.
> 
> PyObject_Dump(): New function that is useful when debugging Python's C
> runtime.  In something like gdb it can be a pain to get some useful
> information out of PyObject*'s.  This function prints the str() of the
> object to stderr, along with the object's refcount and hex address.
> 
> PyGC_Dump(): Similar to PyObject_Dump() but knows how to cast from the
> garbage collector prefix back to the PyObject* structure.
> 
> [See Misc/gdbinit for some useful gdb hooks]
> 
> none_dealloc(): Rather than SEGV if we accidentally decref None out of
> existance, we assign None's and NotImplemented's destructor slot to
> this function, which just calls abort().

Barry, since these are only gdb helpers, would it perhaps be better if
their names started with "_Py" to indicate that they aren't part of
the regular API?  They violate an important rule: you shouldn't write
to stderr directly, but always to sys.stderr.  (There's a helper
routines to write to stderr: PySys_WriteStderr().)  I understand that
for the gdb helper it's important to use the real stderr, and I don't
object to having these functions present at all times (they're so
small), but I do think that we should make it clear (by a _Py name,
and also by a comment) that they should not be called!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From ping@lfw.org  Wed Jan 24 03:29:24 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Tue, 23 Jan 2001 19:29:24 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010124003814.F27785@xs4all.nl>
Message-ID: <Pine.LNX.4.10.10101231921030.1568-100000@skuld.kingmanhall.org>

I wrote:
> The only change that needs to be made to support sets of immutable
> elements is to provide "in" on dictionaries.

Thomas Wouters wrote:
> It's come up before. The problem with it is that it's not quite obvious
> whether it is 'if key in dict' or 'if value in dict'.

Yes, and i've seen this objection before, and i think it's silly.

> Sure, from the above
> example it's obvious what you *expect*, but I suspect that 'for x in dict'
> will result in a 40/60 split in expectations,

No way... it's at least 90/10.

How often do you write 'dict.has_key(x)'?          (std lib says: 206)
How often do you write 'for x in dict.keys()'?     (std lib says: 49)

How often do you write 'x in dict.values()'?       (std lib says: 0)
How often do you write 'for x in dict.values()'?   (std lib says: 3)

I rest my case.


-- ?!ng



From barry@digicool.com  Wed Jan 24 03:44:31 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 23 Jan 2001 22:44:31 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.114,2.115
References: <E14L6FK-0001ZY-00@usw-pr-cvs1.sourceforge.net>
 <200101240322.WAA06671@cj20424-a.reston1.va.home.com>
Message-ID: <14958.20383.795064.832967@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@digicool.com> writes:

    GvR> Barry, since these are only gdb helpers, would it perhaps be
    GvR> better if their names started with "_Py" to indicate that
    GvR> they aren't part of the regular API?  They violate an
    GvR> important rule: you shouldn't write to stderr directly, but
    GvR> always to sys.stderr.  (There's a helper routines to write to
    GvR> stderr: PySys_WriteStderr().)  I understand that for the gdb
    GvR> helper it's important to use the real stderr, and I don't
    GvR> object to having these functions present at all times
    GvR> (they're so small), but I do think that we should make it
    GvR> clear (by a _Py name, and also by a comment) that they should
    GvR> not be called!

I thought about it, couldn't decide and figured I'd check it in
anyway, knowing that you'd let me know.  See how wise I was?  :)

I will rename them as _Py* and fix the gdbinit file accordingly.  One
note: these functions /ought/ to be useful for dbx or any other
command line debugger.  I just haven't used anything but gdb for
years.  If anybody's got a dbxinit equivalent I could add that to Misc
too.

nothing-an-adjacent-office-wouldn't-have-solved-much-more-quick-ly y'rs,
-Barry


From guido@digicool.com  Wed Jan 24 03:46:47 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 22:46:47 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: Your message of "Tue, 23 Jan 2001 09:22:26 EST."
 <20010123092226.A25968@thyrsus.com>
References: <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com>
 <20010123092226.A25968@thyrsus.com>
Message-ID: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>

> Guido van Rossum <guido@digicool.com>:
> > Can you point me to docs explaining the meaning of the BROWSER
> > environment variable?  I've never heard of it...  The last new
> > environment variables I learned were PAGER and EDITOR, probably 15
> > years ago when 4.1BSD was released... :-)

ESR replies:
> You've never heard of BROWSER because I invented it and have not
> widely popularized it yet :-).  Ping knew about it either because he
> read the module code and saw that it was supposed to work, or because
> he remembered the design discussion when webbrowser.py was first
> implemented.
> 
> I've had conversations with some key Perl and Tcl people (Larry Wall,
> Tom Christiansen, Clif Flynt) about the BROWSER convention, and they
> agree it's a good idea.  I'll probably hack support for it into Perl's
> browser launcher next.
> 
> It's documented in the version of libwebbrowser.tex now in the CVS
> tree.

Grumble.  That wasn't the kind of answer I expected.  I don't like it
if Python is used as a wedge to get a particular thing introduced to
the rest of the world, no matter how useful it may seem at the time.
If something is already a popular convention, I'll happily adopt it,
but I'm not comfortable being put in front of somebody else's cart.
There just are too many carts that would like to be pulled by a horse
as strong as Python, and I don't want to take sides if I can avoid it.
BROWSER seems unlikely to take the world by storm and I don't feel I
need to be involved in the effort to get it accepted.

(And yes, I know there are enough cases where I *did* take sides.
There were some cases where I *do* want to take a side, and there were
some mistakes -- which is one of the reasons why I'm shy about taking
sides now.)

Anyway, shouldn't you also talk to the developers of packages like KDE
and Gnome?  Surely their users would like to be able to configure the
default webbrowser.  Talking just to the scripting language people
seems like you're thinking too small.  There must be lots of C apps
with the desire to invoke a browser.  Also Emacs, which has an
extensive list of browser-url-* functions (you might even learn a few
tricks from it about how to invoke various external browsers) but
AFAIK no default browser selection.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Wed Jan 24 03:54:25 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 22:54:25 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Wed, 24 Jan 2001 15:52:43 +1300."
 <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>
Message-ID: <200101240354.WAA06903@cj20424-a.reston1.va.home.com>

> Neil Schemenauer <nas@arctrix.com>:
> 
> > Basicly, we would
> > use the implementation of PyDict but drop the values.
> 
> This could be incorporated into PyDict. Instead of storing keys and
> values in the same array, keep them in separate arrays and only
> allocate the values array the first time someone stores a value other
> than 1.

Not a bad idea!  (But shouldn't the default value be something else,
like none?)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Wed Jan 24 04:20:56 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 23:20:56 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Wed, 24 Jan 2001 00:38:14 +0100."
 <20010124003814.F27785@xs4all.nl>
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com>
 <20010124003814.F27785@xs4all.nl>
Message-ID: <200101240420.XAA07153@cj20424-a.reston1.va.home.com>

> > >     dict[key] = 1
> > >     if key in dict: ...
> > >     for key in dict: ...
> 
> > Independently of implementation issues about sets, I think this is a
> > damn fine idea. +1.
> 
> It's come up before. The problem with it is that it's not quite obvious
> whether it is 'if key in dict' or 'if value in dict'. Sure, from the above
> example it's obvious what you *expect*, but I suspect that 'for x in dict'
> will result in a 40/60 split in expectations, and like American voters, the
> 20% middle section will change their vote each recount :-)
> 
> Now, if only there was a terribly obvious way to spell it... so that it's
> immediately obvious which of the two you wanted.... something like, oh, I
> donno, this, maybe:
> 
>   if key in dict.keys: ...
>   if value in dict.values: ...
> 
> Ponder-ponder--Guido-should-use-the-time-machine-for-this-one!-ly y'rs,

No chance of a time-machine escape, but I *can* say that I agree that
Ping's proposal makes a lot of sense.  This is a reversal of my
previous opinion on this matter.  (Take note -- those don't happen
very often! :-)

First to submit a working patch gets a free copy of 2.1a2 and
subsequent releases,

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Wed Jan 24 04:50:49 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 23 Jan 2001 23:50:49 -0500
Subject: [Python-Dev] getting rid of ucnhash
In-Reply-To: <013901c08566$d2a8f360$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEMCIKAA.tim.one@home.com>

[/F]
> It's probably just me, but the names of the two unicode
> modules tend to irritate me:

I don't care much about the names, but having two Unicode subprojects in the
MS build seems overkill <wink>.

> ls u*.pyd
> ucnhash.pyd      unicodedata.pyd
>
> (the former contains names, the latter data)

Maybe that's the reason:  the names don't get loaded at all unless you *use*
one of the name APIs?  Hard to say whether that's worth the bother; now that
everything has been nicely compressed, it's sure not as compelling as it may
have been earlier.

> I've been meaning to rename the former, but I just realized
> that it might be better to get rid of it completely, and move
> its functionality into the unicodedata module.
>
> The result is a single 200k unicodedata module, which con-
> tains the name database as well as two new functions:
>
>     name(character [, default]) => map unicode
>     character to name.  if the name doesn't exist,
>     return the default object, or raise ValueError.
>
>     lookup(name) => unicode character
>     (or raise KeyError if it doesn't exist)
>
> Should I check it in now, change the names/semantics and check
> it in, or post it to sourceforge?

I have no opinion on what's best:  you're working with it, you're the best
judge of that.  I only vote for checking in whatever you decide sooner
rather than later; I'll fiddle the MS project files and readmes accordingly
ASAP after that.



From moshez@zadka.site.co.il  Wed Jan 24 14:07:08 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Wed, 24 Jan 2001 16:07:08 +0200 (IST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>
Message-ID: <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il>

On Wed, 24 Jan 2001, Greg Ewing <greg@cosc.canterbury.ac.nz> wrote:

> This could be incorporated into PyDict. Instead of storing keys and
> values in the same array, keep them in separate arrays and only
> allocate the values array the first time someone stores a value other
> than 1.

Cool idea, but even cooler (would catch more idioms, that is) is
"the first time someone stores something not 'is'  something in the
dict, allocate the values array". This would catch small numbers,
None and identifier-looking strings, for the measly cost of one
pointer/dict object.

-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From moshez@zadka.site.co.il  Wed Jan 24 14:15:39 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Wed, 24 Jan 2001 16:15:39 +0200 (IST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>, <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com>
 <20010123092226.A25968@thyrsus.com>
Message-ID: <20010124141539.76C3FA83E@darjeeling.zadka.site.co.il>

On Tue, 23 Jan 2001 22:46:47 -0500, Guido van Rossum <guido@digicool.com> wrote:

[ESR]
> You've never heard of BROWSER because I invented it and have not
> widely popularized it yet :-).

[Guido v. Rossum]
> Grumble.  That wasn't the kind of answer I expected.  I don't like it
> if Python is used as a wedge to get a particular thing introduced to
> the rest of the world, no matter how useful it may seem at the time.

Guido, I think you're being over-dramatic. BROWSER is right in the
tradition of PAGER and EDITOR, and a lot of other programs need it.
I know Eric uses RH and mutt, so probably RH's urlview program (which
mutt uses to jump to URLs) uses BROWSER. I was just about to submit
a bug report to Debian that their urlview doesn't respect it.

And if you really don't want to be a horse in front of a cart...

> Anyway, shouldn't you also talk to the developers of packages like KDE
> and Gnome?  Surely their users would like to be able to configure the
> default webbrowser.

Yes -- via GNOME/KDE specific mechanisms. I have 0 experience with KDE,
but I'm guessing the GNOME guys would do it via the GNOME "registry".
KDE probably has something similar. I'm sure you wouldn't want Python
to depend on GNOME, though it would be nice to make the browser-choosing
part pluggable so when "import gnome" is done, it automatically tries
to choose the user's browser.

On UNIX (as opposed to GNOME/KDE, which are pretty much operating systems
themselves), these things are done via environment variable. And $BROWSER
doesn't seem like that much of an innovation.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From skip@mojam.com (Skip Montanaro)  Wed Jan 24 06:28:21 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 24 Jan 2001 00:28:21 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,1.3,1.4
In-Reply-To: <200101240311.WAA06582@cj20424-a.reston1.va.home.com>
References: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net>
 <14957.52952.48739.53360@beluga.mojam.com>
 <200101240311.WAA06582@cj20424-a.reston1.va.home.com>
Message-ID: <14958.30213.325584.373062@beluga.mojam.com>

    Guido> I think I saw a complaint about this that specifically said that
    Guido> when dbhash is imported when bsddb can't be imported, an
    Guido> incomplete dbhash is left behind in sys.modules, and then a
    Guido> second import of dbhash will succeed -- but of course it will
    Guido> define no objects.

So it does:

    % ./python
    Python 2.1a1 (#2, Jan 23 2001, 23:30:41) 
    [GCC 2.95.3 19991030 (prerelease)] on linux2
    Type "copyright", "credits" or "license" for more information.
    >>> import dbhash
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "/home/beluga/skip/src/python/dist/src/Lib/dbhash.py", line 3, in ?
	import bsddb
    ImportError: No module named bsddb
    >>> import dbhash
    >>>

Can that be construed as a bug?  If import fails, shouldn't the stub module
that was inserted in sys.modules be removed?

Skip


From skip@mojam.com (Skip Montanaro)  Wed Jan 24 06:31:08 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 24 Jan 2001 00:31:08 -0600 (CST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>
References: <20010123041730.A25165@thyrsus.com>
 <200101231406.JAA04765@cj20424-a.reston1.va.home.com>
 <20010123092226.A25968@thyrsus.com>
 <200101240346.WAA06790@cj20424-a.reston1.va.home.com>
Message-ID: <14958.30380.851599.764535@beluga.mojam.com>

    Guido> BROWSER seems unlikely to take the world by storm and I don't
    Guido> feel I need to be involved in the effort to get it accepted.

Editors and web browsers are classes of tools which (one would hope) will
always come in several varieties.  Users have to have some way to specify
what to launch.  BROWSER seems analogous to the EDITOR environment variable
which is commonly used in Unix environments for just that purpose.

Skip


From thomas@xs4all.net  Wed Jan 24 07:03:09 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 24 Jan 2001 08:03:09 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101240420.XAA07153@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 11:20:56PM -0500
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl> <200101240420.XAA07153@cj20424-a.reston1.va.home.com>
Message-ID: <20010124080308.G27785@xs4all.nl>

On Tue, Jan 23, 2001 at 11:20:56PM -0500, Guido van Rossum wrote:

> First to submit a working patch gets a free copy of 2.1a2 and
> subsequent releases,

Patch submitted. It only implements 'if key in dict', not 'for key in dict'.
The latter is kind of hard until we have a separate iteration protocol.
(PEP, anyone ?) Once we have it, we could consider 'for key, value in dict',
which is now easily explained with 'dict.popitem()'.

Does this mean I get a legally sound and thus empty legal statement with
every Python release for the rest of your, its or my life, Guido, or will
you just make me 'Free Python Release Receiver For Life' ? :-)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From pf@artcom-gmbh.de  Wed Jan 24 07:31:30 2001
From: pf@artcom-gmbh.de (Peter Funk)
Date: Wed, 24 Jan 2001 08:31:30 +0100 (MET)
Subject: OT: contribution rewards (was Re: [Python-Dev] Re: Sets: elt in dict, lst.include)
In-Reply-To: <200101240420.XAA07153@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Jan 23, 2001 11:20:56 pm"
Message-ID: <m14LKOw-000CxUC@artcom0.artcom-gmbh.de>

Hi,

Guido van Rossum:
[...]
> Ping's proposal makes a lot of sense.  This is a reversal of my
> previous opinion on this matter.  (Take note -- those don't happen
> very often! :-)

It gives a warm und fuzzy feeling to see that happen sometimes at all. ;-)

> First to submit a working patch gets a free copy of 2.1a2 and
> subsequent releases,

This repeated offer of free copies of Python becomes increasingly
boring.  For quite a while I myself have not contributed anything useful 
and I am nevertheless hoarding free copies of Python here. ;-)

What about offering another immaterial reward to potential contributors
instead?  What about "fame points"?  Anybody contributing something
useful to Python receives a certain number of "fame points":  These
fame points will be added and placed in front of the name of
the contributor into the ACKS file and the file will be sorted
accordingly turning the ACKS file effectively into some kind of
"Python contribution high score" ...   ;-)

Just kidding, Peter



From tim.one@home.com  Wed Jan 24 08:08:50 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 24 Jan 2001 03:08:50 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123050807.A29115@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEMPIKAA.tim.one@home.com>

[Neil Schemenauer]
> I think this argues that if sets are added to the core they
> should be implemented as an extension type with the speed of
> dictionaries and the memory usage of lists.  Basicly, we would
> use the implementation of PyDict but drop the values.

They'll be slower than dicts and take more memory than lists then.  WRT
memory, dicts cache the hash code with each entry for speed (so double the
memory of a list even without the value field), and are never more than 2/3
full anyway.  The dict implementation also gets low-level speed benefits out
of using both the key and value fields to characterize the nature of a slot
(the key field is NULL iff the slot is virgin; the value field is NULL iff
the slot is available (virgin or dummy)).

Dummy slots can be avoided (and so also the need for runtime code to
distinguish them from active slots) by using a hash table of pointers to
linked lists-- or flex vectors, or linked lists of small vectors --instead,
and in most ways that leads to much simpler code (no more fiddling with
dummies, no more probe-sequence hassles, no more boosting the size before
the table is full).  But without fine control over the internals of malloc,
that takes even more memory in the end.

Interesting twist:  "a dict" *is* "a set", but a set of (key, value) pairs
further constrained so that no two elements have the same key.  So any set
implementation can be used as-is to implement a dict as a set of 2-tuples,
customizing the hash and "is equal" functions to look at just the tuples'
first elements.  The was the view taken by SETL in 1969, although their
"map" (dict) type was eventually optimized to get away from actually
constructing 2-tuples.  Indeed, SETL eventually grew an elaborate optional
type declaration sublanguage, allowing the user to influence many details of
its many internal set-storage schemes; e.g., from pg 399 of "Programming
With Sets:  An Introduction to SETL":

    For example, we can declare [I'm putting their keywords in UPPERCASE
    for, umm, clarity]

        successors: LOCAL MMAP(ELMT b) REMOTE SET(ELMT b);

    This declaration specifies that for each x in b the image set
    successors{x} is stored in the element block of x, and that this
    image set is always to be represented as a bit vector.  Similarly,
    the declaration

        successors: LOCAL MMAP(ELMT b) SPARSE SET(ELMT b);

    specifies that for each x in b the image set successors{x} is to
    be stored as a hash table containing pointers to elements of b.
    Note that the attribute LOCAL cannot be used for image sets of
    multivalued maps,  This follows from the remarks in section 10.4.3
    on the awkwardness of making local objects into subparts of
    composite objects.

Clear?  Snort.  Here are some citations lifted from the web for their
experience in trying to make these kinds of decisions by magic:

@article{dewar:79,
title="Programming by Refinement, as Exemplified by the {SETL}
Representation Sublanguage",
author="Robert B. K. Dewar and Arthur Grand and Ssu-Cheng Liu and
Jacob T. Schwartz and Edmond Schonberg",
journal=toplas,
year=1979,
month=jul,
volume=1,
number=1,
pages="27--49"
}

@article{schonberg:81,
title="An Automatic Technique for Selection of Data Structures in
{SETL} Programs",
author="Edmond Schonberg and Jacob T. Schwartz and Micha Sharir",
journal=toplas,
year=1981,
month=apr,
volume=3,
number=2,
pages="126--143"
}

@article{freudenberger:83,
title="Experience with the {SETL} Optimizer",
author="Stefan M. Freudenberger and Jacob T. Schwartz and Micha Sharir",
pages="26--45",
journal=toplas,
year=1983,
month=jan,
volume=5,
number=1
}

If someone wanted to take sets seriously today, a better approach would be
to define a minimal "set interface" ("abstract base class" in C++ terms),
then supply multiple implementations of that interface, letting the user
choose directly which implementation strategy they want for each of their
sets.  And people are doing just that in the C++ and Java worlds; e.g.,

http://developer.java.sun.com/developer/onlineTraining/
    collections/Collection.html#SetInterface

Curiously, the newer Java Collections Framework (covering multiple
implementations of list, set, and dict interfaces) gave up on thread-safety
by default, because it cost too much at runtime.  Just another thing to
argue about <wink>.

we're-not-exactly-pioneers-here-ly y'rs  - tim



From fredrik@effbot.org  Wed Jan 24 08:29:30 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Wed, 24 Jan 2001 09:29:30 +0100
Subject: [Python-Dev] getting rid of ucnhash
References: <013901c08566$d2a8f360$e46940d5@hagrid>  <200101240246.VAA06336@cj20424-a.reston1.va.home.com>
Message-ID: <019801c085df$c7ee0540$e46940d5@hagrid>

guido wrote:
> > It's probably just me, but the names of the two unicode
> > modules tend to irritate me:
> > 
> > > ls u*.pyd
> > ucnhash.pyd      unicodedata.pyd
> 
> To me, both of these are irrelevant details of the Unicode
> implementation. :-)   IOW, feel free to check it in.

Done.

Note that Include/ucnhash.h is still there; it declares the
"ucnhash_CAPI" structure used to access names from the
unicodeobject module.

(and all name-related tests are still kept in test_ucn)

I'll leave it to Tim to update the MSVC build files.

Cheers /F



From tim.one@home.com  Wed Jan 24 08:28:34 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 24 Jan 2001 03:28:34 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>

[Guido]
> Can you point me to docs explaining the meaning of the BROWSER
> environment variable?  I've never heard of it...  The last new
> environment variables I learned were PAGER and EDITOR, probably 15
> years ago when 4.1BSD was released... :-)

I gotta say, politics aside, BROWSER is a screamingly natural answer to the
question "what comes next in this sequence?":

    PAGER, EDITOR, ...

Dear Lord, even *I* use a browser almost every week <wink>.

explicit-is-better-than-implicit-ly y'rs  - tim



From esr@thyrsus.com  Wed Jan 24 09:02:59 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:02:59 -0500
Subject: OT: contribution rewards (was Re: [Python-Dev] Re: Sets: elt in dict, lst.include)
In-Reply-To: <m14LKOw-000CxUC@artcom0.artcom-gmbh.de>; from pf@artcom-gmbh.de on Wed, Jan 24, 2001 at 08:31:30AM +0100
References: <200101240420.XAA07153@cj20424-a.reston1.va.home.com> <m14LKOw-000CxUC@artcom0.artcom-gmbh.de>
Message-ID: <20010124040259.A28086@thyrsus.com>

Peter Funk <pf@artcom-gmbh.de>:
> What about offering another immaterial reward to potential contributors
> instead?  What about "fame points"?  Anybody contributing something
> useful to Python receives a certain number of "fame points":  These
> fame points will be added and placed in front of the name of
> the contributor into the ACKS file and the file will be sorted
> accordingly turning the ACKS file effectively into some kind of
> "Python contribution high score" ...   ;-)
> 
> Just kidding, Peter

You may be joking, but as an observer of how gift cultures work I say this
isn't a bad idea.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"One of the ordinary modes, by which tyrants accomplish their purposes
without resistance, is, by disarming the people, and making it an
offense to keep arms."
        -- Constitutional scholar and Supreme Court Justice Joseph Story, 1840


From esr@thyrsus.com  Wed Jan 24 09:09:18 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:09:18 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101240258.VAA06479@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 09:58:59PM -0500
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl> <02f401c0859a$765d07c0$e46940d5@hagrid> <200101240258.VAA06479@cj20424-a.reston1.va.home.com>
Message-ID: <20010124040918.B28086@thyrsus.com>

Guido van Rossum <guido@digicool.com>:
> > let's just say that "in" is the same thing as "has_key",
> > and be done with it.
> 
> You know, I've long resisted this, but I agree now -- this is the
> right thing.

I think we've just justified the time and energy that went into this 
discussion.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

What is a magician but a practicing theorist?
	-- Obi-Wan Kenobi, 'Return of the Jedi'


From esr@thyrsus.com  Wed Jan 24 09:14:27 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:14:27 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 10:46:47PM -0500
References: <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com> <20010123092226.A25968@thyrsus.com> <200101240346.WAA06790@cj20424-a.reston1.va.home.com>
Message-ID: <20010124041427.D28086@thyrsus.com>

Guido van Rossum <guido@digicool.com>:
> Grumble.  That wasn't the kind of answer I expected.  I don't like it
> if Python is used as a wedge to get a particular thing introduced to
> the rest of the world, no matter how useful it may seem at the time.

Oh, stop!  I'm not using Python as an argument for other people to adopt
the BROWSER convention.  The idea sells itself quite nicely by analogy to
EDITOR and PAGER the second people hear it.

> Anyway, shouldn't you also talk to the developers of packages like KDE
> and Gnome?  Surely their users would like to be able to configure the
> default webbrowser.  Talking just to the scripting language people
> seems like you're thinking too small.  There must be lots of C apps
> with the desire to invoke a browser.  Also Emacs, which has an
> extensive list of browser-url-* functions (you might even learn a few
> tricks from it about how to invoke various external browsers) but
> AFAIK no default browser selection.

All on my TO-DO list.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

It is proper to take alarm at the first experiment on our
liberties. We hold this prudent jealousy to be the first duty of
citizens and one of the noblest characteristics of the late
Revolution. The freemen of America did not wait till usurped power had
strengthened itself by exercise and entangled the question in
precedents. They saw all the consequences in the principle, and they
avoided the consequences by denying the principle. We revere this
lesson too much ... to forget it
	-- James Madison.


From esr@thyrsus.com  Wed Jan 24 09:16:12 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:16:12 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>; from tim.one@home.com on Wed, Jan 24, 2001 at 03:28:34AM -0500
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>
Message-ID: <20010124041612.E28086@thyrsus.com>

Tim Peters <tim.one@home.com>:
> I gotta say, politics aside, BROWSER is a screamingly natural answer to the
> question "what comes next in this sequence?":
> 
>     PAGER, EDITOR, ...

That's exactly what I thought when I was struck by the obvious.  Everybody
I spread this meme to seems to agree.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Government is actually the worst failure of civilized man. There has
never been a really good one, and even those that are most tolerable
are arbitrary, cruel, grasping and unintelligent.
	-- H. L. Mencken 


From esr@thyrsus.com  Wed Jan 24 09:21:56 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:21:56 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124141539.76C3FA83E@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Wed, Jan 24, 2001 at 04:15:39PM +0200
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>, <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com> <20010123092226.A25968@thyrsus.com> <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <20010124141539.76C3FA83E@darjeeling.zadka.site.co.il>
Message-ID: <20010124042156.F28086@thyrsus.com>

Moshe Zadka <moshez@zadka.site.co.il>:
> I know Eric uses RH and mutt, so probably RH's urlview program (which
> mutt uses to jump to URLs) uses BROWSER. I was just about to submit
> a bug report to Debian that their urlview doesn't respect it.

Oh, *do* that!  Note: BROWSER may consist of a colon-separated series
of parts, browser commands to be tried in order (this is useful so you
can put an X browser first, then a console browser, and have the right
thing happen).  If a part contains %s, the URL is substituted there;
otherwise, the URL is concatenated to the command after a space.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Gun Control: The theory that a woman found dead in an alley, raped and
strangled with her panty hose, is somehow morally superior to a
woman explaining to police how her attacker got that fatal bullet wound.
	-- L. Neil Smith


From tim.one@home.com  Wed Jan 24 09:24:26 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 24 Jan 2001 04:24:26 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101240354.WAA06903@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAENIIKAA.tim.one@home.com>

[Greg Ewing]
> This could be incorporated into PyDict. Instead of storing keys and
> values in the same array, keep them in separate arrays and only
> allocate the values array the first time someone stores a value other
> than 1.

[Guido]
> Not a bad idea!

In theory, but if Vladimir were here he'd bust a gut over the possibly bad
cache effects on "real dicts" (by keeping everything together, simply
accessing the cached hash code brings both the key and value pointers into
L1 cache too).  We would need to quantify the effect of breaking that
connection.

> (But shouldn't the default value be something else,
> like none?)

Bleech.  I hate the idiom of using a false value to mean "present".

    d = {}
    for x in seq:
        d[x] = 1

runs faster too (None needs a LOAD_GLOBAL now).



From tim.one@home.com  Wed Jan 24 10:01:36 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 24 Jan 2001 05:01:36 -0500
Subject: [Python-Dev] test___all__ failing; Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>

> python  ../lib/test/regrtest.py test___all__

test___all__
test test___all__ crashed -- exceptions.AttributeError:
     'locale' module has no attribute 'LC_MESSAGES'

And indeed it does not:

> python
Python 2.1a1 (#9, Jan 24 2001, 04:40:55) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import locale
>>> dir(locale)
['CHAR_MAX', 'Error', 'LC_ALL', 'LC_COLLATE', 'LC_CTYPE',
 'LC_MONETARY', 'LC_NUMERIC', 'LC_TIME', '__all__', '__builtins__',
 '__doc__', '__file__', '__name__', '_build_localename', '_group',
 '_parse_localename', '_print_locale', '_setlocale', '_test', 'atof',
 'atoi', 'encoding_alias', 'format', 'getdefaultlocale', 'getlocale',
 'locale_alias', 'localeconv', 'normalize', 'resetlocale', 'setlocale',
 'str', 'strcoll', 'string', 'strxfrm', 'sys', 'windows_locale']
>>>

Nor is LC_MESSAGES std C (the other LC_XXX guys are).

I pin the blame on

    from _locale import *

in locale.py -- who knows what that's supposed to export?  Certainly not
Skip <wink>.



From tim.one@home.com  Wed Jan 24 10:17:47 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 24 Jan 2001 05:17:47 -0500
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>

Nevermind; checked in a hack to stop the error on Windows.



From mal@lemburg.com  Wed Jan 24 13:00:28 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 24 Jan 2001 14:00:28 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl> <02f401c0859a$765d07c0$e46940d5@hagrid>
Message-ID: <3A6ED1EC.237B5B1D@lemburg.com>

Fredrik Lundh wrote:
> 
> > It's come up before. The problem with it is that it's not quite obvious
> > whether it is 'if key in dict' or 'if value in dict'.
> 
> you forgot "if (key, value) in dict"
> 
> on the other hand, it's not quite obvious that "list.sort"
> doesn't return the sorted list, "print >>None" prints to
> standard output, "except KeyError, ValueError" doesn't
> catch a ValueError exception, etc, etc, etc.
> 
> (nor that it's "has_key" and "hasattr", and not "has_key"
> and "has_attr" or "haskey" and "hasattr" ;-)
> 
> let's just say that "in" is the same thing as "has_key",
> and be done with it.

+1 all the way :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Wed Jan 24 14:01:33 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 24 Jan 2001 15:01:33 +0100
Subject: [Python-Dev] Interfaces (Is X a (sequence|mapping)?)
References: <200101230814.f0N8EWQ00849@mira.informatik.hu-berlin.de>
 <3A6D4B9F.38B17046@lemburg.com> <200101231531.KAA05122@cj20424-a.reston1.va.home.com>
Message-ID: <3A6EE03D.4D5DFD17@lemburg.com>

Guido van Rossum wrote:
> 
> > Polymorphic code will usually get you more out of an
> > algorithm, than type-safe or interface-safe code.
> 
> Right.
> 
> But there are times when people want to write methods that take
> e.g. either a sequence or a mapping, and need to distinguish between
> the two.  That's not easy in Python!  Java and C++ support it very
> well though, and thus we'll always keep seeing this kind of
> complaint.  Not sure what to do, except to recommend "find out which
> methods you expect in one case but not in the other (e.g. keys()) and
> do a hasattr() test for that."

Perhaps we should provide simple means for testing a set of
available methods and slots ?!

E.g. hasinterface(obj, ('keys', 'items', '__len__'))

Objects could provide an __interface__ special attribute for this
purpose (since not all slots can be auto-detected and -verified
without side-effects).

> > BTW, there are Python interfaces to PySequence_Check() and
> > PyMapping_Check() burried in the builtin operator module in case
> > you really do care ;) ...
> >
> >       operator.isSequenceType()
> >       operator.isMappingType()
> >       + some other C style _Check() APIs
> >
> > These only look at the type slots though, so Python instances
> > will appear to support everything but when used fail with
> > an exception if they don't provide the proper __xxx__ hooks.
> 
> Yes, these should probably be deprecated.  I certainly have never used
> them!  (The operator module doesn't seem to get much use in
> general...  Was it a bad idea?)

Some of these are nice to have and provide some good performance
boost (e.g. the numeric slot access APIs). The type slot checking 
APIs are not too useful though.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jim@digicool.com  Wed Jan 24 09:05:44 2001
From: jim@digicool.com (Jim Fulton)
Date: Wed, 24 Jan 2001 04:05:44 -0500
Subject: [Python-Dev] I think my set module is ready for prime time;
 comments?
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com> <20010123140604.E18796@trump.amber.org> <20010123050807.A29115@glacier.fnational.com> <20010123150616.F18796@trump.amber.org>
Message-ID: <3A6E9AE8.6C2D3CF0@digicool.com>

Christopher Petrilli wrote:
> 
> Neil Schemenauer [nas@arctrix.com] wrote:
> > On Tue, Jan 23, 2001 at 02:06:05PM -0500, Christopher Petrilli wrote:
> > > Unfortunately, for me, a Python implementation of Sets is only
> > > interesting academicaly.  Any time I've needed to work with them at a
> > > large scale, I've needed them *much* faster than Python could achieve
> > > without a C extension.
> >
> > I think this argues that if sets are added to the core they
> > should be implemented as an extension type with the speed of
> > dictionaries and the memory usage of lists.  Basicly, we would
> > use the implementation of PyDict but drop the values.
> 
> This is effectively the implementation that Zope has for Sets. 

Except we use sorted collections with binary search for sets.

I think that a simple hash-based set would make alot of sense.

> In
> addition we have "buckets" that have scores on them (which are
> implemented as a modified BTree).
> 
> Unfortunately Jim Fulton (who wrote all the code for that level) is in
> a meeting, but I hope he'll comment on the implementation that was
> chosen for our software.

We have a number of special needs:

  - Scalability is critical. We make some special opimizations, 
    like sets of integers and mapping objects with integer keys
    and values. In these cases, data are stored using C int arrays, 
    allowing very efficient data storage and manipulation, especially
    when using integer keys.

  - We need to spread data over multiple database records. Our data
    structures may be hundreds of megabytes in size. We have ZODB-aware
    structures that use multiple independently stored database objects.

  - Range searches are very common, and under some circomstances, 
    sorted collections and BTrees can have very little overhead
    compared to dictionaries. For this reason, out mapping objects
    and sets have been based on BTrees and sorted collections.

Unfortunately, our current BTree implementation has a flaw that
causes excessive number of objects to be updated when items are 
added and removed. (Each BTree internal node keeps track of the number
of objects contained in it.)  Also, out current sets are limited
to integers and cannot be spread over multiple database records.

We are completing a new BTree implementation that overcomes these 
limitations.  IN this implementation, we will provide sets as
value-less BTrees.

Jim

--
Jim Fulton           mailto:jim@digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org


From gvwilson@nevex.com  Wed Jan 24 14:10:41 2001
From: gvwilson@nevex.com (Greg Wilson)
Date: Wed, 24 Jan 2001 09:10:41 -0500
Subject: [Python-Dev] re: sets
In-Reply-To: <20010124032401.EB329F199@mail.python.org>
Message-ID: <000301c0860f$6fa29010$770a0a0a@nevex.com>

1. I did a poll overnight by email of 22 friends and colleagues,
none of whom are regular Python users (yet).  My question was,

   "Would you expect the interface of a set class to be like
    the interface of a vector or list, or like the interface
    of a map or hash?"

15 people have replied; all 15 have said, "map or hash".
Several respondents are Perl hackers, so I'm sure the answer
is influenced by previous exposure to the set-as-valueless-hash
idiom.  Still, I think 15-0 is a pretty convincing score...

Four, unprompted, said that they thought the STL's hierarchy of
containers was as good as it gets, and that other languages
should mirror it.  (One of those added that this makes teaching
much simpler --- students can transfer instincts from one language
to another.)

2. Is there enough interest in sets for a BOF at IPC9?  Please
reply to me point-to-point if you're interested; I'll summarize
and post the result.  I volunteer to bring the donuts...

> > Ka-Ping Yee:
> > The only change that needs to be made to support sets of immutable
> > elements is to provide "in" on dictionaries.  The rest is then all
> > quite natural:
> >     dict[key] = 1
> >     if key in dict: ...
> >     for key in dict: ...

> > various:
> > [but what about 'value in dict' or '(key, value) in dict'?]

> Fredrik Lundh:
> let's just say that "in" is the same thing as "has_key",
> and be done with it.

> Guido van Rossum:
> You know, I've long resisted this, but I agree now -- this is the
> right thing.

Greg Wilson:
Woo hoo!  Now, on a related note, what is the status of the 'indices()'
proposal, as in:

    for i in indices(someList):

instead of:

    for i in range(len(someList)):

Would 'indices(dict)' be the same as 'dict.keys()', to allow
uniform iteration?  Or would it be more economical to introduce
a 'keys()' method on lists and tuples, so that:

    for i in collection.keys():

would work on dicts, lists, and tuples?  I know that 'keys()'
is the wrong name for lists and tuples, but dicts are already
using it, and it's completely unambiguous...

Thanks,
Greg


From mal@lemburg.com  Wed Jan 24 14:46:10 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 24 Jan 2001 15:46:10 +0100
Subject: [Python-Dev] I think my set module is ready for prime time;
 comments?
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com> <20010123140604.E18796@trump.amber.org> <20010123050807.A29115@glacier.fnational.com> <20010123150616.F18796@trump.amber.org> <3A6E9AE8.6C2D3CF0@digicool.com>
Message-ID: <3A6EEAB2.5E6A4E83@lemburg.com>

Jim Fulton wrote:
> 
> Christopher Petrilli wrote:
> >
> > Neil Schemenauer [nas@arctrix.com] wrote:
> > > On Tue, Jan 23, 2001 at 02:06:05PM -0500, Christopher Petrilli wrote:
> > > > Unfortunately, for me, a Python implementation of Sets is only
> > > > interesting academicaly.  Any time I've needed to work with them at a
> > > > large scale, I've needed them *much* faster than Python could achieve
> > > > without a C extension.
> > >
> > > I think this argues that if sets are added to the core they
> > > should be implemented as an extension type with the speed of
> > > dictionaries and the memory usage of lists.  Basicly, we would
> > > use the implementation of PyDict but drop the values.
> >
> > This is effectively the implementation that Zope has for Sets.
> 
> Except we use sorted collections with binary search for sets.
> 
> I think that a simple hash-based set would make alot of sense.
> 
> > In
> > addition we have "buckets" that have scores on them (which are
> > implemented as a modified BTree).
> >
> > Unfortunately Jim Fulton (who wrote all the code for that level) is in
> > a meeting, but I hope he'll comment on the implementation that was
> > chosen for our software.
> 
> We have a number of special needs:
> 
>   - Scalability is critical. We make some special opimizations,
>     like sets of integers and mapping objects with integer keys
>     and values. In these cases, data are stored using C int arrays,
>     allowing very efficient data storage and manipulation, especially
>     when using integer keys.
> 
>   - We need to spread data over multiple database records. Our data
>     structures may be hundreds of megabytes in size. We have ZODB-aware
>     structures that use multiple independently stored database objects.
> 
>   - Range searches are very common, and under some circomstances,
>     sorted collections and BTrees can have very little overhead
>     compared to dictionaries. For this reason, out mapping objects
>     and sets have been based on BTrees and sorted collections.
> 
> Unfortunately, our current BTree implementation has a flaw that
> causes excessive number of objects to be updated when items are
> added and removed. (Each BTree internal node keeps track of the number
> of objects contained in it.)  Also, out current sets are limited
> to integers and cannot be spread over multiple database records.
> 
> We are completing a new BTree implementation that overcomes these
> limitations.  IN this implementation, we will provide sets as
> value-less BTrees.

You may want to check out a soon to be released new mx
package: mxBeeBase. This is an on-disk b+tree implementation
which supports data files up to 2GB on 32-bit platforms.

Here's a preview:

	http://www.lemburg.com/python/mxBeeBase.html

(The links on that page are not functional.)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From skip@mojam.com (Skip Montanaro)  Wed Jan 24 14:42:23 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 24 Jan 2001 08:42:23 -0600 (CST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
Message-ID: <14958.59855.4855.52638@beluga.mojam.com>

    Tim> Nor is LC_MESSAGES std C (the other LC_XXX guys are).

    Tim> I pin the blame on

    Tim>     from _locale import *

    Tim> in locale.py -- who knows what that's supposed to export?
    Tim> Certainly not Skip <wink>.

Was that a roundabout way of complimenting me for having found a bug? ;-)

Skip





From skip@mojam.com (Skip Montanaro)  Wed Jan 24 14:50:02 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 24 Jan 2001 08:50:02 -0600 (CST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
 <LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>
Message-ID: <14958.60314.482226.825611@beluga.mojam.com>

    Tim> Nevermind; checked in a hack to stop the error on Windows.

Probably should file a bug report (if you haven't already) so the root
problem isn't forgotten because the hack obscures it.  I see this code in
localemodule.c:

    #ifdef LC_MESSAGES
	x = PyInt_FromLong(LC_MESSAGES);
	PyDict_SetItemString(d, "LC_MESSAGES", x);
	Py_XDECREF(x);
    #endif /* LC_MESSAGES */

Martin, looks like this module is your baby.  Care to hazard a guess about
whether LC_MESSAGES should always or never be there?

Skip



From fredrik@effbot.org  Wed Jan 24 15:11:33 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Wed, 24 Jan 2001 16:11:33 +0100
Subject: [Python-Dev] test___all__ failing; Windows
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com><LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com> <14958.60314.482226.825611@beluga.mojam.com>
Message-ID: <04de01c08617$f56216f0$e46940d5@hagrid>

Skip wrote:

> Probably should file a bug report (if you haven't already) so the root
> problem isn't forgotten because the hack obscures it.  I see this code in
> localemodule.c:
> 
>     #ifdef LC_MESSAGES
> x = PyInt_FromLong(LC_MESSAGES);
> PyDict_SetItemString(d, "LC_MESSAGES", x);
> Py_XDECREF(x);
>     #endif /* LC_MESSAGES */
> 
> Martin, looks like this module is your baby.  Care to hazard a guess about
> whether LC_MESSAGES should always or never be there?

I think the correct answer is "sometimes":

    ANSI C mandates LC_ALL, LC_COLLATE, LC_CTYPE,
    LC_MONETARY, LC_NUMERIC, and LC_TIME

    Unix mandates LC_ALL, LC_COLLATE,LC_CTYPE,
    LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and
    LC_TIME

in other words, if it's supported, it should be exposed by
the Python bindings.

Cheers /F



From tismer@tismer.com  Wed Jan 24 14:40:04 2001
From: tismer@tismer.com (Christian Tismer)
Date: Wed, 24 Jan 2001 16:40:04 +0200
Subject: [Python-Dev] I think my set module is ready for prime time;
 comments?
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>
Message-ID: <3A6EE944.C8CC6EF7@tismer.com>


Greg Ewing wrote:
> 
> Neil Schemenauer <nas@arctrix.com>:
> 
> > Basicly, we would
> > use the implementation of PyDict but drop the values.
> 
> This could be incorporated into PyDict. Instead of storing keys and
> values in the same array, keep them in separate arrays and only
> allocate the values array the first time someone stores a value other
> than 1.

Very good idea. It fits also in my view of how dicts should be
implemented: Keep keys and values apart, since this information
has different access patterns.
I think (or at least hope) that dictionaries become faster,
when hashes, keys and values are in seperate areas, giving more
cache hits. Not sure if hashes and keys should be apart, but
sure for values.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From guido@digicool.com  Wed Jan 24 15:37:03 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 24 Jan 2001 10:37:03 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,1.3,1.4
In-Reply-To: Your message of "Wed, 24 Jan 2001 00:28:21 CST."
 <14958.30213.325584.373062@beluga.mojam.com>
References: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net> <14957.52952.48739.53360@beluga.mojam.com> <200101240311.WAA06582@cj20424-a.reston1.va.home.com>
 <14958.30213.325584.373062@beluga.mojam.com>
Message-ID: <200101241537.KAA27039@cj20424-a.reston1.va.home.com>

>     Guido> I think I saw a complaint about this that specifically said that
>     Guido> when dbhash is imported when bsddb can't be imported, an
>     Guido> incomplete dbhash is left behind in sys.modules, and then a
>     Guido> second import of dbhash will succeed -- but of course it will
>     Guido> define no objects.
> 
> So it does:
> 
>     % ./python
>     Python 2.1a1 (#2, Jan 23 2001, 23:30:41) 
>     [GCC 2.95.3 19991030 (prerelease)] on linux2
>     Type "copyright", "credits" or "license" for more information.
>     >>> import dbhash
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>       File "/home/beluga/skip/src/python/dist/src/Lib/dbhash.py", line 3, in ?
> 	import bsddb
>     ImportError: No module named bsddb
>     >>> import dbhash
>     >>>
> 
> Can that be construed as a bug?  If import fails, shouldn't the stub module
> that was inserted in sys.modules be removed?

Yep, but not a very important bug -- typically this isn't caught.
Feel free to check in a change; I think you should be able to insert
something like

    import sys
    try:
	import bsddb
    except ImportError:
	del sys.modules[__name__]
	raise

into dbhash.

If this works for you in testing, forget the patch manager, just check
it in.  (I'm too busy to do much myself, the company needs me. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From pf@artcom-gmbh.de  Wed Jan 24 15:32:55 2001
From: pf@artcom-gmbh.de (Peter Funk)
Date: Wed, 24 Jan 2001 16:32:55 +0100 (MET)
Subject: LC_MESSAGES (was Re: [Python-Dev] test___all__ failing; Windows)
In-Reply-To: <14958.60314.482226.825611@beluga.mojam.com> from Skip Montanaro at "Jan 24, 2001  8:50: 2 am"
Message-ID: <m14LRup-000CxUC@artcom0.artcom-gmbh.de>

Hi,

Skip Montanaro:
> 
>     Tim> Nevermind; checked in a hack to stop the error on Windows.
> 
> Probably should file a bug report (if you haven't already) so the root
> problem isn't forgotten because the hack obscures it.  I see this code in
> localemodule.c:
> 
>     #ifdef LC_MESSAGES
> 	x = PyInt_FromLong(LC_MESSAGES);
> 	PyDict_SetItemString(d, "LC_MESSAGES", x);
> 	Py_XDECREF(x);
>     #endif /* LC_MESSAGES */
> 
> Martin, looks like this module is your baby.  Care to hazard a guess about
> whether LC_MESSAGES should always or never be there?

AFAI found out, LC_MESSAGES was added to the POSIX "standard" in Posix.2.
Non-posix2 compatible systems probably miss the proper functionality 
behind 'setlocale()'.  So the best solution would be to add a clever
emulation/approximation of this feature, if the underlying platform
(here windows) doesn't provide it.   This would require to wrap 
'setlocale()'.  But I'm not sure how to emulate for example
'setlocale(LC_MESSAGES, 'DE_de') on a Windows box.  May be it is
impossible to achieve.  

What I would love to see is that the typical query
'setlocale(LC_MESSAGES)' would return 'DE_de' on a Box running for example
the german version of Windows or MacOS.  This would eliminate the need for
ugly language selection menus on these platforms in a portable fashion.

Regards, Peter



From guido@digicool.com  Wed Jan 24 15:41:07 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 24 Jan 2001 10:41:07 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Wed, 24 Jan 2001 16:07:08 +0200."
 <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il>
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>
 <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il>
Message-ID: <200101241541.KAA27082@cj20424-a.reston1.va.home.com>

> > This could be incorporated into PyDict. Instead of storing keys and
> > values in the same array, keep them in separate arrays and only
> > allocate the values array the first time someone stores a value other
> > than 1.
> 
> Cool idea, but even cooler (would catch more idioms, that is) is
> "the first time someone stores something not 'is'  something in the
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> dict, allocate the values array". This would catch small numbers,
> None and identifier-looking strings, for the measly cost of one
> pointer/dict object.

Sorry, but I don't understand what you mean by the ^^^ marked phrase.
Can you please elaborate?

Regarding storing one for "present", that's all well and fine, but it
suggests to me that storing a false value could mean "not present".
Do we really want that?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From moshez@zadka.site.co.il  Thu Jan 25 00:50:13 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Thu, 25 Jan 2001 02:50:13 +0200 (IST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101241541.KAA27082@cj20424-a.reston1.va.home.com>
References: <200101241541.KAA27082@cj20424-a.reston1.va.home.com>, <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>
 <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il>
Message-ID: <20010125005013.58C12A840@darjeeling.zadka.site.co.il>

On Wed, 24 Jan 2001 10:41:07 -0500, Guido van Rossum <guido@digicool.com> wrote:

> > Cool idea, but even cooler (would catch more idioms, that is) is
> > "the first time someone stores something not 'is'  something in the
>                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> > dict, allocate the values array". This would catch small numbers,
> > None and identifier-looking strings, for the measly cost of one
> > pointer/dict object.
> 
> Sorry, but I don't understand what you mean by the ^^^ marked phrase.
> Can you please elaborate?

I should really stop writing incomprehensible bits like that. Heck,
I can't even understand it on second reading.

I meant that the dictionary would keep a slot for "the one and only
value". First time someone puts a value in the dict, it puts it
in the "one and only value" slot, and doesn't initalize the value
array. The second time someone puts a value, it checks for pointer
equality with that "one and only value". If it is the same, it
it still doesn't initalize the value array. The only time when
the dictionary initalizes the value array is when two pointer-different
values are put in.

This would let me code

a[key] = None

For my sets (but consistent in the same set!)

a[key] = 1

When the timbot codes (again, consistent in the same set)

and

a[key] = 'present'

If you're really weird.

(identifier-like strings get interned)

That's not *semantics*, that's *optimization* for a commonly
used (I think) idiom with dictionaries -- you can't predict
the value, but it will probably remain the same.

-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From skip@mojam.com (Skip Montanaro)  Wed Jan 24 16:44:17 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 24 Jan 2001 10:44:17 -0600 (CST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <04de01c08617$f56216f0$e46940d5@hagrid>
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
 <LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>
 <14958.60314.482226.825611@beluga.mojam.com>
 <04de01c08617$f56216f0$e46940d5@hagrid>
Message-ID: <14959.1633.163407.779930@beluga.mojam.com>

    Fredrik> I think the correct answer is "sometimes":

    Fredrik>     ANSI C mandates LC_ALL, LC_COLLATE, LC_CTYPE,
    Fredrik>     LC_MONETARY, LC_NUMERIC, and LC_TIME

    Fredrik>     Unix mandates LC_ALL, LC_COLLATE,LC_CTYPE,
    Fredrik>     LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and
    Fredrik>     LC_TIME

    Fredrik> in other words, if it's supported, it should be exposed by
    Fredrik> the Python bindings.

Then this suggests that either Tim's hack is the correct fix (leave it out
because we can't rely on it always being there) or I should add it to
__all__ at the bottom of the file if and only if it's present in the
module's namespace.

Skip





From moshez@zadka.site.co.il  Thu Jan 25 00:57:22 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Thu, 25 Jan 2001 02:57:22 +0200 (IST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <04de01c08617$f56216f0$e46940d5@hagrid>
References: <04de01c08617$f56216f0$e46940d5@hagrid>, <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com><LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com> <14958.60314.482226.825611@beluga.mojam.com>
Message-ID: <20010125005722.D2229A840@darjeeling.zadka.site.co.il>

On Wed, 24 Jan 2001 16:11:33 +0100, "Fredrik Lundh" <fredrik@effbot.org> wrote:

> I think the correct answer is "sometimes":
> 
>     ANSI C mandates LC_ALL, LC_COLLATE, LC_CTYPE,
>     LC_MONETARY, LC_NUMERIC, and LC_TIME
> 
>     Unix mandates LC_ALL, LC_COLLATE,LC_CTYPE,
>     LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and
>     LC_TIME
> 
> in other words, if it's supported, it should be exposed by
> the Python bindings.

In that case, the __all__ attribute in the module has to be calculated
dynamically. Say, adding code like

try:
    LC_MESSAGES
except NameError:
    pass
else:
    __all__.append('LC_MESSAGES')

Ditto for anything else.

Should I check in a patch?
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From trentm@ActiveState.com  Wed Jan 24 16:49:17 2001
From: trentm@ActiveState.com (Trent Mick)
Date: Wed, 24 Jan 2001 08:49:17 -0800
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>; from tim.one@home.com on Wed, Jan 24, 2001 at 03:28:34AM -0500
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>
Message-ID: <20010124084917.C29977@ActiveState.com>

How will the expected adherence of apps to BROWSER jive with the current (and
poorly understood by me) Windows convention of specifying the "default"
browser somewhere in the registry? 

Trent


-- 
Trent Mick
TrentM@ActiveState.com


From skip@mojam.com (Skip Montanaro)  Wed Jan 24 16:49:23 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 24 Jan 2001 10:49:23 -0600 (CST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <20010125005722.D2229A840@darjeeling.zadka.site.co.il>
References: <04de01c08617$f56216f0$e46940d5@hagrid>
 <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
 <LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>
 <14958.60314.482226.825611@beluga.mojam.com>
 <20010125005722.D2229A840@darjeeling.zadka.site.co.il>
Message-ID: <14959.1939.398029.896891@beluga.mojam.com>

    Moshe> In that case, the __all__ attribute in the module has to be
    Moshe> calculated dynamically. Say, adding code like

No need.  I've already got this exact change in my local copy and I'll be
adding a few more __all__ lists later today.

Skip


From paulp@ActiveState.com  Wed Jan 24 16:56:26 2001
From: paulp@ActiveState.com (Paul Prescod)
Date: Wed, 24 Jan 2001 08:56:26 -0800
Subject: [Python-Dev] I think my set module is ready for prime time;
 comments?
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>
 <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il> <200101241541.KAA27082@cj20424-a.reston1.va.home.com>
Message-ID: <3A6F093A.A311C71E@ActiveState.com>

Guido van Rossum wrote:
> 
>...
> 
> > Cool idea, but even cooler (would catch more idioms, that is) is
> > "the first time someone stores something not 'is'  something in the
>
> Sorry, but I don't understand what you mean by the ^^^ marked phrase.
> Can you please elaborate?

I wasn't clear about that either. The idea is:

def add(new_value):
    if not values_array:
        if self.magic_value is NULL:
            self.magic_value = new_value
        elif new_value is not self.magic_value:
            self.values_array=[self.magic_value, new_value, ... ]
        else:
            # new_value is self.magic_value: do nothing

I am neutral on this proposal myself. I think that even if we optimize
any code where you pass the same thing over and over again, we should
document a convention for consistency. So I'm not sure there is much
advantage.

 Paul Prescod


From esr@thyrsus.com  Wed Jan 24 16:53:31 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 11:53:31 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124084917.C29977@ActiveState.com>; from trentm@ActiveState.com on Wed, Jan 24, 2001 at 08:49:17AM -0800
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com> <20010124084917.C29977@ActiveState.com>
Message-ID: <20010124115331.A15059@thyrsus.com>

Trent Mick <trentm@ActiveState.com>:
> How will the expected adherence of apps to BROWSER jive with the current (and
> poorly understood by me) Windows convention of specifying the "default"
> browser somewhere in the registry? 

BROWSER overrides the registry setting.  Which is OK; under Windows, only
wizards are going to muck with it.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Ideology, politics and journalism, which luxuriate in failure, are
impotent in the face of hope and joy.
	-- P. J. O'Rourke


From guido@digicool.com  Wed Jan 24 16:59:00 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 24 Jan 2001 11:59:00 -0500
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: Your message of "Wed, 24 Jan 2001 10:44:17 CST."
 <14959.1633.163407.779930@beluga.mojam.com>
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com> <LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com> <14958.60314.482226.825611@beluga.mojam.com> <04de01c08617$f56216f0$e46940d5@hagrid>
 <14959.1633.163407.779930@beluga.mojam.com>
Message-ID: <200101241659.LAA27650@cj20424-a.reston1.va.home.com>

>     Fredrik> I think the correct answer is "sometimes":
> 
>     Fredrik>     ANSI C mandates LC_ALL, LC_COLLATE, LC_CTYPE,
>     Fredrik>     LC_MONETARY, LC_NUMERIC, and LC_TIME
> 
>     Fredrik>     Unix mandates LC_ALL, LC_COLLATE,LC_CTYPE,
>     Fredrik>     LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and
>     Fredrik>     LC_TIME
> 
>     Fredrik> in other words, if it's supported, it should be exposed by
>     Fredrik> the Python bindings.
> 
> Then this suggests that either Tim's hack is the correct fix (leave it out
> because we can't rely on it always being there) or I should add it to
> __all__ at the bottom of the file if and only if it's present in the
> module's namespace.

The latter.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From moshez@zadka.site.co.il  Thu Jan 25 17:05:44 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Thu, 25 Jan 2001 19:05:44 +0200 (IST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124084917.C29977@ActiveState.com>
References: <20010124084917.C29977@ActiveState.com>, <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>
Message-ID: <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>

On Wed, 24 Jan 2001 08:49:17 -0800, Trent Mick <trentm@ActiveState.com> wrote:
 
> How will the expected adherence of apps to BROWSER jive with the current (and
> poorly understood by me) Windows convention of specifying the "default"
> browser somewhere in the registry? 

The "webbrowser" module should prefer to take the setting from the
registry on windows.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From guido@digicool.com  Wed Jan 24 17:17:09 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 24 Jan 2001 12:17:09 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Thu, 25 Jan 2001 02:50:13 +0200."
 <20010125005013.58C12A840@darjeeling.zadka.site.co.il>
References: <200101241541.KAA27082@cj20424-a.reston1.va.home.com>, <200101240252.PAA02105@s454.cosc.canterbury.ac.nz> <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il>
 <20010125005013.58C12A840@darjeeling.zadka.site.co.il>
Message-ID: <200101241717.MAA27852@cj20424-a.reston1.va.home.com>

> I meant that the dictionary would keep a slot for "the one and only
> value". First time someone puts a value in the dict, it puts it
> in the "one and only value" slot, and doesn't initalize the value
> array. The second time someone puts a value, it checks for pointer
> equality with that "one and only value". If it is the same, it
> it still doesn't initalize the value array. The only time when
> the dictionary initalizes the value array is when two pointer-different
> values are put in.
> 
> This would let me code
> 
> a[key] = None
> 
> For my sets (but consistent in the same set!)
> 
> a[key] = 1
> 
> When the timbot codes (again, consistent in the same set)
> 
> and
> 
> a[key] = 'present'
> 
> If you're really weird.
> 
> (identifier-like strings get interned)
> 
> That's not *semantics*, that's *optimization* for a commonly
> used (I think) idiom with dictionaries -- you can't predict
> the value, but it will probably remain the same.

This I like!

But note that a dict currently uses 12 bytes per slot in the hash
table (on a 32-bit platform: long me_hash; PyObject *me_key,
*me_value).  The hash table's fill factor is typically between 50 and
67%.

I think removing the hashes would slow down lookups too much, so
optimizing identical values out would only save 6-8 bytes per existing
key on average.  Not clear if it's worth enough.  I think I have to
agree with Tim's expectation that two (or three) separate parallel
arrays will reduce the cache locality and thus slow things down.  Once
you start probing, you jump through the hashtable at large random
strides, causing bad cache performance (for largeish hash tables); but
since often enough the first slot tried is right, you have the hash,
key and value right next together, typically on the same cache line.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@thyrsus.com  Wed Jan 24 17:31:55 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 12:31:55 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Thu, Jan 25, 2001 at 07:05:44PM +0200
References: <20010124084917.C29977@ActiveState.com>, <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com> <20010124084917.C29977@ActiveState.com> <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>
Message-ID: <20010124123155.A15203@thyrsus.com>

Moshe Zadka <moshez@zadka.site.co.il>:
> > How will the expected adherence of apps to BROWSER jive with the
> > current (and poorly understood by me) Windows convention of
> > specifying the "default" browser somewhere in the registry?
> 
> The "webbrowser" module should prefer to take the setting from the
> registry on windows.

Um, that's not the way it works right now. The windows-default browser choice 
launches the registered default browser, but BROWSER may have something else
in its search list first.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The real point of audits is to instill fear, not to extract revenue;
the IRS aims at winning through intimidation and (thereby) getting
maximum voluntary compliance
	-- Paul Strassel, former IRS Headquarters Agent Wall St. Journal 1980


From esr@thyrsus.com  Wed Jan 24 17:52:11 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 12:52:11 -0500
Subject: [Python-Dev] BROWSER status
Message-ID: <20010124125211.A15276@thyrsus.com>

I spent the morning writing and testing patches to make urlview and GNU Emacs
BROWSER-aware, and have sent them off to the relevant maintainers.  I've
also sent a patch to Andries Brouwer for the environ(5) man page.  

Those of you interested in my latest bit of social engineering can
take a look at

	http://www.tuxedo.org/~esr/BROWSER/

A bow in Guido's direction -- if he hadn't been grouchy about this I
probably wouldn't have gotten to shipping those patches for a while.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

A right is not what someone gives you; it's what no one can take from you. 
	-- Ramsey Clark


From thomas@xs4all.net  Wed Jan 24 18:33:27 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 24 Jan 2001 19:33:27 +0100
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Thu, Jan 25, 2001 at 07:05:44PM +0200
References: <20010124084917.C29977@ActiveState.com>, <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com> <20010124084917.C29977@ActiveState.com> <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>
Message-ID: <20010124193326.B962@xs4all.nl>

On Thu, Jan 25, 2001 at 07:05:44PM +0200, Moshe Zadka wrote:
> On Wed, 24 Jan 2001 08:49:17 -0800, Trent Mick <trentm@ActiveState.com> wrote:
>  
> > How will the expected adherence of apps to BROWSER jive with the current (and
> > poorly understood by me) Windows convention of specifying the "default"
> > browser somewhere in the registry? 

> The "webbrowser" module should prefer to take the setting from the
> registry on windows.

Why ? That's a lot harder to change, and not settable per
'shell'/'thread'/'process'.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From tim.one@home.com  Wed Jan 24 19:54:47 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 24 Jan 2001 14:54:47 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124115331.A15059@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEPDIKAA.tim.one@home.com>

Guys, while I like BROWSER, don't think it has anything to do with Windows!
Windows is not Unix; doesn't have PAGER or EDITOR either; and, in general,
use of envars is an abomination under Windows.  The old webbrowser.py uses
the Windows-specific os.startfile(url) because that's the *right* way to do
it on Windows, wizard or not.  And you would have to be a Windows wizard to
succeed in launching a browser under Windows in any other way anyway.  You
may as well try to sell the notion that, on Unix, Python should maintain a
dict mapping file extensions to the user's preferred ways of opening such
files <0.9 wink>.



From tim.one@home.com  Wed Jan 24 19:56:32 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 24 Jan 2001 14:56:32 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124193326.B962@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPEIKAA.tim.one@home.com>

>> The "webbrowser" module should prefer to take the setting from the
>> registry on windows.

> Why ? That's a lot harder to change, and not settable per
> 'shell'/'thread'/'process'.

A Windows user has a legitimate expectation that *every* time an .html file
is opened, it will come up in their browser of choice.  That choice is made
via the registry, and this is how *all* apps work under Windows.  Ditto for
.htm files (and that may be a different browser than is used for .html
files, but again the user has set up their registry to do what *they* want
done with it).  It's not supposed to be easy to change; it is supposed to be
consistent.  Using a different browser per shell/thread/process is a foreign
concept; it's also a useless concept on Windows <0.5 wink>.



From tim.one@home.com  Wed Jan 24 20:32:35 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 24 Jan 2001 15:32:35 -0500
Subject: LC_MESSAGES (was Re: [Python-Dev] test___all__ failing; Windows)
In-Reply-To: <m14LRup-000CxUC@artcom0.artcom-gmbh.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEPGIKAA.tim.one@home.com>

[Peter Funk]
> ...
> AFAI found out, LC_MESSAGES was added to the POSIX "standard" in Posix.2.

FYI, it appears that C99 declined to adopt this extension to C89, but don't
know why (the C99 Rationale doesn't mention it).  That means the vendors who
don't already support it can (well, *will*) use the new C99 std as "a
reason" to continue leaving it out.



From tim.one@home.com  Wed Jan 24 20:15:28 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 24 Jan 2001 15:15:28 -0500
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <14959.1633.163407.779930@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPFIKAA.tim.one@home.com>

[Skip]
> Then this suggests that either Tim's hack is the correct fix (leave it out
> because we can't rely on it always being there) or I should add it to
> __all__ at the bottom of the file if and only if it's present in the
> module's namespace.

What you suggest at the end *is* the hack I checked in.  That is, it's
already done.  The existence of LC_MESSAGES is clearly platform-specific; if
anyone can say for sure a priori *which* platforms it's available on, tell
Fred Drake so he can update the docs accordingly.



From skip@mojam.com (Skip Montanaro)  Wed Jan 24 21:25:45 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 24 Jan 2001 15:25:45 -0600 (CST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124123155.A15203@thyrsus.com>
References: <20010124084917.C29977@ActiveState.com>
 <200101240346.WAA06790@cj20424-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>
 <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>
 <20010124123155.A15203@thyrsus.com>
Message-ID: <14959.18521.648454.488731@beluga.mojam.com>

>>>>> "Eric" == Eric S Raymond <esr@thyrsus.com> writes:

    Moshe Zadka <moshez@zadka.site.co.il>:

    >> The "webbrowser" module should prefer to take the setting from the
    >> registry on windows.

    Eric> Um, that's not the way it works right now. The windows-default
    Eric> browser choice launches the registered default browser, but
    Eric> BROWSER may have something else in its search list first.

Why not have a special REGISTRY token you can place in the BROWSER path to
tell it when to consult the registry?  On non-Windows platforms it can
simply be ignored:

    BROWSER=netscape:REGISTRY:explorer

Skip



From esr@thyrsus.com  Wed Jan 24 21:30:44 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 16:30:44 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <14959.18521.648454.488731@beluga.mojam.com>; from skip@mojam.com on Wed, Jan 24, 2001 at 03:25:45PM -0600
References: <20010124084917.C29977@ActiveState.com> <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com> <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il> <20010124123155.A15203@thyrsus.com> <14959.18521.648454.488731@beluga.mojam.com>
Message-ID: <20010124163044.A15877@thyrsus.com>

Skip Montanaro <skip@mojam.com>:
> Why not have a special REGISTRY token you can place in the BROWSER path to
> tell it when to consult the registry?  On non-Windows platforms it can
> simply be ignored:

In effect, windows-default is that special token.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The Bible is not my book, and Christianity is not my religion.  I could never
give assent to the long, complicated statements of Christian dogma.
	-- Abraham Lincoln


From martin@mira.cs.tu-berlin.de  Wed Jan 24 21:41:11 2001
From: martin@mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 24 Jan 2001 22:41:11 +0100
Subject: [Python-Dev] Tkinter documentation (Was:  What does "batteries are included" mean?)
Message-ID: <200101242141.f0OLfBT01812@mira.informatik.hu-berlin.de>

> It's already a blot on Python that the standard documentation set
> doesn't cover Tkinter.

Just point your friendly web browser to Ping's HTML generator and ask
for Tkinter, or invoke "pydoc.py Tkinter".

[I wouldn't have brought this up if it hadn't been the contribution of
my friend Nils Fischbeck:-]

Regards,
Martin


From nas@arctrix.com  Wed Jan 24 15:31:55 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Wed, 24 Jan 2001 07:31:55 -0800
Subject: [Python-Dev] Makefile changes
Message-ID: <20010124073155.B32266@glacier.fnational.com>

I've checked in my new makefile.  Hopefully everything goes well.
The following files are no longer used so please don't patch
them:

    Grammar/Makefile.in
    Include/Makefile
    Lib/Makefile
    Modules/Makefile.pre.in
    Objects/Makefile.in
    Parser/Makefile.in
    Python/Makefile.in
    Makefile.in

They will be removed in a few days assuming all goes well.  You
should re-run configure to use the new makefile.

I would appreciate it if people using platforms other than Linux
and GNU make could give me some feedback on the build process.
Does configure and make work okay?  Does "make test" and "make
install" work?  Thanks.

  Neil


From greg@cosc.canterbury.ac.nz  Wed Jan 24 22:55:00 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 25 Jan 2001 11:55:00 +1300 (NZDT)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101240354.WAA06903@cj20424-a.reston1.va.home.com>
Message-ID: <200101242255.LAA02208@s454.cosc.canterbury.ac.nz>

Guido:

> But shouldn't the default value be something else,
> like none?

It should really be whatever is the first value that gets
stored after the dict is created. That way people can
use whatever they want for their dummy value and it will
Just Work. And it will probably catch most existing uses
of a dict as a set as well.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From ping@lfw.org  Wed Jan 24 20:33:43 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Wed, 24 Jan 2001 12:33:43 -0800 (PST)
Subject: [Python-Dev] Anonymous + varargs: possible serious breakage -- please confirm!
Message-ID: <Pine.LNX.4.10.10101241222270.483-100000@skuld.kingmanhall.org>

Hi -- after updating my CVS tree today with Python 2.1a1, i ran
the tests and test_inspect failed.  This revealed that the format
of code.co_varnames has changed.  At first i tried to update the
inspect.py module to check the Python version number and track the
change, but now i believe this is actually symptomatic of a real
interpreter problem.

Consider the function:

    def f(a, (b, c), *d):
        x = 1
        print a, b, c, d, x

Whereas in Python 1.5.2:

    f.func_code.co_argcount = 2
    f.func_code.co_nlocals = 6
    f.func_code.co_names = ('x', 'a', 'b', 'c', 'd')
    f.func_code.co_varnames = ('a', '.2', 'd', 'b', 'c', 'x')

In Python 2.1a1:

    f.func_code.co_argcount = 2
    f.func_code.co_nlocals = 6
    f.func_code.co_names = ('b', 'c', 'x', 'a', 'd')
    f.func_code.co_varnames = ('a', '.2', 'b', 'c', 'd', 'x')

Notice how the ordering of the variable names has changed.
I went and looked at the CO_VARARGS clause in eval_code2 to
see if it put the varargs and kwdict arguments in different
slots, but it appears unchanged!  It still puts varargs at
locals[co_argcount] and kwdict at locals[co_argcount + 1].

Please try:

    >>> def f(a, (b, c), *d):
    ...     x = 1
    ...     print a, b, c, d, x
    ...
    >>> f(1, (2, 3), 4)
    1 2 3
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "<stdin>", line 3, in f
    UnboundLocalError: local variable 'd' referenced before assignment
    >>> 

In Python 1.5.2, this prints "1 2 3 (4,)" as expected.

I only have 1.5.2 and 2.1a1 to test.  I hope this problem
isn't present in 2.0...


Note that test_inspect was the only test to fail!  It might be the
only test that checks anonymous and *varargs at the same time.
(Yet another reason to put inspect in the core...)

I did recently check in additions to test_extcall that made the
test much beefier -- but that only tested combinations of regular,
keyword, varargs, and kwdict arguments; it neglected to test
anonymous (tuple) arguments as well.


-- ?!ng



From tim.one@home.com  Wed Jan 24 23:56:25 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 24 Jan 2001 18:56:25 -0500
Subject: [Python-Dev] Re: test___all__ failing; Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCMEPMIKAA.tim.one@home.com>

> In that case, the __all__ attribute in the module has to be calculated
> dynamically. Say, adding code like
>
> try:
>    LC_MESSAGES
> except NameError:
>    pass
> else:
>    __all__.append('LC_MESSAGES')
>
> Ditto for anything else.
>
> Should I check in a patch?

SourceForge CVS doesn't appear to be broken, so I can only conclude everyone
decided this was a bad to stop taking drugs <0.9 wink>.



From tim.one@home.com  Thu Jan 25 00:04:50 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 24 Jan 2001 19:04:50 -0500
Subject: [Python-Dev] (no subject)
Message-ID: <LNBBLJKPBEHFEDALKOLCEEPNIKAA.tim.one@home.com>

[Skip]
> Why not have a special REGISTRY token you can place in the BROWSER
> path to tell it when to consult the registry?  On non-Windows
> platforms it can simply be ignored:
>
>    BROWSER=netscape:REGISTRY:explorer

Because non-Windows platforms shouldn't be bothered with Windows silliness
any more than Windows users should be bothered with Unix silliness.  BROWSER
isn't of any use on Windows, and REGISTRY isn't of any use on Unix.  Eric
may still *think* BROWSER is of use on Windows, but if so that's not really
a technical problem <wink>.



From thomas@xs4all.net  Thu Jan 25 00:25:54 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 25 Jan 2001 01:25:54 +0100
Subject: [Python-Dev] Makefile changes
In-Reply-To: <20010124073155.B32266@glacier.fnational.com>; from nas@arctrix.com on Wed, Jan 24, 2001 at 07:31:55AM -0800
References: <20010124073155.B32266@glacier.fnational.com>
Message-ID: <20010125012554.F962@xs4all.nl>

On Wed, Jan 24, 2001 at 07:31:55AM -0800, Neil Schemenauer wrote:

> I would appreciate it if people using platforms other than Linux
> and GNU make could give me some feedback on the build process.
> Does configure and make work okay?  Does "make test" and "make
> install" work?  Thanks.

Only have time for a quick check now, and no time what so ever tomorrow, but
at first glance, it looks okay (read: it compiles Python) on BSDI 4.0.1,
BSDI 4.1 and FreeBSD 4.2.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From esr@thyrsus.com  Thu Jan 25 00:15:10 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 19:15:10 -0500
Subject: [Python-Dev] (no subject)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEPNIKAA.tim.one@home.com>; from tim.one@home.com on Wed, Jan 24, 2001 at 07:04:50PM -0500
References: <LNBBLJKPBEHFEDALKOLCEEPNIKAA.tim.one@home.com>
Message-ID: <20010124191510.A17782@thyrsus.com>

Tim Peters <tim.one@home.com>:
> Because non-Windows platforms shouldn't be bothered with Windows silliness
> any more than Windows users should be bothered with Unix silliness.  BROWSER
> isn't of any use on Windows, and REGISTRY isn't of any use on Unix.  Eric
> may still *think* BROWSER is of use on Windows, but if so that's not really
> a technical problem <wink>.

Actually that's not something I have an opinion on.  I addressed the
original question because I know it would be technically possible to set
a BROWSER variable under Windows.  Yes, an unlikely move, but possible.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

A man who has nothing which he is willing to fight for, nothing 
which he cares about more than he does about his personal safety, 
is a miserable creature who has no chance of being free, unless made 
and kept so by the exertions of better men than himself. 
	-- John Stuart Mill, writing on the U.S. Civil War in 1862


From tim.one@home.com  Thu Jan 25 04:38:54 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 24 Jan 2001 23:38:54 -0500
Subject: [Python-Dev] I think my set module is ready for prime time;  comments?
In-Reply-To: <3A6EE944.C8CC6EF7@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEADILAA.tim.one@home.com>

[Christian Tismer]
> ...
> Not sure if hashes and keys should be apart, but
> sure for values.

How so?  That is, under what assumptions?  Any savings from separation would
appear to require that I look up keys a lot more than I access the
associated values; while trivially true for dicts used as sets, it seems
dubious to me for use of dicts as mappings (count[word] += 1, etc).



From Jason.Tishler@dothill.com  Thu Jan 25 06:09:47 2001
From: Jason.Tishler@dothill.com (Jason Tishler)
Date: Thu, 25 Jan 2001 01:09:47 -0500
Subject: [Python-Dev] Re: Python 2.1 alpha 1 released!
In-Reply-To: <200101230333.WAA28376@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 10:33:02PM -0500
References: <200101230333.WAA28376@cj20424-a.reston1.va.home.com>
Message-ID: <20010125010947.M1256@dothill.com>

On Mon, Jan 22, 2001 at 10:33:02PM -0500, Guido van Rossum wrote:
> - Python should now build out of the box on Cygwin.  If it doesn't,
>   mail to Jason Tishler (jlt63 at users.sourceforge.net).

Although Python CVS built OOTB under Cygwin until 2001/01/17 18:54:54,
Python 2.1a1 needs a small patch in order to build cleanly under Cygwin.
If interested, please see the following for details:

    http://www.cygwin.com/ml/cygwin-apps/2001-01/msg00019.html

Thanks,
Jason

-- 
Jason Tishler
Director, Software Engineering       Phone: +1 (732) 264-8770 x235
Dot Hill Systems Corp.               Fax:   +1 (732) 264-8798
82 Bethany Road, Suite 7             Email: Jason.Tishler@dothill.com
Hazlet, NJ 07730 USA                 WWW:   http://www.dothill.com


From tim.one@home.com  Thu Jan 25 07:29:19 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 25 Jan 2001 02:29:19 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101231549.KAA05172@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEBFILAA.tim.one@home.com>

[Guido]
> ...
> It's no big deal if the Vaults contain three or more set modules --
> perfect even, people can choose the best one for their purpose.

They really can't, not realistically, unless all the modules in question
conform to the same interface (which users can't control), and users
restrict themselves to methods defined only in the interface (which users
can control).  The problem is that "their purpose" changes over time, and in
some cases the effects of representation on performance simply can't be
out-guessed in advance of actual measurement.  If people need to change any
more than just the import statement, *then* a single implementation has to
be all things to all people.

I hate to say this (bet <wink>?), but I suspect the fact that Python's basic
types are all builtin and not classes has kept us from fully appreciating
the class-based "1 interface, N implementations" approach that C++ and Java
hackers are having so much fun with.  They're not all that easy to find, but
people who have climbed the steep STL learning curve often end up in the
same ecstatic trance I used to see only among fellow Pythoneers.

> But in the core, there's only room for one set type or module.

I don't like the conclusion:  it implies there's no room in the core for
more than one implementation of anything, yet one-size-fits-all doesn't.  I
have no problem with the idea that there's only room for one Set *interface*
in the core.  Then you only need Pronounce on a reasonable set of abstract
operations, and leave the implementation tradeoffs to be made by different
people in different ways (I've really got no use for Eric's list-based sets;
he's really got no use for my sets-of-sets).

That said, if there can be at most one, and must be at least one, a
hashtable based set is the best compromise there is, and mutable objects as
elements should not be supported (they add great implementation complexity
for the benefit of relatively few applications).

jeremy's-set-class-couldn't-be-accused-of-overkill<wink>-ly y'rs  - tim



From tim.one@home.com  Thu Jan 25 07:57:18 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 25 Jan 2001 02:57:18 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123113050.A26162@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEBLILAA.tim.one@home.com>

[Eric S. Raymond]
> ...
> What you get by going with a dictionary representation is that
> membership test becomes close to constant-time, while insertion and
> deletion become sometimes cheap and sometimes quite expensive
> (depending of course on whether you have to allocate a new
> hash bucket).

Note that Python's dicts aren't vulnerable to that:  they use open
addressing in a contiguous, preallocated vector.  There are no mallocs() or
free()s going on for lookups, deletes, or inserts, unless an insert happens
to hit a "time to double the size of the vector" boundary.  Deletes never
cost more than a lookup; inserts never more unless the table-size boundary
is hit (one in 2**N unique inserts, at which point N goes up too).

> ...
> "works for everbody" isn't really possible here.  So my solution
> does the next best thing -- pick a choice of tradeoffs that isn't
> obviously worse than the alternatives and keeps things bog-simple.

I agree that this shouldn't be an either/or choice, but if it's going to be
forced into that mold I have to protest that the performance of unordered
lists would kill most of the set applications I've ever had.  I typically
have a small number of very large sets (and I'm talking not 100s, but often
100s of 1000s of elements).  The relatively large memory burden of a dict
representation wouldn't bother me unless I instead had 100s of 1000s of very
small sets.

which-we-may-happen-in-my-next-life-but-not-in-this-one-ly y'rs  - tim



From tim.one@home.com  Thu Jan 25 08:08:30 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 25 Jan 2001 03:08:30 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <001101c0857a$c0dce420$770a0a0a@nevex.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEBMILAA.tim.one@home.com>

[Greg Wilson]
> ...
> Unfortunately, if values are required to be immutable, then sets of
> sets aren't possible... :-(

Sure they are.  I wrote about how before, and Moshe put up a simple
implementation as a SourceForge patch.  Not bulletproof, though:  "consentng
adults".  No matter *what* you implement, I'll find *some* way to trick it
into believing my sets are immutable <wink>, so don't worry about that.

Bulletproof is very hard, and is a minority distraction at best.  IIRC, SETL
had "by value" semantics when inserting a set into another set as an
element, and had some exceedingly hairy copy-on-write scheme under the
covers to make that bearably quick.  That may be wrong, though.  Herman
Venter's Slim (Sets, Lists and Maps) language does work that way (Guido,
Herman was a friend of the departed Stoffel Erasmus, who you may recall
fondly from Python's very early days -- if *that* doesn't make sets
attractive to you, nothing will <wink>).

Ah!  Meant to post this before:

    http://birch.eecs.lehigh.edu/~bacon/setlprog.ps.gz

That's a readable and very good intro to SETL Classic.  People pondering
computerized sets should at least catch up with what was common knowledge 30
years ago <wink>.



From thomas@xs4all.net  Thu Jan 25 09:24:24 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 25 Jan 2001 10:24:24 +0100
Subject: [Python-Dev] Anonymous + varargs: possible serious breakage -- please confirm!
In-Reply-To: <Pine.LNX.4.10.10101241222270.483-100000@skuld.kingmanhall.org>; from ping@lfw.org on Wed, Jan 24, 2001 at 12:33:43PM -0800
References: <Pine.LNX.4.10.10101241222270.483-100000@skuld.kingmanhall.org>
Message-ID: <20010125102424.G962@xs4all.nl>

On Wed, Jan 24, 2001 at 12:33:43PM -0800, Ka-Ping Yee wrote:

> Please try:

>     >>> def f(a, (b, c), *d):
>     ...     x = 1
>     ...     print a, b, c, d, x
>     ...
>     >>> f(1, (2, 3), 4)
>     1 2 3
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>       File "<stdin>", line 3, in f
>     UnboundLocalError: local variable 'd' referenced before assignment
>     >>> 

> In Python 1.5.2, this prints "1 2 3 (4,)" as expected.

> I only have 1.5.2 and 2.1a1 to test.  I hope this problem
> isn't present in 2.0...

It isn't present in 2.0. This is probably related to Jeremy's changes
in the call mechanism or the compiler track, though Jeremy himself is the
best person to claim that for sure :)

> Note that test_inspect was the only test to fail!  It might be the
> only test that checks anonymous and *varargs at the same time.
> (Yet another reason to put inspect in the core...)

Well, this is not an inspect-specific test, so it shouldn't *be* in
test_inspect, it should be in test_extcall :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From fredrik@effbot.org  Thu Jan 25 09:45:31 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Thu, 25 Jan 2001 10:45:31 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Doc/lib libwinsound.tex,1.5,1.6
References: <E14Limt-0002Rf-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <003801c086b3$8ff41560$e46940d5@hagrid>

tim accidentally wrote:

>     \versionadded{1.5.3} % XXX fix this version number when release is scheduled!

1.5.3?  time for a 1.5.3 => 1.6 query replace?

> fgrep 1.5.3 doc/*/*.tex
doc/lib/libcmp.tex:\deprecated{1.5.3}{Use the \module{filecmp} module inste
doc/lib/libcmpcache.tex:\deprecated{1.5.3}{Use the \module{filecmp} module
ad.}
doc/lib/libwinsound.tex:  \versionadded{1.5.3} % XXX fix this version number

or am I missing something?

Cheers /F



From tim.one@home.com  Thu Jan 25 11:20:18 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 25 Jan 2001 06:20:18 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Doc/lib libwinsound.tex,1.5,1.6
In-Reply-To: <003801c086b3$8ff41560$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCAECFILAA.tim.one@home.com>

Gotta ask Fred about this one!

> or am I missing something?

Yes, the Python 1.5.3 release.  I use it all the time <wink>.



From tismer@tismer.com  Thu Jan 25 12:22:32 2001
From: tismer@tismer.com (Christian Tismer)
Date: Thu, 25 Jan 2001 14:22:32 +0200
Subject: [Python-Dev] Intended to work? (lambda x,y:map(eval, ["x", "y"]))(2,3)
Message-ID: <3A701A88.F2C68635@tismer.com>

In a function like this:

def f(x):
  return eval("x")

, eval uses the local function namespace, and the above works.
This is according to chapter 2.3 of the Python library ref.

Now on my problem: When eval() is used with map, the same
mechanism takes place:

def f(x):
  return map(eval,["x"])

It works the same as the above, because map is a builtin function
that does not modify the frame chain, so eval finds the local
namespace.
Not so with Stackless Python (at the moment), since Stackless map
assigns an own frame to map without passing the correct namespaces
to it. (Reported by Bernd Rinn)

Question: Is this by chance, or is eval() *meant* to function with
the local namespace, even if it is executed in the context of
a function like map() ?

The description of map() does not state whether it has to pass
its surrounding namespace to the mapped function, and if one
simulates map() by writing one's own python implementation,
it will fail exactly like Stackless does today. The same
applies to apply().

I think I should fix Stackless here, anyway?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com


From guido@digicool.com  Thu Jan 25 13:35:12 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 25 Jan 2001 08:35:12 -0500
Subject: [Python-Dev] Re: Intended to work? (lambda x,y:map(eval, ["x", "y"]))(2,3)
In-Reply-To: Your message of "Thu, 25 Jan 2001 14:22:32 +0200."
 <3A701A88.F2C68635@tismer.com>
References: <3A701A88.F2C68635@tismer.com>
Message-ID: <200101251335.IAA16713@cj20424-a.reston1.va.home.com>

> In a function like this:
> 
> def f(x):
>   return eval("x")
> 
> , eval uses the local function namespace, and the above works.
> This is according to chapter 2.3 of the Python library ref.
> 
> Now on my problem: When eval() is used with map, the same
> mechanism takes place:
> 
> def f(x):
>   return map(eval,["x"])
> 
> It works the same as the above, because map is a builtin function
> that does not modify the frame chain, so eval finds the local
> namespace.
> Not so with Stackless Python (at the moment), since Stackless map
> assigns an own frame to map without passing the correct namespaces
> to it. (Reported by Bernd Rinn)
> 
> Question: Is this by chance, or is eval() *meant* to function with
> the local namespace, even if it is executed in the context of
> a function like map() ?

Map, being a built-in, is transparent to namespaces.

> The description of map() does not state whether it has to pass
> its surrounding namespace to the mapped function, and if one
> simulates map() by writing one's own python implementation,
> it will fail exactly like Stackless does today. The same
> applies to apply().

So you can't simulate a built-in.

> I think I should fix Stackless here, anyway?

Yes.

Note: beware of Jeremy's nested scopes.  That adds a whole slew of
namespaces!  (But eval() is more crippled there.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From jeremy@alum.mit.edu  Thu Jan 25 15:20:45 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 10:20:45 -0500 (EST)
Subject: [Python-Dev] Anonymous + varargs: possible serious breakage -- please confirm!
In-Reply-To: <20010125102424.G962@xs4all.nl>
References: <Pine.LNX.4.10.10101241222270.483-100000@skuld.kingmanhall.org>
 <20010125102424.G962@xs4all.nl>
Message-ID: <14960.17485.549337.5476@localhost.localdomain>

>>>>> "TW" == Thomas Wouters <thomas@xs4all.net> writes:

  TW> On Wed, Jan 24, 2001 at 12:33:43PM -0800, Ka-Ping Yee wrote:
  >> Please try:

  >> >>> def f(a, (b, c), *d):
  >> ...  x = 1 ...  print a, b, c, d, x ...
  >> >>> f(1, (2, 3), 4)
  >> 1 2 3 Traceback (most recent call last): File "<stdin>", line 1,
  >> in ?  File "<stdin>", line 3, in f UnboundLocalError: local
  >> variable 'd' referenced before assignment
  >> >>>

  >> In Python 1.5.2, this prints "1 2 3 (4,)" as expected.

  >> I only have 1.5.2 and 2.1a1 to test.  I hope this problem isn't
  >> present in 2.0...

  TW> It isn't present in 2.0. This is probably related to Jeremy's
  TW> changes in the call mechanism or the compiler track, though
  TW> Jeremy himself is the best person to claim that for sure :)

The bug is in the compiler.  It creates varnames while it is parsing
the argument list.  While I got the handling of the anonymous tuples
right, I forgot to insert *varargs or **kwargs in varnames *before*
the names defined in the tuple.

I will fix it real soon now.

  >> Note that test_inspect was the only test to fail!  It might be
  >> the only test that checks anonymous and *varargs at the same
  >> time.  (Yet another reason to put inspect in the core...)

  TW> Well, this is not an inspect-specific test, so it shouldn't *be*
  TW> in test_inspect, it should be in test_extcall :)

It should probably be in test_grammar.  The ext call mechanism is only
invoked when the caller uses a form like 'f(*arg)'.  Perhaps the name
"ext call" isn't very clear.

Jeremy


From esr@thyrsus.com  Thu Jan 25 16:19:36 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Thu, 25 Jan 2001 11:19:36 -0500
Subject: [Python-Dev] Waiting method for file objects
Message-ID: <20010125111936.A23512@thyrsus.com>

--UugvWAfsgieZRqgk
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

I have been researching the question of how to ask a file descriptor how much
data it has waiting for the next sequential read, with a view to discovering
what cross-platform behavior we could count on for a hypothetical `waiting'
method in Python's built-in file class.

1:  Why bother?

I have these main applications in mind:

1. Detecting EOF on a static plain file.
2. Non-blocking poll of a socket opened in non-blocking mode.
3. Non-blocking poll of a FIFO opened in non-blocking mode.
4. Non-blocking poll of a terminal device opened in non-blocking mode.

These are all frequently requested capabilities on C newsgroups -- how
often have *you* seen the "how do I detect an individual keypress"
question from beginning programmers?  I believe having these
capabilities would substantially enhance Python's appeal.

2: What would be under the hood?

Summary: We can do this portably, and we can do it with only one (1)
new #ifdef.  Our tools for this purpose will be the fstat(2) st_size
field and the FIONREAD ioctl(2) call.  They are complementary.

In all supposedly POSIX-conformant environments I know of, the st_size
field has a documented meaning for plain files (S_IFREG) and may or
may not give a meaningful number for FIFOs, sockets, and tty devices.
The Single Unix Specification is silent on the meaning of st_size for
file types other than regular files (S_IFREG).  I have filed a defect
report about this with OpenGroup and am discussing appropriate language
with them.

(The last sentence of the Inferno operating system's language on
stat(2) is interesting: "If the file resides on permanent storage and
is not a directory, the length returned by stat is the number of bytes
in the file. For directories, the length returned is zero. Some
devices report a length that is the number of bytes that may be read
from the device without blocking.")

The FIONREAD ioctl(2) call, on the other hand, returns bytes waiting
on character devices such as FIFOs, sockets, or ttys -- but does not
return a useful value for files or directories or block devices. The
FIONREAD ioctl was supported in both SVr4 and 4.2BSD.  It's present in
all the open-source Unixes, SunOS, Solaris, and AIX.  Via Google
search I have discovered that it's also supported in the Windows
Sockets API and the GUSI POSIX libraries for the Macintosh.  Thus, it
can be considered portable for Python's purposes even though it's
rather sparsely documented.

I was able to obtain confirming information on Linux from Linus
Torvalds himself. My information on Windows and the Mac is from
Gavriel State, formerly a lead developer on Corel's WINE team and a
programmer with extensive cross-platform experience.  Gavriel reported
on the MSCRT POSIX environment, on the Metrowerks Standard Library
POSIX implementation for the Mac, and on the GUSI POSIX implementation
for the Mac.

2.1: Plain files

Torvalds and State confirm that for plain files (S_IFREG) the st_size
field is reliable on all three platforms.  On the Mac it gives the
file's data fork size.

One apparent difficulty with the plain-file case is that POSIX does
not guarantee anything about seek_t quantities such as lseek(2)
returns and the st_size field except that they can be compared for
equality.  Thus, under the strict letter of POSIX law, `waiting' can
be used to detect EOF but not to get a reliable read-size return in
any other file position.

Fortunately, this is less an issue than it appears.  The weakness of
the POSIX language was a 1980s-era concession to a generation of
mainframe operating systems with record-oriented file structures --
all of which are now either thoroughly obsolete or (in the case of IBM
VM/CMS) have become Linux emulators :-).  On modern operating systems
under which files have character granularity, stat(2) emulations can
be and are written to give the right result.

2.2: Block devices

The directory case (S_IFDIR) is a complete loss.  Under Unixes,
including Linux, the fstat(2) size field gives the allocated size of
the directory as if it were a plain file.  Under MSCRT POSIX the
meaning is undocumented and unclear.  Metroworks returns garbage.
GUSI POSIX returns the number of files in the directory!  FIONREAD
cannot be used on directories.

Block devices (S_IFBLK) are a mess again.  Linus points out that a
system with removable or unmountable volumes *cannot* return a useful
st_size field -- what happens when the device is dismounted?

2.3: Character devices

Pipes and FIFOs (S_IFIFO) look better.  On MSCRT the fstat(2) size
field returns the number of bytes waiting to be read.  This is also
true under current Linuxes, though Torvalds says it is "an
implementation detail" and recommends polling with the FIONREAD ioctl
instead.  Fortunately, FIONREAD is available under Unix, Windows, and
the Mac.

Sockets (S_IFSOCK) look better too.  Under Linux, the fstat(2) size
field gives number of bytes waiting.  Torvalds again says this is "an
implementation detail" and recommends polling with the FIONREAD ioctl.
Neither MSCRT POSIX nor Metroworks has direct support for sockets.
GUSI POSIX returns 1 (!) in the st_size field. But FIONREAD is
available under Unix, Windows, and the GUSI POSIX libraries on the
Mac.

Character devices (S_IFCHR) can be polled with FIONREAD.  This technique
has a long history of use with tty devices under Unix.  I don't know whether
it will work with the equivalents of terminal devices for Windows and the Mac.
Fortunately this is not a very important question, as those are GUI 
environments with the terminal devices are rarely if ever used.

3. How does this turn into Python?

The upshot of our portability analysis is that by using FIONREAD and
fstat(2), we can get useful results for plain files, pipes, and
sockets on all three platforms.  Directories and block devices are a
complete loss.  Character devices (in particular, ttys) we can poll
reliably under Unix.  What we'll get polling the equivalents of tty or
character devices under Windows and the Mac is presently unknown, but
also unimportant.

My proposed semantics for a Python `waiting' method is that it reports
the amount of data that would be returned by a read() call at the time
of the waiting-method invocation.  The interpreter throws OSError if
such a report is impossible or forbidden.

I have enclosed a patch against the current CVS sources, including
documentation.  This patch is tested and working against plain files,
sockets, and FIFOs under Linux.  I have also attached the
Python test program I used under Linux.

I would appreciate it if those of you on Windows and Macintosh
machines would test the waiting method. The test program will take
some porting, because it needs to write to a FIFO in background.
Under Linux I do it this way:

	(echo -n '%s' >testfifo; echo 'Data written to FIFO.') &

I don't know how to do the equivalent under Windows or Mac.

When you run this program, it will try to mail me your test results.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Sometimes it is said that man cannot be trusted with the government
of himself.  Can he, then, be trusted with the government of others?
	-- Thomas Jefferson, in his 1801 inaugural address

--UugvWAfsgieZRqgk
Content-Type: text/plain; charset=us-ascii
Content-Description: Patch implementing the waiting method
Content-Disposition: attachment; filename="waiting.patch"

Index: fileobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/fileobject.c,v
retrieving revision 2.108
diff -c -r2.108 fileobject.c
*** fileobject.c	2001/01/18 03:03:16	2.108
--- fileobject.c	2001/01/25 16:16:10
***************
*** 35,40 ****
--- 35,44 ----
  #include <errno.h>
  #endif
  
+ #ifndef DONT_HAVE_IOCTL_H
+ #include <sys/ioctl.h>
+ #endif
+ 
  
  typedef struct {
  	PyObject_HEAD
***************
*** 423,428 ****
--- 427,513 ----
  }
  
  static PyObject *
+ file_waiting(PyFileObject *f, PyObject *args)
+ {
+ 	struct stat stbuf;
+ #ifdef HAVE_FSTAT
+ 	int ret;
+ #endif
+ 
+ 	if (f->f_fp == NULL)
+ 		return err_closed();
+ 	if (!PyArg_NoArgs(args))
+ 		return NULL;
+ #ifndef HAVE_FSTAT
+ 	PyErr_SetString(PyExc_OSError, "fstat(2) is not available.");
+ 	clearerr(f->f_fp);
+ 	return NULL;
+ #else
+ 	Py_BEGIN_ALLOW_THREADS
+ 	errno = 0;
+ 	ret = fstat(fileno(f->f_fp), &stbuf);
+ 	Py_END_ALLOW_THREADS
+ 	    if (ret == -1) {			/* the fstat failed */
+ 		PyErr_SetFromErrno(PyExc_IOError);
+ 		clearerr(f->f_fp);
+ 		return NULL;
+        	} else if (S_ISDIR(stbuf.st_mode) || S_ISBLK(stbuf.st_mode)) {
+ 		PyErr_SetString(PyExc_IOError, 
+ 				"Can't poll a block device or directory.");
+ 		clearerr(f->f_fp);
+ 		return NULL;
+ 	} else if (S_ISREG(stbuf.st_mode)) {	/* plain file */
+ #if defined(HAVE_LARGEFILE_SUPPORT) && SIZEOF_OFF_T < 8 && SIZEOF_FPOS_T >= 8
+ 		fpos_t pos;
+ #else
+ 		off_t pos;
+ #endif
+ 		Py_BEGIN_ALLOW_THREADS
+ 		errno = 0;
+ 		pos = _portable_ftell(f->f_fp);
+ 		Py_END_ALLOW_THREADS
+ 		if (pos == -1) {
+ 			PyErr_SetFromErrno(PyExc_IOError);
+ 			clearerr(f->f_fp);
+ 			return NULL;
+ 		}
+ #if !defined(HAVE_LARGEFILE_SUPPORT)
+ 		return PyInt_FromLong(stbuf.st_size - pos);
+ #else
+ 		return PyLong_FromLongLong(stbuf.st_size - pos);
+ #endif
+ 	} else if (S_ISFIFO(stbuf.st_mode) 
+ 		    || S_ISSOCK(stbuf.st_mode) 
+ 		    || S_ISCHR(stbuf.st_mode)) {	/* stream device */
+ #ifndef FIONREAD
+ 		PyErr_SetString(PyExc_OSError, 
+ 				"FIONREAD is not available.");
+ 		clearerr(f->f_fp);
+ 		return NULL;
+ #else
+ 		int waiting;
+ 
+ 		Py_BEGIN_ALLOW_THREADS
+ 		errno = 0;
+ 		ret = ioctl(fileno(f->f_fp), FIONREAD, &waiting);
+ 		Py_END_ALLOW_THREADS
+ 		if (ret == -1) {
+ 			PyErr_SetFromErrno(PyExc_IOError);
+ 			clearerr(f->f_fp);
+ 			return NULL;
+ 		}
+ 
+ 		return Py_BuildValue("i", waiting);
+ #endif /* FIONREAD */
+ 	} else {				/* should never happen! */
+ 		PyErr_SetString(PyExc_OSError, "Unknown file type.");
+ 		clearerr(f->f_fp);
+ 		return NULL;
+ 	}
+ #endif /* HAVE_FSTAT */
+ }
+ 
+ static PyObject *
  file_fileno(PyFileObject *f, PyObject *args)
  {
  	if (f->f_fp == NULL)
***************
*** 1263,1268 ****
--- 1348,1354 ----
  	{"truncate",	(PyCFunction)file_truncate, 1},
  #endif
  	{"tell",	(PyCFunction)file_tell, 0},
+ 	{"waiting",	(PyCFunction)file_waiting, 0},
  	{"readinto",	(PyCFunction)file_readinto, 0},
  	{"readlines",	(PyCFunction)file_readlines, 1},
  	{"xreadlines",	(PyCFunction)file_xreadlines, 1},

--UugvWAfsgieZRqgk
Content-Type: text/plain; charset=us-ascii
Content-Description: Test program for the waiting method
Content-Disposition: attachment; filename="waiting_test.py"

#!/usr/bin/env python
import sys, os, random, string, time, socket, smtplib, readline

print "This program tests the `waiting' method of file objects."

fp = open("waiting_test.py")
if hasattr(fp, "waiting"):
    print "Good, you're running a patched Python with `waiting' available."
else:
    print "You haven't installed the `waiting' patch yet.  This won't work."
    sys.exit(1)

successes = ""
failures = ""
nogo = ""

print ""
print "First, plain files:"

filesize = fp.waiting()
print "There are %d bytes waiting to be read in this file." % filesize
if os.name == 'posix':
    os.system("ls -l waiting_test.py")
    print "That should match the number in the ls listing above."
else:
    print "Please check this with your OS's directory tools."

get = random.randrange(fp.waiting())
print "I'll now read a random number (%d) of bytes." % get
fp.read(get)
print "The waiting method sees %d bytes left." % fp.waiting()
if get + fp.waiting() == filesize:
    print  "%d + %d = %d.  That's consistent.  Test passed." % \
          (get, fp.waiting(), filesize)
    successes += "Plain file random-read test passed.\n"
else:
    print "That's not consistent. Test failed."
    failures += "Plain file random-read test failed\n"

print "Now let's see if we can detect EOF reliably."
fp.read()
left = fp.waiting()
print "I'll do a read()...the waiting method now returns %d" % left
if left == 0:
    print "That looks like EOF."
    successes += "Plain file EOF test passed.\n"
else:
    print "%d bytes left. Test failed." % left
    failures += "Plain file EOF test failed\n"
fp.close()

print ""
print "Now sockets:"
print "Connecting to imap.netaxs.com's IMAP server now..."
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
file = sock.makefile('rb')
sock.connect(("imap.netaxs.com", 143))
print "Waiting a few seconds to avoid a race condition..."
time.sleep(3)
greetsize = file.waiting()
print "There appear to be %d bytes waiting..." % greetsize
greeting = file.readline()
print "I just read the greeting line..."
sys.stdout.write(greeting)
if len(greeting) == greetsize:
    print "...and the size matches.  Test passed."
    successes += "Socket test passed.\n"
else:
    print "That's not right.  Test failed."
    failures += "Socket test failed.\n"
sock.close()

print ""
if not hasattr(os, "mkfifo"):
    print "Your platform doesn't have FIFOs (mkfifo() is absent), so I can't test them."
    nogo = "FIFO test could not be performed."
else:
    print "Now FIFOs:"
    print "I'm making a FIFO named testfifo."; os.mkfifo("testfifo")
    str = string.letters[:random.randrange(len(string.letters))]
    print "I'm going to send it the following string '%s' of random length %d:" \
          % (str, len(str),)
    # Note: Unix dependency here!
    os.system("(echo -n '%s' >testfifo; echo 'Data written to FIFO.') &" % str)
    fp = open("testfifo", "r")
    print "Waiting a few seconds to avoid a race condition..."
    time.sleep(3)
    ready = fp.waiting()
    print "I see %d bytes waiting in the FIFO." % ready
    if ready == len(str):
        print "That's consistent.  Test passed."
        successes += "FIFO test passed.\n"
    else:
        print "That's not consistent. Test failed."
        failures += "FIFO test failed\n"
    os.remove("testfifo")

print "\nSummary:"
report = "Platform is: %s, version is %s\n" % (sys.platform, sys.version)
if successes:
    report += "The following tests succeeded:\n" + successes
if failures:
    report += "The following tests failed:\n" + failures
if nogo:
    report += "The following tests could not be performed:\n" + nogo
if not nogo:
    report += "No tests were skipped.\n"
if not failures:
    report += "All tests succeeded.\n"
print report

if os.name == 'posix':
    me = os.environ["USER"] + "@" + socket.getfqdn()
else:
    me = raw_input("Enter your emasil address, please?")

try:
    server = smtplib.SMTP('localhost')
    report = ("From: %s\nTo: esr@thyrsus.com\nSubject: waiting_test\n\n" % me) + report
    server.sendmail(me, ["esr@thyrsus.com"], report)
    server.quit()
except:
    print "The attempt to mail your test result failed.\n"

--UugvWAfsgieZRqgk--


From esr@snark.thyrsus.com  Thu Jan 25 16:46:20 2001
From: esr@snark.thyrsus.com (Eric S. Raymond)
Date: Thu, 25 Jan 2001 11:46:20 -0500
Subject: [Python-Dev] Documentation patch for waiting method.
Message-ID: <200101251646.f0PGkKM23567@snark.thyrsus.com>

Index: libstdtypes.tex
===================================================================
RCS file: /cvsroot/python/python/dist/src/Doc/lib/libstdtypes.tex,v
retrieving revision 1.50
diff -u -r1.50 libstdtypes.tex
--- libstdtypes.tex	2001/01/17 01:18:00	1.50
+++ libstdtypes.tex	2001/01/25 16:46:40
@@ -1142,6 +1142,24 @@
   \UNIX{} versions support this operation).
 \end{methoddesc}
 
+\begin{methoddesc}[file]{waiting}{}
+  Return the number of bytes waiting to be read from this file object.
+  For regular files, this returns the size of the file in bytes minus
+  the current seek address, as would be returned by \method{tell()}; a
+  zero return can be used to detect EOF.  For streams such as FIFOs,
+  sockets, Unix ttys, and other Unix character devices, this method
+  returns the number of bytes currently buffered up and waiting to be
+  read.  Attempts to call this method on Unix block devices or
+  on directories will raise an error.
+	\footnote{The \method{waiting()} method uses
+  	\cfunction{fstat(2)} and \cfunction{lseek(2)} on plain files;
+  	these should be reliable on all of Unix, Windows, and MacOS.
+  	It uses the FIONREAD ioctl(2) call to query FIFOs, sockets,
+  	Unix ttys, and other POSIX character devices; FIFO and socket
+  	behavior should be consistent across all three platforms, but
+  	the results from querying other character devices may vary.}
+\end{methoddesc}
+
 \begin{methoddesc}[file]{write}{str}
   Write a string to the file.  There is no return value.  Note: Due to
   buffering, the string may not actually show up in the file until

-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"To disarm the people... was the best and most effectual way to enslave them."
        -- George Mason, speech of June 14, 1788


From fredrik@effbot.org  Thu Jan 25 19:23:50 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Thu, 25 Jan 2001 20:23:50 +0100
Subject: [Python-Dev] Fw: random.py gives wrong results (+ a solution)
Message-ID: <00f701c08704$59bde510$e46940d5@hagrid>

I'm pretty sure Tim's seen this already, but just
in case...

----- Original Message ----- 
From: "Ivan Frohne" <frohne@gci.net>
Newsgroups: comp.lang.python
Sent: Thursday, January 25, 2001 5:20 PM
Subject: Re: random.py gives wrong results (+ a solution)


> 
> "Janne Sinkkonen" <janne@oops.nnets.fi> wrote in message
> news:m3u26oy1rw.fsf@kinos.nnets.fi...
> >
> > At least in Python 2.0 and earlier, the samples returned by the
> > function betavariate() of random.py are not from a beta distribution
> > although the function name misleadingly suggests so.
> >
> > The following would give beta-distributed samples:
> >
> > def betavariate(alpha, beta):
> >      y = gammavariate(alpha,1)
> >      if y==0: return 0.0
> >      else: return  y/(y+gammavariate(beta,1))
> >
> > This is from matlab. A comment in the original matlab code refers to
> > Devroye, L. (1986) Non-Uniform Random Variate Generation, theorem 4.1A
> > (p. 430). Another reference would be Gelman, A. et al. (1995) Bayesian
> > data analysis, p. 481, which I have checked and found to agree with
> > the code above.
> 
> 
> I'm convinced that Janne Sinkkonen is right:  The beta distribution
> generator in module random.py does not return Beta-distributed
> random numbers.  Janne's suggested fix should work just fine.
> 
> Here's my guess on how and why this bug bit  -- it won't be of interest to
> most but
> this subject is so obscure sometimes that there needs to be a detailed
> analysis.
> 
> The probability density function of the gamma distribution with (positive)
> parameters
> A and B is usually written
> 
>     g(x; A, B) = (x**(A-1) * exp(x/B)) / (Gamma(A) * B**A), where x, A, and
> B > 0.
> 
> Here Gamma(A) is the gamma function -- for A a positive integer, Gamma(A) is
> the
> factorial of A - 1, Gamma(A) = (A-1)!.  In fact, this is the definition used
> by the authors of random.py in defining gammavariate(alpha, beta), the gamma
> distribution random number generator.
> 
> Now it happens that a gamma-distributed random variable with parameters A =
> 1 and
> B has the (much simpler) exponential distribution with density function
> 
>     g(x; 1, B) = exp(-x/B) / B.
> 
> Keep that in mind.
> 
> The reference "Discrete Event Simulation in ," by Kevin Watkins
> (McGraw-Hill, 1993)
> was consulted by the random.py authors.  But this reference defines the
> gamma probability distribution a little differently, as
> 
>     g1(x; A, B) =  (B**A * x**(A-1) * exp(B*x)) / Gamma(A), where x, A, B >
> 0.
> 
> (See p. 85).  On page 87, Watkins states (incorrectly) that if grv(A, B) is
> a function which
> returns a gamma random variable with parameters A and B (using his
> definition on p. 85),
> then the function
> 
>     brv(A, B) = grv(1, 1/B) / ( grv(1, 1/B) + grv(1, A) )              [ not
> true!]
> 
> will return a random variable which has the beta distribution with
> parameters A and B.
> 
> Believing Watkins to be correct, the random.py authors remembered that a
> gamma
> random variable with parameter A = 1 is just an exponential random variable
> and
> further simplified their beta generator to
> 
>    brv(A, B) = erv(1/B) / (erv(1/B) + erv(A)), where erv(K) is a random
> variable
> 
> having the exponential distribution with
> 
> parameter K.
> 
> The corrected equation for a beta random variable, using Watkins' definition
> of the
> gamma density, is
> 
>     brv(A, B) = grv(A, 1) / ( grv(A, 1) + grv(1/B, 1) ),
> 
> which translates to
> 
>     brv(A, B) = grv(A, 1) / (grv(A, 1) + grv(B, 1)
> 
> using the more common gamma density definition (the one used in random.py).
> Many standard statistical references give this equation -- two are
> "Non-Uniform random Variate Generation," by Luc Devroye, Springer-Verlag,
> 1986,
> p. 432, and  "Monte Carlo Concepts, Algorithms and Applications," by
> George S. Fishman, Springer, 1996, p. 200.
> 
> --Ivan Frohne
> 
> 
> 
> 
> >>>
> 
> 
> 
> 



From jeremy@alum.mit.edu  Thu Jan 25 17:13:03 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 12:13:03 -0500 (EST)
Subject: [Python-Dev] Makefile changes
In-Reply-To: <20010124073155.B32266@glacier.fnational.com>
References: <20010124073155.B32266@glacier.fnational.com>
Message-ID: <14960.24223.599357.388059@localhost.localdomain>

Neil,

What would it take to add useful dependency information to the
Makefile?  Or does it already exist?

When I was working the nested scopes, building was tedious at times
because a change to funcobject.h meant that, e.g., newmodule.c needed
to be recompiled.  The Makefiles didn't capture that information, so I
had been adding it to the individual Makefiles, e.g.

newmodule.o: newmodule.c ../Include/funcobject.h

(I think this worked.)

It would be great if the Makefile captured all the dependencies.
Could we just use makedepend?

Jeremy


From MarkH@ActiveState.com  Thu Jan 25 19:43:35 2001
From: MarkH@ActiveState.com (Mark Hammond)
Date: Thu, 25 Jan 2001 11:43:35 -0800
Subject: [Python-Dev] Waiting method for file objects
In-Reply-To: <20010125111936.A23512@thyrsus.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPOEBFDAAA.MarkH@ActiveState.com>

> I would appreciate it if those of you on Windows and Macintosh
> machines would test the waiting method. The test program will take
> some porting, because it needs to write to a FIFO in background.

This didn't compile under Windows.  I have a patch (against CVS) that
compiles, but doesnt appear to work (and will be forwarded to Eric under
seperate cover) [news flash :-)  Changing the open call to add "rb" as the
mode makes it work - text v binary bites again]

I didn't try any sort of fifo test.

The sockets test failed with a socket error, but would certainly have failed
had the socket connected, as my patch includes:

#ifndef S_ISSOCK
#	define S_ISSOCK(mode) (0)
#endif

I have no idea if it managed to mail the results, but I guess not, so the
output is below.  The test file (after some small mods, including the "rb"
param) is indeed 4252 bytes long.

Hope this is useful!

Mark.

This program tests the `waiting' method of file objects.
Good, you're running a patched Python with `waiting' available.

First, plain files:
There are 4252 bytes waiting to be read in this file.
Please check this with your OS's directory tools.
I'll now read a random number (3091) of bytes.
The waiting method sees 1161 bytes left.
3091 + 1161 = 4252.  That's consistent.  Test passed.
Now let's see if we can detect EOF reliably.
I'll do a read()...the waiting method now returns 0
That looks like EOF.

Now sockets:
Connecting to imap.netaxs.com's IMAP server now...
Traceback (most recent call last):
  File "c:\temp\waiting_test.py", line 57, in ?
    sock.connect(("imap.netaxs.com", 143))
  File "<string>", line 1, in connect
socket.error: (10060, 'Operation timed out')



From nas@arctrix.com  Thu Jan 25 13:07:53 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 05:07:53 -0800
Subject: [Python-Dev] Makefile changes
In-Reply-To: <14960.24223.599357.388059@localhost.localdomain>; from jeremy@alum.mit.edu on Thu, Jan 25, 2001 at 12:13:03PM -0500
References: <20010124073155.B32266@glacier.fnational.com> <14960.24223.599357.388059@localhost.localdomain>
Message-ID: <20010125050753.A1573@glacier.fnational.com>

On Thu, Jan 25, 2001 at 12:13:03PM -0500, Jeremy Hylton wrote:
> What would it take to add useful dependency information to the
> Makefile?  Or does it already exist?

Some of it exists but I don't think its complete.

> When I was working the nested scopes, building was tedious at times
> because a change to funcobject.h meant that, e.g., newmodule.c needed
> to be recompiled.  The Makefiles didn't capture that information, so I
> had been adding it to the individual Makefiles, e.g.
> 
> newmodule.o: newmodule.c ../Include/funcobject.h
> 
> (I think this worked.)


Hmm, I don't think so.  Which makefile did you add this to?  Are
you using the new makefile?  The Makefile.pre.in file contains a
line like:

    $(LIBRARY_OBJS) $(MAINOBJ): $(PYTHON_HEADERS)

but newmodule.o not in LIBRARY_OBJS.  By default its not compiled
by make but with distutils.  If you add newmodule to Setup then a
line like:

    Modules/newmodule.o: $(PYTHON_HEADERS)

would do the trick.  I think I will add a line like:

    $(MODOBJS): $(PYTHON_HEADERS)

to fix the problem.

I could easily restore the mkdep target but my feeling right now
that explicitly including the header dependencies is better.
What do you think?  

  Neil


From jeremy@alum.mit.edu  Thu Jan 25 20:02:46 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 15:02:46 -0500 (EST)
Subject: [Python-Dev] PEP 227 checkins to follow
Message-ID: <14960.34406.342961.834827@localhost.localdomain>

I am about to check in the changes that implemention PEP 227.  There
are many changes, which I will make via separate commits.  You might
want to wait until the checkins are done to do an update.  I'll send a
note when I'm done.

I also wanted to mention that the PEP has fallen a little out of
date.  There are a few wrinkles that it doesn't deal with, e.g.
    def f(x):
        def g(y):
            return x + y
        del x
        return g

For now, this raises a SyntaxError.

I'll flesh out the PEP to reflect the current implemention and spec
out some of the less obvious cases.

I'd welcome any comments on the code itself.  I know there are a
number of rough edges and also, most likely, a bunch of memory leaks.
I'll be working to clean things up before 2.1a2, but wanted to get the
code into CVS ASAP.

Jeremy


From jeremy@alum.mit.edu  Thu Jan 25 20:15:01 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 15:15:01 -0500 (EST)
Subject: [Python-Dev] checkins done for PEP 227
Message-ID: <14960.35141.237252.468467@localhost.localdomain>

It looks like python-dev is very slow, so you'll see my original
warning well after the checkins occurred.  Oh, well.  They're done.

Jeremy



From tim.one@home.com  Thu Jan 25 20:58:03 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 25 Jan 2001 15:58:03 -0500
Subject: [Python-Dev] Fw: random.py gives wrong results (+ a solution)
Message-ID: <LNBBLJKPBEHFEDALKOLCCEDDILAA.tim.one@home.com>

[/F, fwds a c.l.py claim that random.betavariate is dead wrong]

Not to worry; I had already entered that into the SF bug database and
assigned it to me (hmm:  why would you send it to Python-Dev instead of
putting it in the database?).  I suspect he's correct, and, more
importantly, so does Ivan Frohne.  We'll settle it before 2.1a2, but perhaps
not today.  Alas, I have no idea where the original code came from ("Guido"
isn't a useful answer -- he was just converting somebody else's C++ code to
Python).



From fredrik@effbot.org  Thu Jan 25 20:42:05 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Thu, 25 Jan 2001 21:42:05 +0100
Subject: [Python-Dev] Waiting method for file objects
References: <20010125111936.A23512@thyrsus.com>
Message-ID: <01fb01c0870f$48517110$e46940d5@hagrid>

eric wrote:

> Fortunately, this is less an issue than it appears.

only if you ignore Windows...

-1 on making this a file method

+0 on adding it as an optional support function to
the os module.

</F>



From martin@mira.cs.tu-berlin.de  Thu Jan 25 20:42:39 2001
From: martin@mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 25 Jan 2001 21:42:39 +0100
Subject: [Python-Dev] jeremy@alum.mit.edu
Message-ID: <200101252042.f0PKgd101532@mira.informatik.hu-berlin.de>

> It would be great if the Makefile captured all the dependencies.

That would be great, yes. However, setup.py should probably also
consider dependencies.

> Could we just use makedepend?

Not sure. Certainly not in the build process. I dislike distributions
which, as the first thing, perform dependency generation. Dependencies
change less often than the actual source, so it is should be
sufficient to update them manually. Furthermore, generated files as
part of the CVS repository fail to work properly unless everybody uses
the exact same generator. For autoconf alone, that's a problem because
of multiple autoconf versions. I don't know how many different
makedepend versions are in use.

Regards,
Martin



From tim.one@home.com  Thu Jan 25 21:02:11 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 25 Jan 2001 16:02:11 -0500
Subject: [Python-Dev] Windows compile broken
Message-ID: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com>

Linking...
   Creating library ./python21.lib and object ./python21.exp
ceval.obj : error LNK2001: unresolved external symbol _PyCell_Set
ceval.obj : error LNK2001: unresolved external symbol _PyCell_Get
frameobject.obj : error LNK2001: unresolved external symbol _PyCell_New
./python21.dll : fatal error LNK1120: 3 unresolved externals
Error executing link.exe.


Sorry if this has already been discussed.  I don't see mention of it in the
Python-Dev archive, and my email is almost worse than useless (random delays
of minutes to days, due to what appears to be the simultaneous worldwide
wedging of every email server servicing every email account I have).



From esr@thyrsus.com  Thu Jan 25 21:12:25 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Thu, 25 Jan 2001 16:12:25 -0500
Subject: [Python-Dev] Waiting method for file objects
In-Reply-To: <01fb01c0870f$48517110$e46940d5@hagrid>; from fredrik@effbot.org on Thu, Jan 25, 2001 at 09:42:05PM +0100
References: <20010125111936.A23512@thyrsus.com> <01fb01c0870f$48517110$e46940d5@hagrid>
Message-ID: <20010125161225.A24305@thyrsus.com>

Fredrik Lundh <fredrik@effbot.org>:
> > Fortunately, this is less an issue than it appears.
> 
> only if you ignore Windows...

I don't understand this.  Explain?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Sometimes the law defends plunder and participates in it. Sometimes
the law places the whole apparatus of judges, police, prisons and
gendarmes at the service of the plunderers, and treats the victim --
when he defends himself -- as a criminal.
	-- Frederic Bastiat, "The Law"


From esr@thyrsus.com  Thu Jan 25 21:13:31 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Thu, 25 Jan 2001 16:13:31 -0500
Subject: [Python-Dev] jeremy@alum.mit.edu
In-Reply-To: <200101252042.f0PKgd101532@mira.informatik.hu-berlin.de>; from martin@mira.cs.tu-berlin.de on Thu, Jan 25, 2001 at 09:42:39PM +0100
References: <200101252042.f0PKgd101532@mira.informatik.hu-berlin.de>
Message-ID: <20010125161331.B24305@thyrsus.com>

Martin v. Loewis <martin@mira.cs.tu-berlin.de>:
> Not sure. Certainly not in the build process. I dislike distributions
> which, as the first thing, perform dependency generation. Dependencies
> change less often than the actual source, so it is should be
> sufficient to update them manually. Furthermore, generated files as
> part of the CVS repository fail to work properly unless everybody uses
> the exact same generator. For autoconf alone, that's a problem because
> of multiple autoconf versions. I don't know how many different
> makedepend versions are in use.

Easily solved -- there are script versions of makedepend we can just ship
with the distribution.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Morality is always the product of terror; its chains and
strait-waistcoats are fashioned by those who dare not trust others,
because they dare not trust themselves, to walk in liberty.
	-- Aldous Huxley 


From mal@lemburg.com  Thu Jan 25 21:26:04 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Thu, 25 Jan 2001 22:26:04 +0100
Subject: [Python-Dev] Windows compile broken
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com>
Message-ID: <3A7099EC.81689EA5@lemburg.com>

Tim Peters wrote:
> 
> Linking...
>    Creating library ./python21.lib and object ./python21.exp
> ceval.obj : error LNK2001: unresolved external symbol _PyCell_Set
> ceval.obj : error LNK2001: unresolved external symbol _PyCell_Get
> frameobject.obj : error LNK2001: unresolved external symbol _PyCell_New
> ./python21.dll : fatal error LNK1120: 3 unresolved externals
> Error executing link.exe.
> 
> Sorry if this has already been discussed.  I don't see mention of it in the
> Python-Dev archive, and my email is almost worse than useless (random delays
> of minutes to days, due to what appears to be the simultaneous worldwide
> wedging of every email server servicing every email account I have).

These must be related to checkins by Jeremy and his nested
scopes... (I knew these would get us into trouble ;-)

I think Jeremy forgot to check in the needed change for 
Objects/Makefile.in and probably the Windows project file is
missing the new object type too (Objects/cellobject.c).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jeremy@alum.mit.edu  Thu Jan 25 21:14:52 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 16:14:52 -0500 (EST)
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <3A7099EC.81689EA5@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com>
 <3A7099EC.81689EA5@lemburg.com>
Message-ID: <14960.38732.773129.793360@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal@lemburg.com> writes:

  MAL> Tim Peters wrote:
  >>
  >> Linking...  Creating library ./python21.lib and object
  >> ./python21.exp ceval.obj : error LNK2001: unresolved external
  >> symbol _PyCell_Set ceval.obj : error LNK2001: unresolved external
  >> symbol _PyCell_Get frameobject.obj : error LNK2001: unresolved
  >> external symbol _PyCell_New ./python21.dll : fatal error LNK1120:
  >> 3 unresolved externals Error executing link.exe.
  >>
  >> Sorry if this has already been discussed.  I don't see mention of
  >> it in the Python-Dev archive, and my email is almost worse than
  >> useless (random delays of minutes to days, due to what appears to
  >> be the simultaneous worldwide wedging of every email server
  >> servicing every email account I have).

  MAL> These must be related to checkins by Jeremy and his nested
  MAL> scopes... (I knew these would get us into trouble ;-)

Just you wait and see!

  MAL> I think Jeremy forgot to check in the needed change for
  MAL> Objects/Makefile.in and probably the Windows project file is
  MAL> missing the new object type too (Objects/cellobject.c).

That's right.  I didn't change the Makefile in Objects or do anything
with Windows.  Don't know how to do the latter, but perhaps Tim will
stop by my desk next week and show me.  As for the Makefile, I thought
I saw a message from Neil saying not to update those anymore.

Jeremy


From nas@arctrix.com  Thu Jan 25 15:10:56 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 07:10:56 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include cellobject.h,NONE,2.1 Python.h,2.30,2.31
In-Reply-To: <E14Lscy-00065x-00@usw-pr-cvs1.sourceforge.net>; from jhylton@users.sourceforge.net on Thu, Jan 25, 2001 at 12:04:16PM -0800
References: <E14Lscy-00065x-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010125071056.A2390@glacier.fnational.com>

On Thu, Jan 25, 2001 at 12:04:16PM -0800, Jeremy Hylton wrote:
> A cell contains a reference to a single PyObject.  It could be
> implemented as a mutable, one-element sequence, but the separate type
> has less overhead.

Can this object be involved in reference cycles?  If so, it
should probably have the GC methods added to it.

  Neil


From jeremy@alum.mit.edu  Thu Jan 25 21:42:04 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 16:42:04 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include cellobject.h,NONE,2.1 Python.h,2.30,2.31
In-Reply-To: <20010125071056.A2390@glacier.fnational.com>
References: <E14Lscy-00065x-00@usw-pr-cvs1.sourceforge.net>
 <20010125071056.A2390@glacier.fnational.com>
Message-ID: <14960.40364.594582.353511@localhost.localdomain>

>>>>> "NS" == Neil Schemenauer <nas@arctrix.com> writes:

  NS> On Thu, Jan 25, 2001 at 12:04:16PM -0800, Jeremy Hylton wrote:
  >> A cell contains a reference to a single PyObject.  It could be
  >> implemented as a mutable, one-element sequence, but the separate
  >> type has less overhead.

  NS> Can this object be involved in reference cycles?  If so, it
  NS> should probably have the GC methods added to it.

It's already there.  (Last five lines of cellobject.c quoted as
proof.) 

>	Py_TPFLAGS_DEFAULT | Py_TPFLAGS_GC,	/* tp_flags */
> 	0,					/* tp_doc */
> 	(traverseproc)cell_traverse,		/* tp_traverse */
> 	(inquiry)cell_clear,			/* tp_clear */
>};


From nas@arctrix.com  Thu Jan 25 15:19:22 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 07:19:22 -0800
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <3A7099EC.81689EA5@lemburg.com>; from mal@lemburg.com on Thu, Jan 25, 2001 at 10:26:04PM +0100
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com>
Message-ID: <20010125071922.B2390@glacier.fnational.com>

On Thu, Jan 25, 2001 at 10:26:04PM +0100, M.-A. Lemburg wrote:
> I think Jeremy forgot to check in the needed change for 
> Objects/Makefile.in

That file is dead.  Should I remove it now?  I haven't heard any
major complaints about Makefile.pre.in yet.  Maybe the messages
are all sitting in the python.org mail spool.  Barry, what the
hell is going on?  You need to drop that Postfix crap and get
qmail. :-)

  Neil


From thomas@xs4all.net  Thu Jan 25 22:19:37 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Thu, 25 Jan 2001 23:19:37 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include modsupport.h,2.35,2.36
In-Reply-To: <E14Lue8-0006SF-00@usw-pr-cvs1.sourceforge.net>; from fdrake@users.sourceforge.net on Thu, Jan 25, 2001 at 02:13:36PM -0800
References: <E14Lue8-0006SF-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010125231937.I962@xs4all.nl>

On Thu, Jan 25, 2001 at 02:13:36PM -0800, Fred L. Drake wrote:

> The addition of new parameters to functions in the Python/C API requires
> that PYTHON_API_VERSION be incremented.

When we update the API version, isn't it time to clean up the TP_HASFEATURE
stuff ? Since we updated the API, all the current slots should be there,
right ?

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Thu Jan 25 22:32:32 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 25 Jan 2001 17:32:32 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include modsupport.h,2.35,2.36
In-Reply-To: Your message of "Thu, 25 Jan 2001 23:19:37 +0100."
 <20010125231937.I962@xs4all.nl>
References: <E14Lue8-0006SF-00@usw-pr-cvs1.sourceforge.net>
 <20010125231937.I962@xs4all.nl>
Message-ID: <200101252232.RAA20013@cj20424-a.reston1.va.home.com>

> > The addition of new parameters to functions in the Python/C API requires
> > that PYTHON_API_VERSION be incremented.
> 
> When we update the API version, isn't it time to clean up the TP_HASFEATURE
> stuff ? Since we updated the API, all the current slots should be there,
> right ?

No, we're issuing a warning about old API versions but still try to
work with them.  After all most extensions don't create frame or code
objects.

I added the flags for the tp_richcompare field when I tried 2.1a1 with
Zope's ExtensionClasses and Acquisition modules.  Turns out I cot a
core dump, while 2.1 ran flawlessly.  The reason: they have their own
type struct which has the same lay-out as the Python 1.5.2 (or even
older) type struct, followed by fields of their own.  They have the
tp_flags field set to 0, so up to 2.0, it was compatible.  I expect
that 2.1a2 will work with the unchanged Zope code because of the flag
I added.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From mal@lemburg.com  Thu Jan 25 23:04:54 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 00:04:54 +0100
Subject: [Python-Dev] Windows compile broken
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com>
Message-ID: <3A70B116.12BF756B@lemburg.com>

Neil Schemenauer wrote:
> 
> On Thu, Jan 25, 2001 at 10:26:04PM +0100, M.-A. Lemburg wrote:
> > I think Jeremy forgot to check in the needed change for
> > Objects/Makefile.in
> 
> That file is dead.  Should I remove it now?  I haven't heard any
> major complaints about Makefile.pre.in yet.

What about that file ? Are you saying that Makefile.pre.in
will no longer work in 2.1 ??? 

Please don't remove that mechanism -- it has been in use for
quite a while and is much more stable than distutils. We should
at least wait a few more distutils releases for the dust to
settle before removing the old fallback solution.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@digicool.com  Thu Jan 25 23:06:40 2001
From: guido@digicool.com (Guido van Rossum)
Date: Thu, 25 Jan 2001 18:06:40 -0500
Subject: [Python-Dev] Windows compile broken
In-Reply-To: Your message of "Fri, 26 Jan 2001 00:04:54 +0100."
 <3A70B116.12BF756B@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com>
 <3A70B116.12BF756B@lemburg.com>
Message-ID: <200101252306.SAA20173@cj20424-a.reston1.va.home.com>

> > That file is dead.  Should I remove it now?  I haven't heard any
> > major complaints about Makefile.pre.in yet.
> 
> What about that file ? Are you saying that Makefile.pre.in
> will no longer work in 2.1 ??? 
> 
> Please don't remove that mechanism -- it has been in use for
> quite a while and is much more stable than distutils. We should
> at least wait a few more distutils releases for the dust to
> settle before removing the old fallback solution.

Let's at least mark it clearly as obsolete though -- it's a pain to
maintain.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From nas@arctrix.com  Thu Jan 25 16:31:28 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 08:31:28 -0800
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <3A70B116.12BF756B@lemburg.com>; from mal@lemburg.com on Fri, Jan 26, 2001 at 12:04:54AM +0100
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com> <3A70B116.12BF756B@lemburg.com>
Message-ID: <20010125083128.A2699@glacier.fnational.com>

On Fri, Jan 26, 2001 at 12:04:54AM +0100, M.-A. Lemburg wrote:
> What about that file ? Are you saying that Makefile.pre.in
> will no longer work in 2.1 ??? 

I'm talking about Objects/Makefile.in.  Which Makefile.pre.in are
you talking about?  Modules/Makefile.pre.in is dead too.  There
is a Makefile.pre.in in the toplevel directory which does the
same thing.  There is also Misc/Makefile.pre.in.  That file gets
installed into lib and still works as it aways did.  The toplevel
Makefile.pre.in can use Modules/Setup* just like the old
Modules/Makefile.pre.in could.  Does this address your concerns?

> Please don't remove that mechanism -- it has been in use for
> quite a while and is much more stable than distutils. We should
> at least wait a few more distutils releases for the dust to
> settle before removing the old fallback solution.

No doubt.

  Neil


From nas@arctrix.com  Thu Jan 25 16:33:48 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 08:33:48 -0800
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <200101252306.SAA20173@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Jan 25, 2001 at 06:06:40PM -0500
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com> <3A70B116.12BF756B@lemburg.com> <200101252306.SAA20173@cj20424-a.reston1.va.home.com>
Message-ID: <20010125083348.B2699@glacier.fnational.com>

On Thu, Jan 25, 2001 at 06:06:40PM -0500, Guido van Rossum wrote:
> Let's at least mark it clearly as obsolete though -- it's a pain to
> maintain.

Are you talking about Misc/Makefile.pre.in?  If so, how do you
suggest we mark it?

I don't think Modules/Setup should go away any time soon.  I
often like to build lots of modules staticly into the
interpreter.  setup.py has no support for building static
modules.

  Neil


From tim.one@home.com  Thu Jan 25 23:27:52 2001
From: tim.one@home.com (Tim Peters)
Date: Thu, 25 Jan 2001 18:27:52 -0500
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <14960.38732.773129.793360@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEBILAA.tim.one@home.com>

Thanks for the clues, everyone!  I'll fix it for Windows.  Note that I'm
getting email in wild bursts, and most often delayed.  So I'm generally not
seeing any checkin msgs, or SF bug email, or Python-Dev email, ..., anywhere
near the time (or, alas, sometimes even day) they're generated.  So I simply
didn't see the checkin msg introducing cellobject.c.

all's-well-that-looks-like-it-may-end-ly y'rs  - tim



From mal@lemburg.com  Fri Jan 26 09:32:14 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 10:32:14 +0100
Subject: [Python-Dev] Makefile.pre.in (Windows compile broken)
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com> <3A70B116.12BF756B@lemburg.com> <20010125083128.A2699@glacier.fnational.com>
Message-ID: <3A71441E.4584A5C8@lemburg.com>

Neil Schemenauer wrote:
> 
> On Fri, Jan 26, 2001 at 12:04:54AM +0100, M.-A. Lemburg wrote:
> > What about that file ? Are you saying that Makefile.pre.in
> > will no longer work in 2.1 ???
> 
> I'm talking about Objects/Makefile.in.  Which Makefile.pre.in are
> you talking about?  Modules/Makefile.pre.in is dead too.  There
> is a Makefile.pre.in in the toplevel directory which does the
> same thing.  There is also Misc/Makefile.pre.in.  That file gets
> installed into lib and still works as it aways did.  The toplevel
> Makefile.pre.in can use Modules/Setup* just like the old
> Modules/Makefile.pre.in could.  Does this address your concerns?

Yes. Thanks. I was talking about the Misc/Makefile.pre.in mechanism
which was used in the past by many Python C extensions to provide
a portable of compiling the extension into a shared module or
statically into the Python interpreter.
 
I have been using that mechanism for years now and with much
success. Even though I am currently moving to distutils I have
no idea how stable distutils is on exotic platforms or ones which
have special needs (like e.g. AIX).

> > Please don't remove that mechanism -- it has been in use for
> > quite a while and is much more stable than distutils. We should
> > at least wait a few more distutils releases for the dust to
> > settle before removing the old fallback solution.
> 
> No doubt.

Ok.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Fri Jan 26 09:37:12 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 10:37:12 +0100
Subject: [Python-Dev] setup.py
Message-ID: <3A714548.C487DCC9@lemburg.com>

I have posted two messages here regarding the new setup.py
mechanism for building Modules/ but have received no comments
on them so far. Here's another go:

1. I think that setup.py should output warnings about modules 
   which cannot be built for some reason rather than having
   ot the build process completely.

2. I suggest adding -L/usr/lib/termcap to the readline extension.
   This doesn't hurt anywhere and will get this extension to compile
   on SuSE Linux too.

Thoughts ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From esr@thyrsus.com  Fri Jan 26 12:27:56 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Fri, 26 Jan 2001 07:27:56 -0500
Subject: [Python-Dev] setup.py
In-Reply-To: <3A714548.C487DCC9@lemburg.com>; from mal@lemburg.com on Fri, Jan 26, 2001 at 10:37:12AM +0100
References: <3A714548.C487DCC9@lemburg.com>
Message-ID: <20010126072756.A5013@thyrsus.com>

M.-A. Lemburg <mal@lemburg.com>:
> 1. I think that setup.py should output warnings about modules 
>    which cannot be built for some reason rather than having
>    ot the build process completely.
> 
> 2. I suggest adding -L/usr/lib/termcap to the readline extension.
>    This doesn't hurt anywhere and will get this extension to compile
>    on SuSE Linux too.

Both good ideas.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Such are a well regulated militia, composed of the freeholders,
citizen and husbandman, who take up arms to preserve their property,
as individuals, and their rights as freemen.
        -- "M.T. Cicero", in a newspaper letter of 1788 touching the "militia" 
            referred to in the Second Amendment to the Constitution.


From mal@lemburg.com  Fri Jan 26 14:13:45 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 15:13:45 +0100
Subject: [Python-Dev] setup.py
References: <3A714548.C487DCC9@lemburg.com> <20010126072756.A5013@thyrsus.com>
Message-ID: <3A718619.6278AF41@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal@lemburg.com>:
> > 1. I think that setup.py should output warnings about modules
> >    which cannot be built for some reason rather than having
> >    ot the build process completely.
> >
> > 2. I suggest adding -L/usr/lib/termcap to the readline extension.
> >    This doesn't hurt anywhere and will get this extension to compile
> >    on SuSE Linux too.
> 
> Both good ideas.

Should I implement the two and check these in ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From esr@thyrsus.com  Fri Jan 26 14:25:59 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Fri, 26 Jan 2001 09:25:59 -0500
Subject: [Python-Dev] setup.py
In-Reply-To: <3A718619.6278AF41@lemburg.com>; from mal@lemburg.com on Fri, Jan 26, 2001 at 03:13:45PM +0100
References: <3A714548.C487DCC9@lemburg.com> <20010126072756.A5013@thyrsus.com> <3A718619.6278AF41@lemburg.com>
Message-ID: <20010126092559.A5623@thyrsus.com>

M.-A. Lemburg <mal@lemburg.com>:
> "Eric S. Raymond" wrote:
> > 
> > M.-A. Lemburg <mal@lemburg.com>:
> > > 1. I think that setup.py should output warnings about modules
> > >    which cannot be built for some reason rather than having
> > >    ot the build process completely.
> > >
> > > 2. I suggest adding -L/usr/lib/termcap to the readline extension.
> > >    This doesn't hurt anywhere and will get this extension to compile
> > >    on SuSE Linux too.
> > 
> > Both good ideas.
> 
> Should I implement the two and check these in ?

I may not channel Guido the way Tim does, but I suspect he gave you
developer privileges because he trusts you to do routine stuff like this.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The saddest life is that of a political aspirant under democracy. His
failure is ignominious and his success is disgraceful.
        -- H.L. Mencken


From mal@lemburg.com  Fri Jan 26 14:29:18 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 15:29:18 +0100
Subject: [Python-Dev] setup.py
References: <3A714548.C487DCC9@lemburg.com> <20010126072756.A5013@thyrsus.com> <3A718619.6278AF41@lemburg.com> <20010126092559.A5623@thyrsus.com>
Message-ID: <3A7189BE.C6C2806E@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal@lemburg.com>:
> > "Eric S. Raymond" wrote:
> > >
> > > M.-A. Lemburg <mal@lemburg.com>:
> > > > 1. I think that setup.py should output warnings about modules
> > > >    which cannot be built for some reason rather than having
> > > >    ot the build process completely.
> > > >
> > > > 2. I suggest adding -L/usr/lib/termcap to the readline extension.
> > > >    This doesn't hurt anywhere and will get this extension to compile
> > > >    on SuSE Linux too.
> > >
> > > Both good ideas.
> >
> > Should I implement the two and check these in ?
> 
> I may not channel Guido the way Tim does, but I suspect he gave you
> developer privileges because he trusts you to do routine stuff like this.

Just asking because setup.py is Andrew's baby. I'll add the above
two later today.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mwh21@cam.ac.uk  Fri Jan 26 16:40:47 2001
From: mwh21@cam.ac.uk (Michael Hudson)
Date: 26 Jan 2001 16:40:47 +0000
Subject: [Python-Dev] [PEP 232] Syntactic support for function attributes strawman.
Message-ID: <m3ofwuw9kg.fsf@atrus.jesus.cam.ac.uk>

Following discussion on c.l.py I've just submitted:

http://sourceforge.net/patch/?func=detailpatch&patch_id=103441&group_id=5470

which implements a syntax for adding function attributes inline:

>>> def f(a) having (publish=1):
...  print 1
... 
>>> f.publish
1

It uses an "import-as" like strategy to avoid makeing "having" a
keyword (which interacts a bit badly with error reporting, as it
happens).  Obviously, it would be easy to change "having" to a
different word.

Another idea I had was:

>>> def f(a) having (.publish=1):
...  print 1
... 
>>> f.publish
1

to emphasize the attributeness of what's going on, but I didn't like
this as much in practice (I always forgot the period!).

Emile van Sebille also suggested

>>> d = {'a':1}
>>> def f(a) having (**d):
...  print 1
... 
>>> f.a
1

which I haven't implemented, because I didn't really like it, but I
thought I'd mention.

I'll do test suites and documentation in time, but I thought I'd call
in here to check the idea wasn't DOA.  What do you all think?

Cheers,
M.

-- 
  surely, somewhere, somehow, in the history of computing, at least
  one manual has been written that you could at least remotely
  attempt to consider possibly glancing at.              -- Adam Rixey




From nas@arctrix.com  Fri Jan 26 09:55:57 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 01:55:57 -0800
Subject: [Python-Dev] [PEP 232] Syntactic support for function attributes strawman.
In-Reply-To: <m3ofwuw9kg.fsf@atrus.jesus.cam.ac.uk>; from mwh21@cam.ac.uk on Fri, Jan 26, 2001 at 04:40:47PM +0000
References: <m3ofwuw9kg.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <20010126015556.A4215@glacier.fnational.com>

I don't see whats wrong with:

    def f(a):
        print 1
    f.publish = 1

Its perfectly clear to me.  As a bonus it works already.  I'm -1
on inventing more syntax.

  Neil


From evan@digicool.com  Fri Jan 26 17:12:43 2001
From: evan@digicool.com (Evan Simpson)
Date: Fri, 26 Jan 2001 12:12:43 -0500
Subject: [Python-Dev] [PEP 232] Syntactic support for function attributes strawman.
References: <m3ofwuw9kg.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <00c001c087bb$322a9720$3e48a4d8@digicool.com>

From: Michael Hudson <mwh21@cam.ac.uk>
> >>> def f(a) having (publish=1):
> ...  print 1

This doesn't really need special syntax.  I would much rather have this (or
something like it) as a way of spelling initialized local variables.  That
is, when I want static local variables, instead of corrupting the function
signature by writing:

def f(x, marker=[], foo=foo)

...I could write:

def f(x) having (marker=[], foo)

Cheers,

Evan @ digicool



From jeremy@alum.mit.edu  Fri Jan 26 17:58:24 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 26 Jan 2001 12:58:24 -0500 (EST)
Subject: [Python-Dev] Makefile changes
In-Reply-To: <20010125050753.A1573@glacier.fnational.com>
References: <20010124073155.B32266@glacier.fnational.com>
 <14960.24223.599357.388059@localhost.localdomain>
 <20010125050753.A1573@glacier.fnational.com>
Message-ID: <14961.47808.315324.734238@localhost.localdomain>

>>>>> "NS" == Neil Schemenauer <nas@arctrix.com> writes:

  >> When I was working the nested scopes, building was tedious at
  >> times because a change to funcobject.h meant that, e.g.,
  >> newmodule.c needed to be recompiled.  The Makefiles didn't
  >> capture that information, so I had been adding it to the
  >> individual Makefiles, e.g.
  >>
  >> newmodule.o: newmodule.c ../Include/funcobject.h
  >>
  >> (I think this worked.)

  NS> Hmm, I don't think so.  Which makefile did you add this to?

Just to clarify: I added this line to the old Makefile before you
checked the new one in.

  NS> Hmm, I don't think so.  Which makefile did you add this to?  Are
  NS> you using the new makefile?  The Makefile.pre.in file contains a
  NS> line like:

  NS>     $(LIBRARY_OBJS) $(MAINOBJ): $(PYTHON_HEADERS)

  NS> but newmodule.o not in LIBRARY_OBJS.  By default its not
  NS> compiled by make but with distutils.  If you add newmodule to
  NS> Setup then a line like:

  NS>     Modules/newmodule.o: $(PYTHON_HEADERS)

  NS> would do the trick.  I think I will add a line like:

  NS>     $(MODOBJS): $(PYTHON_HEADERS)

  NS> to fix the problem.

  NS> I could easily restore the mkdep target but my feeling right now
  NS> that explicitly including the header dependencies is better.
  NS> What do you think?

Isn't it overkill to have every .o file depend on all the .h files?
If I change cobject.h, there are very few .o files that depend on this
change.  I suppose, however, it's not worth the effort to get it right
at a finer granularity, e.g. that the only files that depend on
cobject.h are cobject, cStringIO, unicodedata, _cursesmodule, object,
and unicodeobject.

Jeremy





From fdrake@acm.org  Fri Jan 26 20:36:18 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Fri, 26 Jan 2001 15:36:18 -0500 (EST)
Subject: [Python-Dev] Makefile changes
In-Reply-To: <14961.47808.315324.734238@localhost.localdomain>
References: <20010124073155.B32266@glacier.fnational.com>
 <14960.24223.599357.388059@localhost.localdomain>
 <20010125050753.A1573@glacier.fnational.com>
 <14961.47808.315324.734238@localhost.localdomain>
Message-ID: <14961.57282.880552.358709@cj42289-a.reston1.va.home.com>

Jeremy Hylton writes:
 > Isn't it overkill to have every .o file depend on all the .h files?
 > If I change cobject.h, there are very few .o files that depend on this
 > change.  I suppose, however, it's not worth the effort to get it right

  Perhaps.  It's definately easier to maintain than tracking it more
specifically and better than what we had, so I'll live with it.  ;)

 > at a finer granularity, e.g. that the only files that depend on
 > cobject.h are cobject, cStringIO, unicodedata, _cursesmodule, object,
 > and unicodeobject.

  And py_curses.h, which is also used in _curses_panel.c.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From nas@arctrix.com  Fri Jan 26 13:58:50 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 05:58:50 -0800
Subject: [Python-Dev] Makefile changes
In-Reply-To: <14961.47808.315324.734238@localhost.localdomain>; from jeremy@alum.mit.edu on Fri, Jan 26, 2001 at 12:58:24PM -0500
References: <20010124073155.B32266@glacier.fnational.com> <14960.24223.599357.388059@localhost.localdomain> <20010125050753.A1573@glacier.fnational.com> <14961.47808.315324.734238@localhost.localdomain>
Message-ID: <20010126055850.C4918@glacier.fnational.com>

On Fri, Jan 26, 2001 at 12:58:24PM -0500, Jeremy Hylton wrote:
> Isn't it overkill to have every .o file depend on all the .h files?

Maybe, but Python compiles pretty fast anyhow.  I'd rather error
on the safe side (ie. compiling too much).  Trying to figure out
which of the subheaders a .c file uses when it imports Python.h
would be a lot of work and error prone.  More power to you if you
want to do it.  ;-)

  Neil


From dgoodger@atsautomation.com  Fri Jan 26 21:46:13 2001
From: dgoodger@atsautomation.com (Goodger, David)
Date: Fri, 26 Jan 2001 16:46:13 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
Message-ID: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>

[CC'ing to Armin Steinhoff, who maintains pyqnx on SourceForge.]

I'm having trouble building Python 2.1a1 on QNX 4.25. Caveat: my C is very
rusty (long live Python!), I don't know my way around configure, and am not
familiar with Python's Makefile. Python 2.0 compiled fine (with a couple of
tweaks), but I'm getting caught by the new way of building things. Please
help if you can! Many thanks in advance.

Here's an excerpt of my efforts:

    # cd /tmp/py
    # gunzip -c < python-2.1a1.tgz | tar -rf -
    # cd Python-2.1a1
    # ./configure 2>&1 | tee ../configure.1
    # make 2>&1 | tee ../make.1
    ...
    ./python //5/tmp/py/Python-2.1a1/setup.py build
    'import site' failed; use -v for traceback
    Traceback (most recent call last):
      File "//5/tmp/py/Python-2.1a1/setup.py", line 4, in ?
        import sys, os, string, getopt
    ImportError: No module named string

Running ./python results in stack overflow. The old QNX instructions in
README recommend editing Modules/Makefile:
    LDFLAGS=    -N 64k

    # make 2>&1 | tee ../make.2

Same error as first make. But now the stack doesn't overflow.

    # python
    'import site' failed; use -v for traceback
    Python 2.1a1 (#2, Jan 26 2001, 11:38:55) [C] on qnxJ
    Type "copyright", "credits" or "license" for more information.
    >>> import sys
    >>> sys.path
    ['', '/usr/local/lib/python', '/home/dgoodger/lib/python', 
    '/5/tmp/py/Python-2.1a1/Lib', '/5/tmp/py/Python-2.1a1/Lib/plat-qnxJ', 
    '/tmp/py/Python-2.1a1/Modules']
    >>> ^D

    # fullpath .
    . is //5/tmp/py/Python-2.1a1

The QNX node number prefix '//5' (machine or host number, equivalent to a
'hostname:' prefix for network paths) is being reduced somehow (path
normalization?) to '/5', so paths don't resolve. 2 slashes ('//') are
required at the head of the path. Is this something that can be fixed?

I added a prefix (QNX virtual-to-real path mapping on the filesystem tree)
to correct this:

    # prefix -A /5=//5

Now /5 points to //5, similar to a link.

    # make 2>&1 | tee ../make.3
    ...
    ./python //5/tmp/py/Python-2.1a1/setup.py build
    unable to execute ld: No such file or directory
    running build
    running build_ext
    building 'struct' extension
    creating build
    creating build/temp.qnx-J-PCI-2.1
    cc -O -I. -I/5/tmp/py/Python-2.1a1/./Include -IInclude/
-I/usr/local/include -c /5/tmp/py/Python-2.1a1/Modules/structmodule.c -o
build/temp.qnx-J-PCI-2.1/structmodule.o
    creating build/lib.qnx-J-PCI-2.1
    ld build/temp.qnx-J-PCI-2.1/structmodule.o -L/usr/local/lib -o
build/lib.qnx-J-PCI-2.1/struct.so
    error: command 'ld' failed with exit status 1
    make: *** [sharedmods] Error 1

QNX doesn't have an 'ld' command. Is configure not getting its info to
setup.py? (Is it supposed to?)

What should I check? I have logs of each of the configure & make runs.
Should I submit this as a bug on SourceForge?

Hope to hear from somebody soon.


David Goodger
Systems Administrator & Programmer, Advanced Systems
Automation Tooling Systems Inc., Automation Systems Division
direct: (519) 653-4483 ext. 7121    fax: (519) 650-6695
e-mail: dgoodger@atsautomation.com


From guido@digicool.com  Fri Jan 26 21:52:47 2001
From: guido@digicool.com (Guido van Rossum)
Date: Fri, 26 Jan 2001 16:52:47 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: Your message of "Fri, 26 Jan 2001 16:46:13 EST."
 <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>
Message-ID: <200101262152.QAA26624@cj20424-a.reston1.va.home.com>

> [CC'ing to Armin Steinhoff, who maintains pyqnx on SourceForge.]
> 
> I'm having trouble building Python 2.1a1 on QNX 4.25. Caveat: my C is very
> rusty (long live Python!), I don't know my way around configure, and am not
> familiar with Python's Makefile. Python 2.0 compiled fine (with a couple of
> tweaks), but I'm getting caught by the new way of building things. Please
> help if you can! Many thanks in advance.
> 
> Here's an excerpt of my efforts:
> 
>     # cd /tmp/py
>     # gunzip -c < python-2.1a1.tgz | tar -rf -
>     # cd Python-2.1a1
>     # ./configure 2>&1 | tee ../configure.1
>     # make 2>&1 | tee ../make.1
>     ...
>     ./python //5/tmp/py/Python-2.1a1/setup.py build
>     'import site' failed; use -v for traceback
>     Traceback (most recent call last):
>       File "//5/tmp/py/Python-2.1a1/setup.py", line 4, in ?
>         import sys, os, string, getopt
>     ImportError: No module named string
> 
> Running ./python results in stack overflow. The old QNX instructions in
> README recommend editing Modules/Makefile:
>     LDFLAGS=    -N 64k
> 
>     # make 2>&1 | tee ../make.2
> 
> Same error as first make. But now the stack doesn't overflow.
> 
>     # python
>     'import site' failed; use -v for traceback
>     Python 2.1a1 (#2, Jan 26 2001, 11:38:55) [C] on qnxJ
>     Type "copyright", "credits" or "license" for more information.
>     >>> import sys
>     >>> sys.path
>     ['', '/usr/local/lib/python', '/home/dgoodger/lib/python', 
>     '/5/tmp/py/Python-2.1a1/Lib', '/5/tmp/py/Python-2.1a1/Lib/plat-qnxJ', 
>     '/tmp/py/Python-2.1a1/Modules']
>     >>> ^D
> 
>     # fullpath .
>     . is //5/tmp/py/Python-2.1a1
> 
> The QNX node number prefix '//5' (machine or host number, equivalent to a
> 'hostname:' prefix for network paths) is being reduced somehow (path
> normalization?) to '/5', so paths don't resolve. 2 slashes ('//') are
> required at the head of the path. Is this something that can be fixed?

Aha -- you may need QNX-specific path manipulation functions.  What's
going on is that site.py normalizes the entries in sys.path, using
this function:

    def makepath(*paths):
	dir = os.path.join(*paths)
	return os.path.normcase(os.path.abspath(dir))

I've got a feeling that os.path.abspath(dir) here is the culprit in
posixpath.py:

def abspath(path):
    """Return an absolute path."""
    if not isabs(path):
        path = join(os.getcwd(), path)
    return normpath(path)

And here I think that normpath(path) is the routine that actually gets
rid of the double leading /.

Feel free to submit a patch that leaves double leading slashes in if
on QNX.

> I added a prefix (QNX virtual-to-real path mapping on the filesystem tree)
> to correct this:
> 
>     # prefix -A /5=//5
> 
> Now /5 points to //5, similar to a link.
> 
>     # make 2>&1 | tee ../make.3
>     ...
>     ./python //5/tmp/py/Python-2.1a1/setup.py build
>     unable to execute ld: No such file or directory
>     running build
>     running build_ext
>     building 'struct' extension
>     creating build
>     creating build/temp.qnx-J-PCI-2.1
>     cc -O -I. -I/5/tmp/py/Python-2.1a1/./Include -IInclude/
> -I/usr/local/include -c /5/tmp/py/Python-2.1a1/Modules/structmodule.c -o
> build/temp.qnx-J-PCI-2.1/structmodule.o
>     creating build/lib.qnx-J-PCI-2.1
>     ld build/temp.qnx-J-PCI-2.1/structmodule.o -L/usr/local/lib -o
> build/lib.qnx-J-PCI-2.1/struct.so
>     error: command 'ld' failed with exit status 1
>     make: *** [sharedmods] Error 1
> 
> QNX doesn't have an 'ld' command. Is configure not getting its info to
> setup.py? (Is it supposed to?)
> 
> What should I check? I have logs of each of the configure & make runs.
> Should I submit this as a bug on SourceForge?
> 
> Hope to hear from somebody soon.

This is probably in the realm of the distutils.  I have no idea how to
teach it to build on QNX, sorry!

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Fri Jan 26 22:01:01 2001
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Fri, 26 Jan 2001 17:01:01 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>; from dgoodger@atsautomation.com on Fri, Jan 26, 2001 at 04:46:13PM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>
Message-ID: <20010126170101.B2762@amarok.cnri.reston.va.us>

On Fri, Jan 26, 2001 at 04:46:13PM -0500, Goodger, David wrote:
>    ImportError: No module named string

The 'import string' in setup.py actually seems to be redundant now,
since nothing seems to actually refer to the string module.  I've
removed it from CVS.

>The QNX node number prefix '//5' (machine or host number, equivalent to a
>'hostname:' prefix for network paths) is being reduced somehow (path
>normalization?) to '/5', so paths don't resolve. 2 slashes ('//') are
>required at the head of the path. Is this something that can be fixed?

Ooh, very likely:
>>> os.path.normpath('//5/foo/bar')
'/5/foo/bar'

Isn't // at the root a Unix convention of some sort for some 
network filesystems?  Probably normpath() should just leave it alone.

>QNX doesn't have an 'ld' command. Is configure not getting its info to
>setup.py? (Is it supposed to?)

setup.py should be parsing the Makefile.  The old QNX instructions say
Modules/Makefile should be edited, but with Neil's non-recursive
Makefile patch (committed after alpha1's release), editing
Modules/Makefile will have no effect.  Try editing just the top-level
Makefile, which should affect setup.py.

--amk
 


From mal@lemburg.com  Fri Jan 26 22:15:09 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 23:15:09 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us>
Message-ID: <3A71F6ED.D6D642A7@lemburg.com>

"Andrew M. Kuchling" wrote:
> >The QNX node number prefix '//5' (machine or host number, equivalent to a
> >'hostname:' prefix for network paths) is being reduced somehow (path
> >normalization?) to '/5', so paths don't resolve. 2 slashes ('//') are
> >required at the head of the path. Is this something that can be fixed?
> 
> Ooh, very likely:
> >>> os.path.normpath('//5/foo/bar')
> '/5/foo/bar'
> 
> Isn't // at the root a Unix convention of some sort for some
> network filesystems?  Probably normpath() should just leave it alone.

Samba uses //<hostname>/<mountname>/<path>. os.path.normpath()
should probably leave the leading '//' untouched (having too
many of those in the path doesn't do any harm, AFAIK).
 
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From nas@arctrix.com  Fri Jan 26 15:26:12 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 07:26:12 -0800
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>; from dgoodger@atsautomation.com on Fri, Jan 26, 2001 at 04:46:13PM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>
Message-ID: <20010126072611.A5345@glacier.fnational.com>

On Fri, Jan 26, 2001 at 04:46:13PM -0500, Goodger, David wrote:
> Running ./python results in stack overflow. The old QNX instructions in
> README recommend editing Modules/Makefile:
>     LDFLAGS=    -N 64k
> 
>     # make 2>&1 | tee ../make.2

The README should be changed to say edit the toplevel Makefile.
Should those flags be the default?  If you can give me the
MACHDEP from your Makefile I can add it to configure.in.

> QNX doesn't have an 'ld' command. Is configure not getting its info to
> setup.py? (Is it supposed to?)

I'm not sure how distutils figures out what to use for ld.  It
doesn't appear in the Makefile.  It think this is probably some
distutils thing.  Andrew?

  Neil


From fredrik@effbot.org  Fri Jan 26 22:25:34 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Fri, 26 Jan 2001 23:25:34 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com>
Message-ID: <001a01c087e6$ec3b9710$e46940d5@hagrid>

mal wrote:> > Ooh, very likely:
> > >>> os.path.normpath('//5/foo/bar')
> > '/5/foo/bar'
> > 
> > Isn't // at the root a Unix convention of some sort for some
> > network filesystems?  Probably normpath() should just leave it alone.
> 
> Samba uses //<hostname>/<mountname>/<path>. os.path.normpath()
> should probably leave the leading '//' untouched (having too
> many of those in the path doesn't do any harm, AFAIK).

from 1.5.2's posixpath:

def normpath(path):
    """Normalize path, eliminating double slashes, etc."""
    import string
    # Treat initial slashes specially
    slashes = ''
    while path[:1] == '/':
        slashes = slashes + '/'
        path = path[1:]
    ...
    return slashes + string.joinfields(comps, '/')

from 2.0's posixpath:

def normpath(path):
    """Normalize path, eliminating double slashes, etc."""
    if path == '':
        return '.'
    import string
    initial_slash = (path[0] == '/')
    ...
    if initial_slash:
        path = '/' + path
    return path or '.'

interesting...

Cheers /F



From akuchlin@mems-exchange.org  Fri Jan 26 22:28:03 2001
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Fri, 26 Jan 2001 17:28:03 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <20010126072611.A5345@glacier.fnational.com>; from nas@arctrix.com on Fri, Jan 26, 2001 at 07:26:12AM -0800
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126072611.A5345@glacier.fnational.com>
Message-ID: <20010126172803.A2817@amarok.cnri.reston.va.us>

On Fri, Jan 26, 2001 at 07:26:12AM -0800, Neil Schemenauer wrote:
>I'm not sure how distutils figures out what to use for ld.  It
>doesn't appear in the Makefile.  It think this is probably some
>distutils thing.  Andrew?

It looks at LDSHARED.  See customize_compiler in
Lib/distutils/sysconfig.py.  Looking in Modules/Makefile, LDFLAGS is
only used for the final link to produce a Python executable, so I
think this is up to the Makefile, not setup.py.

--amk


From nas@arctrix.com  Fri Jan 26 15:56:41 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 07:56:41 -0800
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <20010126172803.A2817@amarok.cnri.reston.va.us>; from akuchlin@cnri.reston.va.us on Fri, Jan 26, 2001 at 05:28:03PM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126072611.A5345@glacier.fnational.com> <20010126172803.A2817@amarok.cnri.reston.va.us>
Message-ID: <20010126075641.A5534@glacier.fnational.com>

On Fri, Jan 26, 2001 at 05:28:03PM -0500, Andrew M. Kuchling wrote:
> On Fri, Jan 26, 2001 at 07:26:12AM -0800, Neil Schemenauer wrote:
> >I'm not sure how distutils figures out what to use for ld.
> 
> It looks at LDSHARED.

Okay.  David, what should LDSHARED say for QNX?  I can add the
magic to configure.in.

  Neil


From mal@lemburg.com  Fri Jan 26 22:51:09 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 23:51:09 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid>
Message-ID: <3A71FF5D.DC609775@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:> > Ooh, very likely:
> > > >>> os.path.normpath('//5/foo/bar')
> > > '/5/foo/bar'
> > >
> > > Isn't // at the root a Unix convention of some sort for some
> > > network filesystems?  Probably normpath() should just leave it alone.
> >
> > Samba uses //<hostname>/<mountname>/<path>. os.path.normpath()
> > should probably leave the leading '//' untouched (having too
> > many of those in the path doesn't do any harm, AFAIK).
> 
> from 1.5.2's posixpath:
> 
> def normpath(path):
>     """Normalize path, eliminating double slashes, etc."""
>     import string
>     # Treat initial slashes specially
>     slashes = ''
>     while path[:1] == '/':
>         slashes = slashes + '/'
>         path = path[1:]
>     ...
>     return slashes + string.joinfields(comps, '/')
> 
> from 2.0's posixpath:
> 
> def normpath(path):
>     """Normalize path, eliminating double slashes, etc."""
>     if path == '':
>         return '.'
>     import string
>     initial_slash = (path[0] == '/')
>     ...
>     if initial_slash:
>         path = '/' + path
>     return path or '.'
> 
> interesting...

Here's the log message:

revision 1.34
date: 2000/07/19 17:09:51;  author: montanaro;  state: Exp;  lines: +18 -23
added rewritten normpath from Moshe Zadka that does the right thing with
paths containing ..

and the diff:

diff -r1.34 -r1.33
349,350d348
<     if path == '':
<         return '.'
352,367c350,372
<     initial_slash = (path[0] == '/')
<     comps = string.split(path, '/')
<     new_comps = []
<     for comp in comps:
<         if comp in ('', '.'):
<             continue
<         if (comp != '..' or (not initial_slash and not new_comps) or 
<              (new_comps and new_comps[-1] == '..')):
<             new_comps.append(comp)
<         elif new_comps:
<             new_comps.pop()
<     comps = new_comps
<     path = string.join(comps, '/')
<     if initial_slash:
<         path = '/' + path
<     return path or '.'
---
>     # Treat initial slashes specially
>     slashes = ''
>     while path[:1] == '/':
>         slashes = slashes + '/'
>         path = path[1:]
>     comps = string.splitfields(path, '/')
>     i = 0
>     while i < len(comps):
>         if comps[i] == '.':
>             del comps[i]
>             while i < len(comps) and comps[i] == '':
>                 del comps[i]
>         elif comps[i] == '..' and i > 0 and comps[i-1] not in ('', '..'):
>             del comps[i-1:i+1]
>             i = i-1
>         elif comps[i] == '' and i > 0 and comps[i-1] <> '':
>             del comps[i]
>         else:
>             i = i+1
>     # If the path is now empty, substitute '.'
>     if not comps and not slashes:
>         comps.append('.')
>     return slashes + string.joinfields(comps, '/')

Revision 1.33 clearly leaves initial slashes untouched.
I guess we should restore this...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From nas@arctrix.com  Fri Jan 26 16:12:15 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 08:12:15 -0800
Subject: [Python-Dev] LINKCC defaults to CXX
Message-ID: <20010126081215.B5534@glacier.fnational.com>

Dear lord why?  So people can develop extensions using C++?  Its
not worth the pain inflicted on everyone else.  Let them
recompile with LINKCC=CXX.

Linking with CXX opens a huge can of stinky worms.  First of all,
just because configure found a value for CXX doesn't mean it
works.  Even if it does that doesn't mean that using it is a good
idea.  Linking with CXX will bring in the C++ runtime.  There are
a large number of platforms where the C++ ABI has not been
standarized; for example, anything that used g++.

Can we please leave LINKCC default to CXX?  Its easy enough for
the crazies to override if they like.  I'll even create a
configure option for them.

  Neil


From barry@digicool.com  Fri Jan 26 23:09:57 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Fri, 26 Jan 2001 18:09:57 -0500
Subject: [Python-Dev] LINKCC defaults to CXX
References: <20010126081215.B5534@glacier.fnational.com>
Message-ID: <14962.965.464326.794431@anthem.wooz.org>

>>>>> "NS" == Neil Schemenauer <nas@arctrix.com> writes:

    NS> Can we please leave LINKCC default to CXX?

I think you mean default it to CC, eh?  +1


From mal@lemburg.com  Sat Jan 27 00:16:01 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 27 Jan 2001 01:16:01 +0100
Subject: [Python-Dev] Nightly CVS tarballs
Message-ID: <3A721341.3F348E51@lemburg.com>

I just got a request from someone who wants to test the latest
CVS version but unfortunately can't because he's behind a 
firewall.

Is there any chance of reactivating the nightly tarball generation
that was once in place ?

	http://www.python.org/download/cvs.html

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From dgoodger@atsautomation.com  Sat Jan 27 00:30:21 2001
From: dgoodger@atsautomation.com (Goodger, David)
Date: Fri, 26 Jan 2001 19:30:21 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
Message-ID: <B4A8F5EFA7E4D41184A00003470D35DE011587BC@INTERGATE>

Thank you all for your prompt replies. (Guido's was within seconds! Well,
minutes, certainly.)

I'll give it another go on Monday. I've got renovations to fill my weekend.

/David


From thomas@xs4all.net  Sat Jan 27 00:35:41 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sat, 27 Jan 2001 01:35:41 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <B4A8F5EFA7E4D41184A00003470D35DE011587BC@INTERGATE>; from dgoodger@atsautomation.com on Fri, Jan 26, 2001 at 07:30:21PM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE011587BC@INTERGATE>
Message-ID: <20010127013541.N962@xs4all.nl>

On Fri, Jan 26, 2001 at 07:30:21PM -0500, Goodger, David wrote:

> Thank you all for your prompt replies. (Guido's was within seconds! Well,
> minutes, certainly.)

Oh, the wonderful things one can do with a time machine....

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From jeremy@alum.mit.edu  Fri Jan 26 22:14:26 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Fri, 26 Jan 2001 17:14:26 -0500 (EST)
Subject: [Python-Dev] Nightly CVS tarballs
In-Reply-To: <3A721341.3F348E51@lemburg.com>
References: <3A721341.3F348E51@lemburg.com>
Message-ID: <14961.63170.394043.790610@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal@lemburg.com> writes:

  MAL> I just got a request from someone who wants to test the latest
  MAL> CVS version but unfortunately can't because he's behind a
  MAL> firewall.

  MAL> Is there any chance of reactivating the nightly tarball
  MAL> generation that was once in place ?

  MAL> 	http://www.python.org/download/cvs.html

I plan to set up nightly cvs snapshots soon.  We should be moving into
our new office next week; I hope to have a machine that is on the net
24x7 shortly after that.

Jeremy


From bckfnn@worldonline.dk  Sat Jan 27 07:58:38 2001
From: bckfnn@worldonline.dk (Finn Bock)
Date: Sat, 27 Jan 2001 07:58:38 GMT
Subject: [Python-Dev] Nightly CVS tarballs
In-Reply-To: <14961.63170.394043.790610@localhost.localdomain>
References: <3A721341.3F348E51@lemburg.com> <14961.63170.394043.790610@localhost.localdomain>
Message-ID: <3a727e79.835771@smtp.worldonline.dk>

>>>>>> "MAL" == M -A Lemburg <mal@lemburg.com> writes:
>
>  MAL> I just got a request from someone who wants to test the latest
>  MAL> CVS version but unfortunately can't because he's behind a
>  MAL> firewall.
>
>  MAL> Is there any chance of reactivating the nightly tarball
>  MAL> generation that was once in place ?
>
>  MAL> 	http://www.python.org/download/cvs.html

[Jeremy]

>I plan to set up nightly cvs snapshots soon.  We should be moving into
>our new office next week; I hope to have a machine that is on the net
>24x7 shortly after that.

FWIW, I have been using this cron and shell script running on
shell.sourceforge.net. This way I don't need 24x7 in order to make a cvs
tarball (and .zip) available.


22 2 * * * $HOME/bin/jython-snap



SHOTLABEL=`date +%Y%m%d`
LOGLABEL=log.`date +%Y%m%d`
cd /home/groups/jython/htdocs/cvssnaps
(cvs -Qd :pserver:anonymous@cvs1:/cvsroot/jython checkout -d
jython-$SHOTLABEL jython && \
  tar zcf jython-nightly.tar.gz jython-$SHOTLABEL && \
  rm -fr jython-nightly.zip && \
  zip -qr9 jython-nightly.zip jython-$SHOTLABEL && \
  rm -fr jython-$SHOTLABEL) >$LOGLABEL 2>&1


regards,
finn


From tim.one@home.com  Sat Jan 27 09:35:14 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 27 Jan 2001 04:35:14 -0500
Subject: [Python-Dev] setup.py
In-Reply-To: <20010126092559.A5623@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEJPILAA.tim.one@home.com>

[Eric S. Raymond]
> I may not channel Guido the way Tim does, but I suspect he gave you
> developer privileges because he trusts you to do routine stuff like this.

Excellent, Eric!  You're batting 1%.  Here's how to boost it to 93%:
whenever a new idea comes up, just grumble "no".  You'll be right 92% of the
time <wink>.

Reminds me of a friend who got sucked into working at a neural-net startup
trying to build a black box to predict whether the daily close of the S&P
500 would be above or below the previous day's.  He was greatly impressed by
the research they had done, showing that the prototype got the right answer
more than half the time when fed historical data, and at a very high
significance level (i.e., it almost certainly did better than flipping a
coin).  What he didn't realize at the time is that if they had written the
prototype in Python:

    # S&P close daily direction predictor
    print "higher"

it would have been right about 2/3rds the time <0.33 wink>.

never-ascribe-to-insight-what-can-be-explained-by-idiocy-ly y'rs  - tim



From martin@mira.cs.tu-berlin.de  Sat Jan 27 09:38:41 2001
From: martin@mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 27 Jan 2001 10:38:41 +0100
Subject: [Python-Dev] Nightly CVS tarballs
Message-ID: <200101270938.f0R9cfU08311@mira.informatik.hu-berlin.de>

> Is there any chance of reactivating the nightly tarball generation
> that was once in place ?

What's wrong with

http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.gz

?

Regards,
Martin


From fredrik@effbot.org  Sat Jan 27 10:43:50 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Sat, 27 Jan 2001 11:43:50 +0100
Subject: [Python-Dev] setup.py
References: <LNBBLJKPBEHFEDALKOLCOEJPILAA.tim.one@home.com>
Message-ID: <008c01c0884e$09bd2030$e46940d5@hagrid>

tim wrote:
> Reminds me of a friend who got sucked into working at a neural-net startup
> trying to build a black box to predict whether the daily close of the S&P
> 500 would be above or below the previous day's.  /.../
> 
>     # S&P close daily direction predictor
>     print "higher"

replace "higher" with "same", and you have a pretty
decent weather predictor.

Cheers /F



From mal@lemburg.com  Sat Jan 27 12:01:30 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 27 Jan 2001 13:01:30 +0100
Subject: [Python-Dev] Nightly CVS tarballs
References: <200101270938.f0R9cfU08311@mira.informatik.hu-berlin.de>
Message-ID: <3A72B89A.E03C1912@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > Is there any chance of reactivating the nightly tarball generation
> > that was once in place ?
> 
> What's wrong with
> 
> http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.gz
> 
> ?

I didn't realize that SF does this automagically. Could someone
please redirect the link on the python.org cvs page to the
above address (David Ascher's tarball generation stopped in
February 2000 !).

Thanks for the hint, Martin.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fdrake@acm.org  Sat Jan 27 13:16:01 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Sat, 27 Jan 2001 08:16:01 -0500 (EST)
Subject: [Python-Dev] Nightly CVS tarballs
In-Reply-To: <3A72B89A.E03C1912@lemburg.com>
References: <200101270938.f0R9cfU08311@mira.informatik.hu-berlin.de>
 <3A72B89A.E03C1912@lemburg.com>
Message-ID: <14962.51729.905084.154359@cj42289-a.reston1.va.home.com>

"Martin v. Loewis" wrote:
 > What's wrong with
 > 
 > http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.gz

M.-A. Lemburg writes:
 > I didn't realize that SF does this automagically. Could someone
 > please redirect the link on the python.org cvs page to the
 > above address (David Ascher's tarball generation stopped in
 > February 2000 !).

  Did you want a "snapshot" or a copy of the repository?  What SF
produces is a tarball of the repository, not a snapshot.  We still
need to do something to create snapshots.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From mal@lemburg.com  Sat Jan 27 13:28:40 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Sat, 27 Jan 2001 14:28:40 +0100
Subject: [Python-Dev] Nightly CVS tarballs
References: <200101270938.f0R9cfU08311@mira.informatik.hu-berlin.de>
 <3A72B89A.E03C1912@lemburg.com> <14962.51729.905084.154359@cj42289-a.reston1.va.home.com>
Message-ID: <3A72CD08.F47DAA69@lemburg.com>

"Fred L. Drake, Jr." wrote:
> 
> "Martin v. Loewis" wrote:
>  > What's wrong with
>  >
>  > http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.gz
> 
> M.-A. Lemburg writes:
>  > I didn't realize that SF does this automagically. Could someone
>  > please redirect the link on the python.org cvs page to the
>  > above address (David Ascher's tarball generation stopped in
>  > February 2000 !).
> 
>   Did you want a "snapshot" or a copy of the repository?  What SF
> produces is a tarball of the repository, not a snapshot. 

I meant a copy of what you get when you check out the Python
CVS tree wrapped into a .tar.gz file. The size of the above
archive (16MB) suggests that a lot more is going into the
.tar.gz file. A .tar.gz of the CVS checkout is around 4MB in
size. Looks like we still need to do something after all ;)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From armin@steinhoff.de  Sat Jan 27 16:24:57 2001
From: armin@steinhoff.de (Armin Steinhoff)
Date: Sat, 27 Jan 2001 17:24:57 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
Message-ID: <4.3.2.7.2.20010127170125.00b2ee80@mail.secureweb.de>

Hello Guido,

nice to see the first 2.1 version :)

At 16:52 26.01.01 -0500, you wrote:
> > [CC'ing to Armin Steinhoff, who maintains pyqnx on SourceForge.]
> >
> > I'm having trouble building Python 2.1a1 on QNX 4.25. Caveat: my C is very
> > rusty (long live Python!), I don't know my way around configure, and am not
> > familiar with Python's Makefile. Python 2.0 compiled fine (with a couple of
> > tweaks), but I'm getting caught by the new way of building things. Please
> > help if you can! Many thanks in advance.
> >
> > Here's an excerpt of my efforts:
> >
> >     # cd /tmp/py
> >     # gunzip -c < python-2.1a1.tgz | tar -rf -
> >     # cd Python-2.1a1
> >     # ./configure 2>&1 | tee ../configure.1

I did a fast hack with the new 2.1 version:

CC=cc LINKCC=cc configure --without-gcc --shared=no --without-threads

(Hope '--shared=no' works ... QNX4 doesn't support dynamic loading)
Please replace all references to g++ by cc -> in the main Makefile and the 
Modules/Makefile.
In the Modules/Makefile set LDFLAGS=250K  ... the default stacksize of 32K 
seems to be too small.

> >     # make 2>&1 | tee ../make.1
> >     ...
> >     ./python //5/tmp/py/Python-2.1a1/setup.py build
> >     'import site' failed; use -v for traceback

'python -v' shows that the module 'distutils.util' isn't there ....  it 
seems to be not included in the source distribution.

'import site' failed; traceback:
Traceback (most recent call last):
File "//1/Python-2.1a1/Lib/site.py", line 85, in ?
from distutils.util import get_platform
ImportError: No module named distutils.util
                                              ^^^^^^^^^^^^^^
[ clip ..]

>This is probably in the realm of the distutils.  I have no idea how to
>teach it to build on QNX, sorry!

IMHO ... it is not a path problem.

In the moment there is no time left for me to go into these details. A 
clean port will happen in a few weeks. Please check out PyQNX for news 
regarding QNX4.25 and QNX6.0  (aka  QNX Neutrino).

Greetings

Armin Steinhoff

Life-Demo of PyDACHS
http://www.dachs.net/PyDACHS_python-tilcon.htm
in our booth at
Embedded Systems 2001, Nuremberg, GER
http://www.embedded-systems-messe.de
Febr. 14-16, 2000            Hall 11, Booth P 04





From guido@digicool.com  Sat Jan 27 16:50:50 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sat, 27 Jan 2001 11:50:50 -0500
Subject: [Python-Dev] LINKCC defaults to CXX
In-Reply-To: Your message of "Fri, 26 Jan 2001 08:12:15 PST."
 <20010126081215.B5534@glacier.fnational.com>
References: <20010126081215.B5534@glacier.fnational.com>
Message-ID: <200101271650.LAA30720@cj20424-a.reston1.va.home.com>

> Dear lord why?  So people can develop extensions using C++?  Its
> not worth the pain inflicted on everyone else.  Let them
> recompile with LINKCC=CXX.
> 
> Linking with CXX opens a huge can of stinky worms.  First of all,
> just because configure found a value for CXX doesn't mean it
> works.  Even if it does that doesn't mean that using it is a good
> idea.  Linking with CXX will bring in the C++ runtime.  There are
> a large number of platforms where the C++ ABI has not been
> standarized; for example, anything that used g++.
> 
> Can we please leave LINKCC default to CXX?  Its easy enough for
> the crazies to override if they like.  I'll even create a
> configure option for them.

Arg.  My bad.  I did this as an experiment; it didn't break on my
machine, but I didn't intend this to become the standard!  Thanks for
changing it back.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Sat Jan 27 16:52:23 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sat, 27 Jan 2001 11:52:23 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: Your message of "Fri, 26 Jan 2001 23:51:09 +0100."
 <3A71FF5D.DC609775@lemburg.com>
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid>
 <3A71FF5D.DC609775@lemburg.com>
Message-ID: <200101271652.LAA30750@cj20424-a.reston1.va.home.com>

> revision 1.34
> date: 2000/07/19 17:09:51;  author: montanaro;  state: Exp;  lines: +18 -23
> added rewritten normpath from Moshe Zadka that does the right thing with
> paths containing ..
[...]
> Revision 1.33 clearly leaves initial slashes untouched.
> I guess we should restore this...

Yes, please!  (Just the "leading extra slashes stay" behavior.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Sat Jan 27 16:57:40 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sat, 27 Jan 2001 11:57:40 -0500
Subject: [Python-Dev] New bug in function object hash() and comparisons
In-Reply-To: Your message of "Fri, 26 Jan 2001 17:02:09 EST."
 <list-760656@digicool.com>
References: <list-760656@digicool.com>
Message-ID: <200101271657.LAA30782@cj20424-a.reston1.va.home.com>

Barry noticed:

> Anyway, did you know that you can use functions as keys to a
> dictionary, but that you can mutate them to "lose" the element?
> 
> -------------------- snip snip --------------------
> Python 2.0 (#13, Jan 10 2001, 13:06:39) 
> [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2
> Type "copyright", "credits" or "license" for more information.
> >>> d = {}
> >>> def foo(): pass
> ... 
> >>> def bar(): pass
> ... 
> >>> d[foo] = 1
> >>> d[foo]
> 1
> >>> foocode = foo.func_code
> >>> foo.func_code = bar.func_code
> >>> d[foo]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> KeyError: <function foo at 0x81ef474>
> >>> d[bar] = 2
> >>> d[bar]
> 2
> >>> d[foo]
> 2
> >>> foo.func_code = foocode
> >>> d[foo]
> 1
> -------------------- snip snip --------------------
> 
> It's because a function's func_code attribute is used in its hash
> calculation, but func_code is writable!

Clearly, something changed.  I'm pretty sure it's the function
attributes.  Either the function attributes shouldn't be used in
comparing function objects, or hash() on functions should be
unimplemented, or comparison on functions should use simple pointer
compares.

What's the right solution?  Do people use functions as dict keys?  If
not, we can remove the hash() implementation.  But I suspect they
*are* used as dict keys.  Not using the __dict__ on comparisons
appears ugly, so probably the best solution is to change function
comparisons to use simple pointer compares.  That removes the
possibility to see whether two different functions implement the same
code -- but does anybody really use that?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From moshez@zadka.site.co.il  Sat Jan 27 17:17:50 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Sat, 27 Jan 2001 19:17:50 +0200 (IST)
Subject: [Python-Dev] New bug in function object hash() and comparisons
In-Reply-To: <200101271657.LAA30782@cj20424-a.reston1.va.home.com>
References: <200101271657.LAA30782@cj20424-a.reston1.va.home.com>, <list-760656@digicool.com>
Message-ID: <20010127171750.91412A840@darjeeling.zadka.site.co.il>

On Sat, 27 Jan 2001 11:57:40 -0500, Guido van Rossum <guido@digicool.com> wrote:

(about function hash doing the wrong thing)
> What's the right solution?

I have no idea...

>  Do people use functions as dict keys?  If
> not, we can remove the hash() implementation.

...but this ain't it.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From gvwilson@ca.baltimore.com  Sat Jan 27 17:23:42 2001
From: gvwilson@ca.baltimore.com (Greg Wilson)
Date: Sat, 27 Jan 2001 12:23:42 -0500
Subject: [Python-Dev] RE: Python-Dev digest, Vol 1 #1119 - 17 msgs
In-Reply-To: <20010127170103.DA6DEEA44@mail.python.org>
Message-ID: <000001c08885$e5418c40$770a0a0a@nevex.com>

> Guido wrote:
> What's the right solution?  Do people use functions as dict keys?

Yup --- even use this as an example in the course (part of drumming
home to students that functions are just a special kind of data).

Greg


From barry@digicool.com  Sat Jan 27 17:43:43 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Sat, 27 Jan 2001 12:43:43 -0500
Subject: [Python-Dev] Re: New bug in function object hash() and comparisons
References: <list-760656@digicool.com>
 <200101271657.LAA30782@cj20424-a.reston1.va.home.com>
Message-ID: <14963.2255.268933.615456@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@digicool.com> writes:

    GvR> Clearly, something changed.  I'm pretty sure it's the
    GvR> function attributes.

Actually no.  func_code is used in func_hash() but somewhere in the
Python 1.6 cycle, func_code was made assignable.
    
    GvR> Either the function attributes shouldn't be used in comparing
    GvR> function objects, or hash() on functions should be
    GvR> unimplemented, or comparison on functions should use simple
    GvR> pointer compares.

    GvR> What's the right solution?

We should definitely continue to allow functions as keys to
dictionaries, but probably just remove func_code as an input to the
function's hash.
    
-Barry


From barry@digicool.com  Sat Jan 27 17:48:33 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Sat, 27 Jan 2001 12:48:33 -0500
Subject: [Python-Dev] Re: New bug in function object hash() and comparisons
References: <list-760656@digicool.com>
 <200101271657.LAA30782@cj20424-a.reston1.va.home.com>
 <14963.2255.268933.615456@anthem.wooz.org>
Message-ID: <14963.2545.14600.667505@anthem.wooz.org>

    Me> We should definitely continue to allow functions as keys to
    Me> dictionaries, but probably just remove func_code as an input
    Me> to the function's hash.
    
But of course, func_globals won't be sufficient as a hash for
functions.  Probably changing the hash to a pointer compare is the
best thing after all.

-Barry


From guido@digicool.com  Sat Jan 27 17:49:16 2001
From: guido@digicool.com (Guido van Rossum)
Date: Sat, 27 Jan 2001 12:49:16 -0500
Subject: [Python-Dev] Re: New bug in function object hash() and comparisons
In-Reply-To: Your message of "Sat, 27 Jan 2001 12:43:43 EST."
 <14963.2255.268933.615456@anthem.wooz.org>
References: <list-760656@digicool.com> <200101271657.LAA30782@cj20424-a.reston1.va.home.com>
 <14963.2255.268933.615456@anthem.wooz.org>
Message-ID: <200101271749.MAA32025@cj20424-a.reston1.va.home.com>

> >>>>> "GvR" == Guido van Rossum <guido@digicool.com> writes:
> 
>     GvR> Clearly, something changed.  I'm pretty sure it's the
>     GvR> function attributes.
> 
> Actually no.  func_code is used in func_hash() but somewhere in the
> Python 1.6 cycle, func_code was made assignable.

Argh!  You're right.

>     GvR> Either the function attributes shouldn't be used in comparing
>     GvR> function objects, or hash() on functions should be
>     GvR> unimplemented, or comparison on functions should use simple
>     GvR> pointer compares.
> 
>     GvR> What's the right solution?
> 
> We should definitely continue to allow functions as keys to
> dictionaries, but probably just remove func_code as an input to the
> function's hash.

OK, that settles it.  There's not much point in having a function
compare do anything besides a pointer comparison when the code objects
aren't compared.  (Two completely different functions could compare
equal e.g. if they has the same attribute dict.)  So we should just
punt, and compare functions by object pointer.

The proper way to do this is to *delete* func_hash and func_compare
from funcobject.c -- the default comparison will take care of this.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From akuchlin@mems-exchange.org  Sat Jan 27 18:58:30 2001
From: akuchlin@mems-exchange.org (A.M. Kuchling)
Date: Sat, 27 Jan 2001 13:58:30 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org>
Message-ID: <200101271858.NAA04898@mira.erols.com>

On Sat, 27 Jan 2001 18:28:02 +0100, 
	Andreas Jung <andreas@andreas-jung.com> wrote:
>Is there a reason why 2.1 runs significantly slower ?
>Both Python versions were compiled with -g -O2 only.

[CC'ing to python-dev]  Confirmed:

[amk@mira Python-2.0]$ ./python Lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 3.14
This machine benchmarks at 3184.71 pystones/second
[amk@mira Python-2.0]$ python2.1 Lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 3.81
This machine benchmarks at 2624.67 pystones/second

The ceval.c changes seem a likely candidate to have caused this.
Anyone want to run Marc-Andre's microbenchmarks and see how the
numbers have changed?

--amk



From moshez@zadka.site.co.il  Sat Jan 27 19:14:28 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Sat, 27 Jan 2001 21:14:28 +0200 (IST)
Subject: [Python-Dev] Function Hash: Check it in?
Message-ID: <20010127191428.D71ADA840@darjeeling.zadka.site.co.il>

Attached is an example Python session after I patched the intepreter.
The test-suite passes all right.

I want an OK to check this in.

Here is the patch:
Index: Objects/funcobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/funcobject.c,v
retrieving revision 2.33
diff -c -r2.33 funcobject.c
*** Objects/funcobject.c        2001/01/25 20:06:59     2.33
--- Objects/funcobject.c        2001/01/27 19:13:08
***************
*** 347,358 ****
        0,              /*tp_print*/
        0, /*tp_getattr*/
        0, /*tp_setattr*/
!       (cmpfunc)func_compare, /*tp_compare*/
        (reprfunc)func_repr, /*tp_repr*/
        0,              /*tp_as_number*/
        0,              /*tp_as_sequence*/
        0,              /*tp_as_mapping*/
!       (hashfunc)func_hash, /*tp_hash*/
        0,              /*tp_call*/
        0,              /*tp_str*/
        (getattrofunc)func_getattro,         /*tp_getattro*/
--- 347,358 ----
        0,              /*tp_print*/
        0, /*tp_getattr*/
        0, /*tp_setattr*/
!       0, /*tp_compare*/
        (reprfunc)func_repr, /*tp_repr*/
        0,              /*tp_as_number*/
        0,              /*tp_as_sequence*/
        0,              /*tp_as_mapping*/
!       0, /*tp_hash*/
        0,              /*tp_call*/
        0,              /*tp_str*/
        (getattrofunc)func_getattro,         /*tp_getattro*/

Python 2.1a1 (#1, Jan 27 2001, 21:01:24)
[GCC 2.95.3 20010111 (prerelease)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> def foo():
...     pass
...
>>> def bar():
...     pass
...
>>> hash(foo)
135484636
>>> hash(bar)
135481676
>>> foo == bar
0
>>> d = {}
>>> d[foo] =1
>>> def temp():
...     print "baz"
...
>>> foo.func_code = temp.func_code
>>> d[foo]
1

-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From tim.one@home.com  Sat Jan 27 20:06:20 2001
From: tim.one@home.com (Tim Peters)
Date: Sat, 27 Jan 2001 15:06:20 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <200101271858.NAA04898@mira.erols.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGELGILAA.tim.one@home.com>

[A.M. Kuchling]
> [CC'ing to python-dev]  Confirmed:
>
> [amk@mira Python-2.0]$ ./python Lib/test/pystone.py
> Pystone(1.1) time for 10000 passes = 3.14
> This machine benchmarks at 3184.71 pystones/second
> [amk@mira Python-2.0]$ python2.1 Lib/test/pystone.py
> Pystone(1.1) time for 10000 passes = 3.81
> This machine benchmarks at 2624.67 pystones/second
>
> The ceval.c changes seem a likely candidate to have caused this.
> Anyone want to run Marc-Andre's microbenchmarks and see how the
> numbers have changed?

Want to, yes, but it looks hopeless on my box:

**** 2.0

C:\Python20>python lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 0.851013
This machine benchmarks at 11750.7 pystones/second

C:\Python20>python lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 1.24279
This machine benchmarks at 8046.41 pystones/second

**** 2.1a1

C:\Python21a1>python lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 0.823313
This machine benchmarks at 12146 pystones/second

C:\Python21a1>python lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 1.27046
This machine benchmarks at 7871.15 pystones/second

**** CVS

C:\Code\python\dist\src\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.836391
This machine benchmarks at 11956.1 pystones/second

C:\Code\python\dist\src\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 1.3055
This machine benchmarks at 7659.9 pystones/second


That's after a reboot:  no matter which Python I use, it gets about 12000 on
the first run with a given python.exe, and about 8000 on the second.  Not
shown is that it *stays* at about 8000 until the next reboot.

So there's a Windows (W98SE) Mystery, but also no evidence that timings have
changed worth spit under the MS compiler.  The eval loop is very touchy, and
I suspect you won't track this down on your box until staring at the code
gcc (I presume you're using gcc) generates.  May be sensitive to which
release of gcc you're using too.

switch-to-windows-and-you'll-have-easier-things-to-worry-about<wink>-ly
    y'rs  - tim



From fredrik@pythonware.com  Sun Jan 28 09:37:45 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Sun, 28 Jan 2001 10:37:45 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid>              <3A71FF5D.DC609775@lemburg.com>  <200101271652.LAA30750@cj20424-a.reston1.va.home.com>
Message-ID: <00ed01c0890e$e3bf5ad0$e46940d5@hagrid>

guido wrote:

> > Revision 1.33 clearly leaves initial slashes untouched.
> > I guess we should restore this...
> 
> Yes, please!  (Just the "leading extra slashes stay" behavior.)

just looked this up in the specs, and POSIX seem to
require that leading slashes are preserved only if there
are exactly two of them:

    A pathname that begins with two successive slashes
    may be interpreted in an implementation-dependent
    manner, although more than two leading slashes are
    treated as a single slash.
    (from susv2)

maybe we should add a if len(slashes) > 2: slashes = "/"
test to the patch?

Cheers /F



From thomas@xs4all.net  Sun Jan 28 17:39:58 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Sun, 28 Jan 2001 18:39:58 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <00ed01c0890e$e3bf5ad0$e46940d5@hagrid>; from fredrik@pythonware.com on Sun, Jan 28, 2001 at 10:37:45AM +0100
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid> <3A71FF5D.DC609775@lemburg.com> <200101271652.LAA30750@cj20424-a.reston1.va.home.com> <00ed01c0890e$e3bf5ad0$e46940d5@hagrid>
Message-ID: <20010128183958.Q962@xs4all.nl>

On Sun, Jan 28, 2001 at 10:37:45AM +0100, Fredrik Lundh wrote:
> guido wrote:

> > > Revision 1.33 clearly leaves initial slashes untouched.
> > > I guess we should restore this...
> > 
> > Yes, please!  (Just the "leading extra slashes stay" behavior.)

> just looked this up in the specs, and POSIX seem to
> require that leading slashes are preserved only if there
> are exactly two of them:

>     A pathname that begins with two successive slashes
>     may be interpreted in an implementation-dependent
>     manner, although more than two leading slashes are
>     treated as a single slash.
>     (from susv2)

> maybe we should add a if len(slashes) > 2: slashes = "/"
> test to the patch?

How strictly do we need (or want, for that matter) to follow POSIX here ?
I'm aware the module is called 'posixpath', but it's used in a bit more than
just POSIX environments (or POSIX behaviours) so it might make sense to
ignore this particular tidbit. What if there is a system that attaches a
special meaning to ///, should we create a new path module for it ?

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From martin@mira.cs.tu-berlin.de  Sun Jan 28 20:50:35 2001
From: martin@mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 28 Jan 2001 21:50:35 +0100
Subject: [Python-Dev] XSLT parser interface
Message-ID: <200101282050.f0SKoZr08809@mira.informatik.hu-berlin.de>

Based on my previous IDL interface for XPath parsers, I've defined an
API for a parser that parsers XSLT pattern expressions. It is an
extension to the XPath API, so I attach only the additional functions.

Any comments are appreciated.

Martin

module XPath{
  // XSLT exprType values
  const unsigned short PATTERN = 17;
  const unsigned short LOCATION_PATTERN = 18;
  const unsigned short RELATIVE_PATH_PATTERN = 19;
  const unsigned short STEP_PATTERN = 20;

  interface Pattern;
  interface LocationPathPattern;
  interface RelativePathPattern;
  interface StepPattern;

  interface PatternFactory:ExprFactory{
    Pattern createPattern(in LocationPathPattern first);
    // idkey may be null, represents IdKeyPattern
    // if parent is true, it is '/', else '//'
    // rel may be null
    LocationPathPattern createLocationPathPattern(in FunctionCall idkey,
						  boolean parent,
						  in RelativePathPattern rel);
    // if parent is true, it is /, else //
    RelativePathPattern createRelativePathPattern(in RelativePathPattern rel,
						  boolean parent,
						  in StepPattern step);
    StepPattern createStepPattern(in AxisSpecifier axis,
				  in NodeTest test,
				  in PredicateList predicates);
  };

  typedef sequence<LocationPathPattern> LocationPathPatterns;
  interface Pattern:Expr{
    readonly attribute LocationPathPatterns patterns;
    void append(in LocationPathPattern pattern);
  };

  interface LocationPathPattern:Expr{
    readonly attribute FunctionCall idkey;
    readonly attribute boolean parent;
    readonly attribute RelativePathPattern relative_pattern;
  };

  interface RelativePathPattern:Expr{
    readonly attribute RelativePathPattern relative;
    readonly attribute boolean parent;
    readonly attribute StepPattern step;
  };

  interface StepPattern:Expr{
    readonly attribute AxisSpecifier axis;
    readonly attribute NodeTest test;
    readonly attribute PredicateList predicates;
  };

  interface XSLTParser:Parser{
    Pattern parsePattern(in DOMString pattern);
  };
};


From skip@mojam.com (Skip Montanaro)  Sun Jan 28 21:40:28 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Sun, 28 Jan 2001 15:40:28 -0600 (CST)
Subject: [Python-Dev] What happened to Setup.local's functionality?
Message-ID: <14964.37324.642566.602319@beluga.mojam.com>

I just remembered Modules/Setup.local.  I peeked at mine and noticed it had
been zeroed out.  I then copied a version of it over from another machine
and reran make a couple times.  Makesetup ran but nothing mentioned in
Setup.local got built.

I don't think 2.1 can be released without providing a way for users to
recover from this change.  I didn't see anything obvious in setup.py.  Am I
missing something?

Skip



From thomas@xs4all.net  Mon Jan 29 00:39:04 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 29 Jan 2001 01:39:04 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include pyport.h,2.20,2.21
In-Reply-To: <20001104001415.A2093@53b.hoffleit.de>; from gregor@hoffleit.de on Sat, Nov 04, 2000 at 12:14:15AM +0100
References: <200010050142.SAA08326@slayer.i.sourceforge.net> <20001104001415.A2093@53b.hoffleit.de>
Message-ID: <20010129013904.R962@xs4all.nl>

On Sat, Nov 04, 2000 at 12:14:15AM +0100, Gregor Hoffleit wrote:
> FYI: This misdefinition with LONG_BIT was due to a bug in glibc's limits.h. It
> has been fixed in glibc 2.96.

Do you mean gcc 2.96, or glibc 2.(1|2).96 ? Or is 2.96 some internal
versioning for glibc that I was unaware of ? :)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From barry@digicool.com  Mon Jan 29 05:03:45 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 00:03:45 -0500
Subject: [Python-Dev] Function Hash: Check it in?
References: <20010127191428.D71ADA840@darjeeling.zadka.site.co.il>
Message-ID: <14964.63921.966960.445548@anthem.wooz.org>

>>>>> "MZ" == Moshe Zadka <moshez@zadka.site.co.il> writes:

    MZ> Attached is an example Python session after I patched the
    MZ> intepreter.  The test-suite passes all right.

    MZ> I want an OK to check this in.

Moshe, please remove the func_hash() and func_compare() functions, and
if the patch passes the test suite, go ahead and check it all in.
Please also check in a test case.

Thanks,
-Barry


From barry@digicool.com  Mon Jan 29 05:04:12 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 00:04:12 -0500
Subject: [Python-Dev] Function Hash: Check it in?
References: <20010127191428.D71ADA840@darjeeling.zadka.site.co.il>
Message-ID: <14964.63948.492662.775413@anthem.wooz.org>

Oh yeah, please also add an entry to the NEWS file.

Thanks,
-Barry


From moshez@zadka.site.co.il  Mon Jan 29 06:26:25 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Mon, 29 Jan 2001 08:26:25 +0200 (IST)
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <14964.63948.492662.775413@anthem.wooz.org>
References: <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il>
Message-ID: <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>

On Mon, 29 Jan 2001 00:04:12 -0500, barry@digicool.com (Barry A. Warsaw) wrote:
 
> Oh yeah, please also add an entry to the NEWS file.

Done. The checkin to the NEWS file will be done in about a million years,
when my antique of a modem finishes sending the data.
I had to change test_opcodes since it tested that functions with the
same code compare equal.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From gregor@hoffleit.de  Mon Jan 29 11:13:39 2001
From: gregor@hoffleit.de (Gregor Hoffleit)
Date: Mon, 29 Jan 2001 12:13:39 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include pyport.h,2.20,2.21
In-Reply-To: <20010129013904.R962@xs4all.nl>; from thomas@xs4all.net on Mon, Jan 29, 2001 at 01:39:04AM +0100
References: <200010050142.SAA08326@slayer.i.sourceforge.net> <20001104001415.A2093@53b.hoffleit.de> <20010129013904.R962@xs4all.nl>
Message-ID: <20010129121339.A1166@mediasupervision.de>

On Mon, Jan 29, 2001 at 01:39:04AM +0100, Thomas Wouters wrote:
> On Sat, Nov 04, 2000 at 12:14:15AM +0100, Gregor Hoffleit wrote:
> > FYI: This misdefinition with LONG_BIT was due to a bug in glibc's limits.h. It
> > has been fixed in glibc 2.96.
> 
> Do you mean gcc 2.96, or glibc 2.(1|2).96 ? Or is 2.96 some internal
> versioning for glibc that I was unaware of ? :)

Sorry, it was fixed in glibc 2.1.96.

    Gregor
    


From mal@lemburg.com  Mon Jan 29 11:31:11 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 12:31:11 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid>
 <3A71FF5D.DC609775@lemburg.com> <200101271652.LAA30750@cj20424-a.reston1.va.home.com>
Message-ID: <3A75547F.A601E219@lemburg.com>

Guido van Rossum wrote:
> 
> > revision 1.34
> > date: 2000/07/19 17:09:51;  author: montanaro;  state: Exp;  lines: +18 -23
> > added rewritten normpath from Moshe Zadka that does the right thing with
> > paths containing ..
> [...]
> > Revision 1.33 clearly leaves initial slashes untouched.
> > I guess we should restore this...
> 
> Yes, please!  (Just the "leading extra slashes stay" behavior.)

Checked in a patch which preserves '/' and '//' but converts
more than 3 initial slashes into one (see Fredrik's note about
POSIX standard on this).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Mon Jan 29 12:24:15 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 13:24:15 +0100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com>
Message-ID: <3A7560EF.39D6CF@lemburg.com>

Here the results of my micro benckmark pybench 0.7:

PYBENCH 0.7

Benchmark: /home/lemburg/tmp/pybench-2.1a1.pyb (rounds=10, warp=20)

Tests:                              per run    per oper.  diff *
------------------------------------------------------------------------
          BuiltinFunctionCalls:    1102.30 ms    8.65 us   +7.56%
           BuiltinMethodLookup:     966.75 ms    1.84 us   +4.56%
                 ConcatStrings:    1198.55 ms    7.99 us  +11.63%
                 ConcatUnicode:    1835.60 ms   12.24 us  +19.29%
               CreateInstances:    1556.40 ms   37.06 us   +2.49%
       CreateStringsWithConcat:    1396.70 ms    6.98 us   +5.44%
       CreateUnicodeWithConcat:    1895.80 ms    9.48 us  +31.61%
                  DictCreation:    1760.50 ms   11.74 us   +2.43%
                      ForLoops:    1426.90 ms  142.69 us   -7.51%
                    IfThenElse:    1155.25 ms    1.71 us   -6.24%
                   ListSlicing:     555.40 ms  158.69 us   -4.14%
                NestedForLoops:     784.55 ms    2.24 us   -6.33%
          NormalClassAttribute:    1052.80 ms    1.75 us  -10.42%
       NormalInstanceAttribute:    1053.80 ms    1.76 us   +0.89%
           PythonFunctionCalls:    1127.50 ms    6.83 us  +12.56%
             PythonMethodCalls:     909.10 ms   12.12 us   +9.70%
                     Recursion:     942.40 ms   75.39 us  +23.74%
                  SecondImport:     924.20 ms   36.97 us   +3.98%
           SecondPackageImport:     951.10 ms   38.04 us   +6.16%
         SecondSubmoduleImport:    1211.30 ms   48.45 us   +7.69%
       SimpleComplexArithmetic:    1635.30 ms    7.43 us   +5.58%
        SimpleDictManipulation:     963.35 ms    3.21 us   -0.57%
         SimpleFloatArithmetic:     877.00 ms    1.59 us   -2.92%
      SimpleIntFloatArithmetic:     851.10 ms    1.29 us   -5.89%
       SimpleIntegerArithmetic:     850.05 ms    1.29 us   -6.41%
        SimpleListManipulation:    1168.50 ms    4.33 us   +8.14%
          SimpleLongArithmetic:    1231.15 ms    7.46 us   +1.52%
                    SmallLists:    2153.35 ms    8.44 us  +10.77%
                   SmallTuples:    1314.65 ms    5.48 us   +3.80%
         SpecialClassAttribute:    1050.80 ms    1.75 us   +1.48%
      SpecialInstanceAttribute:    1248.75 ms    2.08 us   -2.32%
                StringMappings:    1702.60 ms   13.51 us  +19.69%
              StringPredicates:    1024.25 ms    3.66 us  -25.49%
                 StringSlicing:    1093.35 ms    6.25 us   +4.35%
                     TryExcept:    1584.85 ms    1.06 us  -10.90%
                TryRaiseExcept:    1239.50 ms   82.63 us   +4.64%
                  TupleSlicing:     983.00 ms    9.36 us   +3.36%
               UnicodeMappings:    1631.65 ms   90.65 us  +42.76%
             UnicodePredicates:    1762.10 ms    7.83 us  +15.99%
             UnicodeProperties:    1410.80 ms    7.05 us  +19.57%
                UnicodeSlicing:    1366.20 ms    7.81 us  +19.23%
------------------------------------------------------------------------
            Average round time:   58001.00 ms              +3.30%

*) measured against: /home/lemburg/tmp/pybench-2.0.pyb (rounds=10, warp=20)

The benchmark is available here in case someone wants to verify
the results on different platforms:

	http://www.lemburg.com/python/pybench-0.7.zip

The above tests were done on a Linux 2.2 system, AMD K6 233MHz. 
The figures shown compare CVS Python (2.1a1) against stock
Python 2.0. 

As you can see, Python function calls have suffered
a lot for some reason. Unicode mappings and other Unicode database
related methods show the effect of the compression of the Unicode
database -- a clear space/speed tradeoff. 

I can't really explain why Unicode concatenation has had a 
slowdown -- perhaps the new coercion logic has something to
do with this ?!

On the nice side: attribute lookups are faster; probably due to
the string key optimizations in the dictionary implementation.
Loops and exceptions are also a tad faster.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From fredrik@pythonware.com  Mon Jan 29 12:30:32 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Mon, 29 Jan 2001 13:30:32 +0100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com>
Message-ID: <01fc01c089ef$48072230$0900a8c0@SPIFF>

mal wrote:
>                UnicodeMappings:    1631.65 ms   90.65 us  +42.76%
>              UnicodePredicates:    1762.10 ms    7.83 us  +15.99%
>              UnicodeProperties:    1410.80 ms    7.05 us  +19.57%
>                 UnicodeSlicing:    1366.20 ms    7.81 us  +19.23%
>
> Unicode mappings and other Unicode database related methods
> show the effect of the compression of the Unicode database -- a
> clear space/speed tradeoff.

umm.  the tests don't seem to test the "\N{name}" escapes, so the
only thing that has changed in 2.1 is the "decomposition" method
(used in the UnicodeProperties test).

are you sure you're comparing against 2.0 final?

Cheers /F



From mal@lemburg.com  Mon Jan 29 12:52:12 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 13:52:12 +0100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com> <01fc01c089ef$48072230$0900a8c0@SPIFF>
Message-ID: <3A75677C.E4FA82A0@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:
> >                UnicodeMappings:    1631.65 ms   90.65 us  +42.76%
> >              UnicodePredicates:    1762.10 ms    7.83 us  +15.99%
> >              UnicodeProperties:    1410.80 ms    7.05 us  +19.57%
> >                 UnicodeSlicing:    1366.20 ms    7.81 us  +19.23%
> >
> > Unicode mappings and other Unicode database related methods
> > show the effect of the compression of the Unicode database -- a
> > clear space/speed tradeoff.
> 
> umm.  the tests don't seem to test the "\N{name}" escapes, so the
> only thing that has changed in 2.1 is the "decomposition" method
> (used in the UnicodeProperties test).

The mappings figure surprised me too: the code has not changed,
but the unicodetype_db.h look different. Don't know how this
affects performance though.

The differences could also be explained by a increase in Unicode
object creation time (the concatenation is also a lot slower),
so perhaps that's where we should look...

> are you sure you're comparing against 2.0 final?

Yes... after a check of the Makefile I found that I had compiled
Python 2.0 with -O3 and 2.1a1 with -O2 -- perhaps this makes
a difference w/r to inlining of code. I'll recompile and rerun
the benchmark.
 
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tim.one@home.com  Mon Jan 29 12:56:49 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 29 Jan 2001 07:56:49 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>

[Ping]
>     dict[key] = 1
>     if key in dict: ...
>     for key in dict: ...

[Guido]
> No chance of a time-machine escape, but I *can* say that I agree that
> Ping's proposal makes a lot of sense.  This is a reversal of my
> previous opinion on this matter.  (Take note -- those don't happen
> very often! :-)
>
> First to submit a working patch gets a free copy of 2.1a2 and
> subsequent releases,

Thomas since submitted a patch to do the "if key in dict" part (which I
reviewed and accepted, pending resolution of doc issues).

It does not do the "for key in dict" part.  It's not entirely clear whether
you intended to approve that part too (I've simplified away many layers of
quoting in the above <wink>).  In any case, nobody is working on that part.

WRT that part, Ping produced some stats in:

http://mail.python.org/pipermail/python-dev/2001-January/012106.html

> How often do you write 'dict.has_key(x)'?          (std lib says: 206)
> How often do you write 'for x in dict.keys()'?     (std lib says: 49)
>
> How often do you write 'x in dict.values()'?       (std lib says: 0)
> How often do you write 'for x in dict.values()'?   (std lib says: 3)

However, he did not report on occurrences of

    for k, v in dict.items()

I'm not clear exactly which files he examined in the above, or how the
counts were obtained.  So I don't know how this compares:  I counted 188
instances of the string ".items(" in 122 .py files, under the dist/ portion
of current CVS.  A number of those were assignment and return stmts, others
were dict.items() in an arglist, and at least one was in a comment.  After
weeding those out, I was left with 153 legit "for" loops iterating over
x.items().  In all:

    153 iterating over x.items()
    118     "     over x.keys()
     17     "     over x.values()

So I conclude that iterating over .values() is significantly more common
than iterating over .keys().

On c.l.py about an hour ago, Thomas complained that two (out of two) of his
coworkers guessed wrong about what

    for x in dict:

would do, but didn't say what they *did* think it would do.  Since Thomas
doesn't work with idiots, I'm guessing they *didn't* guess it would iterate
over either values or the lines of a freshly-opened file named "dict"
<wink>.

So if you did intend to approve "for x in dict" iterating over dict.keys(),
maybe you want to call me out on that "approval post" I forged under your
name.

falls-on-swords-so-often-there's-nothing-left-to-puncture<wink>-ly y'rs
    - tim



From mal@lemburg.com  Mon Jan 29 13:18:52 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 14:18:52 +0100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com> <01fc01c089ef$48072230$0900a8c0@SPIFF> <3A75677C.E4FA82A0@lemburg.com>
Message-ID: <3A756DBC.8EAC42F5@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> Fredrik Lundh wrote:
> >
> > mal wrote:
> > >                UnicodeMappings:    1631.65 ms   90.65 us  +42.76%
> > >              UnicodePredicates:    1762.10 ms    7.83 us  +15.99%
> > >              UnicodeProperties:    1410.80 ms    7.05 us  +19.57%
> > >                 UnicodeSlicing:    1366.20 ms    7.81 us  +19.23%
> > >
> > > Unicode mappings and other Unicode database related methods
> > > show the effect of the compression of the Unicode database -- a
> > > clear space/speed tradeoff.
> >
> > umm.  the tests don't seem to test the "\N{name}" escapes, so the
> > only thing that has changed in 2.1 is the "decomposition" method
> > (used in the UnicodeProperties test).
> 
> The mappings figure surprised me too: the code has not changed,
> but the unicodetype_db.h look different. Don't know how this
> affects performance though.
> 
> The differences could also be explained by a increase in Unicode
> object creation time (the concatenation is also a lot slower),
> so perhaps that's where we should look...
> 
> > are you sure you're comparing against 2.0 final?
> 
> Yes... after a check of the Makefile I found that I had compiled
> Python 2.0 with -O3 and 2.1a1 with -O2 -- perhaps this makes
> a difference w/r to inlining of code. I'll recompile and rerun
> the benchmark.

Looks like there is an effect of choosing -O3 over -O2 (even though
not necessarily positive all the way); what results do you get on
Windows ?

--

PYBENCH 0.7

Benchmark: /home/lemburg/tmp/pybench-2.1a1.pyb (rounds=10, warp=20)

Tests:                              per run    per oper.  diff *
------------------------------------------------------------------------
          BuiltinFunctionCalls:    1065.10 ms    8.35 us   +3.93%
           BuiltinMethodLookup:    1286.30 ms    2.45 us  +39.12%
                 ConcatStrings:    1243.30 ms    8.29 us  +15.80%
                 ConcatUnicode:    1449.10 ms    9.66 us   -5.83%
               CreateInstances:    1639.25 ms   39.03 us   +7.95%
       CreateStringsWithConcat:    1453.45 ms    7.27 us   +9.73%
       CreateUnicodeWithConcat:    1558.45 ms    7.79 us   +8.19%
                  DictCreation:    1869.35 ms   12.46 us   +8.77%
                      ForLoops:    1526.85 ms  152.69 us   -1.03%
                    IfThenElse:    1381.00 ms    2.05 us  +12.09%
                   ListSlicing:     547.40 ms  156.40 us   -5.52%
                NestedForLoops:     824.50 ms    2.36 us   -1.56%
          NormalClassAttribute:    1233.55 ms    2.06 us   +4.96%
       NormalInstanceAttribute:    1215.50 ms    2.03 us  +16.37%
           PythonFunctionCalls:    1107.30 ms    6.71 us  +10.55%
             PythonMethodCalls:    1047.00 ms   13.96 us  +26.34%
                     Recursion:     940.35 ms   75.23 us  +23.47%
                  SecondImport:     894.05 ms   35.76 us   +0.59%
           SecondPackageImport:     915.05 ms   36.60 us   +2.14%
         SecondSubmoduleImport:    1131.10 ms   45.24 us   +0.56%
       SimpleComplexArithmetic:    1652.05 ms    7.51 us   +6.67%
        SimpleDictManipulation:    1150.25 ms    3.83 us  +18.72%
         SimpleFloatArithmetic:     889.65 ms    1.62 us   -1.52%
      SimpleIntFloatArithmetic:     900.80 ms    1.36 us   -0.40%
       SimpleIntegerArithmetic:     901.75 ms    1.37 us   -0.72%
        SimpleListManipulation:    1125.40 ms    4.17 us   +4.15%
          SimpleLongArithmetic:    1305.15 ms    7.91 us   +7.62%
                    SmallLists:    2102.85 ms    8.25 us   +8.18%
                   SmallTuples:    1329.55 ms    5.54 us   +4.98%
         SpecialClassAttribute:    1234.60 ms    2.06 us  +19.23%
      SpecialInstanceAttribute:    1422.55 ms    2.37 us  +11.28%
                StringMappings:    1585.55 ms   12.58 us  +11.46%
              StringPredicates:    1241.35 ms    4.43 us   -9.69%
                 StringSlicing:    1206.20 ms    6.89 us  +15.12%
                     TryExcept:    1764.35 ms    1.18 us   -0.81%
                TryRaiseExcept:    1217.40 ms   81.16 us   +2.77%
                  TupleSlicing:     933.00 ms    8.89 us   -1.90%
               UnicodeMappings:    1137.35 ms   63.19 us   -0.49%
             UnicodePredicates:    1632.05 ms    7.25 us   +7.43%
             UnicodeProperties:    1244.05 ms    6.22 us   +5.44%
                UnicodeSlicing:    1252.10 ms    7.15 us   +9.27%
------------------------------------------------------------------------
            Average round time:   58804.00 ms              +4.73%

*) measured against: /home/lemburg/tmp/pybench-2.0.pyb (rounds=10, warp=20)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Mon Jan 29 13:28:24 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 14:28:24 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
Message-ID: <3A756FF8.B7185FA2@lemburg.com>

Tim Peters wrote:
> 
> [Ping]
> >     dict[key] = 1
> >     if key in dict: ...
> >     for key in dict: ...
> 
> [Guido]
> > No chance of a time-machine escape, but I *can* say that I agree that
> > Ping's proposal makes a lot of sense.  This is a reversal of my
> > previous opinion on this matter.  (Take note -- those don't happen
> > very often! :-)
> >
> > First to submit a working patch gets a free copy of 2.1a2 and
> > subsequent releases,
> 
> Thomas since submitted a patch to do the "if key in dict" part (which I
> reviewed and accepted, pending resolution of doc issues).
> 
> It does not do the "for key in dict" part.  It's not entirely clear whether
> you intended to approve that part too (I've simplified away many layers of
> quoting in the above <wink>).  In any case, nobody is working on that part.
> 
> WRT that part, Ping produced some stats in:
> 
> http://mail.python.org/pipermail/python-dev/2001-January/012106.html
> 
> > How often do you write 'dict.has_key(x)'?          (std lib says: 206)
> > How often do you write 'for x in dict.keys()'?     (std lib says: 49)
> >
> > How often do you write 'x in dict.values()'?       (std lib says: 0)
> > How often do you write 'for x in dict.values()'?   (std lib says: 3)
> 
> However, he did not report on occurrences of
> 
>     for k, v in dict.items()
> 
> I'm not clear exactly which files he examined in the above, or how the
> counts were obtained.  So I don't know how this compares:  I counted 188
> instances of the string ".items(" in 122 .py files, under the dist/ portion
> of current CVS.  A number of those were assignment and return stmts, others
> were dict.items() in an arglist, and at least one was in a comment.  After
> weeding those out, I was left with 153 legit "for" loops iterating over
> x.items().  In all:
> 
>     153 iterating over x.items()
>     118     "     over x.keys()
>      17     "     over x.values()
> 
> So I conclude that iterating over .values() is significantly more common
> than iterating over .keys().
> 
> On c.l.py about an hour ago, Thomas complained that two (out of two) of his
> coworkers guessed wrong about what
> 
>     for x in dict:
> 
> would do, but didn't say what they *did* think it would do.  Since Thomas
> doesn't work with idiots, I'm guessing they *didn't* guess it would iterate
> over either values or the lines of a freshly-opened file named "dict"
> <wink>.
> 
> So if you did intend to approve "for x in dict" iterating over dict.keys(),
> maybe you want to call me out on that "approval post" I forged under your
> name.

Dictionaries are not sequences. I wonder what order a user of
for k,v in dict: (or whatever other of this proposal you choose)
will expect...

Please also take into account that dictionaries are *mutable*
and their internal state is not defined to e.g. not change due to
lookups (take the string optimization for example...), so exposing
PyDict_Next() in any to Python will cause trouble. In the end,
you will need to create a list or tuple to iterate over one way
or another, so why bother overloading for-loops w/r to dictionaries ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From bckfnn@worldonline.dk  Mon Jan 29 13:48:44 2001
From: bckfnn@worldonline.dk (Finn Bock)
Date: Mon, 29 Jan 2001 13:48:44 GMT
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>
References: <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>
Message-ID: <3a75747e.17414620@smtp.worldonline.dk>

On Mon, 29 Jan 2001 08:26:25 +0200 (IST), you wrote:

>I had to change test_opcodes since it tested that functions with the
>same code compare equal.

Thanks. With this change, Jython too can complete the test_opcodes. In
Jython a code object can never compare equal to anything but itself.

regards,
finn


From moshez@zadka.site.co.il  Mon Jan 29 14:04:47 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Mon, 29 Jan 2001 16:04:47 +0200 (IST)
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <3a75747e.17414620@smtp.worldonline.dk>
References: <3a75747e.17414620@smtp.worldonline.dk>, <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>
Message-ID: <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>

On Mon, 29 Jan 2001 13:48:44 GMT, bckfnn@worldonline.dk (Finn Bock) wrote:
 
> Thanks. With this change, Jython too can complete the test_opcodes. In
> Jython a code object can never compare equal to anything but itself.

Great! I'm happy to have helped.
I'm starting to wonder what the tests really test: the language definition,
or accidents of the implementation?
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From MarkH@ActiveState.com  Mon Jan 29 14:35:25 2001
From: MarkH@ActiveState.com (Mark Hammond)
Date: Tue, 30 Jan 2001 01:35:25 +1100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <3A756DBC.8EAC42F5@lemburg.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPGEGHDAAA.MarkH@ActiveState.com>

"M.-A. Lemburg" wrote:
> what results do you get on Windows ?

Win2k, dual 800, relatively quiet!

Python 2.0

F:\src\Python-2.0\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.847605
This machine benchmarks at 11798 pystones/second

F:\src\Python-2.0\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.845104
This machine benchmarks at 11832.9 pystones/second

F:\src\Python-2.0\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.846069
This machine benchmarks at 11819.4 pystones/second

F:\src\Python-2.0\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.849447
This machine benchmarks at 11772.4 pystones/second

Python from CVS today:

F:\src\python-cvs\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.885801
This machine benchmarks at 11289.2 pystones/second

F:\src\python-cvs\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.889048
This machine benchmarks at 11248 pystones/second

F:\src\python-cvs\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.892422
This machine benchmarks at 11205.5 pystones/second


Although I deleted Tim's earlier mail, from memory this is pretty similar in
terms of performance lost.  I'm afraid I have no idea what your benchmarks
are or how to build them <wink>, but did check that the optimizer is set for
"mazimize for speed" (/O2).  Other compiler options gave significantly
smaller results (no optimizations around 8500, and "optimize for space"
(/O1) at around 10000).  Other fiddling with the optimizer couldn't get
better results than the existing settings.

Mark.



From guido@digicool.com  Mon Jan 29 14:48:22 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 09:48:22 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Mon, 29 Jan 2001 07:56:49 EST."
 <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
Message-ID: <200101291448.JAA11473@cj20424-a.reston1.va.home.com>

> [Ping]
> >     dict[key] = 1
> >     if key in dict: ...
> >     for key in dict: ...
> 
> [Guido]
> > No chance of a time-machine escape, but I *can* say that I agree that
> > Ping's proposal makes a lot of sense.  This is a reversal of my
> > previous opinion on this matter.  (Take note -- those don't happen
> > very often! :-)
> >
> > First to submit a working patch gets a free copy of 2.1a2 and
> > subsequent releases,
> 
> Thomas since submitted a patch to do the "if key in dict" part (which I
> reviewed and accepted, pending resolution of doc issues).
> 
> It does not do the "for key in dict" part.  It's not entirely clear whether
> you intended to approve that part too (I've simplified away many layers of
> quoting in the above <wink>).  In any case, nobody is working on that part.
> 
> WRT that part, Ping produced some stats in:
> 
> http://mail.python.org/pipermail/python-dev/2001-January/012106.html
> 
> > How often do you write 'dict.has_key(x)'?          (std lib says: 206)
> > How often do you write 'for x in dict.keys()'?     (std lib says: 49)
> >
> > How often do you write 'x in dict.values()'?       (std lib says: 0)
> > How often do you write 'for x in dict.values()'?   (std lib says: 3)
> 
> However, he did not report on occurrences of
> 
>     for k, v in dict.items()
> 
> I'm not clear exactly which files he examined in the above, or how the
> counts were obtained.  So I don't know how this compares:  I counted 188
> instances of the string ".items(" in 122 .py files, under the dist/ portion
> of current CVS.  A number of those were assignment and return stmts, others
> were dict.items() in an arglist, and at least one was in a comment.  After
> weeding those out, I was left with 153 legit "for" loops iterating over
> x.items().  In all:
> 
>     153 iterating over x.items()
>     118     "     over x.keys()
>      17     "     over x.values()
> 
> So I conclude that iterating over .values() is significantly more common
> than iterating over .keys().

I did a less sophisticated count but come to the same conclusion:
iterations over items() are (somewhat) more common than over keys(),
and values() are 1-2 orders of magnitude less common.  My numbers:

$ cd python/src/Lib
$ grep 'for .*items():' *.py | wc -l
     47
$ grep 'for .*keys():' *.py | wc -l
     43
$ grep 'for .*values():' *.py | wc -l
      2

> On c.l.py about an hour ago, Thomas complained that two (out of two) of his
> coworkers guessed wrong about what
> 
>     for x in dict:
> 
> would do, but didn't say what they *did* think it would do.  Since Thomas
> doesn't work with idiots, I'm guessing they *didn't* guess it would iterate
> over either values or the lines of a freshly-opened file named "dict"
> <wink>.

I don't much value to the readability argument: typically, one will
write "for key in dict" or "for name in dict" and then it's obvious
what is meant.

> So if you did intend to approve "for x in dict" iterating over dict.keys(),
> maybe you want to call me out on that "approval post" I forged under your
> name.

But here's my dilemma.  "if (k, v) in dict" is clearly useless (nobody
has even asked me for a has_item() method).  I can live with "x in
list" checking the values and "x in dict" checking the keys.  But I
can *not* live with "x in dict" equivalent to "dict.has_key(x)" if
"for x in dict" would mean "for x in dict.items()".  I also think that
defining "x in dict" but not "for x in dict" will be confusing.

So we need to think more.

How about:

    for key in dict: ...		# ... over keys

    for key:value in dict: ...		# ... over items

This is syntactically unambiguous (a colon is currently illegal in
that position).

This also suggests:

    for index:value in list: ...	# ... over zip(range(len(list), list)

while doesn't strike me as bad or ugly, and would fulfill my brother's
dearest wish.

(And why didn't we think of this before?)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Mon Jan 29 14:58:16 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Mon, 29 Jan 2001 15:58:16 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291448.JAA11473@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 09:48:22AM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <200101291448.JAA11473@cj20424-a.reston1.va.home.com>
Message-ID: <20010129155816.T962@xs4all.nl>

On Mon, Jan 29, 2001 at 09:48:22AM -0500, Guido van Rossum wrote:

> How about:

>     for key in dict: ...		# ... over keys

>     for key:value in dict: ...		# ... over items

> This is syntactically unambiguous (a colon is currently illegal in
> that position).

I won't comment on the syntax right now, I need to look at it for a while
first :-) However, what about MAL's point about dict ordering, internally ?
Wouldn't FOR_LOOP be forced to generate a list of keys anyway, to avoid
skipping keys ? I know currently the dict implementation doesn't do any
reordering except during adds/deletes, but there is nothing in the language
ref that supports that -- it's an implementation detail. Would we make a
future enhancement where (some form of) gc would 'clean up' large
dictionaries impossible ?

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Mon Jan 29 15:00:38 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 10:00:38 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Mon, 29 Jan 2001 14:28:24 +0100."
 <3A756FF8.B7185FA2@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
 <3A756FF8.B7185FA2@lemburg.com>
Message-ID: <200101291500.KAA11569@cj20424-a.reston1.va.home.com>

> Dictionaries are not sequences. I wonder what order a user of
> for k,v in dict: (or whatever other of this proposal you choose)
> will expect...

The same order that for k,v in dict.items() will yield, of course.

> Please also take into account that dictionaries are *mutable*
> and their internal state is not defined to e.g. not change due to
> lookups (take the string optimization for example...), so exposing
> PyDict_Next() in any to Python will cause trouble. In the end,
> you will need to create a list or tuple to iterate over one way
> or another, so why bother overloading for-loops w/r to dictionaries ?

Actually, I was going to propose to play dangerously here: the

    for k:v in dict: ...

syntax I proposed in my previous message should indeed expose
PyDict_Next().  It should be a big speed-up, and I'm expecting (though
don't have much proof) that most loops over dicts don't mutate the
dict.

Maybe we could add a flag to the dict that issues an error when a new
key is inserted during such a for loop?  (I don't think the key order
can be affected when a key is *deleted*.)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Mon Jan 29 15:30:17 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 10:30:17 -0500
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: Your message of "Mon, 29 Jan 2001 16:04:47 +0200."
 <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>
References: <3a75747e.17414620@smtp.worldonline.dk>, <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>
 <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>
Message-ID: <200101291530.KAA12037@cj20424-a.reston1.va.home.com>

> I'm starting to wonder what the tests really test: the language definition,
> or accidents of the implementation?

It's good to test conformance to the language definition, but this is
also a regression test for the implementation.  The "accidents of the
implementation" definitely need to be tested.  E.g. if we decide that
repr(s) uses \n rather than \012 or \x0a, this should be tested too.
The language definition gives the implementer a choice here; but once
the implementer has made a choice, it's good to have a test that tests
that this choice is implemented correctly.

Perhaps there should be several parts to the regression test,
e.g. language conformance, library conformance, platform-specific
features, and implementation conformance?

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Mon Jan 29 15:57:12 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 10:57:12 -0500
Subject: [Python-Dev] What happened to Setup.local's functionality?
In-Reply-To: Your message of "Sun, 28 Jan 2001 15:40:28 CST."
 <14964.37324.642566.602319@beluga.mojam.com>
References: <14964.37324.642566.602319@beluga.mojam.com>
Message-ID: <200101291557.KAA12347@cj20424-a.reston1.va.home.com>

> I just remembered Modules/Setup.local.  I peeked at mine and noticed it had
> been zeroed out.  I then copied a version of it over from another machine
> and reran make a couple times.  Makesetup ran but nothing mentioned in
> Setup.local got built.
> 
> I don't think 2.1 can be released without providing a way for users to
> recover from this change.  I didn't see anything obvious in setup.py.  Am I
> missing something?

Well, Module/Setup is still used, so it should be trivial to add
Setup.local back too.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From nas@arctrix.com  Mon Jan 29 09:23:55 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Mon, 29 Jan 2001 01:23:55 -0800
Subject: [Python-Dev] What happened to Setup.local's functionality?
In-Reply-To: <14964.37324.642566.602319@beluga.mojam.com>; from skip@mojam.com on Sun, Jan 28, 2001 at 03:40:28PM -0600
References: <14964.37324.642566.602319@beluga.mojam.com>
Message-ID: <20010129012355.A14763@glacier.fnational.com>

On Sun, Jan 28, 2001 at 03:40:28PM -0600, Skip Montanaro wrote:
> Makesetup ran but nothing mentioned in Setup.local got built.

I believe Setup.local should still work.  One possibility is that
the modules in Setup.local were marked as shared.  Shared modules
from Setup* don't get build by default.  You have to do "make
oldsharedmods".  I'm not sure why oldsharedmods is not included
in the all target.  Andrew, can you think of any reason why it
shouldn't be added.

  Neil


From dgoodger@atsautomation.com  Mon Jan 29 16:19:12 2001
From: dgoodger@atsautomation.com (Goodger, David)
Date: Mon, 29 Jan 2001 11:19:12 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
Message-ID: <B4A8F5EFA7E4D41184A00003470D35DE0115894B@INTERGATE>

Marc-Andre Lemburg's patch to posixpath.py clears up the path problem.
Thanks!

MACHDEP is qnxJ for QNX 4.25, qnxG for QNX 4.23. I don't know what it is for
QNX 6 (Neutrino). Perhaps test for MACHDEP[:3]=='qnx'?

I'm still stuck at 'python setup.py build':

    unable to execute ld: no such file or directory
    running build
    running build_ext
    building 'struct' extension
    skipping //5/tmp/py/Python-2.1a1/Modules/structmodule.c
(build/temp.qnx-J-PCI-2.1/structmodule.o up-to-date)
    ld build/temp.qnx-J-PCI-2.1/structmodule.o -L/usr/local/lib -o
build/lib.qnx-J-PCI-2.1/struct.so
    error: command 'ld' failed with exit status 1
    make: *** [sharedmods] Error 1

Armin Steinhoff said "QNX4 doesn't support dynamic loading". Is this
compatible with distutils? If not, is there a workaround?

Neil Schemenauer asked, "what should LDSHARED say for QNX?". I don't know.
Python 2.0 compiled OK, and its makefile says LDSHARED=ld. However,
Modules/Setup has no uncommented "*shared*" line.

Those of us who rely on Python to get our work done, and who don't have the
bandwidth for the implementation complexities, owe a lot to everyone who
makes it possible to compile Python out-of-the-box. Very much appreciated.
Thank you!

David Goodger
Systems Administrator & Programmer, Advanced Systems
Automation Tooling Systems Inc., Automation Systems Division
direct: (519) 653-4483 ext. 7121    fax: (519) 650-6695
e-mail: dgoodger@atsautomation.com


From nas@arctrix.com  Mon Jan 29 09:40:07 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Mon, 29 Jan 2001 01:40:07 -0800
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <B4A8F5EFA7E4D41184A00003470D35DE0115894B@INTERGATE>; from dgoodger@atsautomation.com on Mon, Jan 29, 2001 at 11:19:12AM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE0115894B@INTERGATE>
Message-ID: <20010129014007.C14763@glacier.fnational.com>

On Mon, Jan 29, 2001 at 11:19:12AM -0500, Goodger, David wrote:
> I'm still stuck at 'python setup.py build':
...
> Armin Steinhoff said "QNX4 doesn't support dynamic loading". Is this
> compatible with distutils? If not, is there a workaround?

The setup.py script only builds shared modules.  Your going to
have to enable modules using the old Setup file.  I think
Setup.dist should got back to including all the modules
(commented out of course).  This would make it easier to people
who can't or don't want to build shared modules.

  Neil


From akuchlin@mems-exchange.org  Mon Jan 29 16:50:31 2001
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Mon, 29 Jan 2001 11:50:31 -0500
Subject: [Python-Dev] What happened to Setup.local's functionality?
In-Reply-To: <20010129012355.A14763@glacier.fnational.com>; from nas@arctrix.com on Mon, Jan 29, 2001 at 01:23:55AM -0800
References: <14964.37324.642566.602319@beluga.mojam.com> <20010129012355.A14763@glacier.fnational.com>
Message-ID: <20010129115031.B4018@amarok.cnri.reston.va.us>

On Mon, Jan 29, 2001 at 01:23:55AM -0800, Neil Schemenauer wrote:
>from Setup* don't get build by default.  You have to do "make
>oldsharedmods".  I'm not sure why oldsharedmods is not included
>in the all target.  Andrew, can you think of any reason why it
>shouldn't be added.

That's an excellent idea, particularly if we add back Setup.dist, too,
and comment out all but the required modules.  

I'll try to do that today.  Note that I'm leaving on vacation
tomorrow, and will be back next Monday.  Everyone, feel free to check
in changes to setup.py that are required.

--amk



From jeremy@alum.mit.edu  Mon Jan 29 16:48:11 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 29 Jan 2001 11:48:11 -0500 (EST)
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <3A75677C.E4FA82A0@lemburg.com>
References: <mailman.980616572.26954.python-list@python.org>
 <200101271858.NAA04898@mira.erols.com>
 <3A7560EF.39D6CF@lemburg.com>
 <01fc01c089ef$48072230$0900a8c0@SPIFF>
 <3A75677C.E4FA82A0@lemburg.com>
Message-ID: <14965.40651.233438.311104@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal@lemburg.com> writes:

  MAL> Yes... after a check of the Makefile I found that I had
  MAL> compiled Python 2.0 with -O3 and 2.1a1 with -O2 -- perhaps this
  MAL> makes a difference w/r to inlining of code. I'll recompile and
  MAL> rerun the benchmark.
 
When I was working in the CALL_FUNCTION revision, I compared 2.0 final
with my development working using -O3.  At that time, I saw no
significant performance difference between the two.  And I did notice
a difference between -O2 and -O3.

The strange thing is that I notice a difference between -O2 and -O3
with 2.1a1, but in the opposite direction.  On pystone, python -O2
runs consistently faster than -O3; the difference is .05 sec on my
machine.  

Jeremy


From esr@thyrsus.com  Mon Jan 29 17:12:05 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 12:12:05 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <14965.40651.233438.311104@localhost.localdomain>; from jeremy@alum.mit.edu on Mon, Jan 29, 2001 at 11:48:11AM -0500
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com> <01fc01c089ef$48072230$0900a8c0@SPIFF> <3A75677C.E4FA82A0@lemburg.com> <14965.40651.233438.311104@localhost.localdomain>
Message-ID: <20010129121205.A8337@thyrsus.com>

Jeremy Hylton <jeremy@alum.mit.edu>:
> The strange thing is that I notice a difference between -O2 and -O3
> with 2.1a1, but in the opposite direction.  On pystone, python -O2
> runs consistently faster than -O3; the difference is .05 sec on my
> machine.  

Bizarre.  Make me wonder if we have a C compiler problem.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

In every country and in every age, the priest has been hostile to
liberty. He is always in alliance with the despot, abetting his abuses
in return for protection to his own.
	-- Thomas Jefferson, 1814


From jeremy@alum.mit.edu  Mon Jan 29 17:27:08 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 29 Jan 2001 12:27:08 -0500 (EST)
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <20010129121205.A8337@thyrsus.com>
References: <mailman.980616572.26954.python-list@python.org>
 <200101271858.NAA04898@mira.erols.com>
 <3A7560EF.39D6CF@lemburg.com>
 <01fc01c089ef$48072230$0900a8c0@SPIFF>
 <3A75677C.E4FA82A0@lemburg.com>
 <14965.40651.233438.311104@localhost.localdomain>
 <20010129121205.A8337@thyrsus.com>
Message-ID: <14965.42988.362288.154254@localhost.localdomain>

>>>>> "ESR" == Eric S Raymond <esr@thyrsus.com> writes:

  ESR> Jeremy Hylton <jeremy@alum.mit.edu>:
  >> The strange thing is that I notice a difference between -O2 and
  >> -O3 with 2.1a1, but in the opposite direction.  On pystone,
  >> python -O2 runs consistently faster than -O3; the difference is
  >> .05 sec on my machine.

  ESR> Bizarre.  Make me wonder if we have a C compiler problem.

Depends on your defintion of "compiler problem" <wink>.  If you mean,
it compiles our code so it runs slower, then, yes, we've got one :-).

One of the differences between -O2 and -O3, according to the man page,
is that -O3 will perform optimizations that involve a space-speed
tradeoff.  It also include -finline-functions.  I can imagine that
some of these optimizations hurt memory performance enough to make a
difference. 

not-really-understanding-but-not-really-expecting-too-ly y'rs,
Jeremy


From jeremy@alum.mit.edu  Mon Jan 29 17:39:05 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 29 Jan 2001 12:39:05 -0500 (EST)
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <14965.40651.233438.311104@localhost.localdomain>
References: <mailman.980616572.26954.python-list@python.org>
 <200101271858.NAA04898@mira.erols.com>
 <3A7560EF.39D6CF@lemburg.com>
 <01fc01c089ef$48072230$0900a8c0@SPIFF>
 <3A75677C.E4FA82A0@lemburg.com>
 <14965.40651.233438.311104@localhost.localdomain>
Message-ID: <14965.43705.367236.994786@localhost.localdomain>

The recursion test in pybench is testing the performance of the nested
scopes changes, which must do some extra bookkeeping to reference the
recursive function in a nested scope.  To some extent, a performance
hit is a necessary consequence for nested functions with free
variables.

Nonetheless, there are two interesting things to say about this
situation.

First, there is a bug in the current implementation of nested scopes
that the benchmark tickles.  The problem is with code like this:

def outer():
    global f
    def f(x):
        if x > 0:
            return f(x - 1)

The compiler determines that f is free in f.  (It's recursive.)  If f
is free in f, in the absence of the global decl, the body of outer
must allocate fresh storage (a cell) for f each time outer is called
and add a reference to that cell to f's closure.

If f is declared global in outer, then it ought to be treated as a
global in nested scopes, too.  In general terms, a free variable
should use the binding found in the nearest enclosing scope.  If the
nearest enclosing scope has a global binding, then the reference is
global. 

If I fix this problem, the recursion benchmark shouldn't be any slower
than a normal function call.

The second interesting thing to say is that frame allocation and
dealloc is probably more expensive than it needs to be in the current
implementation.  The frame object has a new f_closure slot that holds
a tuple that is freshly allocated every time the frame is allocated.
(Unless the closure is empty, then f_closure is just NULL.)

The extra tuple allocation can probably be done away with by using the
same allocation strategy as locals & stack.  If the f_localsplus array
holds cells + frees + locals + stack, then a new frame will never
require more than a single malloc (and often not even that).

Jeremy


From akuchlin@mems-exchange.org  Mon Jan 29 17:54:37 2001
From: akuchlin@mems-exchange.org (Andrew M. Kuchling)
Date: Mon, 29 Jan 2001 12:54:37 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <14965.42988.362288.154254@localhost.localdomain>; from jeremy@alum.mit.edu on Mon, Jan 29, 2001 at 12:27:08PM -0500
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com> <01fc01c089ef$48072230$0900a8c0@SPIFF> <3A75677C.E4FA82A0@lemburg.com> <14965.40651.233438.311104@localhost.localdomain> <20010129121205.A8337@thyrsus.com> <14965.42988.362288.154254@localhost.localdomain>
Message-ID: <20010129125437.E4018@amarok.cnri.reston.va.us>

On Mon, Jan 29, 2001 at 12:27:08PM -0500, Jeremy Hylton wrote:
>Depends on your defintion of "compiler problem" <wink>.  If you mean,
>it compiles our code so it runs slower, then, yes, we've got one :-).

Compiling with gcc and -g, with no optimization, 2.0 and 2.1cvs seem
to be very close, with 2.1 slightly slower:

2.0:
Pystone(1.1) time for 10000 passes = 1.04
This machine benchmarks at 9615.38 pystones/second
This machine benchmarks at 9345.79 pystones/second
This machine benchmarks at 9433.96 pystones/second
This machine benchmarks at 9433.96 pystones/second
This machine benchmarks at 9523.81 pystones/second

2.1cvs:
Pystone(1.1) time for 10000 passes = 1.09
This machine benchmarks at 9174.31 pystones/second
This machine benchmarks at 9090.91 pystones/second
This machine benchmarks at 9259.26 pystones/second
This machine benchmarks at 9174.31 pystones/second
This machine benchmarks at 9090.91 pystones/second

Would it be worth experimenting with platform-specific compiler
options to try to squeeze out the last bit of performance (can wait
for the betas, probably).

--amk


From jeremy@alum.mit.edu  Mon Jan 29 18:04:28 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Mon, 29 Jan 2001 13:04:28 -0500 (EST)
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <3A756DBC.8EAC42F5@lemburg.com>
References: <mailman.980616572.26954.python-list@python.org>
 <200101271858.NAA04898@mira.erols.com>
 <3A7560EF.39D6CF@lemburg.com>
 <01fc01c089ef$48072230$0900a8c0@SPIFF>
 <3A75677C.E4FA82A0@lemburg.com>
 <3A756DBC.8EAC42F5@lemburg.com>
Message-ID: <14965.45228.197778.579989@localhost.localdomain>

I hope another set of benchmarks isn't overkill for the list.  I see
different results comparing 2.1 with 2.0 (both -O3) using pybench
0.6. 

The interesting differences I see in this benchmark that I didn't see
in MAL's are:

DictCreation +15.87%
SeoncdImport +20.29%

Other curious differences, which show up in both benchmarks, include:
SpecialClassAttribute +17.91%     (private variables)
SpecialInstanceAttribute +15.34%  (__methods__)

Jeremy

PYBENCH 0.6

Benchmark: py21 (rounds=10, warp=20)

Tests:                              per run    per oper.  diff *
------------------------------------------------------------------------
          BuiltinFunctionCalls:     305.05 ms    2.39 us   +4.77%
           BuiltinMethodLookup:     319.65 ms    0.61 us   +2.55%
                 ConcatStrings:     383.70 ms    2.56 us   +1.27%
               CreateInstances:     463.85 ms   11.04 us   +1.96%
       CreateStringsWithConcat:     381.20 ms    1.91 us   +2.39%
                  DictCreation:     508.85 ms    3.39 us  +15.87%
                      ForLoops:     577.60 ms   57.76 us   +5.65%
                    IfThenElse:     443.70 ms    0.66 us   +1.02%
                   ListSlicing:     207.50 ms   59.29 us   -4.18%
                NestedForLoops:     315.75 ms    0.90 us   +3.54%
          NormalClassAttribute:     379.80 ms    0.63 us   +7.39%
       NormalInstanceAttribute:     385.45 ms    0.64 us   +8.04%
           PythonFunctionCalls:     400.00 ms    2.42 us  +13.62%
             PythonMethodCalls:     306.25 ms    4.08 us   +5.13%
                     Recursion:     337.25 ms   26.98 us  +19.00%
                  SecondImport:     301.20 ms   12.05 us  +20.29%
           SecondPackageImport:     298.20 ms   11.93 us  +18.15%
         SecondSubmoduleImport:     339.15 ms   13.57 us  +11.40%
       SimpleComplexArithmetic:     392.70 ms    1.79 us  -10.52%
        SimpleDictManipulation:     350.40 ms    1.17 us   +3.87%
         SimpleFloatArithmetic:     300.75 ms    0.55 us   +2.04%
      SimpleIntFloatArithmetic:     347.95 ms    0.53 us   +9.01%
       SimpleIntegerArithmetic:     356.40 ms    0.54 us  +12.01%
        SimpleListManipulation:     351.85 ms    1.30 us  +11.33%
          SimpleLongArithmetic:     309.00 ms    1.87 us   -5.81%
                    SmallLists:     584.25 ms    2.29 us  +10.20%
                   SmallTuples:     442.00 ms    1.84 us  +10.33%
         SpecialClassAttribute:     406.50 ms    0.68 us  +17.91%
      SpecialInstanceAttribute:     557.40 ms    0.93 us  +15.34%
                 StringSlicing:     336.45 ms    1.92 us   +9.56%
                     TryExcept:     650.60 ms    0.43 us   +1.40%
                TryRaiseExcept:     345.95 ms   23.06 us   +2.70%
                  TupleSlicing:     266.35 ms    2.54 us   +4.70%
------------------------------------------------------------------------
            Average round time:   14413.00 ms              +7.07%

*) measured against: py20 (rounds=10, warp=20)



From skip@mojam.com (Skip Montanaro)  Mon Jan 29 18:07:26 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 29 Jan 2001 12:07:26 -0600 (CST)
Subject: [Python-Dev] What happened to Setup.local's functionality?
In-Reply-To: <20010129012355.A14763@glacier.fnational.com>
References: <14964.37324.642566.602319@beluga.mojam.com>
 <20010129012355.A14763@glacier.fnational.com>
Message-ID: <14965.45406.933528.53857@beluga.mojam.com>

    Neil> You have to do "make oldsharedmods".  

This did the trick.  This should be emblazoned in big red letters somewhere
if the decision is made to not include oldsharedmods as a dependency for the
all target.

Thx,

Skip



From gvwilson@ca.baltimore.com  Mon Jan 29 18:19:21 2001
From: gvwilson@ca.baltimore.com (Greg Wilson)
Date: Mon, 29 Jan 2001 13:19:21 -0500
Subject: [Python-Dev] Re: Re: Sets: elt in dict, lst.include
In-Reply-To: <20010129162012.32158ED49@mail.python.org>
Message-ID: <001501c08a20$00dca2a0$770a0a0a@nevex.com>

> > > [Ping]
> > >     dict[key] = 1
> > >     if key in dict: ...
> > >     for key in dict: ...

> "Tim Peters" <tim.one@home.com>
> "if (k, v) in dict" is clearly useless...
> I can live with "x in list" checking the values and "x in dict"
> checking the keys.  But I can *not* live with "x in dict" equivalent
> to "dict.has_key(x)" if "for x in dict" would mean "for x in dict.items()".
> I also think that defining "x in dict" but not "for x in dict" will be
> confusing.

[Greg]
Quick poll (four people): if the expression "if a in b" works,
then all four expected "for a in b" to work as well.  This is
also my intuition; are there any exceptions in really existing
Python?

> [Guido]
>     for key in dict: ...		# ... over keys
>     for key:value in dict: ...	# ... over items

[Greg]
I'm probably revealing my ignorance of Python's internals here,
but can the iteration protocol be extended so that the object
(in this case, the dict) is told the number and type(s) of the
values the loop is expecting?  With:

    for key in dict: ...

the dict would be asked for one value; with:

    for (key, value) in dict:

the dict would be told that a two-element tuple was expected,
and so on.  This would allow multi-dimensional structures
(e.g. NumPy arrays) to do things like:

    for (i, j, k) in array:		# please give me three indices

and:

    for ((i, j, k), v) in array:	# three indices and value

> [Guido]
>     for index:value in list: ...	# ... over zip(range(len(list), list)

How do you feel about:

    for i in seq.keys():		# strings, tuples, etc.

"keys()" is kind of strange ("indices" or something would be
more natural), *but* this allows uniform iteration over all
built-in collections:

    def showem(c):
        for i in c.keys():
            print i, c[i]

Greg



From bckfnn@worldonline.dk  Mon Jan 29 18:31:48 2001
From: bckfnn@worldonline.dk (Finn Bock)
Date: Mon, 29 Jan 2001 18:31:48 GMT
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>
References: <3a75747e.17414620@smtp.worldonline.dk>, <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il> <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>
Message-ID: <3a75aba9.31537178@smtp.worldonline.dk>

On Mon, 29 Jan 2001 16:04:47 +0200 (IST), you wrote:

>On Mon, 29 Jan 2001 13:48:44 GMT, bckfnn@worldonline.dk (Finn Bock) wrote:
> 
>> Thanks. With this change, Jython too can complete the test_opcodes. In
>> Jython a code object can never compare equal to anything but itself.
>
>Great! I'm happy to have helped.
>I'm starting to wonder what the tests really test: the language definition,
>or accidents of the implementation?

Based on the amount of code in test_opcodes dedicated to code
comparison, I doubt this particular situation was an accident.

The problems I have had with the test suite are better described as
accidents of the tests themself. From test_extcall:

  We expected (repr): "g() got multiple values for keyword argument 'b'"
  But instead we got: "g() got multiple values for keyword argument 'a'"

This is caused by a difference in iteration over a dictionary.

Or from test_import:

  test test_import crashed -- java.lang.ClassFormatError:
  java.lang.ClassFormatError: @test$py (Illegal Class name "@test$py")

where '@' isn't allowed in java classnames.

These are failures that have very little to do with the thing the test
are about and nothing at all to do with the language definition.

regards,
finn


From cgw@alum.mit.edu  Mon Jan 29 18:35:58 2001
From: cgw@alum.mit.edu (Charles G Waldman)
Date: Mon, 29 Jan 2001 12:35:58 -0600 (CST)
Subject: [Python-Dev] Re: Re: Sets: elt in dict, lst.include
In-Reply-To: <001501c08a20$00dca2a0$770a0a0a@nevex.com>
References: <20010129162012.32158ED49@mail.python.org>
 <001501c08a20$00dca2a0$770a0a0a@nevex.com>
Message-ID: <14965.47118.135246.700571@sirius.net.home>

Greg Wilson writes:

 > This would allow multi-dimensional structures
 > (e.g. NumPy arrays) to do things like:
 > 
 >     for (i, j, k) in array:		# please give me three indices
 > 
 > and:
 > 
 >     for ((i, j, k), v) in array:	# three indices and value

And what if I had, for example, a 3-dimensional array where the values
are 3-tuples?  Would "for (i,j,k) in array" refer to the indices or the
values?



From mal@lemburg.com  Mon Jan 29 19:03:41 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 20:03:41 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
 <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
Message-ID: <3A75BE8D.1B7673EE@lemburg.com>

With all this confusion about how to actually write the
iteration on dictionary items, wouldn't it make more sense
to implement an extension module which then provides a __getitem__
style iterator for dictionaries by interfacing to PyDict_Next() ?

The module could have three different iterators:

1. iterate over items
2.     ... over keys
3.     ... over values

The reasoning behind this is that the __getitem__ interface
is well established and this doesn't introduce any new
syntax while still providing speed and flexibility.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Mon Jan 29 18:08:16 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 19:08:16 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
 <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
Message-ID: <3A75B190.3FD2A883@lemburg.com>

Guido van Rossum wrote:
> 
> > Dictionaries are not sequences. I wonder what order a user of
> > for k,v in dict: (or whatever other of this proposal you choose)
> > will expect...
> 
> The same order that for k,v in dict.items() will yield, of course.

And then people find out that the order has some sorting
properties and start to use it... "how to sort a dictionary?"
comes up again, every now and then.
 
> > Please also take into account that dictionaries are *mutable*
> > and their internal state is not defined to e.g. not change due to
> > lookups (take the string optimization for example...), so exposing
> > PyDict_Next() in any to Python will cause trouble. In the end,
> > you will need to create a list or tuple to iterate over one way
> > or another, so why bother overloading for-loops w/r to dictionaries ?
> 
> Actually, I was going to propose to play dangerously here: the
> 
>     for k:v in dict: ...
> 
> syntax I proposed in my previous message should indeed expose
> PyDict_Next().  It should be a big speed-up, and I'm expecting (though
> don't have much proof) that most loops over dicts don't mutate the
> dict.
> 
> Maybe we could add a flag to the dict that issues an error when a new
> key is inserted during such a for loop?  (I don't think the key order
> can be affected when a key is *deleted*.)

You mean: mark it read-only ? That would be a "nice to have"
property for a lot of mutable types indeed -- sort of like
low-level locks. This would be another candidate for an object flag
(much like the one Fred wants to introduce for weak referenced
objects).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From guido@digicool.com  Mon Jan 29 19:22:07 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 14:22:07 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Mon, 29 Jan 2001 19:08:16 +0100."
 <3A75B190.3FD2A883@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
 <3A75B190.3FD2A883@lemburg.com>
Message-ID: <200101291922.OAA13321@cj20424-a.reston1.va.home.com>

> > > Dictionaries are not sequences. I wonder what order a user of
> > > for k,v in dict: (or whatever other of this proposal you choose)
> > > will expect...
> > 
> > The same order that for k,v in dict.items() will yield, of course.
> 
> And then people find out that the order has some sorting
> properties and start to use it... "how to sort a dictionary?"
> comes up again, every now and then.

I don't understand why you bring this up.  We're not revealing
anything new here, the random order of dict items has always been part
of the language.  The answer to "how to sort a dict" should be "copy
it into a list and sort that."

Or am I missing something?

> > > Please also take into account that dictionaries are *mutable*
> > > and their internal state is not defined to e.g. not change due to
> > > lookups (take the string optimization for example...), so exposing
> > > PyDict_Next() in any to Python will cause trouble. In the end,
> > > you will need to create a list or tuple to iterate over one way
> > > or another, so why bother overloading for-loops w/r to dictionaries ?
> > 
> > Actually, I was going to propose to play dangerously here: the
> > 
> >     for k:v in dict: ...
> > 
> > syntax I proposed in my previous message should indeed expose
> > PyDict_Next().  It should be a big speed-up, and I'm expecting (though
> > don't have much proof) that most loops over dicts don't mutate the
> > dict.
> > 
> > Maybe we could add a flag to the dict that issues an error when a new
> > key is inserted during such a for loop?  (I don't think the key order
> > can be affected when a key is *deleted*.)
> 
> You mean: mark it read-only ? That would be a "nice to have"
> property for a lot of mutable types indeed -- sort of like
> low-level locks. This would be another candidate for an object flag
> (much like the one Fred wants to introduce for weak referenced
> objects).

Yes.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From gvwilson@ca.baltimore.com  Mon Jan 29 19:38:50 2001
From: gvwilson@ca.baltimore.com (Greg Wilson)
Date: Mon, 29 Jan 2001 14:38:50 -0500
Subject: [Python-Dev] RE: Python-Dev digest, Vol 1 #1124 - 13 msgs
In-Reply-To: <20010129193101.7BF83EF62@mail.python.org>
Message-ID: <001a01c08a2b$1ba5a040$770a0a0a@nevex.com>

> Greg Wilson writes:
>  > This would allow multi-dimensional structures
>  > (e.g. NumPy arrays) to do things like:
>  >     for (i, j, k) in array:
>  > and:
>  >     for ((i, j, k), v) in array:	# three indices and value

> Charles Waldman asks:
> And what if I had, for example, a 3-dimensional array where the values
> are 3-tuples?  Would "for (i,j,k) in array" refer to the 
> indices or the values?

Greg Wilson writes:
That would be up to the module's implementer --- my idea was to have
the 'for' loop provide more information to the object being iterated
over, so that it could "do the right thing" (just as objects do right
now with "x[i]").

Greg


From mal@lemburg.com  Mon Jan 29 19:45:46 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 20:45:46 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
 <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com>
Message-ID: <3A75C86A.3A4236E8@lemburg.com>

Guido van Rossum wrote:
> 
> > > > Dictionaries are not sequences. I wonder what order a user of
> > > > for k,v in dict: (or whatever other of this proposal you choose)
> > > > will expect...
> > >
> > > The same order that for k,v in dict.items() will yield, of course.
> >
> > And then people find out that the order has some sorting
> > properties and start to use it... "how to sort a dictionary?"
> > comes up again, every now and then.
> 
> I don't understand why you bring this up.  We're not revealing
> anything new here, the random order of dict items has always been part
> of the language.  The answer to "how to sort a dict" should be "copy
> it into a list and sort that."
> 
> Or am I missing something?

I just wanted to hint at a problem which iterating over items
in an unordered set can cause. Especially new Python users will find 
it confusing that the order of the items in an iteration can change
from one run to the next.

Not much of an argument, but I like explicit programming more
than magic under the cover. What we really want is iterators for
dictionaries, so why not implement these instead of tweaking
for-loops.

If you are looking for speedups w/r to for-loops, applying a
different indexing technique in for-loops would go a lot further
and provide better performance not only to dictionary loops,
but also to other sequences.

I have made some good experience with a special counter object 
(sort of like a mutable integer) which is used instead of the 
iteration index integer in the current implementation. 

Using an iterator object instead of the integer + __getitem__
call machinery would allow more flexibility for all kinds of
sequences or containers. There could be an iterator type for
dictionaries, one for generic __getitem__ style sequences,
one for lists and tuples, etc. All of these could include
special logic to get the most out of the targetted datatype.

Well, just a thought...
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From esr@thyrsus.com  Mon Jan 29 20:02:47 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 15:02:47 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291922.OAA13321@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 02:22:07PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com>
Message-ID: <20010129150247.B10191@thyrsus.com>

Guido van Rossum <guido@digicool.com>:
> > > Maybe we could add a flag to the dict that issues an error when a new
> > > key is inserted during such a for loop?  (I don't think the key order
> > > can be affected when a key is *deleted*.)
> > 
> > You mean: mark it read-only ? That would be a "nice to have"
> > property for a lot of mutable types indeed -- sort of like
> > low-level locks. This would be another candidate for an object flag
> > (much like the one Fred wants to introduce for weak referenced
> > objects).
> 
> Yes.

For different reasons, I'd like to be able to set a constant flag on a
object instance.  Simple semantics: if you try to assign to a
member or method, it throws an exception.

Application?  I have a large Python program that goes to a lot of effort
to build elaborate context structures in core.  It would be nice to know
they can't be even inadvertently trashed without throwing an exception I 
can watch for.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

No one is bound to obey an unconstitutional law and no courts are bound
to enforce it.  
	-- 16 Am. Jur. Sec. 177 late 2d, Sec 256


From esr@thyrsus.com  Mon Jan 29 20:09:14 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 15:09:14 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3A75C86A.3A4236E8@lemburg.com>; from mal@lemburg.com on Mon, Jan 29, 2001 at 08:45:46PM +0100
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com>
Message-ID: <20010129150914.C10191@thyrsus.com>

M.-A. Lemburg <mal@lemburg.com>:
> If you are looking for speedups w/r to for-loops, applying a
> different indexing technique in for-loops would go a lot further
> and provide better performance not only to dictionary loops,
> but also to other sequences.

Which reminds me...

There's not much I miss from C these days, but one thing I wish Python
had is a more general for-loop.  The C semantics that let you have 
any initialization, any termination test, and any iteration you like
are rather cool.

Yes, I realize that

	for (<init>; <test>; <step>) {<body>}

can be simulated with:

	<init>
	while 1:
		if <test>:
			break
		<body> 

Still, having them spatially grouped the way a C for does it is nice.
Makes it easier to see invariants, I think.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Rightful liberty is unobstructed action, according to our will, within limits
drawn around us by the equal rights of others."
	-- Thomas Jefferson


From moshez@zadka.site.co.il  Mon Jan 29 20:29:53 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Mon, 29 Jan 2001 22:29:53 +0200 (IST)
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <200101291530.KAA12037@cj20424-a.reston1.va.home.com>
References: <200101291530.KAA12037@cj20424-a.reston1.va.home.com>, <3a75747e.17414620@smtp.worldonline.dk>, <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>
 <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>
Message-ID: <20010129202953.D1498A840@darjeeling.zadka.site.co.il>

On Mon, 29 Jan 2001 10:30:17 -0500, Guido van Rossum <guido@digicool.com> wrote:

> It's good to test conformance to the language definition, but this is
> also a regression test for the implementation.  The "accidents of the
> implementation" definitely need to be tested.  E.g. if we decide that
> repr(s) uses \n rather than \012 or \x0a, this should be tested too.
> The language definition gives the implementer a choice here; but once
> the implementer has made a choice, it's good to have a test that tests
> that this choice is implemented correctly.

I agree.

> Perhaps there should be several parts to the regression test,
> e.g. language conformance, library conformance, platform-specific
> features, and implementation conformance?

This sounds like a good idea...probably for the 2.2 timeline.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From tim.one@home.com  Mon Jan 29 21:51:56 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 29 Jan 2001 16:51:56 -0500
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEBBIMAA.tim.one@home.com>

[Moshe Zadka]
> ...
> I'm starting to wonder what the tests really test: the language
> definition, or accidents of the implementation?

You'd be amazed (appalled?) at how hard it is to separate them.

In two previous lives as a Big Iron compiler hacker, we routinely had to get
our compilers validated by a govt agency before any US govt account would be
allowed to buy our stuff; e.g.,

    http://www.itl.nist.gov/div897/ctg/vpl/language.htm

This usually *started* as a two-day process, flying the inspector to our
headquarters, taking perhaps 2 minutes of machine time to run the test
suite, then sitting around that day and into the next arguing about whether
the "failures" were due to non-standard assumptions in the tests, or
compiler bugs.  It was almost always the former, but sometimes that didn't
get fully resolved for months (if the inspector was being particularly
troublesome, it could require getting an Official Interpretation from the
relevant stds body -- not swift!).  (BTW, this is one reason huge customers
are often very reluctant to move to a new release:  the validation process
can be very expensive and drag on for months)

>>> def f():
...     global g
...     g += 1
...     return g
...
>>> g = 0
>>> d = {f(): f()}
>>> d
{2: 1}
>>>

The Python Lang Ref doesn't really say whether {2: 1} or {1: 2} "should be"
the result, nor does it say it's implementation-defined.  If you *asked*
Guido what he thought it should do, he'd probably say {1: 2} (not much of a
guess:  I asked him in the past, and that's what he did say <wink>).

Something "like that" can show up in the test suite, but buried under layers
of obfuscating accidents.  Nobody is likely to realize it in the absence of
a failure motivating people to search for it.

Which is a trap:  sometimes ours was the only compiler (of dozens and
dozens) that had *ever* "failed" a particular test.  This was most often the
case at Cray Research, which had bizarre (but exceedingly fast -- which is
what Cray's customers valued most) floating-point arithmetic.  I recall one
test in particular that failed because Cray's was the only box on earth that
set I to 1 in

    INTEGER I
    I = 6.0/3.0

Fortran doesn't define that the result must be 2.  But-- you guessed
it --neither does Python.

Cute:  at KSR, INT(6.0/3.0) did return 2 -- but INT(98./49.) did not <wink>.

then-again-the-python-test-suite-is-still-shallow-ly y'rs  - tim



From hughett@mercur.uphs.upenn.edu  Mon Jan 29 22:05:22 2001
From: hughett@mercur.uphs.upenn.edu (Paul Hughett)
Date: Mon, 29 Jan 2001 17:05:22 -0500
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEBBIMAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCEEBBIMAA.tim.one@home.com>
Message-ID: <200101292205.RAA18790@mercur.uphs.upenn.edu>

tim says:

> Cray's was the only box on earth that set I to 1 in

>    INTEGER I
>    I = 6.0/3.0

> Fortran doesn't define that the result must be 2.  But-- you guessed
> it --neither does Python.

I would _guess_ that the IEEE 754 floating point standard does require
that, but I haven't actually gotten my hands on a copy of the standard
yet.  If it doesn't, I may have to stop writing code that depends on
the assumption that floating point computation is exact for exactly
representable integers.  If so, then we're reasonably safe; there
aren't many non-IEEE machines left these days.

Un-lurking-ly yours,

Paul Hughett


From tim.one@home.com  Mon Jan 29 22:53:43 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 29 Jan 2001 17:53:43 -0500
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <200101292205.RAA18790@mercur.uphs.upenn.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEBEIMAA.tim.one@home.com>

[Paul Hughett]
> I would _guess_ that the IEEE 754 floating point standard does require
> that [6./3. == 2.],

It does, but 754 is silent on how languages may or may not *bind* to its
semantics.  The C99 std finally addresses that (15 years after 754), and
Java does too (albeit in a way Kahan despises), but that's about it for
"name brand" <wink> languages.

> ...
> If it doesn't, I may have to stop writing code that depends on
> the assumption that floating point computation is exact for exactly
> representable integers.  If so, then we're reasonably safe; there
> aren't many non-IEEE machines left these days.

I'm afraid you've got no guarantees even on a box with 100% conforming 754
hardware.  One of the last "mystery bugs" I helped tracked down at my
previous employer only showed up under Intel's C++ compiler.  It turned out
the compiler was looking for code of the form:

    double *a, *b, scale;
    for (i=0; i < n; ++i) {
        a[i] = b[i] / scale;
    }

and rewriting it as:

    double __temp = 1./scale;
    for (i=0; i < n; ++i) {
        a[i] = b[i] * __temp;
    }

for speed.  As time goes on, PC compilers are becoming more and more like
Cray's and KSR's in this respect:  float division is much more expensive
than float mult, and so variations of "so multiply by the reciprocal
instead" are hard for vendors to resist.  And, e.g., under 754 double rules,

   (17. * 123.) * (1./123.)

must *not* yield exactly 17.0 if done wholly in 754 double (but then 754
says nothing about how any language maps that string to 754 operations).

if-you-like-logic-chopping-you'll-love-arguing-stds<wink>-ly y'rs  - tim



From guido@digicool.com  Mon Jan 29 23:59:34 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 18:59:34 -0500
Subject: [Python-Dev] Does autoconfig detect INSTALL incorrectly?
In-Reply-To: Your message of "Tue, 23 Jan 2001 00:30:56 PST."
 <20010123003056.A28309@glacier.fnational.com>
References: <20010123003056.A28309@glacier.fnational.com>
Message-ID: <200101292359.SAA20364@cj20424-a.reston1.va.home.com>

> Why is the configure.in file set to always use "install-sh"?
> There is a comment that says:
> 
>     # Install just never works :-(
> 
> I don't think that statement is accurate.  /usr/bin/install works
> quite well on my machine.  The only commments I can find in the
> changelog are:
> 
>     revision 1.16
>     date: 1995/01/20 14:12:16;  author: guido;  state: Exp;  lines: +27 -2
>     add INSTALL_PROGRAM and INSTALL_DATA; check for getopt
> 
> and:
> 
>     revision 1.5
>     date: 1994/08/19 15:33:51;  author: guido;  state: Exp;  lines: +14 -6
>     Simplify value of INSTALL (always 'cp').
> 
> Is there any reason why the autoconf macro AC_PROG_INSTALL is not used?  The
> documentation seems to indicate that is does what we want.

Neil,

It's too long for me to remember, and I bet this was before
AC_PROG_INSTALL.  If there's a reason to prefer a working "install"
over install-sh, feel free to do the right thing!  (You're in charge
of the Makefile anyway now, it seems. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From skip@mojam.com (Skip Montanaro)  Tue Jan 30 00:17:25 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Mon, 29 Jan 2001 18:17:25 -0600 (CST)
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
Message-ID: <14966.2069.950895.627663@beluga.mojam.com>

After reading through this thread and noticing (but not paying close
attention to) all the related posts on c.l.py (subject: "in for dicts"), it
seems to me that the whole "if/for something in dict" thing needds to be
hashed out in a PEP.  There were a fair amount of "Python's changing too
fast" rants when 2.0 was released.  Adding a major feature such as this at
the 2.1 stage is only going to generate that many more rants.  The fact that
it was easy for Thomas to implement "if key in dict" doesn't make the
overall concept less controversial.  There are apparently lots of varying
opinions about what's reasonable.  This topic seems related to PEP 212 (Loop
Counter Iteration) and PEP 218 (Adding a Built-In Set Object Type), but may
well warrant its own.

That said, I have plenty enough on my plate trying to keep Mojam afloat
these days, so I can't step into the crevass, just observe that it looks to
me like a very long ways to the bottom... ;-)

Skip


From guido@digicool.com  Tue Jan 30 00:22:58 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 19:22:58 -0500
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: Your message of "Mon, 29 Jan 2001 18:17:25 CST."
 <14966.2069.950895.627663@beluga.mojam.com>
References: <14966.2069.950895.627663@beluga.mojam.com>
Message-ID: <200101300022.TAA21244@cj20424-a.reston1.va.home.com>

> After reading through this thread and noticing (but not paying close
> attention to) all the related posts on c.l.py (subject: "in for dicts"), it
> seems to me that the whole "if/for something in dict" thing needds to be
> hashed out in a PEP.  There were a fair amount of "Python's changing too
> fast" rants when 2.0 was released.  Adding a major feature such as this at
> the 2.1 stage is only going to generate that many more rants.  The fact that
> it was easy for Thomas to implement "if key in dict" doesn't make the
> overall concept less controversial.  There are apparently lots of varying
> opinions about what's reasonable.  This topic seems related to PEP 212 (Loop
> Counter Iteration) and PEP 218 (Adding a Built-In Set Object Type), but may
> well warrant its own.

Excellent.  Good reminder also that this shouldn't go into 2.1 --
clearly the design space is too complicated for a quick decision.

> That said, I have plenty enough on my plate trying to keep Mojam afloat
> these days, so I can't step into the crevass, just observe that it looks to
> me like a very long ways to the bottom... ;-)

I'm not able to lead such a PEP effort myself either, but I hope
*someone* will be.  This PEP has a good chance for 2.2 though (what
with BDFL approval and all :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one@home.com  Tue Jan 30 01:39:17 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 29 Jan 2001 20:39:17 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291448.JAA11473@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com>

[Guido]
> I did a less sophisticated count but come to the same conclusion:
> iterations over items() are (somewhat) more common than over keys(),
> and values() are 1-2 orders of magnitude less common.  My numbers:
>
> $ cd python/src/Lib
> $ grep 'for .*items():' *.py | wc -l
>      47
> $ grep 'for .*keys():' *.py | wc -l
>      43
> $ grep 'for .*values():' *.py | wc -l
>       2

I like my larger sample and anal methodology better <wink>.  A closer look
showed that it may have been unduly biased by the mass of files in
Lib/encodings/, where

encoding_map = {}
for k,v in decoding_map.items():
    encoding_map[v] = k

is at the end of most files (btw, MAL, that's the answer to your question:
people would expect "the same" ordering you expected there, i.e. none in
particular).

> ...
> I don't much value to the readability argument: typically, one will
> write "for key in dict" or "for name in dict" and then it's obvious
> what is meant.

Well, "fiddlesticks" comes to mind <0.9 wink>.  If I've got a dict mapping
phone numbers to names, "for name in dict" is dead backwards.

    for vevent in keydefs.keys():
    for x in self.subdirs.keys():
    for name in lsumdict.keys():
    for locale in self.descriptions.keys():
    for name in attrs.keys():
    for func in other.top_level.keys():
    for func in target.keys():
    for i in u2.keys():
    for s in d.keys():
    for url in self.bad.keys():

are other cases in the CVS tree where I don't think the name makes it
obvious in the absence of ".keys()".

But I don't personally give any weight to whether people can guess what
something does at first glance.  My rule is that it doesn't matter, provided
it's (a) easy to learn; and (especially), (b) hard to *forget* once you've
learned it.  A classic example is Python's "points between elements"
treatment of slice indices:  few people guess right what that does at first
glance, but once they "get it" they're delighted and rarely mess up again.

And I think this is "like that".

> ...
> But here's my dilemma.  "if (k, v) in dict" is clearly useless (nobody
> has even asked me for a has_item() method).

Yup.

> I can live with "x in list" checking the values and "x in dict"
> checking the keys.  But I can *not* live with "x in dict" equivalent
> to "dict.has_key(x)" if "for x in dict" would mean
> "for x in dict.items()".

That's why I brought it up -- it's not entirely clear what's to be done
here.

> I also think that defining "x in dict" but not "for x in dict" will
> be confusing.
>
> So we need to think more.

The hoped-for next step indeed.

> How about:
>
>     for key in dict: ...		# ... over keys
>
>     for key:value in dict: ...		# ... over items
>
> This is syntactically unambiguous (a colon is currently illegal in
> that position).

Cool!  Can we resist adding

    if key:value in dict

for "parallelism"?  (I know I can ...)  2/3rd of these are marginally more
attractive:

    for key: in dict:    # over dict.keys()
    for :value in dict:  # over dict.values()
    for : in dict:       # a delay loop

> This also suggests:
>
>     for index:value in list: ...	# ... over zip(range(len(list), list)
>
> while doesn't strike me as bad or ugly, and would fulfill my brother's
> dearest wish.

You mean besides the one that you fry in hell for not adding "for ...
indexing"?  Ya, probably.

> (And why didn't we think of this before?)

Best guess:  we were focused exclusively on sequences, and a colon just
didn't suggest itself in that context.  Second-best guess:  having finally
approved one of these gimmicks, you finally got desperate enough to make it
work <wink>.

ponderingly y'rs  - tim



From tim.one@home.com  Tue Jan 30 01:58:59 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 29 Jan 2001 20:58:59 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEBPIMAA.tim.one@home.com>

[Guido]
> ...
> I'm expecting (though don't have much proof) that most loops over
> dicts don't mutate the dict.

Safe bet!  I do recall writing one once:  it del'ed keys for which the
associated count was 1, because the rest of the algorithm was only
interested in duplicates.

> Maybe we could add a flag to the dict that issues an error when a new
> key is inserted during such a for loop?  (I don't think the key order
> can be affected when a key is *deleted*.)

That latter is true but specific to this implementation.  "Can't mutate the
dict period" is easier to keep straight, and probably harmless in practice
(if not, it could be relaxed later).  Recall that a similar trick is played
during list.sort(), replacing the list's type pointer for the duration (to
point to an internal "immutable list" type, same as the list type except the
"dangerous" slots point to a function that raises an "immutable list"
TypeError).  Then no runtime expense is incurred for regular lists to keep
checking flags.  I thought of this as an elegant use for switching types at
runtime; you may still be appalled by it, though!



From tim.one@home.com  Tue Jan 30 02:07:36 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 29 Jan 2001 21:07:36 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3A75B190.3FD2A883@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECAIMAA.tim.one@home.com>

[Guido]
> The same order that for k,v in dict.items() will yield, of course.

[MAL]
> And then people find out that the order has some sorting
> properties and start to use it...

Except that it has none.  dict insertion has never used any comparison
outcome beyond "equal"/"not equal", so any ordering you think you see is--
and always was --an illusion.



From guido@digicool.com  Tue Jan 30 02:06:35 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:06:35 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Mon, 29 Jan 2001 20:39:17 EST."
 <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com>
Message-ID: <200101300206.VAA21925@cj20424-a.reston1.va.home.com>

This is all PEP material now.  Tim, do you want to own the PEP?  It
seems just up your alley!

> Cool!  Can we resist adding
> 
>     if key:value in dict
> 
> for "parallelism"?  (I know I can ...)

That's easy to resist because, unlike ``for key:value in dict'', it's
not unambiguous: ``if key:value in dict'' is already legal syntax
currently, with 'key' as the condition and 'value in dict' as the (not
particularly useful) body of the if statement.

> > (And why didn't we think of this before?)
> 
> Best guess:  we were focused exclusively on sequences, and a colon just
> didn't suggest itself in that context.  Second-best guess:  having finally
> approved one of these gimmicks, you finally got desperate enough to make it
> work <wink>.

I'm certainly more comfortable with just ``for key in dict'' than with
the whole slow of extensions using colons.

But, again, that's for the PEP to fight over.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Tue Jan 30 02:15:04 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:15:04 -0500
Subject: [Python-Dev] C's for statement
In-Reply-To: Your message of "Mon, 29 Jan 2001 15:09:14 EST."
 <20010129150914.C10191@thyrsus.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com>
 <20010129150914.C10191@thyrsus.com>
Message-ID: <200101300215.VAA21955@cj20424-a.reston1.va.home.com>

[ESR]
> There's not much I miss from C these days, but one thing I wish Python
> had is a more general for-loop.  The C semantics that let you have 
> any initialization, any termination test, and any iteration you like
> are rather cool.
> 
> Yes, I realize that
> 
> 	for (<init>; <test>; <step>) {<body>}
> 
> can be simulated with:
> 
> 	<init>
> 	while 1:
> 		if <test>:
> 			break
> 		<body> 
> 
> Still, having them spatially grouped the way a C for does it is nice.
> Makes it easier to see invariants, I think.

Hm, I've seen too many ugly C for loops to have much appreciation for
it.  I can recognize and appreciate the few common forms that clearly
iterate over an array; most other forms look rather contorted to me.
Check out the Python C sources; if you find anything more complicated
than ``for (i = n; i > 0; i--)'' I probably didn't write
it. :-)

Common abominations include:

- writing a while loop as for(;<test>;)

- putting arbitrary initialization code in <init>

- having an empty condition, so the <step> becomes an arbitraty
  extension of the body that's written out-of-sequence

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Tue Jan 30 02:19:12 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 29 Jan 2001 21:19:12 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3A75C86A.3A4236E8@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECAIMAA.tim.one@home.com>

[MAL]
> I just wanted to hint at a problem which iterating over items
> in an unordered set can cause.  Especially new Python users will find
> it confusing that the order of the items in an iteration can change
> from one run to the next.

Do they find "for k, v in dict.items()" confusing now?  Would be the same.

> ...
> What we really want is iterators for dictionaries, so why not
> implement these instead of tweaking for-loops.

Seems an unrelated topic:  would "iterators for dictionaries" solve the
supposed problem with iteration order?

> If you are looking for speedups w/r to for-loops, applying a
> different indexing technique in for-loops would go a lot further
> and provide better performance not only to dictionary loops,
> but also to other sequences.
>
> I have made some good experience with a special counter object
> (sort of like a mutable integer) which is used instead of the
> iteration index integer in the current implementation.

Please quantify, if possible.  My belief (based on past experiments) is that
in loops fancier than

    for i in range(n):
        pass

the loop overhead quickly falls into the noise even now.

> Using an iterator object instead of the integer + __getitem__
> call machinery would allow more flexibility for all kinds of
> sequences or containers. ...

This is yet another abrupt change of topic, yes <0.9 wink>?  I agree a new
iteration *protocol* could have major attractions.



From guido@digicool.com  Tue Jan 30 02:17:27 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:17:27 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: Your message of "Mon, 29 Jan 2001 15:02:47 EST."
 <20010129150247.B10191@thyrsus.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com>
 <20010129150247.B10191@thyrsus.com>
Message-ID: <200101300217.VAA21978@cj20424-a.reston1.va.home.com>

[ESR]
> For different reasons, I'd like to be able to set a constant flag on a
> object instance.  Simple semantics: if you try to assign to a
> member or method, it throws an exception.
> 
> Application?  I have a large Python program that goes to a lot of effort
> to build elaborate context structures in core.  It would be nice to know
> they can't be even inadvertently trashed without throwing an exception I 
> can watch for.

Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:

- How to spell it?  x.freeze()?  x.readonly()?

- Should this reversible?  I.e. should there be an x.unfreeze()?

- Should we support something like this for instances too?  Sometimes
  it might be cool to be able to freeze changing attribute values...

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Tue Jan 30 02:29:25 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 29 Jan 2001 21:29:25 -0500
Subject: [Python-Dev] C's for statement
In-Reply-To: <200101300215.VAA21955@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKECBIMAA.tim.one@home.com>

Check out SETL's loop statement.  I think Perl5 is a subset of it <0.9
wink>.



From esr@thyrsus.com  Tue Jan 30 02:34:01 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 21:34:01 -0500
Subject: [Python-Dev] Re: C's for statement
In-Reply-To: <200101300215.VAA21955@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 09:15:04PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com> <20010129150914.C10191@thyrsus.com> <200101300215.VAA21955@cj20424-a.reston1.va.home.com>
Message-ID: <20010129213401.A17235@thyrsus.com>

Guido van Rossum <guido@digicool.com>:
> Common abominations include:
> 
> - writing a while loop as for(;<test>;)

Agreed. Bletch.
 
> - putting arbitrary initialization code in <init>

Not sure what's "arbitrary", unless you mean unrelated to the 
iteration variable.

> - having an empty condition, so the <step> becomes an arbitraty
>   extension of the body that's written out-of-sequence

Again agreed.  Double bletch.

I guess my archetype of the cute C for-loop is the idiom for 
pointer-list traversal:

	struct foo {int data; struct foo *next;} *ptr, *head; 

	for (ptr = head; *ptr; ptr = ptr->next)
		do_something_with(ptr->data)

This is elegant.  It separates the logic for list traversal from the
operation on the list element.

Not the highest on my list of wants -- I'd sooner have ?: back.  I submitted
a patch for that once, and the discussion sort of died.  Were you dead
det against it, or should I revive this proposal?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"The bearing of arms is the essential medium through which the
individual asserts both his social power and his participation in
politics as a responsible moral being..."
        -- J.G.A. Pocock, describing the beliefs of the founders of the U.S.


From esr@thyrsus.com  Tue Jan 30 02:49:59 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 21:49:59 -0500
Subject: [Python-Dev] Re: Making mutable objects readonly
In-Reply-To: <200101300217.VAA21978@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 09:17:27PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <20010129214959.B17235@thyrsus.com>

Guido van Rossum <guido@digicool.com>:
> Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
> 
> - How to spell it?  x.freeze()?  x.readonly()?

I like "freeze", it'a a clear imperative where "readonly()" sounds
like a test (e.g. "is this readonly()?")
 
> - Should we support something like this for instances too?  Sometimes
>   it might be cool to be able to freeze changing attribute values...

Moshe Zadka sent me a hack that handles instances:

> class MarkableAsConstant:
> 
> 	def __init__(self):
> 		self.mark_writable()
> 
> 	def __setattr__(self, name, value):
> 		if self._writable:
> 			self.__dict__[name] = value
> 		else:
> 			raise ValueError, "object is read only"
> 
> 	def mark_writable(self):
> 		self.__dict__['_writable'] = 1
> 
> 	def mark_readonly(self):
> 		self.__dict__['_writable'] = 0

> - Should this reversible?  I.e. should there be an x.unfreeze()?

I gave this some thought earlier today.  There are advantages to either
way.  Making freeze a one-way operation would make it possible to use
freezing to get certain kinds of security and integrity guarantees that
you can't have if freezing is reversible.

Fortunately, there's a semantics that captures both.  If we allow
freeze to take an optional key argument, and require that an unfreeze
call must supply the same key or fail, we get both worlds.  We can
even one-way-hash the keys so they don't have to be stored in the
bytecode.

Want to lock a structure permanently?  Pick a random long key.  Freeze
with it.  Then throw that key away...
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Strict gun laws are about as effective as strict drug laws...It pains
me to say this, but the NRA seems to be right: The cities and states
that have the toughest gun laws have the most murder and mayhem.
        -- Mike Royko, Chicago Tribune


From tim.one@home.com  Tue Jan 30 02:57:59 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 29 Jan 2001 21:57:59 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGECDIMAA.tim.one@home.com>

[Guido]
> Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
>
> - How to spell it?  x.freeze()?  x.readonly()?

See below.

> - Should this reversible?

Of course.  Or x.freeze(solid=1) to default to permanent rigidity, but not
require it.

>  I.e. should there be an x.unfreeze()?

That conveniently answers the first question, since x.unreadonly() reads
horribly <wink>.

> - Should we support something like this for instances too?  Sometimes
>   it might be cool to be able to freeze changing attribute values...

"Should be" supported for every mutable object.  Next step:  as in endless
C++ debates, endless Python debates about "representation freeze" vs
"logical freeze" ("well, yes, I'm changing this member, but it's just an
invisible cache so I *should* be able to tag the object as const anyway
..."; etc etc etc).

keep-it-simple-ly y'rs  - tim



From guido@digicool.com  Tue Jan 30 02:57:24 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:57:24 -0500
Subject: [Python-Dev] Re: C's for statement
In-Reply-To: Your message of "Mon, 29 Jan 2001 21:34:01 EST."
 <20010129213401.A17235@thyrsus.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com> <20010129150914.C10191@thyrsus.com> <200101300215.VAA21955@cj20424-a.reston1.va.home.com>
 <20010129213401.A17235@thyrsus.com>
Message-ID: <200101300257.VAA22186@cj20424-a.reston1.va.home.com>

> > - putting arbitrary initialization code in <init>
> 
> Not sure what's "arbitrary", unless you mean unrelated to the 
> iteration variable.

Yes, that.

> I guess my archetype of the cute C for-loop is the idiom for 
> pointer-list traversal:
> 
> 	struct foo {int data; struct foo *next;} *ptr, *head; 
> 
> 	for (ptr = head; *ptr; ptr = ptr->next)
> 		do_something_with(ptr->data)
> 
> This is elegant.  It separates the logic for list traversal from the
> operation on the list element.

And it rarely happens in Python, because sequences are rarely
represented as linked lists.

> Not the highest on my list of wants -- I'd sooner have ?: back.  I submitted
> a patch for that once, and the discussion sort of died.  Were you dead
> det against it, or should I revive this proposal?

Not dead set against something like it, but dead set against the ?:
syntax because then : becomes too overloaded for the human reader, e.g.:

    if foo ? bar : bletch : spam = eggs

If you want to revive this, I strongly suggest writing a PEP first
before posting here.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Tue Jan 30 02:59:17 2001
From: guido@digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:59:17 -0500
Subject: [Python-Dev] Re: Making mutable objects readonly
In-Reply-To: Your message of "Mon, 29 Jan 2001 21:49:59 EST."
 <20010129214959.B17235@thyrsus.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
 <20010129214959.B17235@thyrsus.com>
Message-ID: <200101300259.VAA22208@cj20424-a.reston1.va.home.com>

> > - How to spell it?  x.freeze()?  x.readonly()?
> 
> I like "freeze", it'a a clear imperative where "readonly()" sounds
> like a test (e.g. "is this readonly()?")

Agreed.

> > - Should we support something like this for instances too?  Sometimes
> >   it might be cool to be able to freeze changing attribute values...
> 
> Moshe Zadka sent me a hack that handles instances:
[...]

OK, so no special support needed there.

> > - Should this reversible?  I.e. should there be an x.unfreeze()?
> 
> I gave this some thought earlier today.  There are advantages to either
> way.  Making freeze a one-way operation would make it possible to use
> freezing to get certain kinds of security and integrity guarantees that
> you can't have if freezing is reversible.
> 
> Fortunately, there's a semantics that captures both.  If we allow
> freeze to take an optional key argument, and require that an unfreeze
> call must supply the same key or fail, we get both worlds.  We can
> even one-way-hash the keys so they don't have to be stored in the
> bytecode.
> 
> Want to lock a structure permanently?  Pick a random long key.  Freeze
> with it.  Then throw that key away...

Way too cute.  My suggestion freeze(0) freezes forever, freeze(1)
can be unfrozen.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From esr@thyrsus.com  Tue Jan 30 03:06:19 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 22:06:19 -0500
Subject: [Python-Dev] Re: C's for statement
In-Reply-To: <200101300257.VAA22186@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 09:57:24PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com> <20010129150914.C10191@thyrsus.com> <200101300215.VAA21955@cj20424-a.reston1.va.home.com> <20010129213401.A17235@thyrsus.com> <200101300257.VAA22186@cj20424-a.reston1.va.home.com>
Message-ID: <20010129220619.A17713@thyrsus.com>

Guido van Rossum <guido@digicool.com>:
> Not dead set against something like it, but dead set against the ?:
> syntax because then : becomes too overloaded for the human reader, e.g.:
> 
>     if foo ? bar : bletch : spam = eggs
> 
> If you want to revive this, I strongly suggest writing a PEP first
> before posting here.

Noted.  Will do.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Such are a well regulated militia, composed of the freeholders,
citizen and husbandman, who take up arms to preserve their property,
as individuals, and their rights as freemen.
        -- "M.T. Cicero", in a newspaper letter of 1788 touching the "militia" 
            referred to in the Second Amendment to the Constitution.


From tim.one@home.com  Tue Jan 30 03:18:47 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 29 Jan 2001 22:18:47 -0500
Subject: [Python-Dev] Re: Making mutable objects readonly
In-Reply-To: <20010129214959.B17235@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECEIMAA.tim.one@home.com>

Note that even adding a "frozen" flag would add 4 bytes to every freezable
object on most machines.  That's why I'd rather .freeze() replace the type
pointer and .unfreeze() restore it.  No time or space overhead; no
cluttering up the normal-case (i.e., unfrozen) type implementations with new
tests.



From tim.one@home.com  Tue Jan 30 03:57:07 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 29 Jan 2001 22:57:07 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <14965.42988.362288.154254@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com>

Note that optimizing compilers use a pile of linear-time heuristics to
attempt to solve exponential-time optimization problems (from optimal
register assignment to optimal instruction scheduling, they're all formally
intractable even in isolation).

When code gets non-trivial, not even a compiler's chief designer can
reliably outguess what optimization may do.  It's really not unusual for a
higher optimization level to yield slower code, and especially not when the
source code is pushing or exceeding machine limits (# of registers, # of
instruction pipes, size of branch-prediction buffers; I-cache structure;
dynamic restrictions on execution units; ...).

[Jeremy]
> ...
> One of the differences between -O2 and -O3, according to the man page,
> is that -O3 will perform optimizations that involve a space-speed
> tradeoff.  It also include -finline-functions.  I can imagine that
> some of these optimizations hurt memory performance enough to make a
> difference.

One of the time-consuming ongoing tasks at my last employer was running
profiles and using them to override counterproductive compiler inlining
decisions (in both directions).  It's not just memory that excessive
inlining can screw up, but also things like running out of registers and so
inserting gobs of register spill/restore code, and inlining so much code
that the instruction scheduler effectively gives up (under many compilers, a
sure sign of this is when you look at the generated code for a function, and
it looks beautiful "at the top" but terrible "at the bottom"; some clever
optimizers tried to get around that by optimizing "bottom-up", and then it
looks beautiful at the bottom but terrible at the top <0.5 wink>; others
work middle-out or burn the candle at both ends, with visible consequences
you should be able to recognize now!).

optimization-is-easier-than-speech-recog-but-the-latter-doesn't-work-
    all-that-well-either-ly y'rs  - tim



From barry@digicool.com  Tue Jan 30 04:13:24 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 23:13:24 -0500
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
References: <14966.2069.950895.627663@beluga.mojam.com>
Message-ID: <14966.16228.548177.112853@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip@mojam.com> writes:

    SM> it seems to me that the whole "if/for something in dict" thing
    SM> needds to be hashed out in a PEP.
    
    SM> There are apparently lots of varying opinions about what's
    SM> reasonable.  This topic seems related to PEP 212 (Loop Counter
    SM> Iteration) and PEP 218 (Adding a Built-In Set Object Type),
    SM> but may well warrant its own.

As keeper of PEP0, I have to agree.  I personally would vastly prefer
a new iterator protocol than syntax such as "for key:value in dict".
I'd really like to see a PEP on an iterator protocol for Python, but
like Skip, I'm too busy at the moment to do it myself.  If nobody
takes it on before then, I might be willing to champion such a PEP for
the 2.2 time frame.  Until then, I'm decidedly -1 on "for/if in dict".

-Barry


From barry@digicool.com  Tue Jan 30 04:25:09 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 23:25:09 -0500
Subject: [Python-Dev] Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
 <3A756FF8.B7185FA2@lemburg.com>
 <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
 <3A75B190.3FD2A883@lemburg.com>
 <200101291922.OAA13321@cj20424-a.reston1.va.home.com>
 <20010129150247.B10191@thyrsus.com>
 <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <14966.16933.209494.214183@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido@digicool.com> writes:

    GvR> Yes, this is a good thing.  Easy to do on lists and dicts.
    GvR> Questions:

    GvR> - How to spell it?  x.freeze()?  x.readonly()?

    GvR> - Should this reversible?  I.e. should there be an
    GvR> x.unfreeze()?

    GvR> - Should we support something like this for instances too?
    GvR> Sometimes it might be cool to be able to freeze changing
    GvR> attribute values...

lock(x) ...? :)

-Barry


From barry@digicool.com  Tue Jan 30 04:26:50 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 23:26:50 -0500
Subject: [Python-Dev] Re: Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
 <3A756FF8.B7185FA2@lemburg.com>
 <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
 <3A75B190.3FD2A883@lemburg.com>
 <200101291922.OAA13321@cj20424-a.reston1.va.home.com>
 <20010129150247.B10191@thyrsus.com>
 <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
 <20010129214959.B17235@thyrsus.com>
Message-ID: <14966.17034.721204.305315@anthem.wooz.org>

>>>>> "ESR" == Eric S Raymond <esr@thyrsus.com> writes:

    ESR> Fortunately, there's a semantics that captures both.  If we
    ESR> allow freeze to take an optional key argument, and require
    ESR> that an unfreeze call must supply the same key or fail, we
    ESR> get both worlds.  We can even one-way-hash the keys so they
    ESR> don't have to be stored in the bytecode.

    ESR> Want to lock a structure permanently?  Pick a random long
    ESR> key.  Freeze with it.  Then throw that key away...

Clever!


From esr@thyrsus.com  Tue Jan 30 04:32:16 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 23:32:16 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <14966.16933.209494.214183@anthem.wooz.org>; from barry@digicool.com on Mon, Jan 29, 2001 at 11:25:09PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com> <14966.16933.209494.214183@anthem.wooz.org>
Message-ID: <20010129233215.A18533@thyrsus.com>

Barry A. Warsaw <barry@digicool.com>:
> lock(x) ...? :)

I was thinking that myself, Barry.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Boys who own legal firearms have much lower rates of delinquency and
drug use and are even slightly less delinquent than nonowners of guns."
	-- U.S. Department of Justice, National Institute of
	   Justice, Office of Juvenile Justice and Delinquency Prevention,
	   NCJ-143454, "Urban Delinquency and Substance Abuse," August 1995.


From tim.one@home.com  Tue Jan 30 04:56:09 2001
From: tim.one@home.com (Tim Peters)
Date: Mon, 29 Jan 2001 23:56:09 -0500
Subject: [Python-Dev] SSL socket read at EOF; SourceForge problem
Message-ID: <LNBBLJKPBEHFEDALKOLCEECJIMAA.tim.one@home.com>

I tried to open an SF bug for the following msg from c.l.py, but SF balked:

    ERROR
    ERROR getting bug_id

Logged out, logged in, tried it again, same outcome.

Intended bug report content:

Good question from c.l.py, assigned to Guido cuz he's a Socket Guy:

From: Clarence Gardner <clarence@netlojix.com>
Subject: RE: Thread Safety
Date: Mon, 29 Jan 2001 09:51:03 -0800

...

I'm going to repeat a question that I posted about a week ago that passed
without comment on the newsgroup. The issue is the SSL support in the socket
module, which raises an exception when the reading socket is at EOF, rather
than returning an empty string. I'm hesitant to call it a "bug", but I
wouldn't have implemented it this way.  There are the names of two people
mentioned at the top of socketmodule.c, but no contact information, so I'm
suggesting here that it be changed to conform to normal file/socket
practice. (SSL was actually added at 2.0, so I'm late to the party with
this; mea culpa, mea culpa.  I delayed trying Python2 because of the
extension rebuilding.)



From thomas@xs4all.net  Tue Jan 30 06:14:20 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 07:14:20 +0100
Subject: [Python-Dev] Re: C's for statement
In-Reply-To: <20010129213401.A17235@thyrsus.com>; from esr@thyrsus.com on Mon, Jan 29, 2001 at 09:34:01PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com> <20010129150914.C10191@thyrsus.com> <200101300215.VAA21955@cj20424-a.reston1.va.home.com> <20010129213401.A17235@thyrsus.com>
Message-ID: <20010130071420.U962@xs4all.nl>

On Mon, Jan 29, 2001 at 09:34:01PM -0500, Eric S. Raymond wrote:

> I guess my archetype of the cute C for-loop is the idiom for 
> pointer-list traversal:

> 	struct foo {int data; struct foo *next;} *ptr, *head; 

> 	for (ptr = head; *ptr; ptr = ptr->next)
> 		do_something_with(ptr->data)

Note two things: in Python, you would use a list, so 'for x i list' does
exactly what you want here ;) And if you really need it, you could use
iterators for exactly this (once we have them, of course): you are inventing
a new storage type. Quite common in C, since the only one it has is useless
for anything other than strings<wink>, but not so common in Python.

> Not the highest on my list of wants -- I'd sooner have ?: back.  I submitted
> a patch for that once, and the discussion sort of died.  Were you dead
> det against it, or should I revive this proposal?

Triple blech. Guido will never go for it! (There, increased your chance of
getting it approved! :) Seriously though, I wouldn't like it much, it's too
cryptic a syntax. I notice I use it less and less in C, too.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Tue Jan 30 06:18:25 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 07:18:25 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 29, 2001 at 08:39:17PM -0500
References: <200101291448.JAA11473@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com>
Message-ID: <20010130071825.V962@xs4all.nl>

On Mon, Jan 29, 2001 at 08:39:17PM -0500, Tim Peters wrote:

>     for key: in dict:    # over dict.keys()
>     for :value in dict:  # over dict.values()
>     for : in dict:       # a delay loop

Wot's the last one supposed to do ? 'for unused_var in range(len(dict)):' ?

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From tim.one@home.com  Tue Jan 30 06:25:51 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 30 Jan 2001 01:25:51 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010130071825.V962@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECLIMAA.tim.one@home.com>

>>     for key: in dict:    # over dict.keys()
>>     for :value in dict:  # over dict.values()
>>     for : in dict:       # a delay loop

[Thomas Wouters]
> Wot's the last one supposed to do ? 'for unused_var in
> range(len(dict)):' ?

Well, as the preceding line said in the original:

>>    2/3rd of these are marginally more attractive [than
>>    "if key:value in dict"]:

I think you've guessed which 2/3 those are <wink>.  I don't see that the
last line has any visible semantics whatsoever, so Python can do whatever it
likes, provided it doesn't do anything visible.

You still hang out on c.l.py!  So you gotta know that if something of the
form

    x:y

is suggested, people will line up to suggest meanings for the 3 obvious
variations, along with

    x::y

and

    x:-:y

and

    x lambda y

too <0.9 wink>.



From thomas@xs4all.net  Tue Jan 30 06:26:48 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 07:26:48 +0100
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <14966.2069.950895.627663@beluga.mojam.com>; from skip@mojam.com on Mon, Jan 29, 2001 at 06:17:25PM -0600
References: <14966.2069.950895.627663@beluga.mojam.com>
Message-ID: <20010130072648.W962@xs4all.nl>

On Mon, Jan 29, 2001 at 06:17:25PM -0600, Skip Montanaro wrote:

> The fact that it was easy for Thomas to implement "if key in dict" doesn't
> make the overall concept less controversial.

Note that the fact I implemented it doesn't mean I'm +1 on it (witness my
posts on python-list.) In fact, *while implementing it*, I grew from +0 to
-0 and maybe even to a weak -1 (all in 5 minutes :) The enthousiastic
subject of the patch was a weak attempt at 5AM humour, not a venting of an
ancient desire :)

More-5AM-humour-ly y'rs,
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From thomas@xs4all.net  Tue Jan 30 06:55:16 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 07:55:16 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net>; from jhylton@users.sourceforge.net on Mon, Jan 29, 2001 at 05:27:30PM -0800
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010130075515.X962@xs4all.nl>

On Mon, Jan 29, 2001 at 05:27:30PM -0800, Jeremy Hylton wrote:

> add note about two kinds of illegal imports that are now checked

> + - The compiler will report a SyntaxError if "from ... import *" occurs
> +   in a function or class scope or if a name bound by the import
> +   statement is declared global in the same scope.  The language
> +   reference has also documented that these cases are illegal, but
> +   they were not enforced.

Woah. Is this really a good idea ? I have seen 'from ... import *' in a
function scope put to good (relatively -- we're talking 'import *' here)
use. I also thought of 'import' as yet another assignment statement, so to
me it's both logical and consistent if 'import' would listen to 'global'.
Otherwise we have to re-invent 'import spam; eggs = spam' if we want eggs to
be global. 

Is there really a reason to enforce this, or are we enforcing the wording of
the language reference for the sake of enforcing the wording of the language
reference ? When writing 'import as' for 2.0, I fixed some of the
inconsistencies in import, making it adhere to 'global' statements in as
many cases as possible (all except 'from ... import *') but I was apparently
not aware of the wording of the language reference. I'd suggest updating the
wording in the language reference, not the implementation, unless there is a
good reason to disallow this.

I also have another issue with your recent patches, Jeremy, also in the
backwards-compatibility departement :) You gave new.code two new,
non-optional arguments, in the middle of the long argument list. I sent a
note about it to python-checkins instead of python-dev by accident, but Fred
seemed to agree with me there.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From mwh21@cam.ac.uk  Tue Jan 30 08:30:15 2001
From: mwh21@cam.ac.uk (Michael Hudson)
Date: 30 Jan 2001 08:30:15 +0000
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: "Tim Peters"'s message of "Mon, 29 Jan 2001 22:57:07 -0500"
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com>
Message-ID: <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk>

In the interest of generating some numbers (and filling up my hard
drive), last night I wrote a script to build lots & lots of versions
of python (many of which turned out to be redundant - eg. -O6 didn't
seem to do anything different to -O3 and pybench doesn't work with
1.5.2), and then run pybench with them.  Summarised results below;
first a key:

src-n: this morning's CVS (with Jeremy's f_localsplus optimisation)
        (only built this with -O3)
src: CVS from yesterday afternoon
src-obmalloc: CVS from yesterday afternoon with Vladimir's obmalloc 
        patch applied.  More on this later...
Python-2.0: you can guess what this is.

All runs are compared against Python-2.0-O2:

Benchmark: src-n-O3 (rounds=10, warp=20)
            Average round time:   49029.00 ms              -0.86%
Benchmark: src (rounds=10, warp=20)
            Average round time:   67141.00 ms             +35.76%
Benchmark: src-O (rounds=10, warp=20)
            Average round time:   50167.00 ms              +1.44%
Benchmark: src-O2 (rounds=10, warp=20)
            Average round time:   49641.00 ms              +0.37%
Benchmark: src-O3 (rounds=10, warp=20)
            Average round time:   49104.00 ms              -0.71%
Benchmark: src-O6 (rounds=10, warp=20)
            Average round time:   49131.00 ms              -0.66%
Benchmark: src-obmalloc (rounds=10, warp=20)
            Average round time:   63276.00 ms             +27.94%
Benchmark: src-obmalloc-O (rounds=10, warp=20)
            Average round time:   46927.00 ms              -5.11%
Benchmark: src-obmalloc-O2 (rounds=10, warp=20)
            Average round time:   46146.00 ms              -6.69%
Benchmark: src-obmalloc-O3 (rounds=10, warp=20)
            Average round time:   46456.00 ms              -6.07%
Benchmark: src-obmalloc-O6 (rounds=10, warp=20)
            Average round time:   46450.00 ms              -6.08%
Benchmark: Python-2.0 (rounds=10, warp=20)
            Average round time:   68933.00 ms             +39.38%
Benchmark: Python-2.0-O (rounds=10, warp=20)
            Average round time:   49542.00 ms              +0.17%
Benchmark: Python-2.0-O3 (rounds=10, warp=20)
            Average round time:   48262.00 ms              -2.41%
Benchmark: Python-2.0-O6 (rounds=10, warp=20)
            Average round time:   48273.00 ms              -2.39%

My conclusion?  Python 2.1 is slower than Python 2.0, but not by
enough to care about.

Interestingly, adding obmalloc speeds things up.  Let's take a closer
look:

$ python pybench.py -c src-obmalloc-O3 -s src-O3      
PYBENCH 0.7

Benchmark: src-O3 (rounds=10, warp=20)

Tests:                              per run    per oper.  diff *
------------------------------------------------------------------------
          BuiltinFunctionCalls:     843.35 ms    6.61 us   +2.93%
           BuiltinMethodLookup:     878.70 ms    1.67 us   +0.56%
                 ConcatStrings:    1068.80 ms    7.13 us   -1.22%
                 ConcatUnicode:    1373.70 ms    9.16 us   -1.24%
               CreateInstances:    1433.55 ms   34.13 us   +9.06%
       CreateStringsWithConcat:    1031.75 ms    5.16 us  +10.95%
       CreateUnicodeWithConcat:    1277.85 ms    6.39 us   +3.14%
                  DictCreation:    1275.80 ms    8.51 us  +44.22%
                      ForLoops:    1415.90 ms  141.59 us   -0.64%
                    IfThenElse:    1152.70 ms    1.71 us   -0.15%
                   ListSlicing:     397.40 ms  113.54 us   -0.53%
                NestedForLoops:     789.75 ms    2.26 us   -0.37%
          NormalClassAttribute:     935.15 ms    1.56 us   -0.41%
       NormalInstanceAttribute:     961.15 ms    1.60 us   -0.60%
           PythonFunctionCalls:    1079.65 ms    6.54 us   -1.00%
             PythonMethodCalls:     908.05 ms   12.11 us   -0.88%
                     Recursion:     838.50 ms   67.08 us   -0.00%
                  SecondImport:     741.20 ms   29.65 us  +25.57%
           SecondPackageImport:     744.25 ms   29.77 us  +18.66%
         SecondSubmoduleImport:     947.05 ms   37.88 us  +25.60%
       SimpleComplexArithmetic:    1129.40 ms    5.13 us  +114.92%
        SimpleDictManipulation:    1048.55 ms    3.50 us   -0.00%
         SimpleFloatArithmetic:     746.05 ms    1.36 us   -2.75%
      SimpleIntFloatArithmetic:     823.35 ms    1.25 us   -0.37%
       SimpleIntegerArithmetic:     823.40 ms    1.25 us   -0.37%
        SimpleListManipulation:    1004.70 ms    3.72 us   +0.01%
          SimpleLongArithmetic:     865.30 ms    5.24 us  +100.65%
                    SmallLists:    1657.65 ms    6.50 us   +6.63%
                   SmallTuples:    1143.95 ms    4.77 us   +2.90%
         SpecialClassAttribute:     949.00 ms    1.58 us   -0.22%
      SpecialInstanceAttribute:    1353.05 ms    2.26 us   -0.73%
                StringMappings:    1161.00 ms    9.21 us   +7.30%
              StringPredicates:    1069.65 ms    3.82 us   -5.30%
                 StringSlicing:     846.30 ms    4.84 us   +8.61%
                     TryExcept:    1590.40 ms    1.06 us   -0.49%
                TryRaiseExcept:    1104.65 ms   73.64 us  +24.46%
                  TupleSlicing:     681.10 ms    6.49 us   -3.13%
               UnicodeMappings:    1021.70 ms   56.76 us   +0.79%
             UnicodePredicates:    1308.45 ms    5.82 us   -4.79%
             UnicodeProperties:    1148.45 ms    5.74 us  +13.67%
                UnicodeSlicing:     984.15 ms    5.62 us   -0.51%
------------------------------------------------------------------------
            Average round time:   49104.00 ms              +5.70%

*) measured against: src-obmalloc-O3 (rounds=10, warp=20)

Words fail me slightly, but maybe some tuning of the memory allocation
of longs & complex numbers would be in order?

Time for lectures - I don't think algebraic geometry is going to make
my head hurt as much as trying to explain benchmarks...

Cheers,
M.

-- 
  ARTHUR:  But which is probably incapable of drinking the coffee.
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 6



From ping@lfw.org  Tue Jan 30 08:38:12 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Tue, 30 Jan 2001 00:38:12 -0800 (PST)
Subject: [Python-Dev] Read-only function attributes
Message-ID: <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org>

Hi there.

I see that the function attribute feature specifically allows
assignment to func_code and func_defaults, but no other special
attributes.  This seems really suspect to me.  Why would we want
to allow the reassignment of special attributes at all?

Functions have always been immutable objects, and i can see some
motivation for attaching mutable dictionaries to them, but it's
a more serious move to make the functions mutable themselves.

I don't recall any discussion about changing special attributes;
i don't see a clear purpose to them; and i do see a danger in
making it harder to be certain that a program is safe and predictable.

(Yes, i did notice that function attributes can't be set in
restricted mode, but the addition of extra features requiring
extra security checks makes me uneasy.)


-- ?!ng



From ping@lfw.org  Tue Jan 30 08:52:43 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Tue, 30 Jan 2001 00:52:43 -0800 (PST)
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101300043260.7769-100000@skuld.kingmanhall.org>

Eric S. Raymond wrote:
> For different reasons, I'd like to be able to set a constant flag on a
> object instance.  Simple semantics: if you try to assign to a
> member or method, it throws an exception.

Guido van Rossum wrote:
> Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
> 
> - How to spell it?  x.freeze()?  x.readonly()?

I'm not so sure.  There seem to be many issues here.  More questions:

What's the difference between a frozen list and a tuple?

Is a frozen list hashable?

> - Should this reversible?  I.e. should there be an x.unfreeze()?

What if two threads lock and then unlock the same structure?

> - Should we support something like this for instances too?  Sometimes
>   it might be cool to be able to freeze changing attribute values...

If you do this, i bet people will immediately want to freeze
individual attributes.  Some might be confused by

    a.x = [1, 2, 3]
    lock(a.x)        # intend to lock the attribute, not the list
    a.x = 3          # hey, why is this allowed?

What does locking an extension object do?

What happens when you lock an object that implements list or dict
semantics?  Do we care that locking a UserList accomplishes nothing?

Should unfreeze/unlock() be disallowed in restricted mode?


-- ?!ng

No software is totally secure, but using [Microsoft] Outlook is like
hanging a sign on your back that reads "PLEASE MESS WITH MY COMPUTER."
    -- Scott Rosenberg, Salon Magazine



From fredrik@effbot.org  Tue Jan 30 09:05:47 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Tue, 30 Jan 2001 10:05:47 +0100
Subject: [Python-Dev] Read-only function attributes
References: <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org>
Message-ID: <01d701c08a9b$d7a9fe60$e46940d5@hagrid>

Ka-Ping Yee wrote:
> I see that the function attribute feature specifically allows
> assignment to func_code and func_defaults, but no other special
> attributes.  This seems really suspect to me.  Why would we want
> to allow the reassignment of special attributes at all?

to allow an IDE to "patch" a running program?

</F>



From gvwilson@ca.baltimore.com  Tue Jan 30 13:08:42 2001
From: gvwilson@ca.baltimore.com (Greg Wilson)
Date: Tue, 30 Jan 2001 08:08:42 -0500 (EST)
Subject: [Python-Dev] re: Making mutable objects readonly
In-Reply-To: <20010130085202.18E71EAC4@mail.python.org>
Message-ID: <Pine.LNX.4.10.10101300804330.14867-100000@akbar.nevex.com>

> Barry Warsaw:
> lock(x) ...? :)

Greg Wilson:

-1 --- everyone will assume it's mutual exclusion, rather than immutability.





From guido@digicool.com  Tue Jan 30 14:01:15 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 09:01:15 -0500
Subject: [Python-Dev] Read-only function attributes
In-Reply-To: Your message of "Tue, 30 Jan 2001 00:38:12 PST."
 <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org>
Message-ID: <200101301401.JAA25600@cj20424-a.reston1.va.home.com>

> I see that the function attribute feature specifically allows
> assignment to func_code and func_defaults, but no other special
> attributes.  This seems really suspect to me.  Why would we want
> to allow the reassignment of special attributes at all?

As Effbot said, this is useful in certain circumstances where a
development environment wants to implement a "better reload".  For
this same reason you can assign to a class's __bases__ and __dict__
and to an instance's __class__ and __dict__.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido@digicool.com  Tue Jan 30 15:00:58 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 10:00:58 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: Your message of "Tue, 30 Jan 2001 00:52:43 PST."
 <Pine.LNX.4.10.10101300043260.7769-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101300043260.7769-100000@skuld.kingmanhall.org>
Message-ID: <200101301500.KAA25733@cj20424-a.reston1.va.home.com>

> Guido van Rossum wrote:
> > Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
> > 
> > - How to spell it?  x.freeze()?  x.readonly()?

Ping:
> I'm not so sure.  There seem to be many issues here.  More questions:
> 
> What's the difference between a frozen list and a tuple?

A frozen list can be unfrozen (maybe)?

> Is a frozen list hashable?

Yes -- that's what started this thread (using dicts as dict keys,
actually).

> > - Should this reversible?  I.e. should there be an x.unfreeze()?
> 
> What if two threads lock and then unlock the same structure?

That's up to the threads -- it's no different that other concurrent
access.

> > - Should we support something like this for instances too?  Sometimes
> >   it might be cool to be able to freeze changing attribute values...
> 
> If you do this, i bet people will immediately want to freeze
> individual attributes.  Some might be confused by
> 
>     a.x = [1, 2, 3]
>     lock(a.x)        # intend to lock the attribute, not the list
>     a.x = 3          # hey, why is this allowed?

That's a matter of API.  I wouldn't make this a built-in, but rather a
method on freezable objects (please don't call it lock()!).

> What does locking an extension object do?

What does adding 1 to an extension object do?

> What happens when you lock an object that implements list or dict
> semantics?  Do we care that locking a UserList accomplishes nothing?

Who says it doesn't?

> Should unfreeze/unlock() be disallowed in restricted mode?

I don't see why not.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Tue Jan 30 15:06:57 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 10:06:57 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: Your message of "Tue, 30 Jan 2001 07:55:16 +0100."
 <20010130075515.X962@xs4all.nl>
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net>
 <20010130075515.X962@xs4all.nl>
Message-ID: <200101301506.KAA25763@cj20424-a.reston1.va.home.com>

> On Mon, Jan 29, 2001 at 05:27:30PM -0800, Jeremy Hylton wrote:
> 
> > add note about two kinds of illegal imports that are now checked
> 
> > + - The compiler will report a SyntaxError if "from ... import *" occurs
> > +   in a function or class scope or if a name bound by the import
> > +   statement is declared global in the same scope.  The language
> > +   reference has also documented that these cases are illegal, but
> > +   they were not enforced.

> Woah. Is this really a good idea ? I have seen 'from ... import *'
> in a function scope put to good (relatively -- we're talking 'import
> *' here) use. I also thought of 'import' as yet another assignment
> statement, so to me it's both logical and consistent if 'import'
> would listen to 'global'.  Otherwise we have to re-invent 'import
> spam; eggs = spam' if we want eggs to be global.

Note that Jeremy is only raising errors for "from M import *".

> Is there really a reason to enforce this, or are we enforcing the
> wording of the language reference for the sake of enforcing the
> wording of the language reference ? When writing 'import as' for
> 2.0, I fixed some of the inconsistencies in import, making it adhere
> to 'global' statements in as many cases as possible (all except
> 'from ... import *') but I was apparently not aware of the wording
> of the language reference. I'd suggest updating the wording in the
> language reference, not the implementation, unless there is a good
> reason to disallow this.

I think Jeremy has an excellent reason.  Compilers want to do analysis
of name usage at compile time.  The value of * cannot be determined at
compile time (you cannot know what module will actually be imported at
run time).  Up till now, we were able to fudge this, but Jeremy's new
compiler needs to know exactly which names are defined in all local
scopes, in order to do nested scopes right.

> I also have another issue with your recent patches, Jeremy, also in
> the backwards-compatibility departement :) You gave new.code two
> new, non-optional arguments, in the middle of the long argument
> list. I sent a note about it to python-checkins instead of
> python-dev by accident, but Fred seemed to agree with me there.

(Tim will love this. :-)

I don't know what those new arguments represent.  If they can
reasonably be assumed to be empty for code that doesn't use the new
features, I'd say move them to the end and default them properly.  If
they must be specified, I'd say too bad, the new module is an accident
of the implementation anyway, and its users should update their code.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Tue Jan 30 15:08:39 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 10:08:39 -0500
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: Your message of "Tue, 30 Jan 2001 07:26:48 +0100."
 <20010130072648.W962@xs4all.nl>
References: <14966.2069.950895.627663@beluga.mojam.com>
 <20010130072648.W962@xs4all.nl>
Message-ID: <200101301508.KAA25825@cj20424-a.reston1.va.home.com>

> Note that the fact I implemented it doesn't mean I'm +1 on it (witness my
> posts on python-list.) In fact, *while implementing it*, I grew from +0 to
> -0 and maybe even to a weak -1 (all in 5 minutes :) The enthousiastic
> subject of the patch was a weak attempt at 5AM humour, not a venting of an
> ancient desire :)

Can you say "PEP time"? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)


From barry@digicool.com  Tue Jan 30 15:29:43 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 30 Jan 2001 10:29:43 -0500
Subject: [Python-Dev] Read-only function attributes
References: <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org>
Message-ID: <14966.56807.288840.7850@anthem.wooz.org>

>>>>> "KY" == Ka-Ping Yee <ping@lfw.org> writes:

    KY> I see that the function attribute feature specifically allows
    KY> assignment to func_code and func_defaults, but no other
    KY> special attributes.  This seems really suspect to me.  Why
    KY> would we want to allow the reassignment of special attributes
    KY> at all?

... and actually, none of that changed w/ the function attribute
patch.  You've been able to assign to func_code and func_defaults
since Python 1.6!

-Barry


From thomas@xs4all.net  Tue Jan 30 15:52:04 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 16:52:04 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <200101301506.KAA25763@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 30, 2001 at 10:06:57AM -0500
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com>
Message-ID: <20010130165204.I962@xs4all.nl>

On Tue, Jan 30, 2001 at 10:06:57AM -0500, Guido van Rossum wrote:
> > On Mon, Jan 29, 2001 at 05:27:30PM -0800, Jeremy Hylton wrote:
> > 
> > > add note about two kinds of illegal imports that are now checked
> > 
> > > + - The compiler will report a SyntaxError if "from ... import *" occurs
> > > +   in a function or class scope or if a name bound by the import
> > > +   statement is declared global in the same scope.  The language
> > > +   reference has also documented that these cases are illegal, but
> > > +   they were not enforced.

> > Woah. Is this really a good idea ? I have seen 'from ... import *'
> > in a function scope put to good (relatively -- we're talking 'import
> > *' here) use. I also thought of 'import' as yet another assignment
> > statement, so to me it's both logical and consistent if 'import'
> > would listen to 'global'.  Otherwise we have to re-invent 'import
> > spam; eggs = spam' if we want eggs to be global.

> Note that Jeremy is only raising errors for "from M import *".

No, he says he's also raising errors for 'import spam' if 'spam' is declared
global, like so:

def viking():
    global spam
    import spam

> > Is there really a reason to enforce this, or are we enforcing the
> > wording of the language reference for the sake of enforcing the
> > wording of the language reference ? When writing 'import as' for
> > 2.0, I fixed some of the inconsistencies in import, making it adhere
> > to 'global' statements in as many cases as possible (all except
> > 'from ... import *') but I was apparently not aware of the wording
> > of the language reference. I'd suggest updating the wording in the
> > language reference, not the implementation, unless there is a good
> > reason to disallow this.

> I think Jeremy has an excellent reason.  Compilers want to do analysis
> of name usage at compile time.  The value of * cannot be determined at
> compile time (you cannot know what module will actually be imported at
> run time).  Up till now, we were able to fudge this, but Jeremy's new
> compiler needs to know exactly which names are defined in all local
> scopes, in order to do nested scopes right.

Hrrmm.... I guess I have to agree with that. None the less, I wish we could
have a "ack! this is stupid code! it uses 'from larch import *'! All bets
are off, we do a lot of slow complicated runtime checking now!" mode. The
thing I still enjoy most about Python is that it always does what I want,
and though I'd never want to do 'from different import *' in a local scope,
I do want other, less wise people to have the same experience, where
possible :)

And I also want to be able to do:

def fill_me(with):
    global me
    if with == 1:
        import me
    elif with == 2:
        import me_too as me
    elif with == 3:
        from me.Tools import me_me as me
    elif with == 4:
        me = FakeModule()
        sys.modules['me'] = me
    else:
        raise ValueError

And I can't quite argue that away with 'the compiler needs to know ...' --
it's all there!

> > I also have another issue with your recent patches, Jeremy, also in
> > the backwards-compatibility departement :) You gave new.code two
> > new, non-optional arguments, in the middle of the long argument
> > list. I sent a note about it to python-checkins instead of
> > python-dev by accident, but Fred seemed to agree with me there.

> (Tim will love this. :-)

> I don't know what those new arguments represent.  If they can
> reasonably be assumed to be empty for code that doesn't use the new
> features, I'd say move them to the end and default them properly.  If
> they must be specified, I'd say too bad, the new module is an accident
> of the implementation anyway, and its users should update their code.

Okay, I can live with that. It's sure to cause some gripes though. Then
again, from looking at the code I'd say those arguments (freevars and
cellvars) can easily default to empty tuples.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From bckfnn@worldonline.dk  Tue Jan 30 17:34:10 2001
From: bckfnn@worldonline.dk (Finn Bock)
Date: Tue, 30 Jan 2001 17:34:10 GMT
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>   <3A756FF8.B7185FA2@lemburg.com>  <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
Message-ID: <3a76df10.22007715@smtp.worldonline.dk>

[Guido]

>Maybe we could add a flag to the dict that issues an error when a new
>key is inserted during such a for loop?  

FWIW, some of the java2 collections decided to throw a Concurrent-
ModificationException in the iterator if the collection was modified
during the iteration. Generally none of java2 collections can be
modified while iterating over it (the exception is calling .remove() on
the iterator object and not all collections support that).

>(I don't think the key order can be affected when a key is *deleted*.)

Probably also true for the Hashtables which is backing our PyDictionary,
but I'll rather not depend too much on it being true.

[Tim]

>That latter is true but specific to this implementation.  "Can't mutate the
>dict period" is easier to keep straight, and probably harmless in practice
>(if not, it could be relaxed later).  

Agree.

>Recall that a similar trick is played
>during list.sort(), replacing the list's type pointer for the duration (to
>point to an internal "immutable list" type, same as the list type except the
>"dangerous" slots point to a function that raises an "immutable list"
>TypeError).  Then no runtime expense is incurred for regular lists to keep
>checking flags.  I thought of this as an elegant use for switching types at
>runtime; you may still be appalled by it, though!

Changing the type of a type? Yuck! 

I might very likely be reading the CPython sources wrongly, but it seems
this trick will cause an BadInternalCall if some other C extension are
trying to modify a list while it is freezed by the type switching trick.
I imagine this would happen if the extension called:

  PyList_SetItem(myList, 0, aValue);

I guess Jython could support this from the python side, but its hard to
ensure from the java side without adding an additional PyList_Check(..)
to all list methods. It just doesn't feel like the right thing to go
since it would cause slower access to all mutable objects.

regards,
finn


From guido@digicool.com  Tue Jan 30 20:42:58 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 15:42:58 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: Your message of "Tue, 30 Jan 2001 16:52:04 +0100."
 <20010130165204.I962@xs4all.nl>
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com>
 <20010130165204.I962@xs4all.nl>
Message-ID: <200101302042.PAA29301@cj20424-a.reston1.va.home.com>

> > > Woah. Is this really a good idea ? I have seen 'from ... import *'
> > > in a function scope put to good (relatively -- we're talking 'import
> > > *' here) use. I also thought of 'import' as yet another assignment
> > > statement, so to me it's both logical and consistent if 'import'
> > > would listen to 'global'.  Otherwise we have to re-invent 'import
> > > spam; eggs = spam' if we want eggs to be global.
> 
> > Note that Jeremy is only raising errors for "from M import *".
> 
> No, he says he's also raising errors for 'import spam' if 'spam' is declared
> global, like so:
> 
> def viking():
>     global spam
>     import spam

Yeah, this was just brought to my attention at our group meeting
today.  I'm with you on this one -- there really isn't a good reason
why this shouldn't work.  (I wonder why that constraint was ever added
to the reference manual; maybe I was just upset that someone would
*do* something as ugly as that, or maybe there was a J[P]ython
reason???.)

> > I think Jeremy has an excellent reason.  Compilers want to do analysis
> > of name usage at compile time.  The value of * cannot be determined at
> > compile time (you cannot know what module will actually be imported at
> > run time).  Up till now, we were able to fudge this, but Jeremy's new
> > compiler needs to know exactly which names are defined in all local
> > scopes, in order to do nested scopes right.
> 
> Hrrmm.... I guess I have to agree with that. None the less, I wish we could
> have a "ack! this is stupid code! it uses 'from larch import *'! All bets
> are off, we do a lot of slow complicated runtime checking now!" mode. The
> thing I still enjoy most about Python is that it always does what I want,
> and though I'd never want to do 'from different import *' in a local scope,
> I do want other, less wise people to have the same experience, where
> possible :)

Hm, maybe, just *maybe* Jeremy can do this if there are no nested
scopes in sight.  But I don't think it's a big deal as long as the
error message is clear -- it's bad style.

> And I also want to be able to do:
> 
> def fill_me(with):
>     global me
>     if with == 1:
>         import me
>     elif with == 2:
>         import me_too as me
>     elif with == 3:
>         from me.Tools import me_me as me
>     elif with == 4:
>         me = FakeModule()
>         sys.modules['me'] = me
>     else:
>         raise ValueError
> 
> And I can't quite argue that away with 'the compiler needs to know ...' --
> it's all there!

Sort of, although I would prefer to do a two-stager here: first some
variation of "import me as meohmy", and then "global me; me = meohmy" .

> > > I also have another issue with your recent patches, Jeremy, also in
> > > the backwards-compatibility departement :) You gave new.code two
> > > new, non-optional arguments, in the middle of the long argument
> > > list. I sent a note about it to python-checkins instead of
> > > python-dev by accident, but Fred seemed to agree with me there.
> 
> > (Tim will love this. :-)
> 
> > I don't know what those new arguments represent.  If they can
> > reasonably be assumed to be empty for code that doesn't use the new
> > features, I'd say move them to the end and default them properly.  If
> > they must be specified, I'd say too bad, the new module is an accident
> > of the implementation anyway, and its users should update their code.
> 
> Okay, I can live with that. It's sure to cause some gripes though. Then
> again, from looking at the code I'd say those arguments (freevars and
> cellvars) can easily default to empty tuples.

OK.  I hope Jeremy can fix this when he gets home.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Tue Jan 30 22:30:25 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 23:30:25 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3a76df10.22007715@smtp.worldonline.dk>; from bckfnn@worldonline.dk on Tue, Jan 30, 2001 at 05:34:10PM +0000
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3a76df10.22007715@smtp.worldonline.dk>
Message-ID: <20010130233025.J962@xs4all.nl>

On Tue, Jan 30, 2001 at 05:34:10PM +0000, Finn Bock wrote:

> >Recall that a similar trick is played during list.sort(), replacing the
> >list's type pointer for the duration (to point to an internal "immutable
> >list" type, same as the list type except the "dangerous" slots point to a
> >function that raises an "immutable list" TypeError).  Then no runtime
> >expense is incurred for regular lists to keep checking flags.  I thought
> >of this as an elegant use for switching types at runtime; you may still
> >be appalled by it, though!

> Changing the type of a type? Yuck! 

No, the typeobject itself isn't changed -- that would freeze *all*
dicts/lists/whatever, not just the one we want. We'd be changing the type of
an object (or 'type instance', if you want, but not "type 'instance'"), not
the type of a type.

> I might very likely be reading the CPython sources wrongly, but it seems
> this trick will cause an BadInternalCall if some other C extension are
> trying to modify a list while it is freezed by the type switching trick.
> I imagine this would happen if the extension called:

>   PyList_SetItem(myList, 0, aValue);

Only if PyList_SetItem refuses to handle 'frozen' lists. In my eyes,
'frozen' lists should still pass PyList_Check(), but also PyList_Frozen()
(or whatever), and methods/operations that modify the listobject would have
to check if the list is frozen, and raise an appropriate error if so. This
might throw 'unexpected' errors, but only in situations that can't happen
right now!

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From fredrik@effbot.org  Tue Jan 30 22:45:16 2001
From: fredrik@effbot.org (Fredrik Lundh)
Date: Tue, 30 Jan 2001 23:45:16 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3a76df10.22007715@smtp.worldonline.dk> <20010130233025.J962@xs4all.nl>
Message-ID: <003501c08b0e$51f975c0$e46940d5@hagrid>

> Only if PyList_SetItem refuses to handle 'frozen' lists. In my eyes,
> 'frozen' lists should still pass PyList_Check(), but also PyList_Frozen()
> (or whatever), and methods/operations that modify the listobject would have
> to check if the list is frozen, and raise an appropriate error if so. This
> might throw 'unexpected' errors.

did someone just subscribe me to the perl-porters list?

-1 on "modal freeze" (it's madness)
-0 on an "immutable dictionary" type in the core



From tim.one@home.com  Tue Jan 30 23:53:45 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 30 Jan 2001 18:53:45 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101300206.VAA21925@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>

[Guido]
> This is all PEP material now.

Yup.

> Tim, do you want to own the PEP?

Not really.  Available time is finite, and this isn't at the top of the list
of things I'd like to see (resuming the discussion of generators +
coroutines + iteration protocol comes to mind first).

>> Cool!  Can we resist adding
>>
>>     if key:value in dict
>>
>> for "parallelism"?  (I know I can ...)

> That's easy to resist because, unlike ``for key:value in dict'', it's
> not unambiguous:

But

    if (key:value) in dict

is.  Just trying to help whoever *does* want the PEP <wink>.

> ...
> I'm certainly more comfortable with just ``for key in dict'' than with
> the whole slow of extensions using colons.

What about just the

    for key:value in dict
    for index:value in sequence

extensions?  The degenerate forms (omitting x or y or both in x:y) are
mechanical variations so are likely to get raised.

> But, again, that's for the PEP to fight over.

PEPs are easier if you Pronounce on things you hate early so that those can
get recorded in the "BDFL Pronouncements" section without further ado.

whatever-this-may-look-like-it's-not-a-pep-discussion<wink>-ly y'rs  - tim



From nas@arctrix.com  Tue Jan 30 17:12:15 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Tue, 30 Jan 2001 09:12:15 -0800
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <003501c08b0e$51f975c0$e46940d5@hagrid>; from fredrik@effbot.org on Tue, Jan 30, 2001 at 11:45:16PM +0100
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3a76df10.22007715@smtp.worldonline.dk> <20010130233025.J962@xs4all.nl> <003501c08b0e$51f975c0$e46940d5@hagrid>
Message-ID: <20010130091215.C18319@glacier.fnational.com>

On Tue, Jan 30, 2001 at 11:45:16PM +0100, Fredrik Lundh wrote:
> did someone just subscribe me to the perl-porters list?
> 
> -1 on "modal freeze" (it's madness)
> -0 on an "immutable dictionary" type in the core

I'm glad I'm not the only one who had that feeling.  I agree with
your votes too.

  Neil


From nas@arctrix.com  Tue Jan 30 17:24:54 2001
From: nas@arctrix.com (Neil Schemenauer)
Date: Tue, 30 Jan 2001 09:24:54 -0800
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 30, 2001 at 06:53:45PM -0500
References: <200101300206.VAA21925@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>
Message-ID: <20010130092454.D18319@glacier.fnational.com>

[Tim Peters on adding yet more syntatic sugar]
> Available time is finite, and this isn't at the top of the list
> of things I'd like to see (resuming the discussion of
> generators + coroutines + iteration protocol comes to mind
> first).

What's the chances of getting generators into 2.2?  The
implementation should not be hard.  Didn't Steven Majewski have
something years ago?  Why do we always get sidetracked on trying
to figure out how to do coroutines and continuations?

Generators would add real power to the language and are simple
enough that most users could benefit from them.  Also, it should be
possible to design an interface that does not preclude the
addition of coroutines or continuations later.

I'm not volunteering to champion the cause just yet.  I just want
to know if there is some issue I'm missing.

  Neil


From barry@digicool.com  Wed Jan 31 00:24:05 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Tue, 30 Jan 2001 19:24:05 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <200101300206.VAA21925@cj20424-a.reston1.va.home.com>
 <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>
 <20010130092454.D18319@glacier.fnational.com>
Message-ID: <14967.23333.57259.347222@anthem.wooz.org>

>>>>> "NS" == Neil Schemenauer <nas@arctrix.com> writes:

    NS> What's the chances of getting generators into 2.2?  The
    NS> implementation should not be hard.  Didn't Steven Majewski
    NS> have something years ago?  Why do we always get sidetracked on
    NS> trying to figure out how to do coroutines and continuations?

I'd be +1 on someone wrestling PEP 220 from Gordon's icy claws,
renaming it just "Generators" and filling it out for the 2.2 time
frame.  If we want to address coroutines and continuations later, we
can write separate PEPs for them.

Send me a draft.

-Barry


From guido@digicool.com  Wed Jan 31 00:28:44 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 19:28:44 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Tue, 30 Jan 2001 18:53:45 EST."
 <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>
Message-ID: <200101310028.TAA30090@cj20424-a.reston1.va.home.com>

> Not really.  Available time is finite, and this isn't at the top of the list
> of things I'd like to see (resuming the discussion of generators +
> coroutines + iteration protocol comes to mind first).

OK, get going on that one then!

> >> Cool!  Can we resist adding
> >>
> >>     if key:value in dict
> >>
> >> for "parallelism"?  (I know I can ...)
> 
> > That's easy to resist because, unlike ``for key:value in dict'', it's
> > not unambiguous:
> 
> But
> 
>     if (key:value) in dict
> 
> is.  Just trying to help whoever *does* want the PEP <wink>.

OK, I'll pronounce -1 on this one.  It looks ugly to me -- too
reminiscent of C's if (...) required parentheses.  Also it suggests
that (key:value) is a new tuple notation that might be useful in other
contexts -- which it's not.

> > ...
> > I'm certainly more comfortable with just ``for key in dict'' than with
> > the whole slow of extensions using colons.
> 
> What about just the
> 
>     for key:value in dict
>     for index:value in sequence
> 
> extensions?

I'm not against these -- I'd say +0.5.

> The degenerate forms (omitting x or y or both in x:y) are
> mechanical variations so are likely to get raised.

For those, +0.2.

> > But, again, that's for the PEP to fight over.
> 
> PEPs are easier if you Pronounce on things you hate early so that those can
> get recorded in the "BDFL Pronouncements" section without further ado.

At your service -- see above.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Wed Jan 31 00:49:24 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 19:49:24 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Tue, 30 Jan 2001 09:24:54 PST."
 <20010130092454.D18319@glacier.fnational.com>
References: <200101300206.VAA21925@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>
 <20010130092454.D18319@glacier.fnational.com>
Message-ID: <200101310049.TAA30197@cj20424-a.reston1.va.home.com>

> [Tim Peters on adding yet more syntatic sugar]
> > Available time is finite, and this isn't at the top of the list
> > of things I'd like to see (resuming the discussion of
> > generators + coroutines + iteration protocol comes to mind
> > first).
> 
> What's the chances of getting generators into 2.2?  The
> implementation should not be hard.  Didn't Steven Majewski have
> something years ago?  Why do we always get sidetracked on trying
> to figure out how to do coroutines and continuations?

I think there's a very good chance of getting them into 2.2.  But it
*is* true that coroutines are a very attractice piece of land "just
nextdoor".  On the other hand, continiations are a mirage, so don't
try to go there. :-)

> Generators would add real power to the language and are simple
> enough that most users could benefit from them.  Also, it should be
> possible to design an interface that does not preclude the
> addition of coroutines or continuations later.
> 
> I'm not volunteering to champion the cause just yet.  I just want
> to know if there is some issue I'm missing.

There are different ways to do interators.

Here is a very "tame" proposal (and definitely in the realm of 2.2),
that doesn't require any coroutine-like tricks.  Let's propose that

    for var in expr:
	...do something with var...

will henceforth be translated into

    __iter = iterator(expr)
    while __iter.more():
        var = __iter.next()
        ...do something with var...

-- or some variation that combines more() and next() (I don't care).

Then a new built-in function iterator() is needed that creates an
iterator object.  It should try two things:

(1) If the object implements __iterator__() (or a C API equivalent),
    call that and be done; this way arbitrary iterators can be
    created.

(2) If the object smells like a sequence (how to test???), use an
    iterator sort of like this:

    class Iterator:

        def __init__(self, sequence):
            self.sequence = sequence
            self.index = 0

        def more(self):
	    # Store the item so that each index is tried exactly once
            try:
                self.item = self.sequence[self.index]
            except IndexError:
                return 0
            else:
                self.index = self.index + 1
                return 1

        def next(self):
            return self.item

    (I don't necessarily mean that all those instance variables should
    be publicly available.)

The built-in sequence types can use a very fast built-in iterator type
that uses a C int for the index and doesn't store the item in the
iterator.  (This should be as fast as Marc-Andre's for loop
optimization using a C counter.)

Dictionaries can define an appropriate iterator that uses
PyDict_Next().

If the argument to iterator() is itself an iterator (how to test???),
it returns the argument unchanged, so that one can also write

    for var in iterator(obj):
	...do something with var...

Files of course should have iterators that return the next input line.

We could build filtering and mapping iterators that take an iterator
argument and do certain manipulations with the elements; this would
effectively introduce the notion lazy evaluation on sequences.

Etc., etc.

This does not come close to Icon generators -- but it doesn't require
any coroutine-like capabilities, unlike those.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From tim.one@home.com  Wed Jan 31 00:55:10 2001
From: tim.one@home.com (Tim Peters)
Date: Tue, 30 Jan 2001 19:55:10 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3a76df10.22007715@smtp.worldonline.dk>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEJIMAA.tim.one@home.com>

[Finn Bock]
> Changing the type of a type? Yuck!

No, it temporarily changes the type of the single list being sorted, like
so, where "self" is a pointer to a PyListObject (which is a list, not a list
*type* object):

	self->ob_type = &immutable_list_type;
	err = samplesortslice(self->ob_item,
			      self->ob_item + self->ob_size,
			      compare);
	self->ob_type = &PyList_Type;

immutable_list_type is "just like" PyList_Type, except that the slots for
mutating methods point to a function that raises a TypeError.

Before this drastic step came years of increasingly ugly hacks trying to
stop core dumps when people mutated a list during the sort.  Python's sort
is very complex, and lots of pointers are tucked away -- having the size of
the array, or its position in memory, or the set of objects it contains,
change as a side effect of doing a compare, would be difficult and expensive
to recover from -- and by "difficult" read "nobody ever managed to get it
right before this" <0.5 wink>.

> I might very likely be reading the CPython sources wrongly, but it seems
> this trick will cause an BadInternalCall if some other C extension are
> trying to modify a list while it is freezed by the type switching trick.
> I imagine this would happen if the extension called:
>
>   PyList_SetItem(myList, 0, aValue);

Well, in CPython it's not "legal" for any other thread to use the C API
while the sort is in progress, because the thread doing the sort holds the
global interpreter lock for the duration.  So this could happen "legally"
only if a comparison function called by the sort called out to a C extension
attempting to mutate the list.  In that case, fine, it *is* a bad call:
mutation is not allowed during list sorting, so they deserve whatever they
get -- and far better a "bad internal call" than a core dump.

If the immutable_list_type were used more generally, it would require more
general support (but I see Thomas already talked about that -- thanks).



From guido@digicool.com  Wed Jan 31 00:55:19 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 19:55:19 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Tue, 30 Jan 2001 19:24:05 EST."
 <14967.23333.57259.347222@anthem.wooz.org>
References: <200101300206.VAA21925@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com> <20010130092454.D18319@glacier.fnational.com>
 <14967.23333.57259.347222@anthem.wooz.org>
Message-ID: <200101310055.TAA30250@cj20424-a.reston1.va.home.com>

> I'd be +1 on someone wrestling PEP 220 from Gordon's icy claws,
> renaming it just "Generators" and filling it out for the 2.2 time
> frame.  If we want to address coroutines and continuations later, we
> can write separate PEPs for them.

I think it's better not to re-use PEP 220 for that.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From thomas@xs4all.net  Wed Jan 31 00:58:32 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 31 Jan 2001 01:58:32 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101310028.TAA30090@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 30, 2001 at 07:28:44PM -0500
References: <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com> <200101310028.TAA30090@cj20424-a.reston1.va.home.com>
Message-ID: <20010131015832.K962@xs4all.nl>

On Tue, Jan 30, 2001 at 07:28:44PM -0500, Guido van Rossum wrote:

> > What about just the

> >     for key:value in dict
> >     for index:value in sequence

> > extensions?

> I'm not against these -- I'd say +0.5.

What, fractions ? Isn't that against the whole idea of (+|-)(0|1) ? :)
But since we are voting, I'm -0 on this right now, and might end up -1 or
+0, depending on the implementation; I still can't *see* this, though I
wouldn't be myself if I hadn't tried to implement it anyway :) And I ran
into some fairly mind-boggling issues. The worst bit is 'how the f*ck
does FOR_LOOP know if something's a dict or a list'. And the
almost-as-bad bit is 'WTF to do for user classes, extension types and
almost-list/almost-dict practically-builtin types (arrays, the *dbm's,
etc.)'. After some sleep-deprived consideration I gave up and decided we
need an iteration/generator protocol first.

However, my life's been busy (or rather, my work has been) with all kinds
of small and not so small details, and I haven't been getting much sleep
in the last week or so, so I might be overlooking something very simple.
That's why I can go either way based on implementation -- it might prove
me wrong :) Until my boss is back and I stop being 'responsible' (end of
this week, start of next week) and I get a chance to get rid of about 2
months of work backlog (the time he was away) I won't have time to
champion or even contribute to such a PEP. Then again, by that time I
might be preparing for IPC9 (_if_ my boss sends me there) or even my
ApacheCon US presentation (which got accepted today, yay!)

So, if that other message was an attempt to drop the PEP on me, Guido,
the answer is the same as I tend to give to suits that show up next to my
desk wanting to discuss something important (to them) right away:
"b'gg'r 'ff" :)

I'll-save-my-answer-to-PR-officers-doing-the-same-for-when-you-do-something-
 -*really*-offensive-ly <wink> y'rs
-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From guido@digicool.com  Wed Jan 31 01:16:51 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 20:16:51 -0500
Subject: [Python-Dev] Let's release 2.1a2 Thursday night
Message-ID: <200101310116.UAA30386@cj20424-a.reston1.va.home.com>

Things look good for a release of 2.1a2 this week; we're aiming for
Thursday night.  I won't be in town (speaking to the press at
LinuxWorld Expo in New York) but Jeremy will handle the release
process and the other PythonLabs folks will assist him.

Tomorrow Fred will check in his weak references after making some
changes (mostly making it more Spartan :-) that I suggested in a code
review.

After that, I think we're good for the second (and last!) alpha
release; and enough has changed (e.g. nested scopes, lots of setup.py
changes, flat Makefile) to warrant going ahead now.

Now is the time for those last-minute bugfixes that you're all so
famous for!

I propose a checkin freeze for non-PythonLabs folks Wednesday midnight
US west coast time, to give Jeremy c.s. enough time to build the
release and give it a good work-out.  (An internal freeze is up to
Jeremy to declare, but should probably take Tim's sleep cycle into
account.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

PS. I'll be out of reach from noon US east coast time tomorrow
(Wednesday), traveling to New York by train.  I probably won't check
my email while out there; I'll be back Friday night.


From guido@digicool.com  Wed Jan 31 01:35:25 2001
From: guido@digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 20:35:25 -0500
Subject: [Python-Dev] SSL socket read at EOF; SourceForge problem
In-Reply-To: Your message of "Mon, 29 Jan 2001 23:56:09 EST."
 <LNBBLJKPBEHFEDALKOLCEECJIMAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCEECJIMAA.tim.one@home.com>
Message-ID: <200101310135.UAA30629@cj20424-a.reston1.va.home.com>

> I'm going to repeat a question that I posted about a week ago that passed
> without comment on the newsgroup. The issue is the SSL support in the socket
> module, which raises an exception when the reading socket is at EOF, rather
> than returning an empty string. I'm hesitant to call it a "bug", but I
> wouldn't have implemented it this way.  There are the names of two people
> mentioned at the top of socketmodule.c, but no contact information, so I'm
> suggesting here that it be changed to conform to normal file/socket
> practice. (SSL was actually added at 2.0, so I'm late to the party with
> this; mea culpa, mea culpa.  I delayed trying Python2 because of the
> extension rebuilding.)

I agree that it makes more sense if a read at EOF returns an empty
string, since that's what other file-like objects in Python do.  I
can't do much about this right now, but I'd love to see a patch.  It
could go into 2.1a2 if small enough.

Note that input() and raw_input() are specifically excepted because
they are intended for use in interactive mode by newbies mostly; and
because "" as return value for EOF would be ambiguous for these.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From greg@cosc.canterbury.ac.nz  Wed Jan 31 04:12:23 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 31 Jan 2001 17:12:23 +1300 (NZDT)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101310028.TAA30090@cj20424-a.reston1.va.home.com>
Message-ID: <200101310412.RAA03140@s454.cosc.canterbury.ac.nz>

<someone whose attribution has been lost>:

>     for index:value in sequence

-1, because we only construct dicts using that
notation, not sequences.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@digicool.com  Wed Jan 31 05:21:37 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 31 Jan 2001 00:21:37 -0500
Subject: [Python-Dev] codecity.com
Message-ID: <200101310521.AAA31653@cj20424-a.reston1.va.home.com>

Should I spread this word, or is this a joke?  The Python quiz
category is laughable.

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Sat, 27 Jan 2001 23:16:02 -0800
From:    "Jeff Cordova" <jeffc@codecity.com>
To:      <guido@python.org>
Subject: New, fun way to learn Python.

Hi Guido,

  I wanted to let you know about www.codecity.com After several years of
managing large software projects in Silicon Valley, I realized that I was
spending a lot of time teaching jr. programmers how to write code. So, I
created CodeCity to help me automate some of that. If you go to the site,
you'll see that I've created a category for Python. There's not much depth
to the Python content yet (the site is only a week old) but I'm expecting
the Python community to add their wisdom over a period of time. If you could
spread the word, it would be highly appreciated.

Thankyou,

Jeff C.

------- End of Forwarded Message



From tim.one@home.com  Wed Jan 31 06:16:48 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 31 Jan 2001 01:16:48 -0500
Subject: [Python-Dev] codecity.com
In-Reply-To: <200101310521.AAA31653@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEFJIMAA.tim.one@home.com>

[Guido, on www.codecity.com]
> Should I spread this word, or is this a joke?  The Python quiz
> category is laughable.

While the Python section still seems to have only one question, the first
day this was announced the third choice wasn't today's:

    Python is Open Source code, so it doesn't have a creator

but:

    Martha Stewart

I liked it better before <0.9 wink>.



From moshez@zadka.site.co.il  Wed Jan 31 06:30:07 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Wed, 31 Jan 2001 08:30:07 +0200 (IST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101310049.TAA30197@cj20424-a.reston1.va.home.com>
References: <200101310049.TAA30197@cj20424-a.reston1.va.home.com>, <200101300206.VAA21925@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>
 <20010130092454.D18319@glacier.fnational.com>
Message-ID: <20010131063007.536ACA83E@darjeeling.zadka.site.co.il>

On Tue, 30 Jan 2001 19:49:24 -0500, Guido van Rossum <guido@digicool.com> wrote:

> There are different ways to do interators.
> 
> Here is a very "tame" proposal (and definitely in the realm of 2.2),
> that doesn't require any coroutine-like tricks.  Let's propose that
> 
>     for var in expr:
> 	...do something with var...
> 
> will henceforth be translated into
> 
>     __iter = iterator(expr)
>     while __iter.more():
>         var = __iter.next()
>         ...do something with var...

I'm +1 on that...but Tim's "try to use that to write something that
will return the nodes of a binary tree" still haunts me.

Personally, though, I'd thin down the interface to

while 1:
	try:
		var = __iter.next()
	except NoMoreError:
		break # pseudo-break?

With the usual caveat that this is a lie as far as "else" is concerned
(IOW, pseudo-break gets into the else)

> Then a new built-in function iterator() is needed that creates an
> iterator object.  It should try two things:
> 
> (1) If the object implements __iterator__() (or a C API equivalent),
>     call that and be done; this way arbitrary iterators can be
>     created.
 
> (2) If the object smells like a sequence (how to test???), use an
>     iterator sort of like this:

Why not, "if the object doesn't have __iterator__, try this. If it 
won't work, we'll find out by the exception that will be thrown in
our face".

class Iterator:

	def __init__(self, seq):
		self.seq = seq
		self.index = 0

	def next(self):
		try:
			try:
				return self.seq[self.index] # <- smells like
			except IndexError:
				raise NoMoreError(self.index)
		finally:
			self.index += 1

>     (I don't necessarily mean that all those instance variables should
>     be publicly available.)

But what about your poor brother? <wink> Er....I mean, this would make
implementing "indexing" really about just getting the index from the
iterator.

> If the argument to iterator() is itself an iterator (how to test???),

No idea, and this looks problematic. I see your point -- but it's
still problematic.

-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From tim.one@home.com  Wed Jan 31 06:57:26 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 31 Jan 2001 01:57:26 -0500
Subject: [Python-Dev] Can't enter new Python bugs on SourceForge?
Message-ID: <LNBBLJKPBEHFEDALKOLCGEFLIMAA.tim.one@home.com>

Reported this earlier.  Still can't create a new bug.  Guido either.  Here's
the SF Support request opened on this:

http://sourceforge.net/support/
    index.php?func=detailsupport&support_id=113100&group_id=1

The good(?) news is that Python isn't the only project to report this
problem.



From tim.one@home.com  Wed Jan 31 07:50:18 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 31 Jan 2001 02:50:18 -0500
Subject: [Python-Dev] FW: Python programmer needed (addition to urllib2 and HTTPS support)
Message-ID: <LNBBLJKPBEHFEDALKOLCCEFPIMAA.tim.one@home.com>

Get rich quick!

-----Original Message-----
From: python-list-admin@python.org
[mailto:python-list-admin@python.org]On Behalf Of Albert Chin-A-Young
Sent: Wednesday, January 31, 2001 2:31 AM
To: python-list@python.org
Subject: Python programmer needed (addition to urllib2 and HTTPS
support)


We're in need of a contract Python programmer for the following:
  1. Allow connecting to a host with urlopen() which requires
     BASIC HTTP authentication with a proxy (via urllib2.py).
     This should address bug #125217:
     http://sourceforge.net/bugs/?func=detailbug&bug_id=125217&group_id=5470
  2. Allow connecting to a host with urlopen() which requires
     BASIC HTTP authentication with a proxy that requires
     BASIC HTTP authentication (via urllib2.py).
  3. Support for non-authenticated clients to connect to a
     HTTPS server
  4. Support for a client to authenticate the HTTPS host (to
     verify that it's certificate is valid)

What we might consider adding (depends on cost):
  1. Support for authenticated clients to connect to a HTTPS server.

Please note that solutions to the four items above must be rolled back
into the main Python distribution (implies the "community" and the
Python developers need to agree on the adopted solution).

--
albert chin (china at thewrittenword dot com)
--
http://mail.python.org/mailman/listinfo/python-list



From ping@lfw.org  Wed Jan 31 09:47:10 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Wed, 31 Jan 2001 01:47:10 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <Pine.LNX.4.10.10101310142480.8204-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>

On Tue, 30 Jan 2001, Guido van Rossum wrote:
> 
> Can you say "PEP time"? :-)

Okay, i have written a draft PEP that tries to combine the
"elt in dict", custom iterator, and "for k:v" issues into a
coherent proposal.  Have a look:

    http://www.lfw.org/python/pep-iterators.txt
    http://www.lfw.org/python/pep-iterators.html

Could i get a number for this please?


-- ?!ng

"The only `intuitive' interface is the nipple.  After that, it's all learned."
    -- Bruce Ediger, on user interfaces



From moshez@zadka.site.co.il  Wed Jan 31 10:14:49 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Wed, 31 Jan 2001 12:14:49 +0200 (IST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
Message-ID: <20010131101449.B28C5A83E@darjeeling.zadka.site.co.il>

On Wed, 31 Jan 2001 01:47:10 -0800 (PST), Ka-Ping Yee <ping@lfw.org> wrote:

> Okay, i have written a draft PEP that tries to combine the
> "elt in dict", custom iterator, and "for k:v" issues into a
> coherent proposal.  Have a look:
> 
>     http://www.lfw.org/python/pep-iterators.txt
>     http://www.lfw.org/python/pep-iterators.html

Er....one problem with first reading: you forgot to mention in the
while loop description that 'else:' would be executed if the exception
is raised, so the 'break' is a pseudo-break'.

Basic response: I *love* the iter(), sq_iter and __iter__ parts.
I tremble at seeing the rest.
Why not add a method to dictionaries .iteritems() and do

for (k, v) in dict.iteritems():
	pass

(dict.iteritems() would return an an iterator to the items)

-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From MarkH@ActiveState.com  Wed Jan 31 10:34:01 2001
From: MarkH@ActiveState.com (Mark Hammond)
Date: Wed, 31 Jan 2001 21:34:01 +1100
Subject: [Python-Dev] WARNING: Changed build process for zlib on Windows
Message-ID: <LCEPIIGDJPKCOIHOBJEPMEKGDAAA.MarkH@ActiveState.com>

Hi all,
	In an attempt to solve "[ Bug #129293 ] zlib library used for binary win32
distribution can crash"
(https://sourceforge.net/bugs/?func=detailbug&group_id=5470&bug_id=129293),
Tim and I have decided that we should fix the build process of zlib.pyd on
windows.

The current process requires that the builder download _2_ zlib archives - a
binary distribution for zlib.lib, and the source archive for the headers.
We believe that slight differences between the 2 are causing the above bug.
A particular warning-light is that the current process defines ZLIB_DLL even
though we are _not_ currently using the DLL but the static lib.  Removing
this #define generates linker errors.

The new process is very simple, but may break some peoples build.  In theory
it _should_ still work for everyone, but if it fails to build, please check
your directory structure.

>From the comments I just added to zlib.c:

/* *** Notes for Windows Users ***
   * Download the source distribution as referenced above.
   * Unpack the distribution such that a "..\..\zlib-1.1.3" directory is
created
     relative to the "pcbuild" directory.
   * Build this "zlib" project.  Via from MSVC magic, the correct zlib
makefile will
     be run, and "..\..\zlib-1.1.3\zlib.lib" will be built before zlib.pyd.
   *** End of notes for Windows users ***
*/

Specifically, MSVC has a "pre-link step" setup that runs the zlib makefile
from the "..\..\zlib-1.1.3" directory.  The reason this _should_ not break
your build is that your _probably_ already have a "..\..\zlib-1.1.3"
directory installed in the right place so the header files can be located.

Once you have a successful build, you can delete the old "zlib113"
directory, which was the binary-only distribution.

Please let me know if this causes too much pain, or it is in someway broken
for you.

The relevant checkins are Rev 1.15 of PCbuild/zlib.dsp and Rev 2.37 of
Modules/zlibmodules.c.

Thanks,

Mark.



From ping@lfw.org  Wed Jan 31 11:00:48 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Wed, 31 Jan 2001 03:00:48 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010131015832.K962@xs4all.nl>
Message-ID: <Pine.LNX.4.10.10101310252370.8204-100000@skuld.kingmanhall.org>

On Wed, 31 Jan 2001, Thomas Wouters wrote:
> I still can't *see* this, though I
> wouldn't be myself if I hadn't tried to implement it anyway :) And I ran
> into some fairly mind-boggling issues. The worst bit is 'how the f*ck
> does FOR_LOOP know if something's a dict or a list'.

I believe the Pythonic answer to that is "see if the appropriate
method is available".

The best definition of "sequence-like" or "mapping-like" i can
come up with is:

    x is sequence-like if it provides __getitem__() but not keys()
    x is mapping-like if it provides __getitem__() and keys()

But in our case, since we need iteration, we can look for specific
methods that have to do with just what we need for iteration and
nothing else.  Thus, e.g. a mapping-like class without a values()
method is no problem if we never ask to iterate over values.

> And the
> almost-as-bad bit is 'WTF to do for user classes, extension types and
> almost-list/almost-dict practically-builtin types

I think it can be done; the draft PEP at

    http://www.lfw.org/python/pep-iterators.html

is a best-attempt at supporting everything just as you would expect.
Let me know if you think there are important cases it doesn't cover.

I know, the table

    mp_iteritems    __iteritems__, __iter__, items, __getitem__
    mp_iterkeys     __iterkeys__, __iter__, keys, __getitem__
    mp_itervalues   __itervalues__, __iter__, values, __getitem__
    sq_iter         __iter__, __getitem__

might look a little frightening, but it's not so bad, and i think
it's about as simple as you can make it while continuing to support
existing pseudo-lists and pseudo-dictionaries.  No instance should
ever provide __iter__ at the same time as any of the other __iter*__
methods anyway.


-- ?!ng

"The only `intuitive' interface is the nipple.  After that, it's all learned."
    -- Bruce Ediger, on user interfaces



From mal@lemburg.com  Wed Jan 31 11:56:12 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 12:56:12 +0100
Subject: [Python-Dev] Re: from ... import * ([Python-checkins] CVS: python/dist/src/Python
 compile.c,2.153,2.154)
References: <E14NPXJ-0004Re-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <3A77FD5C.DE8729DC@lemburg.com>

> Update of /cvsroot/python/python/dist/src/Python
> In directory usw-pr-cvs1:/tmp/cvs-serv17061/Python
> 
> Modified Files:
>         compile.c 
> Log Message:
> Enforce two illegal import statements that were outlawed in the
> reference manual but not checked: Names bound by import statemants may
> not occur in global statements in the same scope. The from ... import *
> form may only occur in a module scope.
> 
> I guess these changes could break code, but the reference manual
> warned about them.

Jeremy, your code breaks all uses of "from package import submodule"
inside packages.

Try distutils for example or setup.py....

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Wed Jan 31 12:01:24 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 13:01:24 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com>
 <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <3A77FE94.E5082136@lemburg.com>

Guido van Rossum wrote:
> 
> [ESR]
> > For different reasons, I'd like to be able to set a constant flag on a
> > object instance.  Simple semantics: if you try to assign to a
> > member or method, it throws an exception.
> >
> > Application?  I have a large Python program that goes to a lot of effort
> > to build elaborate context structures in core.  It would be nice to know
> > they can't be even inadvertently trashed without throwing an exception I
> > can watch for.
> 
> Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
> 
> - How to spell it?  x.freeze()?  x.readonly()?

How about .lock() and .unlock() ?
 
> - Should this reversible?  I.e. should there be an x.unfreeze()?

Yes. These low-level locks could be used in thread programming
since the above calls are C level functions and thus thread safe
w/r to the global interpreter lock.
 
> - Should we support something like this for instances too?  Sometimes
>   it might be cool to be able to freeze changing attribute values...

Sure :)

Eric, could you write a PEP for this ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Wed Jan 31 12:08:15 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 13:08:15 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCMECAIMAA.tim.one@home.com>
Message-ID: <3A78002F.DC8F0582@lemburg.com>

Tim Peters wrote:
> 
> [MAL]
> > ...
> > What we really want is iterators for dictionaries, so why not
> > implement these instead of tweaking for-loops.
> 
> Seems an unrelated topic:  would "iterators for dictionaries" solve the
> supposed problem with iteration order?

No, but it would solve the problem in a more elegant and
generalized way. Besides, it also allows writing code which
is thread safe, since the iterator can take special actions
to assure that the dictionary doesn't change during the
iteration phase (see the other thread about "making mutable objects
readonly").
 
> > If you are looking for speedups w/r to for-loops, applying a
> > different indexing technique in for-loops would go a lot further
> > and provide better performance not only to dictionary loops,
> > but also to other sequences.
> >
> > I have made some good experience with a special counter object
> > (sort of like a mutable integer) which is used instead of the
> > iteration index integer in the current implementation.
> 
> Please quantify, if possible.  My belief (based on past experiments) is that
> in loops fancier than
> 
>     for i in range(n):
>         pass
> 
> the loop overhead quickly falls into the noise even now.

I don't remember the figures, but these micor optimizations do
speedup loops by a noticable amount. Just compare the performance
of stock Python 1.5 against my patched version.
 
> > Using an iterator object instead of the integer + __getitem__
> > call machinery would allow more flexibility for all kinds of
> > sequences or containers. ...
> 
> This is yet another abrupt change of topic, yes <0.9 wink>?  I agree a new
> iteration *protocol* could have major attractions.

Not really... the counter object is just a special case of
an iterator -- in this case iteration is over the IN.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Wed Jan 31 12:10:43 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 13:10:43 +0100
Subject: [Python-Dev] Re: Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCEECEIMAA.tim.one@home.com>
Message-ID: <3A7800C3.B5D3203F@lemburg.com>

Tim Peters wrote:
> 
> Note that even adding a "frozen" flag would add 4 bytes to every freezable
> object on most machines.  That's why I'd rather .freeze() replace the type
> pointer and .unfreeze() restore it.  No time or space overhead; no
> cluttering up the normal-case (i.e., unfrozen) type implementations with new
> tests.

Note that Fred's weak ref implementation also need a flag on every
weak referencable object (at least last time I looked at his patches).

Why not add a flag byte or word to these objects -- then we'd have
8 or 16 choices of what to do with them ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From MarkH@ActiveState.com  Wed Jan 31 12:18:12 2001
From: MarkH@ActiveState.com (Mark Hammond)
Date: Wed, 31 Jan 2001 23:18:12 +1100
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A77FE94.E5082136@lemburg.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPEEKJDAAA.MarkH@ActiveState.com>

MAL writes:

> > - How to spell it?  x.freeze()?  x.readonly()?
>
> How about .lock() and .unlock() ?

I'm with Greg here - lock() and unlock() imply an operation similar to
threading.Lock() - ie, exclusivity rather than immutability.

I don't have a strong opinion on the other names, but definately prefer any
of the others over lock() for this operation.

Mark.



From mal@lemburg.com  Wed Jan 31 12:26:07 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 13:26:07 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LCEPIIGDJPKCOIHOBJEPEEKJDAAA.MarkH@ActiveState.com>
Message-ID: <3A78045F.7DB50871@lemburg.com>

Mark Hammond wrote:
> 
> MAL writes:
> 
> > > - How to spell it?  x.freeze()?  x.readonly()?
> >
> > How about .lock() and .unlock() ?
> 
> I'm with Greg here - lock() and unlock() imply an operation similar to
> threading.Lock() - ie, exclusivity rather than immutability.
> 
> I don't have a strong opinion on the other names, but definately prefer any
> of the others over lock() for this operation.

Funny, I though that .lock() and .unlock() could be used to
implement exactly what threading.Lock() does...

Anyway, names really don't matter much, so how about: 

.mutable([flag]) -> integer

  If called without argument, returns 1/0 depending on whether
  the object is mutable or not. When called with a flag argument,
  sets the mutable state of the object to the value indicated
  by flag and returns the previous flag state.

The semantics of this interface would be in sync with many other
state APIs in Python and C (e.g. setlocale()).

The advantage of making this a method should be clear: it allows
writing polymorphic code.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From Samuele Pedroni <pedroni@inf.ethz.ch>  Wed Jan 31 12:34:32 2001
From: Samuele Pedroni <pedroni@inf.ethz.ch> (Samuele Pedroni)
Date: Wed, 31 Jan 2001 13:34:32 +0100 (MET)
Subject: [Python-Dev] weak refs and jython
Message-ID: <200101311234.NAA24584@core.inf.ethz.ch>

Hi.

I have read weak ref PEP, maybe too late.
I don't know if portability of code using weak refs between python and jython
was a goal or could be one, and up to which extent actual impl. will correspond 
to the PEP.

But about

    The callbacks registered with weak references must accept a single
    parameter, which will be the weak-ly referenced object itself.
    The object can be resurrected by creating some other reference to
    the object in the callback, in which case the weak reference
    generating the callback will still be cleared but no remaining
    weak references to the object will be cleared.
    
AFAIK using java weak refs (which I think is a natural choice) I see
no way (at least no worth-the-effort way) to implement this in jython.
Java weak refs cannot be resurrected.

regards, Samuele Pedroni.


PS: Mr. X  is a jython developer.



From bckfnn@worldonline.dk  Wed Jan 31 12:49:22 2001
From: bckfnn@worldonline.dk (Finn Bock)
Date: Wed, 31 Jan 2001 12:49:22 GMT
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <200101302042.PAA29301@cj20424-a.reston1.va.home.com>
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com>   <20010130165204.I962@xs4all.nl>  <200101302042.PAA29301@cj20424-a.reston1.va.home.com>
Message-ID: <3a7809c0.14839067@smtp.worldonline.dk>

>> > Note that Jeremy is only raising errors for "from M import *".
>> 
>> No, he says he's also raising errors for 'import spam' if 'spam' is declared
>> global, like so:
>> 
>> def viking():
>>     global spam
>>     import spam
>
>Yeah, this was just brought to my attention at our group meeting
>today.  I'm with you on this one -- there really isn't a good reason
>why this shouldn't work.  (I wonder why that constraint was ever added
>to the reference manual; maybe I was just upset that someone would
>*do* something as ugly as that, or maybe there was a J[P]ython
>reason???.)

Previously Jython have had problems with "from .. import *" in function
scope, and still have problems when used with the python -> java
compiler:

http://sourceforge.net/bugs/?func=detailbug&bug_id=122834&group_id=12867

Using global on an import name is currently ignored by Jython because
the name assignment is done by the runtime, not the compiler.

regards,
finn


From thomas@xs4all.net  Wed Jan 31 12:59:14 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 31 Jan 2001 13:59:14 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <3a7809c0.14839067@smtp.worldonline.dk>; from bckfnn@worldonline.dk on Wed, Jan 31, 2001 at 12:49:22PM +0000
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com> <20010130165204.I962@xs4all.nl> <200101302042.PAA29301@cj20424-a.reston1.va.home.com> <3a7809c0.14839067@smtp.worldonline.dk>
Message-ID: <20010131135914.N962@xs4all.nl>

On Wed, Jan 31, 2001 at 12:49:22PM +0000, Finn Bock wrote:

> Using global on an import name is currently ignored by Jython because
> the name assignment is done by the runtime, not the compiler.

So it's impossible to do, in Jython, something like:

def fillme():
    global me
    import me

but it is possible to do:

def fillme():
    global me
    import me as _me
    me = _me

? I have to say I don't like that; we're always claiming 'import' (and
'def' and 'class' for that matter) are 'just another way of writing
assignment'. All these special cases break that.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From bckfnn@worldonline.dk  Wed Jan 31 13:35:36 2001
From: bckfnn@worldonline.dk (Finn Bock)
Date: Wed, 31 Jan 2001 13:35:36 GMT
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <20010131135914.N962@xs4all.nl>
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com> <20010130165204.I962@xs4all.nl> <200101302042.PAA29301@cj20424-a.reston1.va.home.com> <3a7809c0.14839067@smtp.worldonline.dk> <20010131135914.N962@xs4all.nl>
Message-ID: <3a780eda.16144995@smtp.worldonline.dk>

On Wed, 31 Jan 2001 13:59:14 +0100, you wrote:

>On Wed, Jan 31, 2001 at 12:49:22PM +0000, Finn Bock wrote:
>
>> Using global on an import name is currently ignored by Jython because
>> the name assignment is done by the runtime, not the compiler.
>
>So it's impossible to do, in Jython, something like:
>
>def fillme():
>    global me
>    import me
>
>but it is possible to do:
>
>def fillme():
>    global me
>    import me as _me
>    me = _me
>
>?

Yes, only the second example will make a global variable.

> I have to say I don't like that; we're always claiming 'import' (and
>'def' and 'class' for that matter) are 'just another way of writing
>assignment'. All these special cases break that.

I don't like it either, I was only reported what jython currently does.
The current design used by Jython does lend itself directly towards a
solution, but I don't see anything that makes it impossible to solve.

regards,
finn


From mal@lemburg.com  Wed Jan 31 14:34:19 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 15:34:19 +0100
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com> <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <3A78226B.2E177EFE@lemburg.com>

Michael Hudson wrote:
> 
> In the interest of generating some numbers (and filling up my hard
> drive), last night I wrote a script to build lots & lots of versions
> of python (many of which turned out to be redundant - eg. -O6 didn't
> seem to do anything different to -O3 and pybench doesn't work with
> 1.5.2), and then run pybench with them.  Summarised results below;
> first a key:
> 
> src-n: this morning's CVS (with Jeremy's f_localsplus optimisation)
>         (only built this with -O3)
> src: CVS from yesterday afternoon
> src-obmalloc: CVS from yesterday afternoon with Vladimir's obmalloc
>         patch applied.  More on this later...
> Python-2.0: you can guess what this is.
> 
> All runs are compared against Python-2.0-O2:
> 
> Benchmark: src-n-O3 (rounds=10, warp=20)
>             Average round time:   49029.00 ms              -0.86%
> Benchmark: src (rounds=10, warp=20)
>             Average round time:   67141.00 ms             +35.76%
> Benchmark: src-O (rounds=10, warp=20)
>             Average round time:   50167.00 ms              +1.44%
> Benchmark: src-O2 (rounds=10, warp=20)
>             Average round time:   49641.00 ms              +0.37%
> Benchmark: src-O3 (rounds=10, warp=20)
>             Average round time:   49104.00 ms              -0.71%
> Benchmark: src-O6 (rounds=10, warp=20)
>             Average round time:   49131.00 ms              -0.66%
> Benchmark: src-obmalloc (rounds=10, warp=20)
>             Average round time:   63276.00 ms             +27.94%
> Benchmark: src-obmalloc-O (rounds=10, warp=20)
>             Average round time:   46927.00 ms              -5.11%
> Benchmark: src-obmalloc-O2 (rounds=10, warp=20)
>             Average round time:   46146.00 ms              -6.69%
> Benchmark: src-obmalloc-O3 (rounds=10, warp=20)
>             Average round time:   46456.00 ms              -6.07%
> Benchmark: src-obmalloc-O6 (rounds=10, warp=20)
>             Average round time:   46450.00 ms              -6.08%
> Benchmark: Python-2.0 (rounds=10, warp=20)
>             Average round time:   68933.00 ms             +39.38%
> Benchmark: Python-2.0-O (rounds=10, warp=20)
>             Average round time:   49542.00 ms              +0.17%
> Benchmark: Python-2.0-O3 (rounds=10, warp=20)
>             Average round time:   48262.00 ms              -2.41%
> Benchmark: Python-2.0-O6 (rounds=10, warp=20)
>             Average round time:   48273.00 ms              -2.39%
> 
> My conclusion?  Python 2.1 is slower than Python 2.0, but not by
> enough to care about.

What compiler did you use and on which platform ?

I have made similar experience with -On with n>3 compared to -O2
using pgcc (gcc optimized for PC processors). BTW, the Linux
kernel uses "-Wall -Wstrict-prototypes -O3 -fomit-frame-pointer"
as CFLAGS -- perhaps Python should too on Linux ?!
 
Does anybody know about the effect of -fomit-frame-pointer ?
Would it cause problems or produce code which is not compatible
with code compiled without this flag ?

> Interestingly, adding obmalloc speeds things up.  Let's take a closer
> look:
> 
> $ python pybench.py -c src-obmalloc-O3 -s src-O3
> PYBENCH 0.7
> 
> Benchmark: src-O3 (rounds=10, warp=20)
> 
> Tests:                              per run    per oper.  diff *
> ------------------------------------------------------------------------
>           BuiltinFunctionCalls:     843.35 ms    6.61 us   +2.93%
>            BuiltinMethodLookup:     878.70 ms    1.67 us   +0.56%
>                  ConcatStrings:    1068.80 ms    7.13 us   -1.22%
>                  ConcatUnicode:    1373.70 ms    9.16 us   -1.24%
>                CreateInstances:    1433.55 ms   34.13 us   +9.06%
>        CreateStringsWithConcat:    1031.75 ms    5.16 us  +10.95%
>        CreateUnicodeWithConcat:    1277.85 ms    6.39 us   +3.14%
>                   DictCreation:    1275.80 ms    8.51 us  +44.22%
>                       ForLoops:    1415.90 ms  141.59 us   -0.64%
>                     IfThenElse:    1152.70 ms    1.71 us   -0.15%
>                    ListSlicing:     397.40 ms  113.54 us   -0.53%
>                 NestedForLoops:     789.75 ms    2.26 us   -0.37%
>           NormalClassAttribute:     935.15 ms    1.56 us   -0.41%
>        NormalInstanceAttribute:     961.15 ms    1.60 us   -0.60%
>            PythonFunctionCalls:    1079.65 ms    6.54 us   -1.00%
>              PythonMethodCalls:     908.05 ms   12.11 us   -0.88%
>                      Recursion:     838.50 ms   67.08 us   -0.00%
>                   SecondImport:     741.20 ms   29.65 us  +25.57%
>            SecondPackageImport:     744.25 ms   29.77 us  +18.66%
>          SecondSubmoduleImport:     947.05 ms   37.88 us  +25.60%
>        SimpleComplexArithmetic:    1129.40 ms    5.13 us  +114.92%
>         SimpleDictManipulation:    1048.55 ms    3.50 us   -0.00%
>          SimpleFloatArithmetic:     746.05 ms    1.36 us   -2.75%
>       SimpleIntFloatArithmetic:     823.35 ms    1.25 us   -0.37%
>        SimpleIntegerArithmetic:     823.40 ms    1.25 us   -0.37%
>         SimpleListManipulation:    1004.70 ms    3.72 us   +0.01%
>           SimpleLongArithmetic:     865.30 ms    5.24 us  +100.65%
>                     SmallLists:    1657.65 ms    6.50 us   +6.63%
>                    SmallTuples:    1143.95 ms    4.77 us   +2.90%
>          SpecialClassAttribute:     949.00 ms    1.58 us   -0.22%
>       SpecialInstanceAttribute:    1353.05 ms    2.26 us   -0.73%
>                 StringMappings:    1161.00 ms    9.21 us   +7.30%
>               StringPredicates:    1069.65 ms    3.82 us   -5.30%
>                  StringSlicing:     846.30 ms    4.84 us   +8.61%
>                      TryExcept:    1590.40 ms    1.06 us   -0.49%
>                 TryRaiseExcept:    1104.65 ms   73.64 us  +24.46%
>                   TupleSlicing:     681.10 ms    6.49 us   -3.13%
>                UnicodeMappings:    1021.70 ms   56.76 us   +0.79%
>              UnicodePredicates:    1308.45 ms    5.82 us   -4.79%
>              UnicodeProperties:    1148.45 ms    5.74 us  +13.67%
>                 UnicodeSlicing:     984.15 ms    5.62 us   -0.51%
> ------------------------------------------------------------------------
>             Average round time:   49104.00 ms              +5.70%
> 
> *) measured against: src-obmalloc-O3 (rounds=10, warp=20)
> 
> Words fail me slightly, but maybe some tuning of the memory allocation
> of longs & complex numbers would be in order?

AFAIR, Vladimir's malloc implementation favours small objects.
All number objects (except longs) fall into this category.

Perhaps we should think about adding his lib to the core ?!

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Wed Jan 31 14:39:01 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 15:39:01 +0100
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com> <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <3A782385.5B544CD5@lemburg.com>

> In the interest of generating some numbers (and filling up my hard
> drive), last night I wrote a script to build lots & lots of versions
> of python (many of which turned out to be redundant - eg. -O6 didn't
> seem to do anything different to -O3 and pybench doesn't work with
> 1.5.2), and then run pybench with them. 

FYI, I've just updated the archive to also work under Python 1.5.x:

	http://www.lemburg.com/python/pybench-0.7.zip

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mwh21@cam.ac.uk  Wed Jan 31 15:52:23 2001
From: mwh21@cam.ac.uk (Michael Hudson)
Date: 31 Jan 2001 15:52:23 +0000
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: "M.-A. Lemburg"'s message of "Wed, 31 Jan 2001 15:34:19 +0100"
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com> <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk> <3A78226B.2E177EFE@lemburg.com>
Message-ID: <m3itmv7m88.fsf@atrus.jesus.cam.ac.uk>

"M.-A. Lemburg" <mal@lemburg.com> writes:

> > My conclusion?  Python 2.1 is slower than Python 2.0, but not by
> > enough to care about.
> 
> What compiler did you use and on which platform ?

Argh, sorry; I meant to put this in!

$ uname -a
Linux atrus.jesus.cam.ac.uk 2.2.14-1.1.0 #1 Thu Jan 6 05:12:58 EST 2000 i686 unknown
$ gcc --version
2.95.1

It's a Dell Dimension XPS D233 (a 233MHz PII) with a reasonably fast
hard drive (two year old 10G IBM 7200rpm thingy) and quite a lot of
RAM (192Mb).

[snip]
 
> AFAIR, Vladimir's malloc implementation favours small objects.
> All number objects (except longs) fall into this category.

Well, longs & complex numbers don't do any free list handling (like
floats and int do), so I see two conclusions:

1) Don't add obmalloc to the core, but do simple free list stuff for
   longs (might be tricky) and complex nubmers (this should be a
   no-brainer).
2) Integrate obmalloc - then maybe we can ditch all of that icky
   freelist stuff.

> Perhaps we should think about adding his lib to the core ?!

Strikes me as the better solution.  Can anyone try this on Windows?
Seeing as windows malloc reputedly sucks, maybe the differences would
be bigger.

Cheers,
M.

-- 
  Our lecture theatre has just crashed. It will currently only
  silently display an unexplained line-drawing of a large dog
  accompanied by spookily flickering lights.
     -- Dan Sheppard, ucam.chat (from Owen Dunn's summary of the year)



From barry@digicool.com  Wed Jan 31 16:42:28 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Wed, 31 Jan 2001 11:42:28 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
References: <Pine.LNX.4.10.10101310142480.8204-100000@skuld.kingmanhall.org>
 <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
Message-ID: <14968.16500.594486.613828@anthem.wooz.org>

>>>>> "KY" == Ka-Ping Yee <ping@lfw.org> writes:

    KY> Could i get a number for this please?

Looks like you beat Eric to PEP 234. :)

I'll update PEP 0 and let you check in your txt file.  I may want to
do an editorial pass over it.

-Barry


From barry@digicool.com  Wed Jan 31 16:50:10 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Wed, 31 Jan 2001 11:50:10 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
 <20010131101449.B28C5A83E@darjeeling.zadka.site.co.il>
Message-ID: <14968.16962.830739.920771@anthem.wooz.org>

>>>>> "MZ" == Moshe Zadka <moshez@zadka.site.co.il> writes:

    MZ> Basic response: I *love* the iter(), sq_iter and __iter__
    MZ> parts.  I tremble at seeing the rest.  Why not add a method to
    MZ> dictionaries .iteritems() and do

    | for (k, v) in dict.iteritems():
    | 	pass

    MZ> (dict.iteritems() would return an an iterator to the items)

Moshe, I had exactly the same reaction and exactly the same idea.  I'm
a strong -1 on introducing new syntax for this when new methods can
handle it in a much more readable way (IMO).

Another idea would be to allow the iterator() method to take an
argument:

    for key in dict.iterator()

a.k.a.

    for key in dict.iterator(KEYS)

and also

    for value in dict.iterator(VALUES)
    for key, value in dict.iterator(ITEMS)

One problem is that the constants KEYS, VALUES, and ITEMS would either
have to be defined some place, or you'd just use values like 0, 1, 2,
which is less readable perhaps than just having iteratoritems(),
iteratorkeys(), and iteratorvalues() methods.  Alternative spellings:

    itemsiter(), keysiter(), valsiter()
    itemsiterator(), keysiterator(), valuesiterator()
    iiterator(), kiterator(), viterator()

ad-nauseum-ly y'rs,
-Barry


From skip@mojam.com (Skip Montanaro)  Wed Jan 31 16:11:19 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 31 Jan 2001 10:11:19 -0600 (CST)
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A77FE94.E5082136@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
 <3A756FF8.B7185FA2@lemburg.com>
 <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
 <3A75B190.3FD2A883@lemburg.com>
 <200101291922.OAA13321@cj20424-a.reston1.va.home.com>
 <20010129150247.B10191@thyrsus.com>
 <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
 <3A77FE94.E5082136@lemburg.com>
Message-ID: <14968.14631.419491.440774@beluga.mojam.com>

What stimulated this thread about making mutable objects (temporarily)
immutable?  Can someone give me an example where this is actually useful and
can't be handled through some existing mechanism?  I'm definitely with
Fredrik on this one.  Sounds like madness to me.

I'm just guessing here, but since the most common need for immutable objects
is a dictionary keys, I can envision having to test the lock state of a list
or dict that someone wants to use as a key everywhere you would normally
call has_key:

    if l.islocked() and d.has_key(l):
       ...

If you want immutable dicts or lists in order to use them as dictionary
keys, just serialize them first:

    survey_says = {"spam": 14, "eggs": 42}
    sl = marshal.dumps(survey_says)
    dict[sl] = "spam"

Here's another pitfall I can envision.

    survey_says = {"spam": 14, "eggs": 42}
    survey_says.lock()
    dict[survey_says] = "Richard Dawson"
    survey_says.unlock()

At this point can I safely iterate over the keys in the dictionary or not?

Skip



From skip@mojam.com (Skip Montanaro)  Wed Jan 31 15:57:30 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 31 Jan 2001 09:57:30 -0600 (CST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <Pine.LNX.4.10.10101310252370.8204-100000@skuld.kingmanhall.org>
References: <20010131015832.K962@xs4all.nl>
 <Pine.LNX.4.10.10101310252370.8204-100000@skuld.kingmanhall.org>
Message-ID: <14968.13802.22823.702114@beluga.mojam.com>

    Ping>     x is sequence-like if it provides __getitem__() but not keys()

So why does this barf?

    >>> [].__getitem__
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    AttributeError: __getitem__

(Obviously, lists *do* understand __getitem__ at some level.  Why isn't it
exposed in the method table?)

Skip


From fredrik@pythonware.com  Wed Jan 31 17:19:44 2001
From: fredrik@pythonware.com (Fredrik Lundh)
Date: Wed, 31 Jan 2001 18:19:44 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org><20010131101449.B28C5A83E@darjeeling.zadka.site.co.il> <14968.16962.830739.920771@anthem.wooz.org>
Message-ID: <007301c08baa$02908220$e46940d5@hagrid>

barry wrote:
> Alternative spellings:
> 
>     itemsiter(), keysiter(), valsiter()
>     itemsiterator(), keysiterator(), valuesiterator()
>     iiterator(), kiterator(), viterator()

shouldn't that be xitems, xkeys, xvalues?

</F>



From mal@lemburg.com  Wed Jan 31 17:21:02 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 18:21:02 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
 <3A756FF8.B7185FA2@lemburg.com>
 <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
 <3A75B190.3FD2A883@lemburg.com>
 <200101291922.OAA13321@cj20424-a.reston1.va.home.com>
 <20010129150247.B10191@thyrsus.com>
 <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
 <3A77FE94.E5082136@lemburg.com> <14968.14631.419491.440774@beluga.mojam.com>
Message-ID: <3A78497E.8BCF197E@lemburg.com>

Skip Montanaro wrote:
> 
> What stimulated this thread about making mutable objects (temporarily)
> immutable?  Can someone give me an example where this is actually useful and
> can't be handled through some existing mechanism?  I'm definitely with
> Fredrik on this one.  Sounds like madness to me.

This thread is an offspring of the "for something in dict:" thread.
The problem we face when iterating over mutable objects is that
the underlying objects can change. By marking them read-only we can
safely iterate over their contents.

Another advantage of being able to mark mutable as read-only is
that they may become usable as dictionary keys. Optimizations such
as self-reorganizing read-only dictionaries would also become
possible (e.g. attribute dictionaries which are read-only could
calculate a second hash value to make the hashing perfect).
 
> I'm just guessing here, but since the most common need for immutable objects
> is a dictionary keys, I can envision having to test the lock state of a list
> or dict that someone wants to use as a key everywhere you would normally
> call has_key:
> 
>     if l.islocked() and d.has_key(l):
>        ...
> 
> If you want immutable dicts or lists in order to use them as dictionary
> keys, just serialize them first:
> 
>     survey_says = {"spam": 14, "eggs": 42}
>     sl = marshal.dumps(survey_says)
>     dict[sl] = "spam"

Sure and that's what .items(), .keys() and .values() do. The idea
was to avoid the extra step of creating lists or tuples first.
 
> Here's another pitfall I can envision.
> 
>     survey_says = {"spam": 14, "eggs": 42}
>     survey_says.lock()
>     dict[survey_says] = "Richard Dawson"
>     survey_says.unlock()
>
> At this point can I safely iterate over the keys in the dictionary or not?

Tim already pointed out that we will need two different read-only
states:

	a) temporary
	b) permanent

For dictionaries to become usable as keys in another dictionary,
they'd have to marked permanently read-only.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From jeremy@alum.mit.edu  Wed Jan 31 04:35:58 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Tue, 30 Jan 2001 23:35:58 -0500 (EST)
Subject: [Python-Dev] Re: from ... import * ([Python-checkins] CVS: python/dist/src/Python
 compile.c,2.153,2.154)
In-Reply-To: <3A77FD5C.DE8729DC@lemburg.com>
References: <E14NPXJ-0004Re-00@usw-pr-cvs1.sourceforge.net>
 <3A77FD5C.DE8729DC@lemburg.com>
Message-ID: <14967.38446.700271.122029@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal@lemburg.com> writes:

  >> Modified Files: compile.c Log Message: Enforce two illegal import
  >> statements that were outlawed in the reference manual but not
  >> checked: Names bound by import statemants may not occur in global
  >> statements in the same scope. The from ... import * form may only
  >> occur in a module scope.
  >>
  >> I guess these changes could break code, but the reference manual
  >> warned about them.

  MAL> Jeremy, your code breaks all uses of "from package import
  MAL> submodule" inside packages.

  MAL> Try distutils for example or setup.py....

Quite aside from whether the changes should be preserved, I don't see
how "from package import submodule" is affected.  I ran setup.py
without any problem; I wouldn't have been able to build Python
otherwise.  I wrote some simple test cases and didn't have any trouble
with the form you describe.

Can you provide a concrete example?  It may be that something other
than the changes mentioned above that is causing you problems.

Jeremy


From jeremy@alum.mit.edu  Wed Jan 31 04:35:58 2001
From: jeremy@alum.mit.edu (Jeremy Hylton)
Date: Tue, 30 Jan 2001 23:35:58 -0500 (EST)
Subject: [Python-Dev] Re: from ... import * ([Python-checkins] CVS: python/dist/src/Python
 compile.c,2.153,2.154)
In-Reply-To: <3A77FD5C.DE8729DC@lemburg.com>
References: <E14NPXJ-0004Re-00@usw-pr-cvs1.sourceforge.net>
 <3A77FD5C.DE8729DC@lemburg.com>
Message-ID: <14967.38446.700271.122029@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal@lemburg.com> writes:

  >> Modified Files: compile.c Log Message: Enforce two illegal import
  >> statements that were outlawed in the reference manual but not
  >> checked: Names bound by import statemants may not occur in global
  >> statements in the same scope. The from ... import * form may only
  >> occur in a module scope.
  >>
  >> I guess these changes could break code, but the reference manual
  >> warned about them.

  MAL> Jeremy, your code breaks all uses of "from package import
  MAL> submodule" inside packages.

  MAL> Try distutils for example or setup.py....

Quite aside from whether the changes should be preserved, I don't see
how "from package import submodule" is affected.  I ran setup.py
without any problem; I wouldn't have been able to build Python
otherwise.  I wrote some simple test cases and didn't have any trouble
with the form you describe.

Can you provide a concrete example?  It may be that something other
than the changes mentioned above that is causing you problems.

Jeremy


From barry@digicool.com  Wed Jan 31 17:20:24 2001
From: barry@digicool.com (Barry A. Warsaw)
Date: Wed, 31 Jan 2001 12:20:24 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
 <20010131101449.B28C5A83E@darjeeling.zadka.site.co.il>
 <14968.16962.830739.920771@anthem.wooz.org>
 <007301c08baa$02908220$e46940d5@hagrid>
Message-ID: <14968.18776.644453.903217@anthem.wooz.org>

>>>>> "FL" == Fredrik Lundh <fredrik@pythonware.com> writes:

    FL> shouldn't that be xitems, xkeys, xvalues?

Or iitems(), ikeys(), ivalues()?

Personally, I don't much care.  If we get consensus on the more
important issue of going with methods instead of new syntax, I'm sure
Guido will pick whatever method names appeal to him most.

-Barry


From ping@lfw.org  Wed Jan 31 17:14:15 2001
From: ping@lfw.org (Ka-Ping Yee)
Date: Wed, 31 Jan 2001 09:14:15 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <14968.13802.22823.702114@beluga.mojam.com>
Message-ID: <Pine.LNX.4.10.10101310903380.8204-100000@skuld.kingmanhall.org>

On Wed, 31 Jan 2001, Skip Montanaro wrote:
> Ping> x is sequence-like if it provides __getitem__() but not keys()
> 
> So why does this barf?
> 
>     >>> [].__getitem__

I was describing how to tell if instances are sequence-like.  Before
we get to make that judgement, first we have to look at the C method
table.  So:

    x is sequence-like if it has tp_as_sequence;
        all instances have tp_as_sequence;
            an instance is sequence-like if it has __getitem__() but not keys()

    x is mapping-like if it has tp_as_mapping;
        all instances have tp_as_mapping;
            an instance is mapping-like if it has both __getitem__() and keys()

The "in" operator is implemented this way.

    x customizes "in" if it has sq_contains;
        all instances have sq_contains;
            an instance customizes "in" if it has __contains__()

If sq_contains is missing, or if an instance has no __contains__ method,
we supply the default behaviour by comparing the operand to each member
of x in turn.  This default behaviour is implemented twice: once in
PyObject_Contains, and once in instance_contains.

So i proposed this same structure for sq_iter and __iter__.

    x customizes "for ... in x" if it has sq_iter;
        all instances have sq_iter;
            an instance customizes "in" if it has __iter__()

If sq_iter is missing, or if an instance has no __iter__ method,
we supply the default behaviour by calling PyObject_GetItem on x
and incrementing the index until IndexError.


-- ?!ng

"The only `intuitive' interface is the nipple.  After that, it's all learned."
    -- Bruce Ediger, on user interfaces



From mal@lemburg.com  Wed Jan 31 17:57:20 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 18:57:20 +0100
Subject: [Python-Dev] Re: from ... import * ([Python-checkins] CVS: python/dist/src/Python
 compile.c,2.153,2.154)
References: <E14NPXJ-0004Re-00@usw-pr-cvs1.sourceforge.net>
 <3A77FD5C.DE8729DC@lemburg.com> <14967.38446.700271.122029@localhost.localdomain>
Message-ID: <3A785200.FFB37CAD@lemburg.com>

Jeremy Hylton wrote:
> 
> >>>>> "MAL" == M -A Lemburg <mal@lemburg.com> writes:
> 
>   >> Modified Files: compile.c Log Message: Enforce two illegal import
>   >> statements that were outlawed in the reference manual but not
>   >> checked: Names bound by import statemants may not occur in global
>   >> statements in the same scope. The from ... import * form may only
>   >> occur in a module scope.
>   >>
>   >> I guess these changes could break code, but the reference manual
>   >> warned about them.
> 
>   MAL> Jeremy, your code breaks all uses of "from package import
>   MAL> submodule" inside packages.
> 
>   MAL> Try distutils for example or setup.py....
> 
> Quite aside from whether the changes should be preserved, I don't see
> how "from package import submodule" is affected.  I ran setup.py
> without any problem; I wouldn't have been able to build Python
> otherwise.  I wrote some simple test cases and didn't have any trouble
> with the form you describe.

Perhaps you still had old .pyc files in your installation dir ?
 
> Can you provide a concrete example?  It may be that something other
> than the changes mentioned above that is causing you problems.

The distutils code is full of imports like these (and other
code I'm running is too):

distutils/cmd.py:

    def __init__ (self, dist):
        """Create and initialize a new Command object.  Most importantly,
        invokes the 'initialize_options()' method, which is the real
        initializer and depends on the actual command being
        instantiated.
        """
        # late import because of mutual dependence between these classes
        from distutils.dist import Distribution

This is the report I got from Benjamin Collar:

> I've gotten the newest CVS tarball, but setup.py is still not
> working; this time with a different error. I will resubmit a bug on
> sourceforge if that's the proper way to handle this. Here's the error:
> 
> ./python ./setup.py build
> Traceback (most recent call last):
>   File "./setup.py", line 12, in ?
>     from distutils.core import Extension, setup
>   File "/usr/src/python/dist/src/Lib/distutils/core.py", line 20, in ?
>     from distutils.cmd import Command
>   File "/usr/src/python/dist/src/Lib/distutils/cmd.py", line 15, in ?
>     from distutils import util, dir_util, file_util, archive_util,
> dep_util
> SyntaxError: 'from ... import *' may only occur in a module scope
> make: *** [sharedmods] Error 1

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From skip@mojam.com (Skip Montanaro)  Wed Jan 31 18:33:56 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 31 Jan 2001 12:33:56 -0600 (CST)
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A78497E.8BCF197E@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
 <3A756FF8.B7185FA2@lemburg.com>
 <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
 <3A75B190.3FD2A883@lemburg.com>
 <200101291922.OAA13321@cj20424-a.reston1.va.home.com>
 <20010129150247.B10191@thyrsus.com>
 <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
 <3A77FE94.E5082136@lemburg.com>
 <14968.14631.419491.440774@beluga.mojam.com>
 <3A78497E.8BCF197E@lemburg.com>
Message-ID: <14968.23188.573257.392841@beluga.mojam.com>

    MAL> This thread is an offspring of the "for something in dict:" thread.
    MAL> The problem we face when iterating over mutable objects is that the
    MAL> underlying objects can change. By marking them read-only we can
    MAL> safely iterate over their contents.

I suspect you'll find it difficult to mark dbm/bsddb/gdbm files read-only.
(And what about Andy Dustman's cool sqldict stuff?)  If you can't extend
this concept in a reasonable fashion to cover (most of) the other objects
that smell like dictionaries, I think you'll just be adding needless
complications for a feature than can't be used where it's really needed.

I see no problem asking for the items() of an in-memory dictionary in order
to get a predictable list to iterate over, but doing that for disk-based
mappings would be next to impossible.  So, I'm stuck iterating over
something can can change out from under me.  In the end, the programmer will
still have to handle border cases specially.  Besides, even if you *could*
lock your disk-based mapping, are you really going to do that in situations
where its sharable (that's what databases they are there for, after all)?  I
suspect you're going to keep the database mutable and work around any
resulting problems.

If you want to implement "for key in dict:", why not just have the VM call
keys() under the covers and use that list?  It would be no worse than the
situation today where you call "for key in dict.keys():", and with the same
caveats.  If you're dumb enough to do that for an on-disk mapping object,
well, you get what you asked for.

Skip


From esr@thyrsus.com  Wed Jan 31 17:55:00 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 31 Jan 2001 12:55:00 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A78045F.7DB50871@lemburg.com>; from mal@lemburg.com on Wed, Jan 31, 2001 at 01:26:07PM +0100
References: <LCEPIIGDJPKCOIHOBJEPEEKJDAAA.MarkH@ActiveState.com> <3A78045F.7DB50871@lemburg.com>
Message-ID: <20010131125500.C5151@thyrsus.com>

M.-A. Lemburg <mal@lemburg.com>:
> Anyway, names really don't matter much, so how about: 
> 
> .mutable([flag]) -> integer
> 
>   If called without argument, returns 1/0 depending on whether
>   the object is mutable or not. When called with a flag argument,
>   sets the mutable state of the object to the value indicated
>   by flag and returns the previous flag state.

I'll bear this in mind if things progress to the point where a PEP is
indicated.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>



From tim.one@home.com  Wed Jan 31 19:49:34 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 31 Jan 2001 14:49:34 -0500
Subject: [Python-Dev] WARNING: Changed build process for zlib on Windows
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPMEKGDAAA.MarkH@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEHBIMAA.tim.one@home.com>

[Mark Hammond]
> ...
> The new process is very simple, but may break some peoples build.
> ...
> The reason this _should_ not break your build is that your
> _probably_ already have a "..\..\zlib-1.1.3" directory installed
> in the right place so the header files can be located.

Actually, it's certain to break the build for anyone who read
PCbuild\readme.txt.  But I *want* it to break:  changing the directory name
is a strong hint that they should download the zlib source code from the
same place you did (and which is now explained in PCbuild\readme.txt, and
mentioned in the 2.1a2 NEWS file).

Other than that, worked first time, and-- even better --the second time too
<wink>.



From esr@thyrsus.com  Wed Jan 31 17:53:16 2001
From: esr@thyrsus.com (Eric S. Raymond)
Date: Wed, 31 Jan 2001 12:53:16 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A77FE94.E5082136@lemburg.com>; from mal@lemburg.com on Wed, Jan 31, 2001 at 01:01:24PM +0100
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com> <3A77FE94.E5082136@lemburg.com>
Message-ID: <20010131125316.B5151@thyrsus.com>

M.-A. Lemburg <mal@lemburg.com>:
> Eric, could you write a PEP for this ?

Not yet.  I'm about (at Guido's suggestion) to submit a revised ternary-select
proposal.  Let's process that first.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Today, we need a nation of Minutemen, citizens who are not only prepared to
take arms, but citizens who regard the preservation of freedom as the basic
purpose of their daily life and who are willing to consciously work and
sacrifice for that freedom."
	-- John F. Kennedy


From tim.one@home.com  Wed Jan 31 20:28:00 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 31 Jan 2001 15:28:00 -0500
Subject: [Python-Dev] weak refs and jython
In-Reply-To: <200101311234.NAA24584@core.inf.ethz.ch>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHCIMAA.tim.one@home.com>

[Samuele Pedroni]
> I have read weak ref PEP, maybe too late.
> I don't know if portability of code using weak refs between
> python and jython was a goal or could be one,

CPython generally doesn't want to do anything impossible for Jython, if it
can help it.

> and up to which extent actual impl. will correspond to the PEP.

Don't care about that.

> ...
> AFAIK using java weak refs (which I think is a natural choice) I
> see no way (at least no worth-the-effort way) to implement this
> in jython.  Java weak refs cannot be resurrected.

Thanks for bringing this up!  Fred is looking into it.



From fdrake@acm.org  Wed Jan 31 20:25:51 2001
From: fdrake@acm.org (Fred L. Drake, Jr.)
Date: Wed, 31 Jan 2001 15:25:51 -0500 (EST)
Subject: [Python-Dev] weak refs and jython
In-Reply-To: <200101311234.NAA24584@core.inf.ethz.ch>
References: <200101311234.NAA24584@core.inf.ethz.ch>
Message-ID: <14968.29903.183882.41485@cj42289-a.reston1.va.home.com>

Samuele Pedroni writes:
 > AFAIK using java weak refs (which I think is a natural choice) I see
 > no way (at least no worth-the-effort way) to implement this in jython.
 > Java weak refs cannot be resurrected.

  This is certainly annoying.
  How about this: the callback receives the weak reference object or
proxy which it was registered on as a parameter.  Since the reference
has already been cleared, there's no way to get the object back, so we
don't need to get it from Java either.
  Would that be workable?  (I'm adjusting my patch now.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations



From tim.one@home.com  Wed Jan 31 20:56:52 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 31 Jan 2001 15:56:52 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <14968.13802.22823.702114@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHDIMAA.tim.one@home.com>

[Ping]
> x is sequence-like if it provides __getitem__() but not keys()

[Skip]
> So why does this barf?
>
>     >>> [].__getitem__
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     AttributeError: __getitem__
>
> (Obviously, lists *do* understand __getitem__ at some level.  Why
> isn't it exposed in the method table?)

The old type/class split:  list is a type, and types spell their "method
tables" in ways that have little in common with how classes do it.

See PyObject_GetItem in abstract.c for gory details (e.g., dicts spell their
version of getitem via ->tp_as_mapping->mp_subscript(...), while lists spell
it ->tp_as_sequence->sq_item(...); neither has any binding to the attr
"__getitem__"; instance objects fill in both the tp_as_mapping and
tp_as_sequence slots, then map both the mp_subscript and sq_item slots to
classobject.c's instance_item, which in turn looks up "__getitem__").

bet-you're-sorry-you-asked<wink>-ly y'rs  - tim



From tim.one@home.com  Wed Jan 31 21:24:53 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 31 Jan 2001 16:24:53 -0500
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: <3A78226B.2E177EFE@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHFIMAA.tim.one@home.com>

[M.-A. Lemburg]
> AFAIR, Vladimir's malloc implementation favours small objects.

It favors the memory alloc/dealloc patterns Vlad recorded while running an
instrumented Python.  Which is mostly good news.  The flip side is that it
favors the specific programs he ran, and who knows whether those are
"typical".  OTOH, vendor mallocs favor the programs *they* ran, which
probably didn't include Python at all <wink>.

> ...
> Perhaps we should think about adding his lib to the core ?!

It's patch 101104 on SF.  I pushed Vlad to push this for 2.0, but he wisely
decided it was too big a change at the time.  It's certainly too much a
change to slam into 2.1 at this late stage too.  There are many reasons to
want this (e.g., list.append() calls realloc every time today, because,
despite over-allocating, it has no idea how much storage *has* already been
allocated; any malloc has to know this info under the covers, but there's no
way for us to know that too unless we add another N bytes to every list
object to record it, or use our own malloc which *can* tell us that info).

list.append()-behavior-varies-wildly-across-platforms-today-
    when-the-list-gets-large-because-of-that-ly y'rs  - tim



From tim.one@home.com  Wed Jan 31 21:49:31 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 31 Jan 2001 16:49:31 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3A78002F.DC8F0582@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHGIMAA.tim.one@home.com>

[Tim]
>> Seems an unrelated topic:  would "iterators for dictionaries" solve the
>> supposed problem with iteration order?

[MAL]
> No, but it would solve the problem in a more elegant and
> generalized way.

I'm lost.  "Would [it] solve the ... problem?" "No [it wouldn't solve the
problem], but it would solve the problem ...".  Can only assume we're
switching topics within single sentences now <wink>.

> Besides, it also allows writing code which is thread safe, since
> the iterator can take special actions to assure that the dictionary
> doesn't change during the iteration phase (see the other thread
> about "making mutable objects readonly").

Sorry, but immutability has nothing to do with thread safety (the latter has
to do with "doing a right thing" in the presence of multiple threads, to
keep data structures internally consistent; raising an exception is never "a
right thing" unless the user is violating the advertised semantics, and if
mutation during iteration is such a violation, the presence or absence of
multiple threads has nothing to do with that).  IOW, perhaps, a critical
section is an area of non-exceptional serialization, not a landmine that
makes other threads *blow up* if they touch it.

> ...
> I don't remember the figures, but these micor optimizations

That's plural, but I thought you were talking specifically about the mutable
counter object.  I don't know which, but the two statements don't jibe.

> do speedup loops by a noticable amount. Just compare the performance
> of stock Python 1.5 against my patched version.

No time now, but after 2.1 is out, sure, wrt it (not 1.5).



From tim.one@home.com  Wed Jan 31 22:10:12 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 31 Jan 2001 17:10:12 -0500
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: <m3itmv7m88.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEHIIMAA.tim.one@home.com>

[Michael Hudson]
> ...
> Can anyone try this on Windows?  Seeing as windows malloc
> reputedly sucks, maybe the differences would be bigger.

No time now (pymalloc is a non-starter for 2.1).  Was tried in the past on
Windows.  Helped significantly.  Unclear how much was simply due to
exploiting the global interpreter lock, though.  "Windows" is also a
multiheaded beast (e.g., NT has very different memory performance
characteristics than 95).



From tim.one@home.com  Wed Jan 31 22:43:59 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 31 Jan 2001 17:43:59 -0500
Subject: generators (was RE: [Python-Dev] Re: Sets: elt in dict, lst.include)
In-Reply-To: <20010130092454.D18319@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHIIMAA.tim.one@home.com>

[Neil Schemenauer]
> What's the chances of getting generators into 2.2?

Unknown.  IMO it has more to do with generalizing the iteration protocol
than with generators per se (a generator object that doesn't play nice with
"for" is unpleasant to use; otoh, a generator object that can't be used
divorced from "for" is frustrating too (like when comparing the fringes of
two trees efficiently, which requires interleaving two distinct traversals,
each naturally recursive on its own)).

> The implementation should not be hard.  Didn't Steven Majewski have
> something years ago?

Yes, but Guido also sketched out a nearly complete implementation within the
last year or so.

> Why do we always get sidetracked on trying to figure out how to
> do coroutines and continuations?

Sorry, I've been failing to find a good answer to that question for a decade
<0.4 wink>.  I should note, though, that Guido's current notion of
"generator" is stronger than Icon/CLU/Sather's (which are "strictly
stack-like"), and requires machinery more elaborate than StevenM (or Guido)
sketched before.

> Generators would add real power to the language and are simple
> enough that most users could benefit from them.  Also, it should be
> possible to design an interface that does not preclude the
> addition of coroutines or continuations later.

Agreed.

> I'm not volunteering to champion the cause just yet.  I just want
> to know if there is some issue I'm missing.

microthreads have an enthusiastic and possibly growing audience.  That gets
into (C) stacklessness, though, as do coroutines.  I'm afraid that once you
go beyond "simple" (Icon) generators, a whole world of other stuff gets
pulled in.  The key trick to implementing simple generators in current
Python is simply to decline decrementing the frame's refcount upon a
"suspend" (of course the full details are more involved than *just* that,
but they mostly follow *from* just that).

everything-is-the-enemy-of-something-ly y'rs  - tim



From skip@mojam.com (Skip Montanaro)  Wed Jan 31 22:27:38 2001
From: skip@mojam.com (Skip Montanaro) (Skip Montanaro)
Date: Wed, 31 Jan 2001 16:27:38 -0600 (CST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEHDIMAA.tim.one@home.com>
References: <14968.13802.22823.702114@beluga.mojam.com>
 <LNBBLJKPBEHFEDALKOLCOEHDIMAA.tim.one@home.com>
Message-ID: <14968.37210.886842.820413@beluga.mojam.com>

>>>>> "Tim" == Tim Peters <tim.one@home.com> writes:

    >> (Obviously, lists *do* understand __getitem__ at some level.  Why
    >> isn't it exposed in the method table?)

    Tim> The old type/class split: list is a type, and types spell their
    Tim> "method tables" in ways that have little in common with how classes
    Tim> do it.

The problem that rolls around in the back of my mind from time-to-time is
that since Python doesn't currently support interfaces, checking for
specific methods seems to be the only reasonable way to determine if a
object does what you want or not.

What would break if we decided to simply add __getitem__ (and other sequence
methods) to list object's method table?  Would they foul something up or
would simply sit around quietly waiting for hasattr to notice them?

Skip



From pedroni@inf.ethz.ch  Wed Jan 31 22:29:37 2001
From: pedroni@inf.ethz.ch (Samuele Pedroni)
Date: Wed, 31 Jan 2001 23:29:37 +0100
Subject: [Python-Dev] weak refs and jython
References: <200101311234.NAA24584@core.inf.ethz.ch> <14968.29903.183882.41485@cj42289-a.reston1.va.home.com>
Message-ID: <001f01c08bd5$4c9c9900$7c5821c0@newmexico>

Hi.

[Fred L. Drake, Jr.]

>  > Java weak refs cannot be resurrected.
>
>   This is certainly annoying.
>   How about this: the callback receives the weak reference object or
> proxy which it was registered on as a parameter.  Since the reference
> has already been cleared, there's no way to get the object back, so we
> don't need to get it from Java either.
>   Would that be workable?  (I'm adjusting my patch now.)

Yes, it is workable: clearly we can implement weak refs only under java2 but
this is not (really) an issue.
We can register the refs in a java reference queue, and poll it lazily or
trough a low-priority thread
in order to invoke the callbacks.

-- Some remarks
I have used java weak/soft refs to implement some of the internal tables of
jython in order to avoid memory leaks, at least
under java2.

I imagine that the idea behind callbacks plus resurrection was to enable the
construction of sofisticated caches.

My intuition is that these features are not present under java because they
will interfere too much with gc
and have a performance penalty.
On the other hand java offers reference queues and soft references, the latter
cover the common case of caches
that should be cleared when there is few memory left. (Never tried them
seriously, so I don't know if the
actual impl is fair, or will just wait too much starting to discard things =>
behavior like primitives gc).

The main difference I see between callbacks and queues approach is that with
queues is this left to the user
when to do the actual cleanup of his tables/caches, and handling queues
internally has a "low" overhead.
With callbacks what happens depends really on the collection times/patterns and
the overhead is related
to call overhead and how much is non trivial, what the user put in the
callbacks. Clearly general performance
will not be easily predictable.
(From a theoretical viewpoint one can simulate more or less queues with
callbacks and the other way around).

Resurrection makes few sense with queues, but I can easely see that lacking of
both resurrection and soft refs
limits what can be done with weak-like refs.

Last thing: one of the things that is really missing in java refs features is
that one cannot put conditions of the form
as long A is not collected B should not be collected either. Clearly I'm
referring to situation when one cannot modify
the class of A in order to add a field, which is quite typical in java. This
should not be a problem with python and
its open/dynamic way-of-life.

regards, Samuele Pedroni.



From mal@lemburg.com  Wed Jan 31 19:03:12 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 20:03:12 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
 <3A756FF8.B7185FA2@lemburg.com>
 <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
 <3A75B190.3FD2A883@lemburg.com>
 <200101291922.OAA13321@cj20424-a.reston1.va.home.com>
 <20010129150247.B10191@thyrsus.com>
 <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
 <3A77FE94.E5082136@lemburg.com>
 <14968.14631.419491.440774@beluga.mojam.com>
 <3A78497E.8BCF197E@lemburg.com> <14968.23188.573257.392841@beluga.mojam.com>
Message-ID: <3A786170.CD65B8A4@lemburg.com>

Skip Montanaro wrote:
> 
>     MAL> This thread is an offspring of the "for something in dict:" thread.
>     MAL> The problem we face when iterating over mutable objects is that the
>     MAL> underlying objects can change. By marking them read-only we can
>     MAL> safely iterate over their contents.
> 
> I suspect you'll find it difficult to mark dbm/bsddb/gdbm files read-only.
> (And what about Andy Dustman's cool sqldict stuff?)  If you can't extend
> this concept in a reasonable fashion to cover (most of) the other objects
> that smell like dictionaries, I think you'll just be adding needless
> complications for a feature than can't be used where it's really needed.

We are currently only talking about Python dictionaries here, even though
other objects could also benefit from this.
 
> I see no problem asking for the items() of an in-memory dictionary in order
> to get a predictable list to iterate over, but doing that for disk-based
> mappings would be next to impossible.  So, I'm stuck iterating over
> something can can change out from under me.  In the end, the programmer will
> still have to handle border cases specially.  Besides, even if you *could*
> lock your disk-based mapping, are you really going to do that in situations
> where its sharable (that's what databases they are there for, after all)?  I
> suspect you're going to keep the database mutable and work around any
> resulting problems.
> 
> If you want to implement "for key in dict:", why not just have the VM call
> keys() under the covers and use that list?  It would be no worse than the
> situation today where you call "for key in dict.keys():", and with the same
> caveats.  If you're dumb enough to do that for an on-disk mapping object,
> well, you get what you asked for.

That's why iterators do a much better task here. In DB design
these are usually called cursors which the allow moving inside
large result sets. But this really is a different topic...

Readonlyness could be put to some good use in optimizing data
structure for which you know that they won't change anymore.
Temporary readonlyness has the nice sideeffect of allowing low-level
lock implementations and makes writing thread safe code easier
to handle, because you can make assertions w/r to the immutability
of an object during a certain period of time explicit in your
code.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Wed Jan 31 20:36:54 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 21:36:54 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LCEPIIGDJPKCOIHOBJEPEEKJDAAA.MarkH@ActiveState.com> <3A78045F.7DB50871@lemburg.com> <20010131125500.C5151@thyrsus.com>
Message-ID: <3A787766.35453597@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal@lemburg.com>:
> > Anyway, names really don't matter much, so how about:
> >
> > .mutable([flag]) -> integer
> >
> >   If called without argument, returns 1/0 depending on whether
> >   the object is mutable or not. When called with a flag argument,
> >   sets the mutable state of the object to the value indicated
> >   by flag and returns the previous flag state.
> 
> I'll bear this in mind if things progress to the point where a PEP is
> indicated.

Great :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From greg@cosc.canterbury.ac.nz  Wed Jan 31 23:21:04 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 01 Feb 2001 12:21:04 +1300 (NZDT)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <14968.16962.830739.920771@anthem.wooz.org>
Message-ID: <200101312321.MAA03263@s454.cosc.canterbury.ac.nz>

barry@digicool.com (Barry A. Warsaw):

>    for key in dict.iterator(KEYS)
>    for value in dict.iterator(VALUES)
>    for key, value in dict.iterator(ITEMS)

Yuck. I don't like any of this "for x in y.iterator_something()"
stuff. The things you're after aren't "in" the iterator, they're
"in" the dict. I don't want to know that there are iterators
involved.

We seem to be coming up with more and more convoluted ways
to say things that should be very straightforward.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From guido@digicool.com  Wed Jan 31 16:23:37 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 31 Jan 2001 11:23:37 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: Your message of "Wed, 31 Jan 2001 13:35:36 GMT."
 <3a780eda.16144995@smtp.worldonline.dk>
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com> <20010130165204.I962@xs4all.nl> <200101302042.PAA29301@cj20424-a.reston1.va.home.com> <3a7809c0.14839067@smtp.worldonline.dk> <20010131135914.N962@xs4all.nl>
 <3a780eda.16144995@smtp.worldonline.dk>
Message-ID: <200101311623.LAA01774@cj20424-a.reston1.va.home.com>

[Finn]
> >> Using global on an import name is currently ignored by Jython because
> >> the name assignment is done by the runtime, not the compiler.

[Thomas]
> >So it's impossible to do, in Jython, something like:
> >
> >def fillme():
> >    global me
> >    import me
> >
> >but it is possible to do:
> >
> >def fillme():
> >    global me
> >    import me as _me
> >    me = _me
> >
> >?

[Finn again]
> Yes, only the second example will make a global variable.
> 
> > I have to say I don't like that; we're always claiming 'import' (and
> >'def' and 'class' for that matter) are 'just another way of writing
> >assignment'. All these special cases break that.
> 
> I don't like it either, I was only reported what jython currently does.
> The current design used by Jython does lend itself directly towards a
> solution, but I don't see anything that makes it impossible to solve.

Tentatively, I'd say that this should be documented as a Jython
difference and Jython should strive to fix this.  So I see no good
reason to rule it out in CPython.

That doesn't mean I like Thomas's example!  It should probably be
redesigned along the lines of

    def fillme():
	import me
	return me

    me = fillme()

to avoid needing side effects on globals.

--Guido van Rossum (home page: http://www.python.org/~guido/)


From guido@digicool.com  Wed Jan 31 16:26:11 2001
From: guido@digicool.com (Guido van Rossum)
Date: Wed, 31 Jan 2001 11:26:11 -0500
Subject: [Python-Dev] The 2nd Korea Python Users Seminar
Message-ID: <200101311626.LAA01799@cj20424-a.reston1.va.home.com>

Wow...!

Way to go, Christian!

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Wed, 31 Jan 2001 22:46:06 +0900
From:    "Changjune Kim" <junaftnoon@yahoo.com>
To:      <guido@python.org>
Subject: The 2nd Korea Python Users Seminar

Dear Mr. Guido van Rossum,

First of all, I can't thank you more for your great contribution to the
presence of Python. It is not a mere computer programming language but a whole
culture, I think.

I am proud to tell you that we are having the 2nd Korea Python Users Seminar
which is wide open to the public. There are already more than 400 people who
registered ahead, and we expect a few more at the site. The seminar will be
held in Seoul, South Korea on Feb 2.

With the effort of Korea Python Users Group, there has been quite a boom or
phenomenon for Python among developers in Korea. Several magazines are
_competitively_ carrying regular articles about Python -- I'm one of the
authors -- and there was an article even on a _normal_ newspaper, one of the
major four big newspapers in Korea, which described the sprouting of Python in
Korea and pointed its extreme easiness to learn. (moreover, it's the year of
the snake in the 12 zodiac animals)

The seminar is mainly about:

Python 2.0, intro for newbies, Python coding style, ZOPE, internationalization
of Zope for Korean, GUIs such as wxPython, PyQt, Internet programming in
Python, Python with UML, Python C/API, XML with Python, and Stackless Python.

Christian Tismer is coming for SPC presentation with me, and Hostway CEO Lucas
Roh will give a talk about how they are using Python, and one of the Python
evangelists, Brian Lee, CTO of Linuxkorea will give a brief intro to Python
and Python C/API.

I'm so excited and happy to tell you this great news. If there is any message
you want to give to Korea Python Users Group and the audience, it'd be
great -- I could translate it and post it at the site for all the audience.

Thank you again for your wonderful snake.

Best regards,

June from Korea.




_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

------- End of Forwarded Message



From tim.one@home.com  Wed Jan 31 23:25:54 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 31 Jan 2001 18:25:54 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <200101301500.KAA25733@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEHMIMAA.tim.one@home.com>

[Ping]
> Is a frozen list hashable?

[Guido]
> Yes -- that's what started this thread (using dicts as dict keys,
> actually).

Except this doesn't actually work unless list.freeze() recursively ensures
that all elements in the list are frozen too:

>>> hash((1, 2))
219750523
>>> hash((1, [2]))
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: unhashable type
>>>

That bothered me in Eric's original suggestion:  unless x.freeze() does a
traversal of all objects reachable from x, it doesn't actually make x safe
against modification (except at the very topmost level).  But doing such a
traversal isn't what *everyone* would want either (as with "const" in C, I
expect the primary benefit would be the chance to spend countless hours
worming around it in both directions <wink>).

[Skip]
> If you want immutable dicts or lists in order to use them as
> dictionary keys, just serialize them first:
>
>     survey_says = {"spam": 14, "eggs": 42}
>     sl = marshal.dumps(survey_says)
>     dict[sl] = "spam"

marshal.dumps(dict) isn't canonical, though.  That is, it may well be that
d1 == d2 but dumps(d1) != dumps(d2).  Even materializing dict.values(), then
sorting it, then marshaling *that* isn't enough; e.g., consider {1: 1} and
{1: 1L}.  The latter example applies to marshaling lists too.



From greg@cosc.canterbury.ac.nz  Wed Jan 31 23:34:50 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 01 Feb 2001 12:34:50 +1300 (NZDT)
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <14968.14631.419491.440774@beluga.mojam.com>
Message-ID: <200101312334.MAA03267@s454.cosc.canterbury.ac.nz>

Skip Montanaro <skip@mojam.com>:

> Can someone give me an example where this is actually useful and
> can't be handled through some existing mechanism?

I can envisage cases where you want to build a data structure
incrementally, and then treat it as immutable so you can use it as a
dict key, etc. There's currently no way to do that to a list
without copying it.

So, it could be handy to have a way of turning a list into a tuple
in-place. It would have to be a one-way transformation, otherwise
you could start using it as a dict key, make it mutable again, and
cause havoc.

Suggested implementation: When you allocate the space for the values
of a list, leave enough room for the PyObject_HEAD of a tuple at the
beginning. Then you can turn that memory block into a real tuple
later, and flag the original list object as immutable so you can't
change it later via that route.

Hmmm, would waste a bit of space for each list object. Maybe this
should be a special list-about-to-become-tuple type. (Tist?
Luple?)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From tim.one@home.com  Wed Jan 31 23:36:48 2001
From: tim.one@home.com (Tim Peters)
Date: Wed, 31 Jan 2001 18:36:48 -0500
Subject: [Python-Dev] RE: [Patch #103203] PEP 205: weak references implementation
In-Reply-To: <E14O4Nc-0007gt-00@usw-sf-web1.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEHMIMAA.tim.one@home.com>

> Patch #103203 has been updated.
>
> Project: python
> Category: core (C code)
> Status: Open
> Submitted by: fdrake
> Assigned to : tim_one
> Summary: PEP 205: weak references implementation

Fred, just noticed the new "assigned to".  If you don't think it's a
disaster(*), check it in!  That will force more eyeballs on it quickly, and
the quicker the better.  I'm simply not going to do a decent review quickly
on something this large starting cold.  More urgently, I've been working
long hours every day for several weeks, and need a break so I don't screw up
last-second crises tomorrow.

has-12-hours-of-taped-professional-wrestling-to-catch-up-on-ly
    y'rs  - tim


(*) otoh, if you do think it's a disaster, withdraw it for 2.1.



From moshez@zadka.site.co.il  Wed Jan 31 20:32:45 2001
From: moshez@zadka.site.co.il (Moshe Zadka)
Date: Wed, 31 Jan 2001 22:32:45 +0200 (IST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <007301c08baa$02908220$e46940d5@hagrid>
References: <007301c08baa$02908220$e46940d5@hagrid>, <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org><20010131101449.B28C5A83E@darjeeling.zadka.site.co.il> <14968.16962.830739.920771@anthem.wooz.org>
Message-ID: <20010131203245.E813BA83E@darjeeling.zadka.site.co.il>

[Barry]
>     itemsiter(), keysiter(), valsiter()
>     itemsiterator(), keysiterator(), valuesiterator()
>     iiterator(), kiterator(), viterator()

[/F]
> shouldn't that be xitems, xkeys, xvalues?

I'm so hoping I missed a <wink> there somewhere. Please, no more
of the dreaded 'x'.

thinking-of-ripping-x-from-my-keyboard-ly y'rs, Z.
-- 
Moshe Zadka <sig@zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6


From greg@cosc.canterbury.ac.nz  Wed Jan 31 23:54:45 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 01 Feb 2001 12:54:45 +1300 (NZDT)
Subject: [Python-Dev] Generator protocol? (Re: Sets: elt in dict, lst.include)
In-Reply-To: <20010131063007.536ACA83E@darjeeling.zadka.site.co.il>
Message-ID: <200101312354.MAA03272@s454.cosc.canterbury.ac.nz>

Moshe Zadka <moshez@zadka.site.co.il>:

> Tim's "try to use that to write something that
> will return the nodes of a binary tree" still haunts me.

Instead of an iterator protocol, how about a generator
protocol? Now that we're getting nested scopes, it should
be possible to arrange it so that

  for x in thing:
    ...stuff...

gets compiled as something like

  def _body(x):
    ...stuff...
  thing.__generate__(_body)

(Actually it would be more complicated than that - for
backward compatibility you'd want a new bytecode that
would look for a __generator__ attribute and emulate
the old iteration protocol otherwise.)

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Wed Jan 31 23:57:39 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 01 Feb 2001 12:57:39 +1300 (NZDT)
Subject: [Python-Dev] codecity.com
In-Reply-To: <200101310521.AAA31653@cj20424-a.reston1.va.home.com>
Message-ID: <200101312357.MAA03275@s454.cosc.canterbury.ac.nz>

> Should I spread this word, or is this a joke?

I'm not sure what answering trivia questions has to do
with the stated intention of "teaching jr. programmers how 
to write code".

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From greg@cosc.canterbury.ac.nz  Wed Jan 31 23:59:33 2001
From: greg@cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 01 Feb 2001 12:59:33 +1300 (NZDT)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101310049.TAA30197@cj20424-a.reston1.va.home.com>
Message-ID: <200101312359.MAA03278@s454.cosc.canterbury.ac.nz>

Guido van Rossum <guido@digicool.com>:

> But it *is* true that coroutines are a very attractice piece of land
> "just nextdoor".

Unfortunately there's a big high fence in between topped with
barbed wire and patrolled by vicious guard dogs. :-(

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg@cosc.canterbury.ac.nz	   +--------------------------------------+


From thomas@xs4all.net  Wed Jan 31 21:00:33 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 31 Jan 2001 22:00:33 +0100
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: <3A78226B.2E177EFE@lemburg.com>; from mal@lemburg.com on Wed, Jan 31, 2001 at 03:34:19PM +0100
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com> <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk> <3A78226B.2E177EFE@lemburg.com>
Message-ID: <20010131220033.O962@xs4all.nl>

On Wed, Jan 31, 2001 at 03:34:19PM +0100, M.-A. Lemburg wrote:

> I have made similar experience with -On with n>3 compared to -O2
> using pgcc (gcc optimized for PC processors). BTW, the Linux
> kernel uses "-Wall -Wstrict-prototypes -O3 -fomit-frame-pointer"
> as CFLAGS -- perhaps Python should too on Linux ?!

Maybe, but the Linux kernel can be quite specific in what version of gcc you
need, and knows in advance on what platform you are using it :) The
stability and actual speedup of gcc's optimization options can and does vary
across platforms. In the above example, -Wall and -Wstrict-prototypes are
just warnings, and -O3 is the same as "-O2 -finline-functions". As for
-fomit-frame-pointer....

> Does anybody know about the effect of -fomit-frame-pointer ?
> Would it cause problems or produce code which is not compatible
> with code compiled without this flag ?

The effect of -fomit-frame-pointer is that the compilation of frame-pointer
handling code is avoided. It doesn't have any effect on compatibility, since
it doesn't matter that other parts/functions/libraries do have such code,
but it does make debugging impossible (on most machines, in any case.) From
GCC's info docs:

-fomit-frame-pointer'
     Don't keep the frame pointer in a register for functions that
     don't need one.  This avoids the instructions to save, set up and
     restore frame pointers; it also makes an extra register available
     in many functions.  *It also makes debugging impossible on some
     machines.*

     On some machines, such as the Vax, this flag has no effect, because
     the standard calling sequence automatically handles the frame
     pointer and nothing is saved by pretending it doesn't exist.  The
     machine-description macro =06RAME_POINTER_REQUIRED' controls
     whether a target machine supports this flag.  *Note Registers::.

Obviously, for the Linux kernel this is a very good thing, you don't debug
the Linux kernel like a normal program anyway (contrary to some other UNIX
kernels, I might add.) I believe -g turns off -fomit-frame-pointer itself,
but the docs for -g or -fomit-frame-pointer don't mention it.=20

One other thing I noted in the gcc docs is that gcc doesn't do loop
unrolling even with -O3, though I thought it would at -O2. You need to add
-funroll-loop to enable loop unrolling, and that might squeeze out some more
performance.. This only works for loops with a fixed repetition, though, so
I'm not sure if it matters.

--=20
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me sp=
read!


From thomas@xs4all.net  Wed Jan 31 19:14:58 2001
From: thomas@xs4all.net (Thomas Wouters)
Date: Wed, 31 Jan 2001 20:14:58 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <14968.16962.830739.920771@anthem.wooz.org>; from barry@digicool.com on Wed, Jan 31, 2001 at 11:50:10AM -0500
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org> <20010131101449.B28C5A83E@darjeeling.zadka.site.co.il> <14968.16962.830739.920771@anthem.wooz.org>
Message-ID: <20010131201457.I922@xs4all.nl>

[ Trimming CC: line ]

On Wed, Jan 31, 2001 at 11:50:10AM -0500, Barry A. Warsaw wrote:

> Moshe, I had exactly the same reaction and exactly the same idea.  I'm
> a strong -1 on introducing new syntax for this when new methods can
> handle it in a much more readable way (IMO).

Same here. I *might* like it if iterators were given a format string (or
tuple object, or whatever) so they knew what the iterating code expected
(so something like this:

  for x,y,z in obj

would translate into 

  iterator(obj)("(x,y,z)")

or maybe just

  iterator(obj)((None,None,None))

or maybe even just

  iterator(obj)(3) # that is, number of elements

or so) but I suspect it might be too cute (and obfuscated) for Python,
especially if it was put to use to distingish between 'for x:y in obj' and
'for x,y in obj'.

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!


From sjoerd@oratrix.nl  Wed Jan 31 20:05:06 2001
From: sjoerd@oratrix.nl (Sjoerd Mullender)
Date: Wed, 31 Jan 2001 21:05:06 +0100
Subject: [Python-Dev] python setup.py fails with illegal import (+ fix)
Message-ID: <20010131200507.A106931E1AD@bireme.oratrix.nl>

With the current CVS version, running python setup.py as part of the
build process fails with a syntax error:
Traceback (most recent call last):
  File "../setup.py", line 12, in ?
    from distutils.core import Extension, setup
  File "/usr/people/sjoerd/src/python/Lib/distutils/core.py", line 20, in ?
    from distutils.cmd import Command
  File "/usr/people/sjoerd/src/python/Lib/distutils/cmd.py", line 15, in ?
    from distutils import util, dir_util, file_util, archive_util, dep_util
SyntaxError: 'from ... import *' may only occur in a module scope

The fix is to change the from ... import * that the compiler complains
about:
Index: file_util.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/distutils/file_util.py,v
retrieving revision 1.7
diff -u -c -r1.7 file_util.py
*** file_util.py 2000/09/30 17:29:35	1.7
--- file_util.py 2001/01/31 20:01:56
***************
*** 106,112 ****
      # changing it (ie. it's not already a hard/soft link to src OR
      # (not update) and (src newer than dst).
  
!     from stat import *
      from distutils.dep_util import newer
  
      if not os.path.isfile(src):
--- 106,112 ----
      # changing it (ie. it's not already a hard/soft link to src OR
      # (not update) and (src newer than dst).
  
!     from stat import ST_ATIME, ST_MTIME, ST_MODE, S_IMODE
      from distutils.dep_util import newer
  
      if not os.path.isfile(src):

I didn't check this in because distutils is Greg Ward's baby.

-- Sjoerd Mullender <sjoerd.mullender@oratrix.com>


From mal@lemburg.com  Wed Jan 31 22:24:43 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 23:24:43 +0100
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
References: <LNBBLJKPBEHFEDALKOLCAEHIIMAA.tim.one@home.com>
Message-ID: <3A7890AB.69B893F9@lemburg.com>

Tim Peters wrote:
> 
> [Michael Hudson]
> > ...
> > Can anyone try this on Windows?  Seeing as windows malloc
> > reputedly sucks, maybe the differences would be bigger.
> 
> No time now (pymalloc is a non-starter for 2.1).  Was tried in the past on
> Windows.  Helped significantly.  Unclear how much was simply due to
> exploiting the global interpreter lock, though.  "Windows" is also a
> multiheaded beast (e.g., NT has very different memory performance
> characteristics than 95).

We're still in alpha, no ?  

Adding pymalloc is not much of
a deal since it fits nicely with the Python malloc macros and
giving the package a nice spin by putting it into a Python alpha
release would sure create more confidence in this nice piece
of work. We can always take it out again before going into the 
beta phase.

Or do we have a 2.1 feature freeze already ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From mal@lemburg.com  Wed Jan 31 22:15:50 2001
From: mal@lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 23:15:50 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCGEHGIMAA.tim.one@home.com>
Message-ID: <3A788E96.AB823FAE@lemburg.com>

Tim Peters wrote:
> 
> [Tim]
> >> Seems an unrelated topic:  would "iterators for dictionaries" solve the
> >> supposed problem with iteration order?
> 
> [MAL]
> > No, but it would solve the problem in a more elegant and
> > generalized way.
> 
> I'm lost.  "Would [it] solve the ... problem?" "No [it wouldn't solve the
> problem], but it would solve the problem ...".  Can only assume we're
> switching topics within single sentences now <wink>.

Sorry, not my brightest day today... what I wanted to say is that
iterators would solve the problem of defining "something" in
"for something in dict" nicely. 

Since iterators can define the order in which a data structure is 
traversed, this would also do away with the second (supposed) 
problem.

> > Besides, it also allows writing code which is thread safe, since
> > the iterator can take special actions to assure that the dictionary
> > doesn't change during the iteration phase (see the other thread
> > about "making mutable objects readonly").
> 
> Sorry, but immutability has nothing to do with thread safety (the latter has
> to do with "doing a right thing" in the presence of multiple threads, to
> keep data structures internally consistent; raising an exception is never "a
> right thing" unless the user is violating the advertised semantics, and if
> mutation during iteration is such a violation, the presence or absence of
> multiple threads has nothing to do with that).  IOW, perhaps, a critical
> section is an area of non-exceptional serialization, not a landmine that
> makes other threads *blow up* if they touch it.

Who said that an exception is raised ? The method I posted
on the mutability thread allows querying the current state just
like you would query the availability of a resource.

> > ...
> > I don't remember the figures, but these micor optimizations
> 
> That's plural, but I thought you were talking specifically about the mutable
> counter object.  I don't know which, but the two statements don't jibe.

The counter object patch is a micro-optimization and as such will
only give you a gain of a few percent. What makes the difference
is the sum of these micro optimizations.

Here's the patch for Python 1.5 which includes the optimizations:

	http://www.lemburg.com/python/mxPython-1.5.patch.gz
 
> > do speedup loops by a noticable amount. Just compare the performance
> > of stock Python 1.5 against my patched version.
> 
> No time now, but after 2.1 is out, sure, wrt it (not 1.5).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/


From tim.one at home.com  Mon Jan  1 01:13:12 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 31 Dec 2000 19:13:12 -0500
Subject: [Python-Dev] Re: Most everything is busted
In-Reply-To: <14926.34447.60988.553140@anthem.concentric.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAECIIGAA.tim.one@home.com>

[Barry A. Warsaw]
> There's a stupid, stupid bug in Mailman 2.0, which I've just fixed
> and (hopefully) unjammed things on the Mailman end[1].  We're still
> probably subject to the Postfix delays unfortunately; I think those
> are DNS related, and I've gotten a few other reports of DNS oddities,
> which I've forwarded off to the DC sysadmins.  I don't think that
> particular problem will be fixed until after the New Year.
>
> relax-and-enjoy-the-quiet-ly y'rs,

I would have, except you appear to have ruined it:  hundreds of msgs
disgorged overnight and into the afternoon.  And echoes of email to c.l.py
now routinely come back in minutes instead of days.

Overall, ya, I liked it better when it was broken -- jerk <wink>.

typical-user-ly y'rs  - tim




From tim.one at home.com  Mon Jan  1 02:31:18 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 31 Dec 2000 20:31:18 -0500
Subject: [Python-Dev] Copyrights and licensing (was ... something irrelevant)
In-Reply-To: <200012291652.RAA20251@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECLIGAA.tim.one@home.com>

[Martin von Loewis]
> I'd like to get an "official" clarification on this question. Is it
> the case that patches containing copyright notices are only accepted
> if they are accompanied with license information?

It's nigh unto impossible to get Guido to pay attention to these kinds of
issues until after it's too late -- guess who's still trying to get an FSF
approved license for Python 1.6 <wink>.

What I intend to push for is that nothing be accepted except under the
understanding that copyright is assigned to the Python Software Foundation;
but, since that doesn't exist yet, we're in limbo.

> I agree that the changes are minor, I also believe that I hold the
> copyright to the changes whether I attach a notice or not (at least
> according to our local copyright law).

Under U.S. law too.  The difference is that, without an explicit copyright
notice, it's a lot easier to get lawyers to ignore that reality <0.3 wink>.
When the PSF does come into being, the lawyers will doubtless make us hassle
everyone with an explicit copyright notice into signing reams of paperwork.
It's a drain on time and money for all concerned, IMO, with no real payback.

> What concerns me that without such a notice, gencodec.py looks as if
> CNRI holds the copyright to it. I'm not willing to assign the
> copyright of my changes to CNRI, and I'd like to avoid the impression
> of doing so.

Understood, and with sympathy.  Since the status of JPython/Jython is still
muddy, I urged Finn Bock to put his own copyright notice on his Jython work
for exactly the same reason (i.e., to prevent CNRI claiming it later).

Seems to me, though, that it may simplify life down the road if, whenever an
author felt a similar need to assert copyright explicitly, they list Guido
as the copyright holder.  He's not going to screw Python!  And it's
inevitable that all Python copyrights will eventually be owned by him and/or
the PSF anyway.

But, for God's sake, whatever you do, *please* (anyone) don't make us look
at a unique license!  We're not lawyers, but we've been paying lawyers out
of our own pockets to do this crap, and it's expensive and time-consuming.
If you can't trust Guido to do a Right Thing with your code, Python is
better off without it over the long haul.

> What is even more concerning is that CNRI also holds the copyright to
> the generated files, even though they are derived from information
> made available by the Unicode consortium!

It's no concern to me -- but then I'm not paranoid <wink>.

cnri-and-the-uc-can-fight-it-out-if-it-comes-to-that-ly y'rs  - tim




From moshez at zadka.site.co.il  Mon Jan  1 11:01:02 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon,  1 Jan 2001 12:01:02 +0200 (IST)
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20001231105812.A12168@newcnri.cnri.reston.va.us>
References: <20001231105812.A12168@newcnri.cnri.reston.va.us>, <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>
Message-ID: <20010101100102.2360CA84F@darjeeling.zadka.site.co.il>

On Sun, 31 Dec 2000, Andrew Kuchling <akuchlin at cnri.reston.va.us> wrote:

> It also leads to one section of the FAQ (#3, I think) having something
> like 60 questions jumbled together.  IMHO the FAQ should be a text
> file, perhaps in the PEP format so it can be converted to HTML, and it
> should have an editor who'll arrange it into smaller sections.  Any
> volunteers?  (Must ... resist ...  urge to volunteer myself...  help
> me, Spock...)

Well, Andrew, I know if I leave you any more time, you won't be able
to resist the urge. OK, I'll volunteer. Can't do anything right now,
but expect to see an updated version posted on my site soon. If 
people will think it's a good idea, I'll move it to Misc/.
Fred, if the some-xml-format-to-HTML you're working on is in any
sort of readiness, I'll use that to format the FAQ. Having used Perl
in the last couple of weeks, I learned to appreciate the fact that
the FAQ is a standard part of the documentation.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From loewis at informatik.hu-berlin.de  Mon Jan  1 12:43:34 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 1 Jan 2001 12:43:34 +0100 (MET)
Subject: [Python-Dev] Re: Copyrights and licensing (was ... something irrelevant)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCECLIGAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCCECLIGAA.tim.one@home.com>
Message-ID: <200101011143.MAA11550@pandora.informatik.hu-berlin.de>

> Seems to me, though, that it may simplify life down the road if, whenever an
> author felt a similar need to assert copyright explicitly, they list Guido
> as the copyright holder.  He's not going to screw Python!  

That's a good solution, which I'll implement in a revised patch.

Thanks for the advice, and Happy New Year,

Martin



From mal at lemburg.com  Mon Jan  1 18:56:20 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 01 Jan 2001 18:56:20 +0100
Subject: [Python-Dev] Re: Copyright statements ([Patch #103002] Fix for #116285: Properly
 raise UnicodeErrors)
References: <E14Bhs3-0007uf-00@usw-sf-web3.sourceforge.net> <200012290957.KAA17936@pandora.informatik.hu-berlin.de> <3A4C757D.F64E9CEF@lemburg.com>
Message-ID: <3A50C4C4.76A1C5B6@lemburg.com>

Martin von Loewis wrote:
> 
> > My only problem with it is your copyright notice. AFAIK, patches to
> > the Python core cannot contain copyright notices without proper
> > license information. OTOH, I don't think that these minor changes
> > really warrant adding a complete license paragraph.
> 
> I'd like to get an "official" clarification on this question. Is it
> the case that patches containing copyright notices are only accepted
> if they are accompanied with license information?
> 
> I agree that the changes are minor, I also believe that I hold the
> copyright to the changes whether I attach a notice or not (at least
> according to our local copyright law).

True.

> What concerns me that without such a notice, gencodec.py looks as if
> CNRI holds the copyright to it. I'm not willing to assign the
> copyright of my changes to CNRI, and I'd like to avoid the impression
> of doing so.
>
> What is even more concerning is that CNRI also holds the copyright to
> the generated files, even though they are derived from information
> made available by the Unicode consortium!

The copyright for the files and changes needed for the Unicode 
support was indeed transferred to CNRI earlier this year. This
was part of the contract I had with CNRI.

I don't know why the copyright notice wasn't subsequently removed from
the files after final checkin of the changes, though, because, as
I remember, the copyright line was only added as "search&replace"
token to the files in question in the sign over period.

The codec files were part of the Unicode support patch, even though
they were created by the gencodec.py tool I wrote to create them
from the Unicode mapping files. That's why they also carry the
copyright token.

Note that with strict reading of the CNRI license, there's no
problem with removing the notice from the files in question:

"""
...provided, however, that CNRI's
License Agreement and CNRI's notice of copyright, i.e., "Copyright (c)
1995-2000 Corporation for National Research Initiatives; All Rights
Reserved" are retained in Python 1.6 alone or in any derivative
version prepared by Licensee...
"""

The copyright line in the Unicode files is
"(c) Copyright CNRI, All Rights Reserved. NO WARRANTY.", so this
does not match the definition they gave in their license text.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Mon Jan  1 19:58:36 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 01 Jan 2001 13:58:36 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: Your message of "Fri, 29 Dec 2000 21:59:16 +0100."
             <20001229215915.L1281@xs4all.nl> 
References: <EC$An3AHZGT6EwJP@jessikat.fsnet.co.uk> <LNBBLJKPBEHFEDALKOLCKEODIFAA.tim.one@home.com>  
            <20001229215915.L1281@xs4all.nl> 
Message-ID: <200101011858.NAA09263@cj20424-a.reston1.va.home.com>

Thomas just checked this in, using Tim's words:

> *** ref7.tex	2000/07/16 19:05:38	1.20
> --- ref7.tex	2000/12/31 22:52:59	1.21
> ***************
> *** 243,249 ****
>     \ttindex{exc_value}\ttindex{exc_traceback}}
>   
> ! The optional \keyword{else} clause is executed when no exception occurs
> ! in the \keyword{try} clause.  Exceptions in the \keyword{else} clause are
> ! not handled by the preceding \keyword{except} clauses.
>   \kwindex{else}
>   
> --- 243,251 ----
>     \ttindex{exc_value}\ttindex{exc_traceback}}
>   
> ! The optional \keyword{else} clause is executed when the \keyword{try} clause
> ! terminates by any means other than an exception or executing a
> ! \keyword{return}, \keyword{continue} or \keyword{break} statement.  
> ! Exceptions in the \keyword{else} clause are not handled by the preceding
> ! \keyword{except} clauses.
>   \kwindex{else}

How is this different from "when control flow reaches the end of the
try clause", which is what I really had in mind?  Using the current
wording, this paragraph would have to be changed each time a new
control-flow keyword is added.  Based upon the historical record
that's not a grave concern ;-), but I think the new wording relies too
much on accidentals such as the fact that these are the only control
flow altering events.  It may be that control flow is not rigidly
defined -- but as it is what was really intended, maybe the fix should
be to explain the right concept rather than the current ad-hoc
solution.  This also avoids concerns of readers who are trying to read
too much into the words and might become worried that there are other
ways of altering the control flow that *would* cause the else clause
to be executed; and guides implementors of other Pyhon-like languages
(like vyper) that might have more control-flow altering statements or
events.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin at loewis.home.cs.tu-berlin.de  Mon Jan  1 20:00:38 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 1 Jan 2001 20:00:38 +0100
Subject: [Python-Dev] PSA (Was: FAQ Horribly Out Of Date)
Message-ID: <200101011900.UAA01672@loewis.home.cs.tu-berlin.de>

> It appears that CNRI can only think about one thing at a time <0.5
> wink>.  For the last 6 months, that thing has been the license.  If
> they ever resolve the GPL compatibility issue, maybe they can be
> persuaded to think about the PSA.  In the meantime, I'd suggest you
> not renew <ahem>.

I think we need to find a better answer than that, and soon. While
everybody reading this list probably knows not to renew, the PSA is
the first thing that you see when selecting "Python Community" on
python.org. The first paragraph reads

# The continued, free existence of Python is promoted by the
# contributed efforts of many people. The Python Software Activity
# (PSA) supports those efforts by helping to coordinate them. The PSA
# operates web, ftp, and email services, organizes conferences, and
# engages in other activities that benefit the Python user
# community. In order to continue, the PSA needs the membership of
# people who value Python.

If you look at the current members list
(http://www.python.org/psa/Members.html), it appears that many
long-time members indeed have not renewed. This page was last updated
Nov 14 - so it appears that CNRI is still processing applications when
they come. It may well be that many of the newer members ask
themselves by now what happened to their money; it might not be easy
to get an answer to that question. However, there is clearly somebody
to blame here: The Python Community.

So I'd like to request that somebody with write permissions to these
pages changes the text, to something along the lines of replacing the
first paragraph with

# The Python community organizes itself in different ways; people
# interested in discussing development of and with Python usually
# participate in <a href="MailingLists.html">mailing lists</a>.
#
# <p>Organizations that wish to influence further directions of the
# Python language may join the <a href="/consortium">Python
# Consortium</a>.
#
# <p>The <a href="http://www.cnri.reston.va.us/">Corporation for
# National Research Initiatives</a> hosts the Python Software
# Activity, which is described below. The PSA used to provide funding
# for the Python development; that is no longer the case.

If there is a factual error in this text, please let me
know.

Regards,
Martin



From tim.one at home.com  Mon Jan  1 20:20:53 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 1 Jan 2001 14:20:53 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <E14D9Ev-0007ac-00@usw-sf-web3.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com>

[gvanrossum, in an SF patch comment]
> Bah.  I don't like this one bit.  More complexity for a little
> bit of extra speed.
> I'm keeping this open but expect to be closing it soon unless I
> hear a really good argument why more speed is really needed in
> this area.  Down with code bloat and creeping featurism!

Without judging "the solution" here, "the problem" is that everyone's first
attempt to use line-at-a-time file input in Perl:

    while (<F>} {
        ... $_ ...;
    }

runs 2-5x faster then everyone's first attempt in Python:

    while 1:
        line = f.readline()
        if not line:
            break
        ... line ...

It would be beneficial to address that *somehow*, cuz 2-5x isn't just "a
little bit"; and by the time you walk a newbie thru

    while 1:
        lines = f.readlines(hintsize)
        if not lines:
             break
        for line in lines:
            ... line ...

they feel like maybe Perl isn't so obscure after all <wink>.

Does someone have an elegant way to address this?  I believe Jeff's shot at
elegance was the other part of the patch, using (his new) xreadlines under
the covers to speed the fileinput module.

reading-text-files-is-very-common-ly y'rs  - tim




From guido at digicool.com  Mon Jan  1 20:25:07 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 01 Jan 2001 14:25:07 -0500
Subject: [Python-Dev] PSA (Was: FAQ Horribly Out Of Date)
In-Reply-To: Your message of "Mon, 01 Jan 2001 20:00:38 +0100."
             <200101011900.UAA01672@loewis.home.cs.tu-berlin.de> 
References: <200101011900.UAA01672@loewis.home.cs.tu-berlin.de> 
Message-ID: <200101011925.OAA09669@cj20424-a.reston1.va.home.com>

> > It appears that CNRI can only think about one thing at a time <0.5
> > wink>.  For the last 6 months, that thing has been the license.  If
> > they ever resolve the GPL compatibility issue, maybe they can be
> > persuaded to think about the PSA.  In the meantime, I'd suggest you
> > not renew <ahem>.
> 
> I think we need to find a better answer than that, and soon. While
> everybody reading this list probably knows not to renew, the PSA is
> the first thing that you see when selecting "Python Community" on
> python.org. The first paragraph reads
> 
> # The continued, free existence of Python is promoted by the
> # contributed efforts of many people. The Python Software Activity
> # (PSA) supports those efforts by helping to coordinate them. The PSA
> # operates web, ftp, and email services, organizes conferences, and
> # engages in other activities that benefit the Python user
> # community. In order to continue, the PSA needs the membership of
> # people who value Python.
> 
> If you look at the current members list
> (http://www.python.org/psa/Members.html), it appears that many
> long-time members indeed have not renewed. This page was last updated
> Nov 14 - so it appears that CNRI is still processing applications when
> they come. It may well be that many of the newer members ask
> themselves by now what happened to their money; it might not be easy
> to get an answer to that question. However, there is clearly somebody
> to blame here: The Python Community.

I don't know how many memberships CNRI has received, but it can't be
many, since we sent out no reminders.  I'll see if I can get an
answer.

> So I'd like to request that somebody with write permissions to these
> pages changes the text, to something along the lines of replacing the
> first paragraph with
> 
> # The Python community organizes itself in different ways; people
> # interested in discussing development of and with Python usually
> # participate in <a href="MailingLists.html">mailing lists</a>.
> #
> # <p>Organizations that wish to influence further directions of the
> # Python language may join the <a href="/consortium">Python
> # Consortium</a>.
> #
> # <p>The <a href="http://www.cnri.reston.va.us/">Corporation for
> # National Research Initiatives</a> hosts the Python Software
> # Activity, which is described below. The PSA used to provide funding
> # for the Python development; that is no longer the case.
> 
> If there is a factual error in this text, please let me
> know.

I've done something slightly different -- see
http://www.python.org/psa/.  I've kept only your first paragraph, and
inserted a boldface note before that about the obsolescence (or
deprecation :-) of the PSA membership.

I've removed the references to the consortium, since that's also about
to collapse under its own inactivity; instead, the PSF will be formed,
independent from CNRI, to hold the IP rights (insofar they can be
assigned to the PSF) and for not much else.

I'll see if I can get some more news about the creation of the PSF
(which is supposed to be an initiative of ActiveState and Digital
Creations).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Mon Jan  1 20:35:24 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 01 Jan 2001 14:35:24 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Mon, 01 Jan 2001 14:20:53 EST."
             <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> 
Message-ID: <200101011935.OAA09728@cj20424-a.reston1.va.home.com>

> [gvanrossum, in an SF patch comment]
> > Bah.  I don't like this one bit.  More complexity for a little
> > bit of extra speed.
> > I'm keeping this open but expect to be closing it soon unless I
> > hear a really good argument why more speed is really needed in
> > this area.  Down with code bloat and creeping featurism!
> 
> Without judging "the solution" here, "the problem" is that everyone's first
> attempt to use line-at-a-time file input in Perl:
> 
>     while (<F>} {
>         ... $_ ...;
>     }
> 
> runs 2-5x faster then everyone's first attempt in Python:
> 
>     while 1:
>         line = f.readline()
>         if not line:
>             break
>         ... line ...

But is everyone's first thought to time the speed of Python vs. Perl?
Why does it hurt so much that this is a bit slow?

> It would be beneficial to address that *somehow*, cuz 2-5x isn't just "a
> little bit"; and by the time you walk a newbie thru
> 
>     while 1:
>         lines = f.readlines(hintsize)
>         if not lines:
>              break
>         for line in lines:
>             ... line ...
> 
> they feel like maybe Perl isn't so obscure after all <wink>.
> 
> Does someone have an elegant way to address this?  I believe Jeff's shot at
> elegance was the other part of the patch, using (his new) xreadlines under
> the covers to speed the fileinput module.

But of course suggesting fileinput is also not a great solution --
it's relatively obscure (since it's not taught by most tutorials,
certainly not by the standard tutorial).

> reading-text-files-is-very-common-ly y'rs  - tim

So is worrying about performance without a good reason...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Mon Jan  1 20:49:24 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 01 Jan 2001 14:49:24 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: Your message of "Mon, 01 Jan 2001 12:01:02 +0200."
             <20010101100102.2360CA84F@darjeeling.zadka.site.co.il> 
References: <20001231105812.A12168@newcnri.cnri.reston.va.us>, <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>  
            <20010101100102.2360CA84F@darjeeling.zadka.site.co.il> 
Message-ID: <200101011949.OAA09804@cj20424-a.reston1.va.home.com>

[Moshe]
> Well, Andrew, I know if I leave you any more time, you won't be able
> to resist the urge. OK, I'll volunteer. Can't do anything right now,
> but expect to see an updated version posted on my site soon. If 
> people will think it's a good idea, I'll move it to Misc/.
> Fred, if the some-xml-format-to-HTML you're working on is in any
> sort of readiness, I'll use that to format the FAQ.

Moshe, if your solution is to turn the FAQ into a document with a
single editor again, I think you're not doing the community a favor.
Granted, we could add some more sections (easy enough for me if
someone tells me the new section headings and which existing questions
go where) and there is a lot of obsolete information.

But I would be very hesitant to drop the notion of maintaining the FAQ
as a group collaboration project.  There's nothing wrong with the FAQ
wizard except that the password (Spam) should be made publicly known...

I've also noticed that Bjorn Pettersen has made a whole slew of useful
updates to various sections, mostly updates about new 2.0 features or
syntax.

> Having used Perl
> in the last couple of weeks, I learned to appreciate the fact that
> the FAQ is a standard part of the documentation.

Does that mean more than that it should be linked to from
http://www.python.org/doc/ ?  It's already there in the side bar; does
it need a more prominent position?  I used to include the FAQ in Misc/
(Ping's Misc/faq2html.py script is a last remnant of that), but gave
up after realizing that the on-line FAQ is much more useful than a
single text file.

In my eyes, the best thing you (and everyone else) could do, if you
find the time, would be to use the FAQ wizard to fix or delete
out-of-date entries.  To delete an entry, change its subject to
"Deleted" and remove its body; I'll figure out a way to delete them
from the index.  Because FAQ entries can refer to each other (and are
referred to from elsewhere) by number, it's not safe to simply
renumber entries.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Mon Jan  1 21:27:37 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 1 Jan 2001 15:27:37 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <200101011858.NAA09263@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEDJIGAA.tim.one@home.com>

[Guido]
> Thomas just checked this in, using Tim's words:

[   The optional \keyword{else} clause is executed when no
    exception occurs in the \keyword{try} clause.  Exceptions in
    the \keyword{else} clause are not handled by the preceding
    \keyword{except} clauses.

vs
    The optional \keyword{else} clause is executed when the
    \keyword{try} clause terminates by any means other than an
    exception or executing a \keyword{return}, \keyword{continue}
    or \keyword{break} statement.  Exceptions in the \keyword{else}
    clause are not handled by the preceding \keyword{except} clauses.
]

> How is this different from "when control flow reaches the end of the
> try clause", which is what I really had in mind?

Only in that it doesn't appeal to a new undefined phrase, and is (I think)
unambiguous in the eyes of a non-specialist reader (like Robin's friend).
Note that "reaching the end of the try clause" is at best ambiguous, because
you *really* have in mind "falling off the end" of the try clause.  It
wouldn't be unreasonable to say that in:

    try:
         x = 1
         y = 2
         return 1

"x=1" is the beginning of the try clause and "return 1" is the end.  So if
the reader doesn't already know what you mean, saying "the end" doesn't nail
it (or, if like me, the reader does already know what you mean, it doesn't
matter one whit what it says <wink>).

> Using the current wording, this paragraph would have to be
> changed each time a new control-flow keyword is added.  Based
> upon the historical record that's not a grave concern ;-),

It was sure no concern of mine ...

> but I think the new wording relies too much on accidentals such
> as the fact that these are the only control flow altering events.
>
> It may be that control flow is not rigidly defined -- but as it is
> what was really intended, maybe the fix should be to explain the
> right concept rather than the current ad-hoc solution.
> ...

OK, except I don't know how to do that succinctly.  For example, if Java had
an "else" clause, the Java spec would say:

    If present, the "else block" is executed if and only if execution
    of the "try block" completes normally, and then there is a choice:

        If the "else block" completes normally, then the
        "try" statement completes normally.

        If the "else block" completes abruptly for reason S,
        then the "try" statement completes abruptly for reason S.

That is, they deal with control-flow issues via appeal to "complete
normally" and "complete abruptly" (which latter comes in several flavors
("reasons"), such as returns and exceptions), and there are pages and pages
and pages of stuff throughout the spec inductively defining when these
conditions obtain.  It's clear, precise and readable; but it's also wordy,
and we don't have anything similar to build on.

As a compromise, given that we're not going to take the time to be precise
(well, I'm sure not ...):

    The optional \keyword{else} clause is executed if and
    when control flows off the end of the \keyword{try}
    clause.\foonote{In Python 2.0, control "flows off the
    end" except in case of exception, or executing a
    \keyword{return}, \keyword{continue} or \keyword{break}
    statement.}
    Exceptions in the \keyword{else} clause are not handled by
    the preceding \keyword{except} clauses.

Now it's all of imprecise, almost precise, specific to Python 2.0, and
robust against any future changes <wink>.




From akuchlin at cnri.reston.va.us  Mon Jan  1 21:35:27 2001
From: akuchlin at cnri.reston.va.us (Andrew Kuchling)
Date: Mon, 1 Jan 2001 15:35:27 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <200101011949.OAA09804@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 01, 2001 at 02:49:24PM -0500
References: <20001231105812.A12168@newcnri.cnri.reston.va.us>, <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> <20010101100102.2360CA84F@darjeeling.zadka.site.co.il> <200101011949.OAA09804@cj20424-a.reston1.va.home.com>
Message-ID: <20010101153527.A14116@newcnri.cnri.reston.va.us>

On Mon, Jan 01, 2001 at 02:49:24PM -0500, Guido van Rossum wrote:
>But I would be very hesitant to drop the notion of maintaining the FAQ
>as a group collaboration project.  There's nothing wrong with the FAQ
>wizard except that the password (Spam) should be made publicly known...

Why multiply the number of mechanisms required to maintain things?  We
already use CVS for other documentation; why not use it for the FAQ as 
well?  

--amk



From tim.one at home.com  Mon Jan  1 22:00:36 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 1 Jan 2001 16:00:36 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20010101153527.A14116@newcnri.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEDLIGAA.tim.one@home.com>

[Andrew Kuchling]
> Why multiply the number of mechanisms required to maintain things?
> We already use CVS for other documentation; why not use it for the
> FAQ as well?

The search facilities of the FAQ wizard are invaluable, and so is the
ability for "just users" to update the info from within their browsers.
There are two problems with the FAQ in practice:

1. It doesn't get updated enough.  We can't fix that by making it harder to
update!

2. It's *only* available via the web interface.  We should ship a text or
HTML snapshot with releases; perhaps even do the usual Usenet periodic
FAQ-posting thing.




From tim.one at home.com  Mon Jan  1 23:34:03 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 1 Jan 2001 17:34:03 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101011935.OAA09728@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEECIGAA.tim.one@home.com>

[Guido]
> But is everyone's first thought to time the speed of Python vs. Perl?

It's few peoples' first thought.  It's impossible for bilingual programmers
(or dabblers, or evaluators) not to notice *soon*, though, because:

> Why does it hurt so much that this is a bit slow?

Factors of 2 to 5 aren't "a bit" -- they're obvious when they happen, but
the *cause* is not.  To judge from a decade of c.l.py gripes, most people
write it off to "huh -- guess Python is just slow"; the rest eventually
figure out that their text input is the bottleneck (Tom Christiansen never
got this far <0.5 wink>), but then don't know what to do about it.

At this point I'm going to insert two anonymized pvt emails from last year:

-----Original Message #1 -----

From: TTT
Sent: Monday, March 13, 2000 2:29 AM
To: GGG
Subject: RE: [Python-Help] C, C++, Java, Perl, Python, Rexx, Tcl comparison

GGG, note especially figure 4 in Lutz Prechelt's report:

>   http://wwwipd.ira.uka.de/~prechelt/Biblio/#jccpprtTR

The submitted Python programs had by far the largest variability in how long
it took to load the dictionary.  My input loop is probably typical of the
"fast" Python programs, which indeed beat most (but not all) of the fastest
Perl ones here:

class Dictionary:
    ...

    def fill_from_file(self, f, BUFFERSIZE=500000):
        """f, BUFFERSIZE=500000 -> fill dictionary from file f.

        f must be an open file, or other object with a readlines()
        method.  It must contain one word per line.  Optional arg
        BUFFERSIZE is used to chunk up input for efficiency, and is
        roughly the # of bytes read at a time.
        """

        addword = self.addword
        while 1:
            lines = f.readlines(BUFFERSIZE)
            if not lines:
                break
            for line in lines:
                addword(line[:-1])  # chop trailing newline

Comparable Perl may have been the one-liner:

    grep(&addword, chomp(<>));

which may account for why Perl's memory use was uniformly higher than
Python's.

Whatever, you really need to be a Python expert to dream up "the fast way"
to do Python input!  Hire me, and I'll fix that <wink>.

nothing-like-blackmail-before-going-to-bed-ly y'rs  - TTT


-----Original Message #2 -----

From: GGG
Sent: Monday, March 13, 2000 7:08 AM
To: TTT
Subject: Re: [Python-Help] C, C++, Java, Perl, Python, Rexx, Tcl comparison


Agreed.  readlines(BUFFERSIZE) is a crock.  In fact, ``for i in
f.readlines()'' should use lazy evaluation -- but that will have to wait for
Py3K unless we add hints so that readlines knows it is being called from a
for loop.

--GGG


-----Back to 2001 -----

I took TTT's advice and read Lutz's report <wink>.  I agree with GGG that
hiding this in .readlines() would be maximally elegant.  xreadlines supplies
most of the lazy machinery GGG favored.  I don't know how hard it would be
to supply the rest of it, but it's such a frequent bitching point that I
would prefer pointing people to an explicit .xreadlines() hack than either
(a) try to convince them that they "shouldn't" care about the speed as much
as they claim to; or, (b) try to explain the double-loop buffering method.
I'd personally rather use an explicit .xreadlines() hack than code the
double-loop buffering too, and don't see an obvious way to do better than
that right now.

>> reading-text-files-is-very-common-ly y'rs  - tim

> So is worrying about performance without a good reason...

Indeed it is.  I'm persuaded that many people making this specific complaint
have a legitimate need for more speed, though, and that many don't persist
with Python long enough to find out how to address this complaint (because
the double-loop method is too obscure for a newbie to dream up).  That makes
this hack score extraordinarily high on my benefit/harm ratio scale (in P3K
xreadlines can be deprecated in favor of readlines <0.9 wink>).

heck-it-doesn't-even-require-a-new-keyword-ly y'rs  - tim




From thomas at xs4all.net  Mon Jan  1 23:46:45 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 1 Jan 2001 23:46:45 +0100
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101011935.OAA09728@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 01, 2001 at 02:35:24PM -0500
References: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> <200101011935.OAA09728@cj20424-a.reston1.va.home.com>
Message-ID: <20010101234645.B5435@xs4all.nl>

On Mon, Jan 01, 2001 at 02:35:24PM -0500, Guido van Rossum wrote:

[ Python lacks a One True Way of doing Perl's 'while(<>)' ]

> > Does someone have an elegant way to address this?  I believe Jeff's shot at
> > elegance was the other part of the patch, using (his new) xreadlines under
> > the covers to speed the fileinput module.

> But of course suggesting fileinput is also not a great solution --
> it's relatively obscure (since it's not taught by most tutorials,
> certainly not by the standard tutorial).

Is fileinput really obscure ? I personally quite like it. It is enough like
the perl idiom to be very useful for people thinking that way, and it
doesn't require special syntax or considerations. If tutorialization is the
only problem, I'd be happy to fix that, provided Fred or Moshe can TeX my
fix up.

As for speed (which stays a secondary or tertiary consideration at best) do
we really need the xreadlines method to accomplish that ? Couldn't fileinput
get almost the same performance using readlines() with a sizehint ? I
personally don't like the xreadlines because it adds yet another function to
do the same, with a slight, subtle and to the untrained programmer unclear
distinction from the rest. (I don't really like the range/xrange difference
either -- I think Python code shouldn't care whether they're dealing with a
real list or a generator, and as much as possible should just be generators.
And in the case of simple (x)range()es, I have yet to see a case where a
'real' list had significantly better performance than a generator.)

If we *do* start adding methods to (the public API of) filemethods, I think
we should consider more than just xreadlines() (I seem to recall other
proposals, but my memory is hazy at the moment -- I haven't slept since last
millennium) add whatever is necessary, and provide a UserFile in the std.
lib that 'emulates' all fileobject functionality using a single readline()
function.

Now, if you'll excuse me, I have a date with a soft bed I haven't seen in
about 40 hours, a pair of aspirin my head is killing for and probably a
hangover that I don't want to think about, right now ;)

Gelukkig-Nieuwjaar-iedereen-ly y'rs

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From jepler at inetnebr.com  Tue Jan  2 02:49:35 2001
From: jepler at inetnebr.com (Jeff Epler)
Date: Mon, 1 Jan 2001 19:49:35 -0600
Subject: [Python-Dev] Re: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com>; from Tim Peters on Mon, Jan 01, 2001 at 02:20:53PM -0500
Message-ID: <20010101194935.19672@falcon.inetnebr.com>

I'd like to speak up about this patch I've submitted on sourceforge.

I consider the xreadlines function/object to be the core of my proposal.
The addition of a method to file objects, as well as the modifications
to fileinput, are secondary in my opinion.

The desire is to iterate over file conents in a way that satisfies the
following criteria:
	* Uses the "for" syntax, because this clearly captures the
	  underlying operation. (files can be viewed as sequences of
	  lines when appropriate)
	* Consumes small amounts of memory even when the file contents
	  are large.
	* Has the lowest overhead that can reasonably be attained.

I think that it is agreed that the ability to use the "for" syntax is
important, since it was the impetus for the xrange function/object.
After all, there's a "while" statement which will give the same effect,
without introducing xrange.

The point under debate, as I see it, is the utility of speeding up the
"benchmarks" of folks who compare the speed of Python and another
language doing a very simple loop over the lines in a file.  Since this
advantage disappears once real work is beig done on the file, maybe an
XReadLines class, written in Python, would be more suitable.  In fact,
I've written such a class since I didn't know about fileinput and in
any case I find it less useful to me because of all the weird stuff it
does. (parsing argv, opening files by name, etc)

One shortcoming of my current patch, aside from the ones already named
in another person's response to the it, are that it fails when working
on a file-like class which implements .readline but not .readlines.

In any case, I wrote xreadlines to learn how to write C extensions to
Python, and submitted it at the suggestion of a fellow Python user in a
private discussion.  I'd like to extinguish one of these eternal
comp.lang.python threads with it too, but maybe it's not to be.

Happy new year, all.

Jeff



From gstein at lyra.org  Tue Jan  2 04:34:31 2001
From: gstein at lyra.org (Greg Stein)
Date: Mon, 1 Jan 2001 19:34:31 -0800
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20010101153527.A14116@newcnri.cnri.reston.va.us>; from akuchlin@cnri.reston.va.us on Mon, Jan 01, 2001 at 03:35:27PM -0500
References: <20001231105812.A12168@newcnri.cnri.reston.va.us>, <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> <20010101100102.2360CA84F@darjeeling.zadka.site.co.il> <200101011949.OAA09804@cj20424-a.reston1.va.home.com> <20010101153527.A14116@newcnri.cnri.reston.va.us>
Message-ID: <20010101193431.M10567@lyra.org>

On Mon, Jan 01, 2001 at 03:35:27PM -0500, Andrew Kuchling wrote:
> On Mon, Jan 01, 2001 at 02:49:24PM -0500, Guido van Rossum wrote:
> >But I would be very hesitant to drop the notion of maintaining the FAQ
> >as a group collaboration project.  There's nothing wrong with the FAQ
> >wizard except that the password (Spam) should be made publicly known...
> 
> Why multiply the number of mechanisms required to maintain things?  We
> already use CVS for other documentation; why not use it for the FAQ as 
> well?  

That would limit the updaters to just those with CVS access. As Guido just
pointed out, Bjorn made a bunch of updates. And he didn't need CVS to do
that...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From tim.one at home.com  Tue Jan  2 04:44:05 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 1 Jan 2001 22:44:05 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <20010101194935.19672@falcon.inetnebr.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEEKIGAA.tim.one@home.com>

[Jeff Epler]
> I'd like to speak up about this patch I've submitted on sourceforge.

I'm not sure that's allowed <wink>.

> ...
> The point under debate, as I see it, is the utility of speeding
> up the "benchmarks" of folks who compare the speed of Python and
> another language doing a very simple loop over the lines in a file.

If that were true, I couldn't care less.

> Since this advantage disappears once real work is being done on
> the file, ...

I agree that's true, but submit it's rarely relevant. *Most* file-crunching
apps are dominated by I/O time, which is why this is so visible to so many;
e.g., chewing over massive log files looking for patterns appears to be the
growth industry of the 21st century <wink>.  Even in Lutz's report (see
reference from earlier mail), where the task to be solved was far from
trivial, input time exceeded processing time across all languages (with some
oddball exceptions, when the coder neglected to use a hash table to store
info).  That's thoroughly typical of real file-crunching applications, in my
experience:  Perl has a killer speed advantage in the single most
time-consuming portion of the app, and due to one implementation trick.
Take that advantage away, and Python holds its own in this domain.

Coincidentally, I got pvt email from a newbie today, reading in part;

> If Perl wasn't so gosh darn good and fast at text scrubbing, it
> wouldn't really be a consideration, it's syntax is so clunky and
> hard to learn by comparison to both Python and Ruby.

This is just depressing, because I can predict every step of this dance.

> ...
> Happy new year, all.

And to you!  Just make sure it's a fast new year <wink>.





From moshez at zadka.site.co.il  Tue Jan  2 16:24:40 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue,  2 Jan 2001 17:24:40 +0200 (IST)
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <20010101234645.B5435@xs4all.nl>
References: <20010101234645.B5435@xs4all.nl>, <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> <200101011935.OAA09728@cj20424-a.reston1.va.home.com>
Message-ID: <20010102152440.9C26DA84F@darjeeling.zadka.site.co.il>

On Mon, 1 Jan 2001, Thomas Wouters <thomas at xs4all.net> wrote:

> As for speed (which stays a secondary or tertiary consideration at best) do
> we really need the xreadlines method to accomplish that ? Couldn't fileinput
> get almost the same performance using readlines() with a sizehint ? I

<aol>me too</aol>
Adding xreadlines() to the interface would break half a dozen file-objects all
around the world (just the standard library has StringIO, cStringIO,
GzipFile and probably some others I can't remember)

Adding .readlines(sizehint) to fileinput, and adding a function
to create something similar to fileinput from a file object (as opposed
to a file name) would help everyone, and doesn't seem to hard.
Is there a gotcha I'm just not seeing?

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tim.one at home.com  Tue Jan  2 09:06:32 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 03:06:32 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <20010101234645.B5435@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com>

[Thomas Wouters]
> ...
> As for speed (which stays a secondary or tertiary consideration
> at best) do we really need the xreadlines method to accomplish
> that ?  Couldn't fileinput get almost the same performance using
> readlines() with a sizehint ?

There was a long email discussion among Jeff, Paul Prescod, Neel
Krishnaswami, and Alex Martelli about this.  I started getting copied on it
somewhere midstream, but didn't have time to follow it then (like I do now
<wink>).

About two weeks ago Neel summarized all the approaches then under
discussion:

"""
[Neel Krishnaswami]

...

Quick performance summary of the current solutions:

Slowest: for line in fileinput.input('foo'):     # Time 100
       : while 1: line = file.readline()         # Time 75
       : for line in LinesOf(open('foo')):       # Time 25
Fastest: for line in file.readlines():           # Time 10
         while 1: lines = file.readlines(hint)   # Time 10
         for line in xreadlines(file):           # Time 10

The difference in speed between the slowest and fastest is about
a factor of 10.

LinesOf is Alex's Python wrapper class that takes a file and
uses readlines() with a size-hint to present a sequence interface.
It's around half as fast as the fastest idioms, and 3-4 times
faster than while 1:. Jeff's xreadlines is essentially the same
thing in C, and is indistinguishable in performance from the
other fast idioms.

...

"""

On his box, line-at-a-time is >7x slower than the fastest Python methods,
which latter are usually close (depending on the platform) to Perl
line-at-a-time speeds.  A factor of 7 is too large for most working
programmers to ignore in the interest of somebody else's notion of
theoretical purity <wink>.  Seriously, speed is not a secondary
consideration to me when the gap is this gross, and in an area so visible
and common.

Alex's LineOf appears a good predictor for how adding
fileinput.readlines(hint) would perform, since it appears to *be* that
(except off on its own).  Then it buys a factor of 3 over line-at-a-time on
Neel's box but leaves a factor of 2.5 on the table.  The cause of the latter
appears mostly to be the overhead of getting a Python method call into the
equation for each line returned.

Note that Jeff added .xreadlines() as a file object method at Neel's urging.
The way he started this is shown on the last line:  a function.  If we threw
out the fileinput and file method aspects, and just added a new module
xreadlines with a function xreadlines, then what?  I bet it would become as
popular as the string module, and for good reason:  it's a specific approach
that works, to a specific and common problem.

> ...
> And in the case of simple (x)range()es, I have yet to see a case
> where a 'real' list had significantly better performance than
> a generator.)

It varies by platform, but I don't think I've heard of variations larger
than 20% in either direction.  20% is nothing, though; in *this* case we're
talking order of magnitude.  That's go/nogo territory.

> ...
> Gelukkig-Nieuwjaar-iedereen-ly y'rs

I understand people are passionate when reality clashes with the dream of a
wart-free language, but that's no reason to swear at me <wink>.

wishing-you-a-happy-new-year-like-a-civilized-man-ly y'rs  - tim




From paulp at ActiveState.com  Tue Jan  2 11:00:46 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 02:00:46 -0800
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: 
 xrange : range
References: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> <200101011935.OAA09728@cj20424-a.reston1.va.home.com>
Message-ID: <3A51A6CE.3B15371D@ActiveState.com>

Guido van Rossum wrote:
> 
> ...
> 
> But is everyone's first thought to time the speed of Python vs. Perl?
> Why does it hurt so much that this is a bit slow?

I want to interject here that I asked Jeff to submit this patch because
I don't see it as "a little bit slow." When someone transliterates a
program from one scripting language to another and gets a program that
is two to five times slower that is a big deal!

> But of course suggesting fileinput is also not a great solution --
> it's relatively obscure (since it's not taught by most tutorials,
> certainly not by the standard tutorial).

Fileinput's primary problem is that IIRC, it is even slower than doing
readline yourself!

> > reading-text-files-is-very-common-ly y'rs  - tim
> 
> So is worrying about performance without a good reason...

I don't understand what constitutes good reason. We're talking about a
relatively minor change that will speed up thousands of programs, answer
a frequently asked question from comp.lang.python, obliterate an obscure
idiom and reduce the number of requests for a Python syntax change
(assignment expression) all in one bold sweep. It seemed to me as if it
was a "pure win."

 Paul Prescod



From paulp at ActiveState.com  Tue Jan  2 11:06:24 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 02:06:24 -0800
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: 
 xrange : range
References: <20010101234645.B5435@xs4all.nl>, <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> <200101011935.OAA09728@cj20424-a.reston1.va.home.com> <20010102152440.9C26DA84F@darjeeling.zadka.site.co.il>
Message-ID: <3A51A820.50365F02@ActiveState.com>

Moshe Zadka wrote:
> 
> ...
> 
> Adding .readlines(sizehint) to fileinput, and adding a function
> to create something similar to fileinput from a file object (as opposed
> to a file name) would help everyone, and doesn't seem to hard.
> Is there a gotcha I'm just not seeing?

Fileinput is inherently slow because there are too many layers of Python
code. I started to consider ways of inverting the logic so that it only
called into Python when it needed to switch files but it would have been
a much larger patch than Jeff's and I thought that a conservative
approach was important.

Fileinput should someday be optimized but we can easily get a
low-hanging fruit improvement with Jeff's patch.

 Paul Prescod



From guido at digicool.com  Tue Jan  2 15:56:40 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 02 Jan 2001 09:56:40 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 03:06:32 EST."
             <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com> 
Message-ID: <200101021456.JAA12633@cj20424-a.reston1.va.home.com>

Tim's almost as good at convincing me as he is at channeling me!  The
timings he showed almost convinced me that fileinput is hopeless and
xreadlines should be added.  But then I wrote a little timer of my
own...

I am including the timer program below my signature.  The test input
was the current access_log of dinsdale.python.org, which has about 119
Mbytes and 1M lines (as counted by the test program).

I measure about a factor of 2 between readlines with a sizehint (of 1
MB) and fileinput; a change to fileinput that
uses readline with a sizehint and in-lines the common case in
__getitem__ (as suggested by Moshe), didn't make a difference.

Output (the first time is realtime seconds, the second CPU seconds):

total 119808333 chars and 1009350 lines
count_chars_lines     7.944  7.890
readlines_sizehint    5.375  5.320
using_fileinput      15.861 15.740
while_readline        8.648  8.570

This was on a 600 MHz Pentium-III Linux box (RH 6.2).

Note that count_chars_lines and readlines_sizehint use the same
algorithm -- the difference is that readlines_sizehint uses 'pass' as
the inner loop body, while count_chars_lines adds two counters.

Given that very light per-line processing (counting lines and
characters) already increases the time considerably, I'm not sure I
buy the arguments that the I/O overhead is always considerable.  The
fact that my change to fileinput.py didn't make a difference suggests
that its lack of speed it purely caused by the Python code.

Now what to do?  I still don't like xreadlines very much, but I do see
that it can save some time.  But my test doesn't confirm Neel's times
as posted by Tim:

> Slowest: for line in fileinput.input('foo'):     # Time 100
>        : while 1: line = file.readline()         # Time 75
>        : for line in LinesOf(open('foo')):       # Time 25
> Fastest: for line in file.readlines():           # Time 10
>          while 1: lines = file.readlines(hint)   # Time 10
>          for line in xreadlines(file):           # Time 10

I only see a factor of 3 between fastest and slowest, and
readline is only about 60% slower than readlines_sizehint.

--Guido van Rossum (home page: http://www.python.org/~guido/)

import time, fileinput, sys

def timer(func, *args):
    t0 = time.time()
    c0 = time.clock()
    func(*args)
    t1 = time.time()
    c1 = time.clock()
    print "%-20s %6.3f %6.3f" % (func.__name__, t1-t0, c1-c0)

def count_chars_lines(fn, bs=1024*1024):
    nl = 0
    nc = 0
    f = open(fn, "r")
    while 1:
        buf = f.readlines(bs)
        if not buf:
            break
        for line in buf:
            nl += 1
            nc += len(line)
    f.close()
    print "total", nc, "chars and", nl, "lines"

def readlines_sizehint(fn, bs=1024*1024):
    f = open(fn, "r")
    while 1:
        buf = f.readlines(bs)
        if not buf:
            break
        for line in buf:
            pass
    f.close()

def using_fileinput(fn):
    f = fileinput.FileInput(fn)
    for line in f:
        pass
    f.close()

def while_readline(fn):
    f = open(fn, "r")
    while 1:
        line = f.readline()
        if not line:
            break
        pass
    f.close()

fn = "/home/guido/access_log"
if sys.argv[1:]:
    fn = sys.argv[1]
timer(count_chars_lines, fn)
timer(readlines_sizehint, fn, 1024*1024)
timer(using_fileinput, fn)
timer(while_readline, fn)



From guido at digicool.com  Tue Jan  2 16:07:06 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 02 Jan 2001 10:07:06 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: Your message of "Mon, 01 Jan 2001 15:27:37 EST."
             <LNBBLJKPBEHFEDALKOLCGEDJIGAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCGEDJIGAA.tim.one@home.com> 
Message-ID: <200101021507.KAA12796@cj20424-a.reston1.va.home.com>

> As a compromise, given that we're not going to take the time to be precise
> (well, I'm sure not ...):
> 
>     The optional \keyword{else} clause is executed if and
>     when control flows off the end of the \keyword{try}
>     clause.\foonote{In Python 2.0, control "flows off the
>     end" except in case of exception, or executing a
>     \keyword{return}, \keyword{continue} or \keyword{break}
>     statement.}
>     Exceptions in the \keyword{else} clause are not handled by
>     the preceding \keyword{except} clauses.
> 
> Now it's all of imprecise, almost precise, specific to Python 2.0, and
> robust against any future changes <wink>.

Sounds good to me.  The reference to 2.0 could be changed to
"Currently".

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan  2 16:20:11 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 02 Jan 2001 10:20:11 -0500
Subject: [Python-Dev] Re: curses in the core?
In-Reply-To: Your message of "Thu, 28 Dec 2000 18:25:28 EST."
             <20001228182528.A10743@thyrsus.com> 
References: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>  
            <20001228182528.A10743@thyrsus.com> 
Message-ID: <200101021520.KAA13222@cj20424-a.reston1.va.home.com>

> What does being in the Python core mean?  There are two potential definitions:
> 
> 1. Documentation says it's available on all platforms.
> 
> 2. Documentation restricts it to one of the three platform groups 
>    (Unix/Windows/Mac) but implies that it will be available on any
>    OS in that group.  
> 
> I think the second one is closer to what application programmers
> thinking about which batteries are included expect.  But I could be
> persuaded otherwise by a good argument.

Actually, when *I* have used the term "core" I've typically thought of
this as referring to anything that's in the standard source
distribution, whether or not it is built on all platforms.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas at arctrix.com  Tue Jan  2 09:42:30 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 00:42:30 -0800
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101021456.JAA12633@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 02, 2001 at 09:56:40AM -0500
References: <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com> <200101021456.JAA12633@cj20424-a.reston1.va.home.com>
Message-ID: <20010102004230.A29700@glacier.fnational.com>

On Tue, Jan 02, 2001 at 09:56:40AM -0500, Guido van Rossum wrote:
> Now what to do?  I still don't like xreadlines very much, but I do see
> that it can save some time.  But my test doesn't confirm Neel's times
> as posted by Tim:
> 
> > Slowest: for line in fileinput.input('foo'):     # Time 100
> >        : while 1: line = file.readline()         # Time 75
> >        : for line in LinesOf(open('foo')):       # Time 25
> > Fastest: for line in file.readlines():           # Time 10
> >          while 1: lines = file.readlines(hint)   # Time 10
> >          for line in xreadlines(file):           # Time 10
> 
> I only see a factor of 3 between fastest and slowest, and
> readline is only about 60% slower than readlines_sizehint.

Could it be that your using the CVS version of Python which
includes Andrew's cool glibc getline enhancement?

  Neil



From guido at digicool.com  Tue Jan  2 16:40:40 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 02 Jan 2001 10:40:40 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 00:42:30 PST."
             <20010102004230.A29700@glacier.fnational.com> 
References: <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com> <200101021456.JAA12633@cj20424-a.reston1.va.home.com>  
            <20010102004230.A29700@glacier.fnational.com> 
Message-ID: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>

[me]
> > I only see a factor of 3 between fastest and slowest, and
> > readline is only about 60% slower than readlines_sizehint.

[Neil]
> Could it be that your using the CVS version of Python which
> includes Andrew's cool glibc getline enhancement?

Bingo!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Tue Jan  2 17:34:31 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 11:34:31 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <200101021507.KAA12796@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEFJIGAA.tim.one@home.com>

>>     The optional \keyword{else} clause is executed if and
>>     when control flows off the end of the \keyword{try}
>>     clause.\foonote{In Python 2.0, control "flows off the
>>     end" except in case of exception, or executing a
>>     \keyword{return}, \keyword{continue} or \keyword{break}
>>     statement.}
>>     Exceptions in the \keyword{else} clause are not handled by
>>     the preceding \keyword{except} clauses.

[Guido]
> Sounds good to me.  The reference to 2.0 could be changed to
> "Currently".

Cool.  See

http://sourceforge.net/bugs/?group_id=5470&func=detailbug&bug_id=127098




From tim.one at home.com  Tue Jan  2 21:48:08 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 15:48:08 -0500
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
Message-ID: <LNBBLJKPBEHFEDALKOLCEEGJIGAA.tim.one@home.com>

test_compare is broken because the expected-output file has bizarre stuff in
it like:

    cmp(2, [1]) = -108
    cmp(2, (2,)) = -116
    cmp(2, None) = -78

What's up with that?

I'll leave test_minidom to someone who thinks they know what it's doing.

Both failures are very recent.




From tim.one at home.com  Tue Jan  2 21:48:09 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 15:48:09 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>

[Guido]
> I only see a factor of 3 between fastest and slowest, and
> readline is only about 60% slower than readlines_sizehint.

[Neil]
> Could it be that your using the CVS version of Python which
> includes Andrew's cool glibc getline enhancement?

[Guido]
> Bingo!

It's a good thing I haven't yet had time to try any speed tests myself,
since I don't have a glibc-enabled platform so Guido and I may have been
tempted to disagree about numbers in public <wink>.

I checked out the source for glibc's getline.  It's pulling the same trick
Perl uses, copying directly from the stdio buffer when it can, instead of
(like Python, and like almost all vendor fgets implementations) doing
getc-in-a-loop.  The difference is that Perl can't do that without breaking
into the FILE* representation in platform-dependent ways.  It's a shame that
almost all vendors missed that fgets was defined as a primitive by the C
committee precisely so that vendors *could* pull this speed trick under the
covers.  It's also a shame that Perl did it for them <wink>.




From barry at digicool.com  Tue Jan  2 22:56:10 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 2 Jan 2001 16:56:10 -0500
Subject: [Python-Dev] testing, please ignore
Message-ID: <14930.20090.283107.799626@anthem.wooz.org>

Sorry folks, just making sure things are working again.

you-really-didn't-want-email-this-millennium-didja?-ly y'rs,
-Barry




From guido at python.org  Tue Jan  2 21:59:22 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 02 Jan 2001 15:59:22 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 14:59:24 EST."
             <LNBBLJKPBEHFEDALKOLCAEGFIGAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCAEGFIGAA.tim.one@home.com> 
Message-ID: <200101022059.PAA14845@cj20424-a.reston1.va.home.com>

> [Guido]
> > I only see a factor of 3 between fastest and slowest, and
> > readline is only about 60% slower than readlines_sizehint.
> 
> [Neil]
> > Could it be that your using the CVS version of Python which
> > includes Andrew's cool glibc getline enhancement?
> 
> [Guido]
> > Bingo!
> 
> It's a good thing I haven't yet had time to try any speed tests myself,
> since I don't have a glibc-enabled platform so Guido and I may have been
> tempted to disagree about numbers in public <wink>.
> 
> I checked out the source for glibc's getline.  It's pulling the same trick
> Perl uses, copying directly from the stdio buffer when it can, instead of
> (like Python, and like almost all vendor fgets implementations) doing
> getc-in-a-loop.  The difference is that Perl can't do that without breaking
> into the FILE* representation in platform-dependent ways.  It's a shame that
> almost all vendors missed that fgets was defined as a primitive by the C
> committee precisely so that vendors *could* pull this speed trick under the
> covers.  It's also a shame that Perl did it for them <wink>.

Quite apart from whether we should enable xreadlines(), could you look
into doing a similar thing for MSVC stdio?  For most Unix platforms, a
cop-out answer is "use glibc" -- but for Windows it may pay to do our
own hack.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Tue Jan  2 22:06:05 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Tue, 2 Jan 2001 16:06:05 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 02, 2001 at 03:48:09PM -0500
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
Message-ID: <20010102160605.A5211@kronos.cnri.reston.va.us>

On Tue, Jan 02, 2001 at 03:48:09PM -0500, Tim Peters wrote:
>into the FILE* representation in platform-dependent ways.  It's a shame that
>almost all vendors missed that fgets was defined as a primitive by the C
>committee precisely so that vendors *could* pull this speed trick under the
>covers.  It's also a shame that Perl did it for them <wink>.

So, should Python be changed to use fgets(), available on all ANSI C
platforms, rather than the glibc-specific getline()?  That would be
more complicated than the brain-dead easy course of using getline(),
which is obviously why I didn't do it; PyFile_GetLine() had annoyingly
complicated logic.

When this was discussed in comp.lang.python, someone also mentioned
getc_unlocked(), which saves the overhead of locking the stream every
time, but that didn't seem a fruitful avenue for exploration.

--amk




From tim.one at home.com  Tue Jan  2 23:00:37 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 17:00:37 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101022059.PAA14845@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEGNIGAA.tim.one@home.com>

[Guido]
> Quite apart from whether we should enable xreadlines(), could you look
> into doing a similar thing for MSVC stdio?  For most Unix platforms, a
> cop-out answer is "use glibc" -- but for Windows it may pay to do our
> own hack.

There's no question about whether it would pay on Windows, because it pays
big for Perl on Windows.  The question is about cost.  There's no way to
*do* it short of the way Perl does it, which is to write a large pile of
Windows-specific code (roughly the same size and complexity as the glibc
getline implementation -- check it out, it's not trivial, and glibc exploits
compiler inlining to make it bearable) relying on reverse-engineered
accidents of how MS happens to use all the fields from this undocumented
struct (from MS's stdio.h):

struct _iobuf {
        char *_ptr;
        int   _cnt;
        char *_base;
        int   _flag;
        int   _file;
        int   _charbuf;
        int   _bufsiz;
        char *_tmpfname;
        };
typedef struct _iobuf FILE;

in their stdio implementation.  Else it won't play correctly with MS's
stdio.  That's A Project.  Last year I tried extracting the relevant code
from Perl, but, as is usual, gave up after unraveling the third (whatever)
layer of mystery macros with no end in sight.  I bet it would take me a
week.  Is it worth that much to you and DC?  Since the real Windows experts
are hanging out at ActiveState, I bet one of them will volunteer to do it
tonight <wink>.




From tim.one at home.com  Tue Jan  2 23:17:14 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 17:17:14 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010102160605.A5211@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEGNIGAA.tim.one@home.com>

[Tim]
> It's a shame that almost all vendors missed that fgets was defined
> as a primitive by the C committee precisely so that vendors *could*
> pull this speed trick under the covers.  It's also a shame that Perl
> did it for them <wink>.

[Andrew Kuchling]
> So, should Python be changed to use fgets(), available on all ANSI C
> platforms, rather than the glibc-specific getline()?  That would be
> more complicated than the brain-dead easy course of using getline(),
> which is obviously why I didn't do it; PyFile_GetLine() had annoyingly
> complicated logic.

The thrust of my original comment above is that fgets is almost never faster
than what Python is doing now, because vendors overwhelmingly do *not*
exploit the opportunity the std gave them.  So, no, switching to fgets()
wouldn't help.

> When this was discussed in comp.lang.python, someone also mentioned
> getc_unlocked(), which saves the overhead of locking the stream every
> time, but that didn't seem a fruitful avenue for exploration.

Well, get_unlocked isn't std (not even in C99).  Mentioning it did inspire
me to discover, however, that while the MS fgets() is the typical "getc in a
loop" thing, at least it locks/unlocks the stream once each at function
entry/exit, and uses a special MS flavor of getc ("_getc_lk") inside the
loop.  However, that this helps is an illusion, because the body of their
_getc_lk macro is identical to the body of their getc macro.  Smells like a
bug, or an unfinished project.




From paulp at ActiveState.com  Tue Jan  2 23:40:39 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 14:40:39 -0800
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: 
 xrange : range
References: <LNBBLJKPBEHFEDALKOLCCEGNIGAA.tim.one@home.com>
Message-ID: <3A5258E7.D52CA2C@ActiveState.com>

Tim Peters wrote:
> 
> There's no question about whether it would pay on Windows, because it pays
> big for Perl on Windows.  The question is about cost.  There's no way to
> *do* it short of the way Perl does it, which is to write a large pile of
> Windows-specific code 

> ... Since the real Windows experts
> are hanging out at ActiveState, I bet one of them will volunteer to do it
> tonight <wink>.

Mark is busy tonight and the Perl guys are still recovering from
implementing it the first time. :)

 Paul



From guido at python.org  Tue Jan  2 23:46:00 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 02 Jan 2001 17:46:00 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 16:06:05 EST."
             <20010102160605.A5211@kronos.cnri.reston.va.us> 
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>  
            <20010102160605.A5211@kronos.cnri.reston.va.us> 
Message-ID: <200101022246.RAA16384@cj20424-a.reston1.va.home.com>

> On Tue, Jan 02, 2001 at 03:48:09PM -0500, Tim Peters wrote:
> >into the FILE* representation in platform-dependent ways.  It's a shame that
> >almost all vendors missed that fgets was defined as a primitive by the C
> >committee precisely so that vendors *could* pull this speed trick under the
> >covers.  It's also a shame that Perl did it for them <wink>.
> 
> So, should Python be changed to use fgets(), available on all ANSI C
> platforms, rather than the glibc-specific getline()?  That would be
> more complicated than the brain-dead easy course of using getline(),
> which is obviously why I didn't do it; PyFile_GetLine() had annoyingly
> complicated logic.

You mean get_line(), which indeed has a complicated API and
corresponding logic: the argument may be a max length, or 0 to
indicate arbutrary length, or negative to indicate raw_input()
semantics. :-(

Unfortunately we can't use fgets(), even if it were faster than
getline(), because it doesn't tell how many characters it read.  On
files containing null bytes, readline() is supposed to treat these
like any other character; if your input is "abc\0def\nxyz\n", the
first readline() call should return "abc\0def\n".  But with fgets(),
you're left to look in the returned buffer for a null byte, and
there's no way (in general) to distinguish this result from an input
file that only consisted of the three characters "abc".  getline()
doesn't seem to have this problem, since its size is also an output
parameter.

> When this was discussed in comp.lang.python, someone also mentioned
> getc_unlocked(), which saves the overhead of locking the stream every
> time, but that didn't seem a fruitful avenue for exploration.

I've never heard of getc_unlocked; it's not in the (old) C standard.
If it's also a glibc thing, I doubt that using it would be faster than
getline().  If it's a new C standard (C9x) thing, we'll have to wait.

Fred reminded me that for e.g. Solaris, while everybody probably
compiles with GCC, that doesn't mean they are using glibc, so
in practice getline() will only help on Linux.

I'm slowly warming up to xreadlines(), although we must be careful to
consider the consequences (do other file-like objects need to support
it too?).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Tue Jan  2 23:46:18 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 17:46:18 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines ::  xrange : range
In-Reply-To: <3A5258E7.D52CA2C@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHBIGAA.tim.one@home.com>

[Tim]
> ... Since the real Windows experts are hanging out at ActiveState,
> I bet one of them will volunteer to do it tonight <wink>.

[Paul Prescod]
> Mark is busy tonight and the Perl guys are still recovering from
> implementing it the first time. :)

I'm delighted, then, that you have nothing better to do than tease the
decent, hard-working folks on Python-Dev!  I'll be up until about 4am --
feel free to submit your patch anytime before then.

in-a-pinch-i'll-even-accept-it-tomorrow-ly y'rs  - tim




From guido at python.org  Tue Jan  2 23:53:14 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 02 Jan 2001 17:53:14 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 17:00:37 EST."
             <LNBBLJKPBEHFEDALKOLCCEGNIGAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCCEGNIGAA.tim.one@home.com> 
Message-ID: <200101022253.RAA16482@cj20424-a.reston1.va.home.com>

> [Guido]
> > Quite apart from whether we should enable xreadlines(), could you look
> > into doing a similar thing for MSVC stdio?  For most Unix platforms, a
> > cop-out answer is "use glibc" -- but for Windows it may pay to do our
> > own hack.
> 
> There's no question about whether it would pay on Windows, because it pays
> big for Perl on Windows.  The question is about cost.  There's no way to
> *do* it short of the way Perl does it, which is to write a large pile of
> Windows-specific code (roughly the same size and complexity as the glibc
> getline implementation -- check it out, it's not trivial, and glibc exploits
> compiler inlining to make it bearable) relying on reverse-engineered
> accidents of how MS happens to use all the fields from this undocumented
> struct (from MS's stdio.h):
> 
> struct _iobuf {
>         char *_ptr;
>         int   _cnt;
>         char *_base;
>         int   _flag;
>         int   _file;
>         int   _charbuf;
>         int   _bufsiz;
>         char *_tmpfname;
>         };
> typedef struct _iobuf FILE;
> 
> in their stdio implementation.  Else it won't play correctly with MS's
> stdio.  That's A Project.  Last year I tried extracting the relevant code
> from Perl, but, as is usual, gave up after unraveling the third (whatever)
> layer of mystery macros with no end in sight.  I bet it would take me a
> week.  Is it worth that much to you and DC?  Since the real Windows experts
> are hanging out at ActiveState, I bet one of them will volunteer to do it
> tonight <wink>.

Yeah.  That's too much.  Too bad.  I'm not holding my breath for
ActiveState though. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Tue Jan  2 23:52:58 2001
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 2 Jan 2001 16:52:58 -0600 (CST)
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101022246.RAA16384@cj20424-a.reston1.va.home.com>
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
	<20010102160605.A5211@kronos.cnri.reston.va.us>
	<200101022246.RAA16384@cj20424-a.reston1.va.home.com>
Message-ID: <14930.23498.53540.401218@beluga.mojam.com>

    Guido> I'm slowly warming up to xreadlines(), ...

I haven't followed this thread closely, and my brain is a bit frazzled at
the moment, but is there some fundamental reason that the file object's
readlines method can't be made lazy, perhaps only when given a sizehint?

Skip



From paulp at ActiveState.com  Tue Jan  2 23:59:47 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 14:59:47 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
		<LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
		<20010102160605.A5211@kronos.cnri.reston.va.us>
		<200101022246.RAA16384@cj20424-a.reston1.va.home.com> <14930.23498.53540.401218@beluga.mojam.com>
Message-ID: <3A525D63.17ABCC87@ActiveState.com>

Skip Montanaro wrote:
> 
>     Guido> I'm slowly warming up to xreadlines(), ...
> 
> I haven't followed this thread closely, and my brain is a bit frazzled at
> the moment, but is there some fundamental reason that the file object's
> readlines method can't be made lazy, perhaps only when given a sizehint?

I suggested this at one point but it was pointed out that there is
probably a lot of code that works with the resulting list *as a list*
i.e. as a random-access, writable sequence object. I really wasn't
thrilled with xreadlines at first either...it's the least of all
possible evils (including the status quo).

 Paul



From nas at arctrix.com  Tue Jan  2 17:09:15 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 08:09:15 -0800
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEGJIGAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 02, 2001 at 03:48:08PM -0500
References: <LNBBLJKPBEHFEDALKOLCEEGJIGAA.tim.one@home.com>
Message-ID: <20010102080915.A30892@glacier.fnational.com>

On Tue, Jan 02, 2001 at 03:48:08PM -0500, Tim Peters wrote:
> test_compare is broken because the expected-output file has bizarre stuff in
> it like:
> 
>     cmp(2, [1]) = -108
>     cmp(2, (2,)) = -116
>     cmp(2, None) = -78
> 
> What's up with that?

My fault.  I only ran regrtest.py and not "make test".  I'm not
sure why you say bizarre stuff though.  Do you object to testing
that 2 is less than None (something that is not part of the
language spec) or do you think that the results from cmp() should
be clamped between -1 and 1?

  Neil



From guido at python.org  Wed Jan  3 00:06:16 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 02 Jan 2001 18:06:16 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 16:52:58 CST."
             <14930.23498.53540.401218@beluga.mojam.com> 
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com> <20010102160605.A5211@kronos.cnri.reston.va.us> <200101022246.RAA16384@cj20424-a.reston1.va.home.com>  
            <14930.23498.53540.401218@beluga.mojam.com> 
Message-ID: <200101022306.SAA16684@cj20424-a.reston1.va.home.com>

> I haven't followed this thread closely, and my brain is a bit frazzled at
> the moment, but is there some fundamental reason that the file object's
> readlines method can't be made lazy, perhaps only when given a sizehint?

Yes -- readlines() is documented to return a list, and some people do
things to it that require it to be a real list (e.g. sort or reverse
it or modify it in place or concatenate it with other lists).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Wed Jan  3 00:19:14 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 18:19:14 -0500
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <20010102080915.A30892@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHEIGAA.tim.one@home.com>

[Tim]
> test_compare is broken because the expected-output file has
> bizarre stuff in it like:
>
>     cmp(2, [1]) = -108
>     cmp(2, (2,)) = -116
>     cmp(2, None) = -78
>
> What's up with that?

[Neil Schemenauer]
> My fault.  I only ran regrtest.py and not "make test".

Neil, my platform doesn't even *have* a "make":  are you saying the test
passes for you when you run regrtest.py?  That's what I did.

> I'm not sure why you say bizarre stuff though.  Do you object to
> testing that 2 is less than None (something that is not part of the
> language spec)

Only in part.  Lang Ref 2.1.3 (Comparisons) says you can compare them, and
guarantees they won't compare equal, but doesn't define it beyond that.  If
Python actually says "less", fine, we can test for that, although to
minimize maintenance down the road it would be better to test for no more
than we expect Python to guarantee across releases and implementations
(suppose Jython says 2 is greater than None:  that's fine too, and it would
be better if the test suite didn't say Jython was broken).

> or do you think that the results from cmp() should be clamped
> between -1 and 1?

Not that either <wink>; cmp() isn't documented that way.

They're "bizarre" simply because they're not what Python returns!

C:\Code\python\dist\src\PCbuild>python
Python 2.0 (#8, Dec 17 2000, 01:39:08) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> cmp(2, [1])
-1
>>> cmp(2, (2,))
-1
>>> cmp(2, None)
-1
>>>

The expected-output file is supposed to match what Python actually does.  I
have no idea where things like "-108" came from.  So things like -108 look
bizarre to me.  So long as cmp(2, [1]) returns -1 in reality, an
expected-output file that claims it returns -108 will never work no matter
how you run the tests.

One of us is missing something obvious here <wink>.




From paulp at ActiveState.com  Wed Jan  3 00:26:39 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 15:26:39 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>  
	            <20010102160605.A5211@kronos.cnri.reston.va.us> <200101022246.RAA16384@cj20424-a.reston1.va.home.com>
Message-ID: <3A5263AF.CE6C8C81@ActiveState.com>

Guido van Rossum wrote:
> 
> ...
> 
> I'm slowly warming up to xreadlines(), although we must be careful to
> consider the consequences (do other file-like objects need to support
> it too?).

The implementation is such that it is pretty easy to add the method to
other file-like objects. It is also easy to use the xreadlines module to
get the same behavior for objects that do not have the method. 
Essentially, file.xreadlines is implemented like this:

def xreadlines(self):
    import xreadlines
    xreadlines.xreadlines(self)

Any object can add the method similarly.

 Paul Prescod



From nas at arctrix.com  Tue Jan  2 17:51:48 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 08:51:48 -0800
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEHEIGAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 02, 2001 at 06:19:14PM -0500
References: <20010102080915.A30892@glacier.fnational.com> <LNBBLJKPBEHFEDALKOLCOEHEIGAA.tim.one@home.com>
Message-ID: <20010102085148.A30986@glacier.fnational.com>

On Tue, Jan 02, 2001 at 06:19:14PM -0500, Tim Peters wrote:
> Neil, my platform doesn't even *have* a "make":  are you saying the test
> passes for you when you run regrtest.py?

Yes.  Isn't checking in code without running regrtest a capital
offence? :)

> Lang Ref 2.1.3 (Comparisons) says you can compare them, and
> guarantees they won't compare equal, but doesn't define it beyond that.

Okay, I'll use == rather than cmp().  When I was working on the coercion
patch I found cmp() useful.  I guess it shouldn't be in the standard
test suite, especially since Jython may implement things differently.

[Neil]
> or, do you think that the results from cmp() should be clamped
> between -1 and 1?

[Tim]
> Not that either <wink>; cmp() isn't documented that way.
> 
> They're "bizarre" simply because they're not what Python returns!

They do on my box:

    Python 2.0 (#19, Nov 21 2000, 18:13:04) 
    [GCC 2.95.2 20000220 (Debian GNU/Linux)] on linux2
    Type "copyright", "credits" or "license" for more information.
    >>> cmp(1, None)
    -78

I guess MS uses a different strcmp than GNU.  Do you mind trying the
attached C code?  I get "-78" as output.  I should have thought a little
more before checking in the patch.  -78 is quite obviously a
machine/library dependent thing.

[Tim again]
> One of us is missing something obvious here <wink>.

I don't know about that.  The implementation of coercion and comparison
is not simple.  I've been studying it for some time now and I obviously
still don't know what the hell is going on.

AFAICT, the problem is that instances without a comparison method can
compare larger or smaller than numbers depending on where in memory the
objects are stored.

  Neil


#include <stdio.h>
#include <string.h>

int main()
{
    printf("%d\n", strcmp("", "None"));
}



From tim.one at home.com  Wed Jan  3 01:30:26 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 19:30:26 -0500
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <20010102085148.A30986@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEHIIGAA.tim.one@home.com>

[Neil]
> They do on my box:
>
>     Python 2.0 (#19, Nov 21 2000, 18:13:04)
>     [GCC 2.95.2 20000220 (Debian GNU/Linux)] on linux2
>     Type "copyright", "credits" or "license" for more information.
>     >>> cmp(1, None)
>     -78

Well, who cares about your silly box <wink>?  Messier than I thought!  Yes,
Windows strcmp is always in {-1, 0, 1}.  Rather than run tests, here's the
tail end of MS's strcmp.c:

        if ( ret < 0 )
                ret = -1 ;
        else if ( ret > 0 )
                ret = 1 ;

        return( ret );

Wasted cycles and stupid formatting <wink>.

> ...
> AFAICT, the problem is that instances without a comparison method can
> compare larger or smaller than numbers depending on where in memory
> the objects are stored.

If so, that's a bug ... OK, it *is* a bug, at least in current CVS.  Did you
cause that, or was it always this way?  I was able to provoke this badness:

>>> j < c < i
1
>>> j < i
0
>>>

i.e. it violates transitivity, and that's never supposed to happen in the
absence of user-supplied __cmp__.  Here c is an instance of "class C: pass",
and i and j are ints.

>>> type(i), type(j), type(c)
(<type 'int'>, <type 'int'>, <type 'instance'>)
>>> i, j, c
(999999, 1000000, <__main__.C instance at 00791B7C>)
>>> id(i), id(j), id(c)
(7941572, 7744676, 7936892)
>>>

Guido thought he fixed this kind of stuff once (and I believed him <wink>)
by treating all numbers as if they had type name "" (i.e., yes, an empty
string) when compared to non-numbers.  Then the usual "mixed-type
comparisons in the absence of __cmp__ compare via type name string" rule
ensured that numbers would always compare "less than" instances of any other
type.  That's the intent of the tail end:

		else if (vtp->tp_as_number != NULL)
			vname = "";
		else if (wtp->tp_as_number != NULL)
			wname = "";
		/* Numerical types compare smaller than all other types */
		return strcmp(vname, wname);

of PyObject_Compare.  So, in the example above, we *should* have

    i < c == 1
    j < c == 1
    j < c < i == 0

Unfortunately, we actually have

    i < c == 0

in that example.  We're apparently not getting to the "number hack" code
because c is an instance, and I'll confess up front that my eyes always
glazed over long before I got to PyInstance_HalfBinOp <0.half wink>.
Whatever, there's at least one bug somewhere in that path!   We should have
n < i == 1 for any numeric type n and any non-numeric type i (in the absence
of user-defined __cmp__).





From skip at mojam.com  Wed Jan  3 02:27:03 2001
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 2 Jan 2001 19:27:03 -0600 (CST)
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <3A525D63.17ABCC87@ActiveState.com>
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
	<20010102160605.A5211@kronos.cnri.reston.va.us>
	<200101022246.RAA16384@cj20424-a.reston1.va.home.com>
	<14930.23498.53540.401218@beluga.mojam.com>
	<3A525D63.17ABCC87@ActiveState.com>
Message-ID: <14930.32743.525564.69044@beluga.mojam.com>

    Paul> I suggested this at one point but it was pointed out that there is
    Paul> probably a lot of code that works with the resulting list *as a
    Paul> list*

How about this idea?  What if readlines() was allowed to return a lazy
evaluator if a sizehint > 0 was given?  I only saw one example outside of
test cases in the current CVS tree where readlines(sizehint) was used
(Tools/idle/GrepDialog.py), and it used it as expected:

    while 1:
      block = f.readlines(sizehint)
      if not block:
        break
      for line in block:
        more stuff

My suspicion is that most uses of sizehint will be like this.  It hasn't
been around all that long in Python-years (since 1.5a2), so there's probably
not tons of code to break (I agree the semantics would change), and the
majority of code that uses it probably looks like the above, which is almost
safe (if it returned "" instead of an empty evaluator when nothing was left
to read it would be safe).  The advantage would be that the above could
become the more obvious

    for line in f.readlines(sizehint):
      more stuff

and the change to file reading code that is "too slow" becomes much simpler.
(Of course, xreadlines() has that advantage as well.)

I scanned my own code quickly.  I found about 10 uses with sizehint and 300
without.

I presume we are talking about 2.1 here.  In any case, it seems to me that
in Py3k readlines should be lazy.

Skip

P.S.  Why did FileInput class never grow a readlines method?



From nas at arctrix.com  Tue Jan  2 20:38:53 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 11:38:53 -0800
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEHIIGAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 02, 2001 at 07:30:26PM -0500
References: <20010102085148.A30986@glacier.fnational.com> <LNBBLJKPBEHFEDALKOLCKEHIIGAA.tim.one@home.com>
Message-ID: <20010102113853.A31341@glacier.fnational.com>

On Tue, Jan 02, 2001 at 07:30:26PM -0500, Tim Peters wrote:
> > AFAICT, the problem is that instances without a comparison method can
> > compare larger or smaller than numbers depending on where in memory
> > the objects are stored.
> 
> If so, that's a bug ... OK, it *is* a bug, at least in current CVS.  Did you
> cause that, or was it always this way?

To quote Bart Simpson: I didn't do it.  I'm pretty sure the bug
is in PyInstance_DoBinOp.  I don't think its worth fixing though.
I'm ready to check in my coercion overhaul patch, assuming no
veto's from the list.  It should fix this bug (and introduce a
whole slew of new ones :).

Guido suggested that I remove the "number types compare smaller
than other types" behavior.  What's your take on that?  The
current patch on SF always uses the type names.  It should be
easy to implement the old behavior though.

  Neil



From nas at arctrix.com  Tue Jan  2 20:48:09 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 11:48:09 -0800
Subject: [Python-Dev] Applying the PEP 208 (coercion overhaul) patch
Message-ID: <20010102114809.B31341@glacier.fnational.com>

I'm almost ready to apply SF patch #102652.  Guido has give the
okay assuming there are no objections from the rest of
python-dev.  The patch is large and modifies some complicated
parts of the interpreter.  I expect there will be some bugs.  If
you would like me to wait, speak now.

Guido has sent me some comments on the patch today which I plan
to review and address tonight.  I will probably apply the patch
tomorrow evening.

  Neil



From tim.one at home.com  Wed Jan  3 04:05:59 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 22:05:59 -0500
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <20010102113853.A31341@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHNIGAA.tim.one@home.com>

[Neil Schemenauer, on a violation of transitivity j < c < i but not j < i]

> To quote Bart Simpson: I didn't do it.  I'm pretty sure the bug
> is in PyInstance_DoBinOp.  I don't think its worth fixing though.
> I'm ready to check in my coercion overhaul patch, assuming no
> veto's from the list.  It should fix this bug (and introduce a
> whole slew of new ones :).

Sounds good to me!

> Guido suggested that I remove the "number types compare smaller
> than other types" behavior.  What's your take on that?  The
> current patch on SF always uses the type names.  It should be
> easy to implement the old behavior though.

It doesn't matter that they're specifically smaller, it matters that they
can't violate transitivity.  "numbers compare smaller" was introduced
deliberately (by Guido) because, e.g., before that we had

    99 < [99] < 99L

despite that 99 == 99L, because

   "int" < "list" < "long int"

Even stranger, we had

    100 < [99] < 0L < 100

and

    100 < [] < -101L < -100


Making numbers compare smaller than other types is one way to ensure stuff
like that can't happen; I can't think of a simpler way (although making them
compare larger than other types would be equally simple, as would making
them compare as if their type name were "Neil" <wink>).




From paulp at ActiveState.com  Wed Jan  3 04:34:59 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 19:34:59 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
		<LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
		<20010102160605.A5211@kronos.cnri.reston.va.us>
		<200101022246.RAA16384@cj20424-a.reston1.va.home.com>
		<14930.23498.53540.401218@beluga.mojam.com>
		<3A525D63.17ABCC87@ActiveState.com> <14930.32743.525564.69044@beluga.mojam.com>
Message-ID: <3A529DE3.D93C3916@ActiveState.com>

Skip Montanaro wrote:
> 
>...
> 
> I presume we are talking about 2.1 here.  In any case, it seems to me that
> in Py3k readlines should be lazy.

I agree, but I'm ambivalent about your suggestion for polymorphic return
values from readlines(). Yet another option is a "lazy=1" option.

 Paul Prescod



From tim.one at home.com  Wed Jan  3 05:33:29 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 23:33:29 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101021456.JAA12633@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHPIGAA.tim.one@home.com>

[Guido, writes a timing program]

[Jeff, if you weren't copied on all this stuff, you can play catch-up
 by reading the archives, at
    http://mail.python.org/pipermail/python-dev/
]

> ...
> I am including the timer program below my signature.  The test input
> was the current access_log of dinsdale.python.org, which has about 119
> Mbytes and 1M lines (as counted by the test program).

For a contrast, I cobbled together a large test file out of various chunks
of C source, .py source, HTML source, and email archives.  I was shooting
for the same size you used (~119Mb), but ended up with more than 3x as many
lines.

> I measure about a factor of 2 between readlines with a sizehint (of 1
> MB) and fileinput;

Factor of 7 here (Jeff, NeilS eventually figured out that Guido was using a
CVS version of Python that has AndrewK's glibc getline patch, a zippier
line-input routine than Python 2.0 has; but it only applies to platforms
using glibc).

> ...
> Output (the first time is realtime seconds, the second CPU seconds):
>
> total 119808333 chars and 1009350 lines
> count_chars_lines     7.944  7.890
> readlines_sizehint    5.375  5.320
> using_fileinput      15.861 15.740
> while_readline        8.648  8.570
>
> This was on a 600 MHz Pentium-III Linux box (RH 6.2).

total 117615824 chars and 3237568
count_chars_lines    14.780 14.772
readlines_sizehint    9.390  9.375
using_fileinput      66.130 66.157
while_readline       30.380 30.337

866 MHz P3 Win98SE, current CVS Python.  I have no handy explanation for why
clock() and time() differ on my box (Win98 has no notions of "user time" or
"CPU time" distinct from clock time).

> Note that count_chars_lines and readlines_sizehint use the same
> algorithm -- the difference is that readlines_sizehint uses 'pass' as
> the inner loop body, while count_chars_lines adds two counters.
>
> Given that very light per-line processing (counting lines and
> characters) already increases the time considerably, I'm not sure I
> buy the arguments that the I/O overhead is always considerable.

I disagree that this is "very light processing", although I agree it's hard
to think of lighter processing <wink>:  it's a few Python statements per
line, which I'd say is pretty *typical* processing.  Read a line, run a
string find or regexp search on it, test the result, sometimes fiddle the
line accordingly and sometimes not.  File-crunching apps generally aren't
rocket science!  For example, I changed count_chars_lines to tally the
number of lines containing the string "Guido" instead, and the runtime went
up by just 0.8 seconds (BTW, it found 13808 of them <wink>):  if you're
thinking in C terms, millions of failing searches for "Guido" may seem like
more work, but the number of Python stmts executed usually counts more than
what the stmts do at the C level.

> ...
> Now what to do?  I still don't like xreadlines very much, but I do
> see that it can save some time.  But my test doesn't confirm Neel's
> times as posted by Tim:
>
>> Slowest: for line in fileinput.input('foo'):     # Time 100
>>        : while 1: line = file.readline()         # Time 75
>>        : for line in LinesOf(open('foo')):       # Time 25
>> Fastest: for line in file.readlines():           # Time 10
>>          while 1: lines = file.readlines(hint)   # Time 10
>>          for line in xreadlines(file):           # Time 10
>
> I only see a factor of 3 between fastest and slowest, and
> readline is only about 60% slower than readlines_sizehint.

I don't know what Neel used for an input file, or which platform he used
either.  And this is bound to vary a lot across platforms.  As above, I saw
a factor of 7 between fastest and slowest and a factor of 3 between readline
and readlines_sizehint.

BTW, on my platform the Perl script (using a recent ActiveState Windows
Perl)

open(FILE, "ga.txt");
while (<FILE>) {
    1;
}

ran in about 6 seconds (I never figured how to get Perl to compute usable
timings itself)-- substantially faster than even readlines_sizehint! --and
changing the body to

$nc = $nl = 0;
while (<FILE>) {
    ++$nl;
    $nc += length;
}
print "$nc $nl\n";

boosted that to about 8 seconds.  So Perl has gotten zippier too over the
years.




From tim.one at home.com  Wed Jan  3 10:32:55 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 3 Jan 2001 04:32:55 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101022253.RAA16482@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEIDIGAA.tim.one@home.com>

[Guido & Tim, wonder about faking getline-like functionality for Windows]

The attached is kinda baffling.  The std tests pass with it, and it changes my test
timings from:

count_chars_lines    14.780 14.772
readlines_sizehint    9.390  9.375
using_fileinput      66.130 66.157
while_readline       30.380 30.337

to:

count_chars_lines    14.880 14.854
readlines_sizehint    9.280  9.302
using_fileinput      48.610 48.589
while_readline       13.450 13.451

Big win?  You bet.  But ...

The baffling parts:

1. That Perl still takes only 6 seconds in line-at-a-time mode.

2. I originally wrote a getline workalike, instead of building directly into a PyString
buffer.  That made my test run *slower*, and I'm talking factor of 2, not a yawn.  To
judge from my usually silent disk (I've got 256Mb RAM on this box), I'm afraid the extra
mallocs required may have triggered the horrid Win9x malloc-thrashing problem I wrote
about while I was still at Dragon.  Consider that another vote for Vlad's PyMalloc --
we've got no handle on x-platform dynamic memory behavior now.  Python's destiny is to
replace both the platform OS and libc anyway <0.9 wink>.

The scary parts:

+ As the "XXX" comments indicate, this is full of little insecurities.

+ Another one I just thought of:  if the user's last operations on the fp were two or more
consecutive ungetc calls, all bets are off.  But then MS doesn't define what happens then
either.

+ This is much less ambitious than I recall Perl's code being:  it doesn't try to guess
anything about the file, and effectively captures only what would happen if you could
unroll the guts of a getc-in-a-loop and optimize the snot out of it.  The good news is
that this means it's much easier to maintain (it touches only two of the MS FILE* fields,
and in ways that are pretty obviously correct).  The bad news is that this seems also
pretty clearly all there *is* to be gotten out of breaking into the FILE* abstraction for
the particular test case I'm using; and increasing TUNEME doesn't save any time at all:
the sucker is flying at full speed already.

+ It drops (line-at-a-time) drops to a little under 13 seconds if I comment out the thread
macros.

+ I haven't looked at Perl's implementation in a year, and they must have dreamt up
another trick since then.  That's a "scary part" indeed to anyone who has ever looked at
Perl's implementation.

retreating-into-a-fetal-position-ly y'rs  - tim


Anyone wants to play, the sandbox is fileobject.c.  Do two things:  insert this new chunk
somewhere above get_line:

#ifdef MS_WIN32
static PyObject*
win32_getline(FILE *fp)
{
	/* XXX ignores thread safety -- but so does MS's getc macro! */
	PyObject* v;
	char* pBuf;	/* next free slot in v's buffer */
	/* MS's internals are declared in terms of ints, but it's a sure bet
	 * that won't last forever -- use size_t now & live w/ the casting;
	 * ditto for Python's routines
	 */
	size_t total_buf_size = 100;
	size_t free_buf_size = total_buf_size;
#define TUNEME 1000	/* how much to boost the string buffer when exhausted */

	v = PyString_FromStringAndSize((char *)NULL, (int)total_buf_size);
	if (v == NULL)
		return NULL;
	pBuf = BUF(v);
	Py_BEGIN_ALLOW_THREADS
	for (;;) {
		char ch;
		size_t ms_cnt;	/* FILE->_cnt shadow */
		char* ms_ptr;	/* FILE->_ptr shadow */
		size_t max_to_copy, i;
		/* stdio buffer empty or in unknown state; rather
		 * than try to simulate every quirk of MS's internals,
		 * let the MS macros deal with it.
		 */
		/* XXX we also wind up here when we simply run out of string
		 * XXX buffer space, but I'm not sure I care:  making this a
		 * XXX double-nested loop doesn't seem worth it
		 */
		ch = getc(fp);
		if (ch == EOF)
			break;
		/* make sure we've got some breathing room */
		if (free_buf_size < 100) {
			size_t currentoffset = pBuf - BUF(v);
			total_buf_size += TUNEME;  /* XXX check for overflow */
			Py_BLOCK_THREADS
			if (_PyString_Resize(&v, (int)total_buf_size) < 0)
				return NULL;
			Py_UNBLOCK_THREADS
			pBuf = BUF(v) + currentoffset;
			free_buf_size = TUNEME;
		}
		/* ch wasn't EOF, so store it */
		*pBuf++ = ch;
		--free_buf_size;
		if (ch == '\n') {
			break;
		}
		ms_cnt = (size_t)fp->_cnt;
		if (!ms_cnt) {
			/* XXX this is a slow way to read one character at
			 * XXX a time if, e.g., the stream is unbuffered
			 */
			continue;
		}
		/* payback!  now we don't have to check for buffer overflows or
		 * EOF inside the loop, nor does the macro _filbuf() branch force
		 *  _ptr and _cnt in and out of memory on each iteration
		 */
		ms_ptr = fp->_ptr;
		assert(ms_cnt > 0);
		i = max_to_copy = ms_cnt < free_buf_size ? ms_cnt : free_buf_size;
		do {
			/* XXX unclear to me why MS's getc macro does "& 0xff" */
			*pBuf++ = ch = *ms_ptr++ & 0xff;
		} while (--i && ch != '\n');
		/* update the shadows & counters */
		fp->_ptr = ms_ptr;
		free_buf_size -= max_to_copy - i;
		fp->_cnt = ms_cnt - (max_to_copy - i);
		if (ch == '\n')
			break;
	}
	Py_END_ALLOW_THREADS
	_PyString_Resize(&v, pBuf - BUF(v));
	return v;
}
#endif

2. Within get_line, add this before the #endif (this is the getline #if block):

#elif defined(MS_WIN32)
	if (n == 0) {
		return win32_getline(fp);
	}




From ping at lfw.org  Wed Jan  3 12:40:47 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Wed, 3 Jan 2001 05:40:47 -0600 (CST)
Subject: [Python-Dev] inspect.py
In-Reply-To: <14840.19556.127151.457533@anthem.concentric.net>
Message-ID: <Pine.LNX.4.10.10011021617550.800-100000@skuld.kingmanhall.org>

Uh... hi.  <sheepish look>

I know i've all but dropped out of existence for a long time, what with
my simultaneous first stints as a grad student, a teaching assistant, and
a house cook (!) and all, but i didn't want to let this work go to waste.

Now that the holidays are here i can *finally* try to get some work done!

So, i've updated inspect.py in response to Barry's comments, and below is
my reply to this old thread.  I also wrote some regression tests.

I tried to submit inspect.py to SourceForge, but i got:

    ERROR

    Patch Uploaded ERROR - Submission failed PQsendQuery() -- query is
    too long.  Maximum length is 16382

Does anyone know what's going on with that?


Anyway, the latest module and regression tests are available at:

    http://www.lfw.org/python/inspect.py
    http://www.lfw.org/python/test_inspect.py

for your perusal.




On Thu, 26 Oct 2000 barry at wooz.org wrote:
> Some thoughts after an initial scan of inspect.py:
> 
> - The doc strings for the is*() functions aren't accurate.
>   E.g. ismodule() says that it asks whether "the object is a module
>   with the __file__ special attribute", but that isn't really what it
>   tests!  Guido points out that builtin modules don't currently have
>   __file__ and besides, you're really testing that the type of the
>   object is ModuleType.

Perhaps a different wording would be better, but i should at least
clarify the intention: i wrote them that way because it seemed that
the current objects export an unofficial "interface" by means of the
special attributes they provide.  The purpose of the "is*()" functions
is to determine whether an object meets one of these interfaces.

A complete interface would provide (1) a type-checker, (2) a constructor,
and (3) the methods.  As for (2), we don't normally allow construction of
these things (except for wizards using the newmodule).  As for (3), i
suppose that one could further encapsulate these interfaces by providing
spelled-out methods like "def getcode(f): return f.func_code", but it
didn't seem worth the trouble.  So that left just (1), and i had the
other parts in mind while trying to describe (1).

The type-checkers aren't of much use unless they accurately reflect
the availability of the special attributes.  Do you see what i'm trying
to do?  Maybe you can suggest a better way of doing it... anyway, i've
tried to compromise in the docstrings as submitted.

> - Don't make the predicate in getmembers() default to "lambda x: 1"
>   Instead make the default None, and skip the predicate test if it is
>   None.

Okay, fine.

> - getdoc()'s docstring should describe the margin munging it does.

Okay, done.

> - findsource() seems off-by one, e.g.
> 
>    >>> x = inspect.findsource(inspect.findsource)
>    >>> x[1]
>    138
> 
>    but the function really stars on line 139.

138 was the intended result here.  Indeed the function starts
on line 139 if you start counting from 1.  The reason it returns
138 is that it's the index you would use for the array of lines
(thus x[0][x[1]] or file.readlines()[138] is the first line of
the function).

Which way makes more sense?  Should it be changed?

> - I notice that currentframe() still uses the try/except trick to get
>   the frame object.  It's much more efficient to provide a C
>   trampoline for getting that information.

Sure, if there's a faster way, that's fine.  It just wasn't
something i expected to be used really often, and i wanted to
write the module in pure Python so it could be easily maintained.

I added a line to clobber the pure-Python currentframe() with
sys._getframe() if it exists.

> - If this were included in the library, we might want to 2.0-ify it.

It currently doesn't rely on any 2.0 features, and it would be
kind of nice to have it still work with 1.5 (especially if it is
part of a drop-in documentation tool, as it is now, since it goes
with htmldoc).


-- ?!ng

"Computers are useless.  They can only give you answers."
    -- Pablo Picasso





From guido at python.org  Wed Jan  3 13:06:33 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 07:06:33 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <200101031206.HAA19182@cj20424-a.reston1.va.home.com>

Apparently getc_unlocked() is in the Single Unix spec.  Not sure how
widespread that is -- do Linux developers pay attention to this
standard at all?  According to the webpage it's (c) 1997.

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Wed, 03 Jan 2001 10:58:44 +0200
From:    Erno Kuusela <erno at iki.fi>
To:      guido at python.org
Subject: getc_unlocked note

hello,

i was reading the python-dev archives and saw that someone had noticed
my getline/getc_unlocked post from the newsgroup. a correction to the
python-dev thread: getc_unlocked and friends are infact standard (not c99
though since c99 doesn't specify threads); they are part of the single
unix specification.

link:
http://www.opennc.org/onlinepubs/007908799/xsh/getc_unlocked.html

   -- erno

------- End of Forwarded Message




From guido at python.org  Wed Jan  3 13:37:11 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 07:37:11 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 03 Jan 2001 04:32:55 EST."
             <LNBBLJKPBEHFEDALKOLCIEIDIGAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCIEIDIGAA.tim.one@home.com> 
Message-ID: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>

> 1. That Perl still takes only 6 seconds in line-at-a-time mode.

Are you sure Perl still uses stdio at all?

If so, does it open the file in binary or in text mode?  Based on the
APIs in MS's libc, I presume that the crlf->lf translation is not done
by stdio proper but by the Unix I/O emulation just underneath it
(open() has an O_BINARY option flag, so read() probably does the
translation).  That comes down to copying most bytes an extra time.

(To test this hypothesis, you could try to open the test file with
mode "rb" and see if it makes a difference.)

> 2. I originally wrote a getline workalike, instead of building
> directly into a PyString buffer.  That made my test run *slower*,
> and I'm talking factor of 2, not a yawn.  To judge from my usually
> silent disk (I've got 256Mb RAM on this box), I'm afraid the extra
> mallocs required may have triggered the horrid Win9x
> malloc-thrashing problem I wrote about while I was still at Dragon.
> Consider that another vote for Vlad's PyMalloc -- we've got no
> handle on x-platform dynamic memory behavior now.  Python's destiny
> is to replace both the platform OS and libc anyway <0.9 wink>.
>
> The scary parts:
>
> + As the "XXX" comments indicate, this is full of little
> insecurities.

My biggest worry: thread-safety.  There must be a way to lock the file
(you indicated that fgets() uses it).

> + Another one I just thought of: if the user's last operations on
> the fp were two or more consecutive ungetc calls, all bets are off.
> But then MS doesn't define what happens then either.

Python doesn't have an interface to ungetc(), and I believe the stdio
standard says you can only call ungetc() once consecutively.  Assuming
other C code linked with Python obeys this rule (a pretty safe
assumption), we should be fine.  And if the assumption is violated, I
presume it's really that C code's fault -- plus, it code that only
uses getc() would be screwed just as badly.

> + This is much less ambitious than I recall Perl's code being: it
> doesn't try to guess anything about the file, and effectively
> captures only what would happen if you could unroll the guts of a
> getc-in-a-loop and optimize the snot out of it.  The good news is
> that this means it's much easier to maintain (it touches only two of
> the MS FILE* fields, and in ways that are pretty obviously correct).
> The bad news is that this seems also pretty clearly all there *is*
> to be gotten out of breaking into the FILE* abstraction for the
> particular test case I'm using; and increasing TUNEME doesn't save
> any time at all: the sucker is flying at full speed already.

You probably don't have many lines longer than 1000 characters.

> + It drops (line-at-a-time) drops to a little under 13 seconds if I
> comment out the thread macros.

If you mean the Py_BLOCK_THREADS around the resize, that can be safely
dropped.  (If/when we introduce Vladimir's malloc, we'll have to
decide whether it is threadsafe by itself or whether it requires the
global interpreter lock.  I vote to make it threadsafe by itself.)

> + I haven't looked at Perl's implementation in a year, and they must
> have dreamt up another trick since then.  That's a "scary part"
> indeed to anyone who has ever looked at Perl's implementation.
>
> retreating-into-a-fetal-position-ly y'rs - tim
> 
> 
> Anyone wants to play, the sandbox is fileobject.c.  Do two things:
> insert this new chunk somewhere above get_line:
> 
> #ifdef MS_WIN32
> static PyObject*
> win32_getline(FILE *fp)
> {
> 	/* XXX ignores thread safety -- but so does MS's getc macro! */
> 	PyObject* v;
> 	char* pBuf;	/* next free slot in v's buffer */
> 	/* MS's internals are declared in terms of ints, but it's a sure bet
> 	 * that won't last forever -- use size_t now & live w/ the casting;
> 	 * ditto for Python's routines
> 	 */
> 	size_t total_buf_size = 100;
> 	size_t free_buf_size = total_buf_size;
> #define TUNEME 1000	/* how much to boost the string buffer when exhausted */
> 
> 	v = PyString_FromStringAndSize((char *)NULL, (int)total_buf_size);
> 	if (v == NULL)
> 		return NULL;
> 	pBuf = BUF(v);
> 	Py_BEGIN_ALLOW_THREADS
> 	for (;;) {
> 		char ch;
> 		size_t ms_cnt;	/* FILE->_cnt shadow */
> 		char* ms_ptr;	/* FILE->_ptr shadow */
> 		size_t max_to_copy, i;
> 		/* stdio buffer empty or in unknown state; rather
> 		 * than try to simulate every quirk of MS's internals,
> 		 * let the MS macros deal with it.
> 		 */
> 		/* XXX we also wind up here when we simply run out of string
> 		 * XXX buffer space, but I'm not sure I care:  making this a
> 		 * XXX double-nested loop doesn't seem worth it
> 		 */
> 		ch = getc(fp);
> 		if (ch == EOF)
> 			break;
> 		/* make sure we've got some breathing room */
> 		if (free_buf_size < 100) {
> 			size_t currentoffset = pBuf - BUF(v);
> 			total_buf_size += TUNEME;  /* XXX check for overflow */
> 			Py_BLOCK_THREADS
> 			if (_PyString_Resize(&v, (int)total_buf_size) < 0)
> 				return NULL;
> 			Py_UNBLOCK_THREADS
> 			pBuf = BUF(v) + currentoffset;
> 			free_buf_size = TUNEME;
> 		}
> 		/* ch wasn't EOF, so store it */
> 		*pBuf++ = ch;
> 		--free_buf_size;
> 		if (ch == '\n') {
> 			break;
> 		}
> 		ms_cnt = (size_t)fp->_cnt;
> 		if (!ms_cnt) {
> 			/* XXX this is a slow way to read one character at
> 			 * XXX a time if, e.g., the stream is unbuffered
> 			 */
> 			continue;
> 		}
> 		/* payback!  now we don't have to check for buffer overflows or
> 		 * EOF inside the loop, nor does the macro _filbuf() branch force
> 		 *  _ptr and _cnt in and out of memory on each iteration
> 		 */
> 		ms_ptr = fp->_ptr;
> 		assert(ms_cnt > 0);
> 		i = max_to_copy = ms_cnt < free_buf_size ? ms_cnt : free_buf_size;

Doesn't it make more sense to delay the resize until this point?  I
don't know how much the character copying accounts for, but I could
imagine a strategy based on memchr() and memcpy() that first searches
for a \n, and if found, allocates to the right size before copying.
Typically, the buffer contains many lines, so this could be optimized
into requiring a single exactly-sized malloc() call in the common case
(where the buffer doesn't wrap).  But possibly scanning the buffer for
\n and then copying the bytes separately, even with memcmp() and
memcpy(), slows things down too much for this to be faster.

> 		do {
> 			/* XXX unclear to me why MS's getc macro does "& 0xff" */
> 			*pBuf++ = ch = *ms_ptr++ & 0xff;

I know why.  getchar() returns an int in the range [-1, 255].  If
chars are signed the &0xff is needed else you would get a return in
the range [-128, 127] and -1 would be ambiguous (EOF==-1).  Not sure
if they *are* unsigned on any MS platform -- if they aren't, whoever
coded this wasn't thinking -- on the other hand the compiler probagbly
optimizes it out.  But here since you're copying to another character,
it's pointless.

> 		} while (--i && ch != '\n');
> 		/* update the shadows & counters */
> 		fp->_ptr = ms_ptr;
> 		free_buf_size -= max_to_copy - i;
> 		fp->_cnt = ms_cnt - (max_to_copy - i);
> 		if (ch == '\n')
> 			break;
> 	}
> 	Py_END_ALLOW_THREADS
> 	_PyString_Resize(&v, pBuf - BUF(v));
> 	return v;
> }
> #endif
> 
> 2. Within get_line, add this before the #endif (this is the getline #if block):
> 
> #elif defined(MS_WIN32)
> 	if (n == 0) {
> 		return win32_getline(fp);
> 	}

Note that get_line() with negative n could be implemented as
get_line(0) with some post processing.  This should be done completely
separately, in PyFile_GetLine.  The negative n case is only used by
raw_input() -- it means strip the \n and raise EOFError for EOF, and I
expect that this is rarely if ever used in a speed-conscious
situation.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Wed Jan  3 15:56:31 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 09:56:31 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 03 Jan 2001 07:06:33 EST."
             <200101031206.HAA19182@cj20424-a.reston1.va.home.com> 
References: <200101031206.HAA19182@cj20424-a.reston1.va.home.com> 
Message-ID: <200101031456.JAA19990@cj20424-a.reston1.va.home.com>

> Apparently getc_unlocked() is in the Single Unix spec.  Not sure how
> widespread that is -- do Linux developers pay attention to this
> standard at all?  According to the webpage it's (c) 1997.

Erno Kuusela gave me some more info about this; glibc supports it.

I did a quick test which suggests that it is a lot faster than regular
getc() -- on a small test file it's actually faster than GNU
getline(), even with the proper flockfile() / funlockfile() calls.
(The test file was 6Mb -- 10 copies of /etc/termcap, which has short
lines -- avg 43 chars.)

This together with Tim's Win32x specific hacks might be the best we
can do for get_line().  However, raw xreadlines is still almost twice
as fast, so it's still under consideration.

Maybe MS supports a similar unlocked getc macro, and a separate
primitive to lock/unlock a file?  That would allow more unified code.

(Quick research shows that it exists, but only in internal form.  We
could probably call _lock_file() and _unlock_file(), and define our
own getc_lk(), protected by the proper set of macros.  This could all
be presented by config.h as flockfile(), funlockfile(), and
getc_unlocked() macros.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Wed Jan  3 16:27:09 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 3 Jan 2001 10:27:09 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101031206.HAA19182@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 03, 2001 at 07:06:33AM -0500
References: <200101031206.HAA19182@cj20424-a.reston1.va.home.com>
Message-ID: <20010103102709.A19451@kronos.cnri.reston.va.us>

On Wed, Jan 03, 2001 at 07:06:33AM -0500, Guido van Rossum wrote:
>Apparently getc_unlocked() is in the Single Unix spec.  Not sure how
>widespread that is -- do Linux developers pay attention to this
>standard at all?  According to the webpage it's (c) 1997.

It seems to be in glibc 2.1, but I don't know how much it would help,
and the added complexity of having to lock the file separately worries
me, perhaps due to a superstitious fear of angering the Thread Gods.

--amk



From akuchlin at mems-exchange.org  Wed Jan  3 16:44:57 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 3 Jan 2001 10:44:57 -0500
Subject: [Python-Dev] Help wanted with setup.py script
In-Reply-To: <017201c0759a$c2b180c0$e000a8c0@thomasnotebook>; from thomas.heller@ion-tof.com on Wed, Jan 03, 2001 at 04:35:10PM +0100
References: <200012290046.TAA01346@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com> <017201c0759a$c2b180c0$e000a8c0@thomasnotebook>
Message-ID: <20010103104457.A19493@kronos.cnri.reston.va.us>

[Cc'ing to python-dev].  

On Wed, Jan 03, 2001 at 04:35:10PM +0100, Thomas Heller wrote:
>You didn't expect this script run under windows?
>(It does not run)

It shouldn't matter, I think, since the makesetup stuff doesn't run on
Windows either; presumably the compiled-in modules are specified by an
MSVC project file, or something similar.  Can anyone confirm that I
don't care if setup.py works on Windows?  (Well, I *know* for a fact I
don't care; but should I? :) )

--amk




From guido at python.org  Wed Jan  3 16:49:43 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 10:49:43 -0500
Subject: [Python-Dev] Help wanted with setup.py script
In-Reply-To: Your message of "Wed, 03 Jan 2001 10:44:57 EST."
             <20010103104457.A19493@kronos.cnri.reston.va.us> 
References: <200012290046.TAA01346@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com> <017201c0759a$c2b180c0$e000a8c0@thomasnotebook>  
            <20010103104457.A19493@kronos.cnri.reston.va.us> 
Message-ID: <200101031549.KAA20188@cj20424-a.reston1.va.home.com>

> It shouldn't matter, I think, since the makesetup stuff doesn't run on
> Windows either; presumably the compiled-in modules are specified by an
> MSVC project file, or something similar.  Can anyone confirm that I
> don't care if setup.py works on Windows?  (Well, I *know* for a fact I
> don't care; but should I? :) )

Personally, I don't think it's worth to make setup.py work for
Windows.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Wed Jan  3 21:04:07 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 3 Jan 2001 15:04:07 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #103082] speed up readline() using getc_unlocked()
In-Reply-To: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net>; from noreply@sourceforge.net on Wed, Jan 03, 2001 at 08:47:30AM -0800
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net>
Message-ID: <20010103150407.D20301@kronos.cnri.reston.va.us>

On Wed, Jan 03, 2001 at 08:47:30AM -0800, GvR wrote:
>Summary: speed up readline() using getc_unlocked()

So what does the performance of this version look like?

--amk



From guido at python.org  Wed Jan  3 21:25:53 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 15:25:53 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #103082] speed up readline() using getc_unlocked()
In-Reply-To: Your message of "Wed, 03 Jan 2001 15:04:07 EST."
             <20010103150407.D20301@kronos.cnri.reston.va.us> 
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net>  
            <20010103150407.D20301@kronos.cnri.reston.va.us> 
Message-ID: <200101032025.PAA27457@cj20424-a.reston1.va.home.com>

> >Summary: speed up readline() using getc_unlocked()
> 
> So what does the performance of this version look like?

Very slightly faster than the GNU getline() version.  Without GNU
getline, the old code was about 3.5 times slower.

Here are the current times on a 6 Mb file (fileinput.py has my
sourceforge speedup patch too):

$ ./python ~/rltest.py ~/termcapx10 
total 6252720 chars and 146250 lines; average line length 42.8
count_chars_lines     0.943  0.930
readlines_sizehint    0.544  0.540
using_fileinput       2.089  2.090
while_readline        0.956  0.960

For comparison, here's what Python 1.5.2 does with the same test
(which should be pretty close to what the released Python 2.0 does; I
don't have a copy of that handy).

$ python1.5 ~/rltest.py ~/termcapx10 
total 6252720 chars and 146250 lines; average line length 42.8
count_chars_lines     0.836  0.820
readlines_sizehint    0.523  0.520
using_fileinput       5.739  5.740
while_readline        3.670  3.670

I don't know why count_chars_lines got proportionally more slower than
readlines_sizehint.  (The += operator didn't make a difference either
way.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Wed Jan  3 21:45:38 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 15:45:38 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #103082] speed up readline() using getc_unlocked()
In-Reply-To: Your message of "Wed, 03 Jan 2001 15:25:53 EST."
             <200101032025.PAA27457@cj20424-a.reston1.va.home.com> 
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net> <20010103150407.D20301@kronos.cnri.reston.va.us>  
            <200101032025.PAA27457@cj20424-a.reston1.va.home.com> 
Message-ID: <200101032045.PAA27595@cj20424-a.reston1.va.home.com>

I should add that the patches are on SourceForge:

fileinput.py:
http://sourceforge.net/patch/?func=detailpatch&patch_id=103081&group_id=5470

fileobject.c:
http://sourceforge.net/patch/?func=detailpatch&patch_id=103082&group_id=5470

I'm ready to check these in, but I'm waiting 24 hours in case there's
something I've missed.  (I haven't actually tested these on any other
platform besides Linux.)

Jeff Epler's xreadlines patch is here:
http://sourceforge.net/patch/?func=detailpatch&patch_id=102915&group_id=5470

Note that Jeff's patch includes a patch to fileinput.py that does the
same thing as mine but using his xreadlines module instead of directly
using readlines(sizehint) as does mine.  I like my approach better,
mostly because it reduces depenencies.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Wed Jan  3 22:25:30 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 3 Jan 2001 16:25:30 -0500
Subject: [Python-Dev] speed up readline() using getc_unlocked()
In-Reply-To: <200101032045.PAA27595@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 03, 2001 at 03:45:38PM -0500
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net> <20010103150407.D20301@kronos.cnri.reston.va.us> <200101032025.PAA27457@cj20424-a.reston1.va.home.com> <200101032045.PAA27595@cj20424-a.reston1.va.home.com>
Message-ID: <20010103162530.A20433@kronos.cnri.reston.va.us>

On Wed, Jan 03, 2001 at 03:45:38PM -0500, Guido van Rossum wrote:
>I'm ready to check these in, but I'm waiting 24 hours in case there's
>something I've missed.  (I haven't actually tested these on any other
>platform besides Linux.)

On Solaris 2.6, the configure script doesn't detect that
getc_unlocked() & friends are supported; details available from the
patch.  After editing config.h manually to enable them, the results are:

Before getc_unlocked patch:
total 1559913 chars and 32513 lines
count_chars_lines     0.892  0.730
readlines_sizehint    0.329  0.300
using_fileinput       4.612  4.470
while_readline        2.739  2.670

After patch:
total 1559913 chars and 32513 lines
count_chars_lines     0.698  0.680
readlines_sizehint    0.273  0.270
using_fileinput       2.707  2.700
while_readline        0.778  0.780
amarok src>           

With a patched version of fileinput.py:
using_fileinput       1.675  1.680

--amk



From guido at python.org  Wed Jan  3 22:36:07 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 16:36:07 -0500
Subject: [Python-Dev] speed up readline() using getc_unlocked()
In-Reply-To: Your message of "Wed, 03 Jan 2001 16:25:30 EST."
             <20010103162530.A20433@kronos.cnri.reston.va.us> 
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net> <20010103150407.D20301@kronos.cnri.reston.va.us> <200101032025.PAA27457@cj20424-a.reston1.va.home.com> <200101032045.PAA27595@cj20424-a.reston1.va.home.com>  
            <20010103162530.A20433@kronos.cnri.reston.va.us> 
Message-ID: <200101032136.QAA07752@cj20424-a.reston1.va.home.com>

> On Solaris 2.6, the configure script doesn't detect that
> getc_unlocked() & friends are supported; details available from the
> patch.

(Fixed now, see the new patch.)

> After editing config.h manually to enable them, the results are:
> 
> Before getc_unlocked patch:
> total 1559913 chars and 32513 lines
> count_chars_lines     0.892  0.730
> readlines_sizehint    0.329  0.300
> using_fileinput       4.612  4.470
> while_readline        2.739  2.670
> 
> After patch:
> total 1559913 chars and 32513 lines
> count_chars_lines     0.698  0.680
> readlines_sizehint    0.273  0.270
> using_fileinput       2.707  2.700
> while_readline        0.778  0.780
> amarok src>           
> 
> With a patched version of fileinput.py:
> using_fileinput       1.675  1.680

Thanks!  The bottom line seems to be that your basic readline loop is
still 3x as slow as the fastest way -- so there's still a lot to say
for xreadlines...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Wed Jan  3 22:42:48 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 03 Jan 2001 22:42:48 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib codecs.py,1.13,1.14
References: <E14DvT9-00079N-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <3A539CD8.367361B8@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> Update of /cvsroot/python/python/dist/src/Lib
> In directory usw-pr-cvs1:/tmp/cvs-serv26608/Lib
> 
> Modified Files:
>         codecs.py
> Log Message:
> ...
> 
> This patch closes the bugs #116285 and #119960.

I was too fast... the subject line of #119960 was misleading.
It is still open.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Thu Jan  4 00:13:15 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 3 Jan 2001 18:13:15 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com>

[Guido]
> Are you sure Perl still uses stdio at all?

Pretty sure, but there are so many layers of macros the code is
undecipherable, and I can't step thru macros in the debugger either (that's
assuming I wanted to devote N hours to building Perl from source too --
which I don't).  Perl also makes heavy use of macroizing std library names,
so e.g. when I see "fopen" (which I do!), that doesn't mean I'm getting the
fopen I'm thinking of.  But the MSVC config files define all sorts of macros
to get at the MS stdio _cnt and _ptr (and most other) FILE* fields, and the
version of fopen in the Win32 stuff appears to defer to the platform fopen
(after doing Perlish stuff, like if someone passed "/dev/null" as the file
name, Perl changes it to "NUL").

This is what it's like:  the first line of Perl's win32_fopen is this:

    dTHXo;

That's conditionally defined in perl.h, either as

#define dTHXo			dTHXoa(PERL_GET_THX)

or, if pTHXo is not defined, as

#  define dTHXo		dTHX

dTHX in turn is #defined in 4 different places across 3 different files in 2
different directories.  I'll skip those.  OTOH, dTHXoa is easy!  It's only
defined once:

#define dTHXoa(a)		pTHXo = a

Ah, *that* clears it up <wink>.  Etc.  20 years ago I may have thought this
was fun.  I thought debugging large systems of m4 macros was fun then, and
I'm not sure this is either better or worse than that -- well, it's worse,
because I understood m4's implementation.


> If so, does it open the file in binary or in text mode?

Sorry, but I really don't know and it's a pit to pursue.  If it's not native
text mode, they do a good job of faking it (e.g., Ctrl-Z acts like an EOF
when reading a text file from Perl on Windows -- not something even Larry
would be likely to do on his own <wink>).

> Based on the APIs in MS's libc, I presume that the crlf->lf
> translation is not done by stdio proper but by the Unix I/O
> emulation just underneath it (open() has an O_BINARY option
> flag, so read() probably does the translation).

Yes; and late in the last release cycle, import.c's open_exclusive had a
Windows bug related to this (fdopen() used "wb", but the earlier open()
didn't use O_BINARY, and fdopen *acted* like it had used "w").  Also, the MS
setmode() function works on file handles, not streams.

> That comes down to copying most bytes an extra time.

Understood.  But the CRLF are stored physically on disk, so unless the disk
controller is converting them, *someone's* software (whether MS's or Perl's)
is doing it.  By the time Perl is doing its fast line-input stuff, and doing
what sure looks like a straight copy out of an IO buffer, it's clear from
the code that CRLF has already been translated to LF.

> (To test this hypothesis, you could try to open the test file
> with mode "rb" and see if it makes a difference.)

In Python, that saved about 10% (but got the wrong answers <wink>).  In
Perl, about 15-20%.  But I don't think that tells us who's doing the
translation.  Assuming that the translation takes about the same total time
for each, it makes sense that the percentage would be higher for Perl (since
its total runtime is lower:  same-sized slice of a smaller pie).

> My biggest worry: thread-safety.  There must be a way to lock
> the file (you indicated that fgets() uses it).

Yes, via the unadvertised _lock_str and _unlock_str macros defined in MS
mtdll.h, which is not on the include path:

/*
 * This is an internal C runtime header file. It is used when building
 * the C runtimes only. It is not to be used as a public header file.
 */

The routines and macros it calls are also unadvertised.  After an hour of
thrashing I wasn't able to successfully link any code trying to call these
routines.  Doesn't mean it's impossible, does means they're internal to MS
libc and aren't meant to be called by anything else.  That's why it's called
"cheating" <wink>.  Perl appears to ignore the whole issue (but Perl's
thread story is muddy at best).

[... ungetc ...]

Not worried here either.

> ...
> You probably don't have many lines longer than 1000 characters.

None, in fact.

>> + It drops (line-at-a-time) drops to a little under 13 seconds if I
>> comment out the thread macros.

> If you mean the Py_BLOCK_THREADS around the resize, that can be safely
> dropped.

I meant *all* thread-related macros -- was just trying to get a feel for how
much that fiddling cost (it's an expense Perl doesn't seem to have -- yet).
Was measurable but not substantial.  WRT the resize, there's now a "fast
path" that avoids it.

> (If/when we introduce Vladimir's malloc, we'll have to decide whether
> it is threadsafe by itself or whether it requires the global
> interpreter lock.  I vote to make it threadsafe by itself.)

As feared, this thread is going to consume my life <0.5 wink>.

> ...
> Doesn't it make more sense to delay the resize until this point?  I
> don't know how much the character copying accounts for, but I could
> imagine a strategy based on memchr() and memcpy() that first searches
> for a \n, and if found, allocates to the right size before copying.
> Typically, the buffer contains many lines, so this could be optimized
> into requiring a single exactly-sized malloc() call in the common case
> (where the buffer doesn't wrap).  But possibly scanning the buffer for
> \n and then copying the bytes separately, even with memcmp() and
> memcpy(), slows things down too much for this to be faster.

Turns out that Perl does very much what I was doing; the Perl code is
actually more burdensome, because its routine is trying to deal not only
with \n-termination, but also arbitrary-string termination (Perl's Awk-like
input record separator), and "paragraph mode", and fixed-size reads, and
some other stuff I can't figure out from the macro names.  In all cases with
a terminator, though, it's doing the same business of both copying and
testing in a very tight inner loop.  It doesn't appear to make any serious
attempts to avoid resizing the buffer.  But, Perl has its own malloc
routines, and I'm guessing they're highly tuned for this stuff.

Since we're stuck with the MS malloc-- and Win9x's in particular seems
lame --adding this near the start of my stuff did yield a nice speedup:

	if (fp->_cnt > 0 &&
	    (pBuf = (char *)memchr(fp->_ptr, '\n', fp->_cnt)) != NULL) {
	    	/* it's all in the buffer so don't bother releasing the
	    	 * global lock
	    	 */
		total_buf_size = pBuf - fp->_ptr + 1;
		v = PyString_FromStringAndSize(fp->_ptr,
			                       (int)total_buf_size);
		if (v != NULL) {
			pBuf = BUF(v) + total_buf_size;
			fp->_cnt -= total_buf_size;
			fp->_ptr += total_buf_size;
		}
		goto done;
	}

So that builds the result string directly from the stdio buffer when it can.
Times dropped from (before this particular small hack)

count_chars_lines    14.880 14.854
readlines_sizehint    9.280  9.302
using_fileinput      48.610 48.589
while_readline       13.450 13.451

to

count_chars_lines    14.780 14.784
readlines_sizehint    9.550  9.514
using_fileinput      43.560 43.584
while_readline       10.600 10.578

Since I have no long lines in this test data, and the stdio buffer typically
contains thousands of chars, most calls should be satisfied by the fast
path.  Compared to the previous code, the fast path (1) avoids global lock
fiddling (but that didn't account for much in a distinct test); (2) crawls
over the buffer twice instead of once; and, (3) avoids one (shrinking!)
realloc.  So crawling over the buffer an extra time costs nothing compared
to the cost of a resize; and that's likely just more evidence that
malloc/realloc suck on this platform.

CAUTION:  no file locking is going on now (because I haven't found a way to
do it).  My previous claim that the MS getc macro did no locking was wrong,
as I discovered by stepping thru the generated machine code.  stdio.h
#defines getc without locking, but in _MT mode it later gets #undef'ed and
turned into a function call.

>> /* XXX unclear to me why MS's getc macro does "& 0xff" */
>>			*pBuf++ = ch = *ms_ptr++ & 0xff;

> I know why.  getchar() returns an int in the range [-1, 255].  If
> chars are signed the &0xff is needed else you would get a return in
> the range [-128, 127] and -1 would be ambiguous (EOF==-1).

Bingo -- MS chars are signed.

> ...
> But here since you're copying to another character, it's pointless.

Yup!  Gone.

> ....
> Note that get_line() with negative n could be implemented as
> get_line(0) with some post processing.

Andrew's glibc getline code appears to have wanted to do that, but looks to
me like it's unreachable (unless I'm hallucinating, the "n < 0" test after
return from glibc getline can't succeed, because the enclosing block is
guarded by an "n==0" test).

> This should be done completely separately, in PyFile_GetLine.

I assume you have an editor <wink>.

> The negative n case is only used by raw_input() -- it means strip
> the \n and raise EOFError for EOF, and I expect that this is rarely
> if ever used in a speed-conscious situation.

I've never seen raw_input used except when stdin and stdout were connected
to a tty.  When I tried raw_input from a DOS box under the debugger, it
never called get_line.  Something trickier is going on there; I suspect it's
actually calling fgets (eventually) instead in that case.

more-mysteries-than-i-really-need-ly y'rs  - tim




From jeremy at alum.mit.edu  Thu Jan  4 01:06:58 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Wed, 3 Jan 2001 19:06:58 -0500 (EST)
Subject: [Python-Dev] Mailman problems?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com>
References: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com>
Message-ID: <14931.48802.273143.209933@localhost.localdomain>

Tim & Barry,

It looks like the is some problem with Mailman that is garbling
messages to python-dev.  It may only affect lines that begin with a
tab; not sure.

Your most recent message came through with the following line

>    dTHXo;

(This was not the only example.)

I think this was supposed to be a line of C code, but whatever
meaningful contents it had were rendered as gobbledygook.

Jeremy


    
    



From loewis at informatik.hu-berlin.de  Thu Jan  4 01:13:16 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Thu, 4 Jan 2001 01:13:16 +0100 (MET)
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <200101040013.BAA13436@pandora.informatik.hu-berlin.de>

> Apparently getc_unlocked() is in the Single Unix spec.  Not sure how
> widespread that is -- do Linux developers pay attention to this
> standard at all?

Ulrich Drepper, who is in charge of glibc, is always interested in
following Single Unix to the letter; getc_unlocked is supported
atleast since glibc 2.0.

http://www.sun.com/smcc/solaris-migration/docs/courses/threadsHTML/adv.html

claims that getc_unlocked is already in POSIX.1c; Solaris apparently
supports it atleast since Solaris 2.4.

Irix has it since 6.5, Tru64 atleast since 4.0d (probably much
longer); HPUX since 11.0, AIX since atleast 4.3.

Of the BSDs, only OpenBSD appears to support it; it knows that it is
in ANSI 1003.1 since 1996-07-12.

SCO OpenServer doesn't support it.

Regards,
Martin



From fredrik at effbot.org  Thu Jan  4 01:20:41 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Thu, 4 Jan 2001 01:20:41 +0100
Subject: [Python-Dev] Mailman problems?
References: <200101031237.HAA19244@cj20424-a.reston1.va.home.com><LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com> <14931.48802.273143.209933@localhost.localdomain>
Message-ID: <011901c075e4$2ce96360$e46940d5@hagrid>

> It looks like the is some problem with Mailman that is garbling
> messages to python-dev.  It may only affect lines that begin with a
> tab; not sure.
>
> Your most recent message came through with the following line
> 
> >    dTHXo;
> 
> (This was not the only example.)
> 
> I think this was supposed to be a line of C code, but whatever
> meaningful contents it had were rendered as gobbledygook.

also looks like Mailman removed all smileys from
Jeremys post ;-)

</F>




From thomas at xs4all.net  Thu Jan  4 01:27:54 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 01:27:54 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101040013.BAA13436@pandora.informatik.hu-berlin.de>; from loewis@informatik.hu-berlin.de on Thu, Jan 04, 2001 at 01:13:16AM +0100
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de>
Message-ID: <20010104012753.D2467@xs4all.nl>

On Thu, Jan 04, 2001 at 01:13:16AM +0100, Martin von Loewis wrote:

> Of the BSDs, only OpenBSD appears to support it; it knows that it is
> in ANSI 1003.1 since 1996-07-12.

BSDI supports getc_unlocked() at least since BSDI 3.1. I don't have any
older boxes to check, but the manpage for getc and all its friends carries
the timestamp 'June 4, 1993', which implies it could have been available a
lot longer. (Note that BSD was once known to *define* the standard ;-)

I concur that FreeBSD does not currently support getc_unlocked, but since
BSDI and FreeBSD are merging, I suspect it will, soonish.

In other words: use it! :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From barry at wooz.org  Thu Jan  4 03:59:01 2001
From: barry at wooz.org (Barry A. Warsaw)
Date: Wed, 3 Jan 2001 21:59:01 -0500
Subject: [Python-Dev] Re: Mailman problems?
References: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com>
	<14931.48802.273143.209933@localhost.localdomain>
Message-ID: <14931.59125.391596.730296@anthem.wooz.org>

>>>>> "JH" == Jeremy Hylton <jeremy at alum.mit.edu> writes:

    JH> It looks like the is some problem with Mailman that is
    JH> garbling messages to python-dev.  It may only affect lines
    JH> that begin with a tab; not sure.

    JH> Your most recent message came through with the following line

    >> dTHXo;

    JH> (This was not the only example.)

    JH> I think this was supposed to be a line of C code, but whatever
    JH> meaningful contents it had were rendered as gobbledygook.

Oh shoot, my bad.  I dropped in an experimental Perl filter module in
the delivery pipeline.  It's been so long since I hacked Perl, I think
I meant to write $%_-> when I really wrote %$_->

-Barry




From tim.one at home.com  Thu Jan  4 05:26:51 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 3 Jan 2001 23:26:51 -0500
Subject: [Python-Dev] RE: Mailman problems?
In-Reply-To: <14931.48802.273143.209933@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKIIGAA.tim.one@home.com>

[Jeremy]
> It looks like the is some problem with Mailman that is garbling
> messages to python-dev.  It may only affect lines that begin with a
> tab; not sure.
>
> Your most recent message came through with the following line
>
>>    dTHXo;
>
> (This was not the only example.)
>
> I think this was supposed to be a line of C code, but whatever
> meaningful contents it had were rendered as gobbledygook.

I have no idea where that "o" came from!  It was supposed to be "o".  Barry,
fix it!

BTW, the second line of Perl implementation functions is usually a lot less
mysterious than the first.  If anyone wants the joy of reverse-engineering
Perl's supernaturally fast input, it's function Perl_sv_gets in file sv.c.
sv.c?  Yes!  The destination of a one-line input is a Scalar Value, hence,
sc.  I expect there's similar method behind all of this stuff, but I never
stumbled into the key.  To get you started, here's the first line of
Perl_sv_gets:

    dTHR;

The line you're looking for is 119 lines down from that:

	    if ((*bp++ = *ptr++) == rslast)  /* really   |  dust */

the-comment-makes-more-sense-in-context<wink>-ly y'rs  - tim




From thomas at xs4all.net  Thu Jan  4 07:51:17 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 07:51:17 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101040037.TAA08699@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 03, 2001 at 07:37:22PM -0500
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com>
Message-ID: <20010104075116.J402@xs4all.nl>

On Wed, Jan 03, 2001 at 07:37:22PM -0500, Guido van Rossum wrote:
> > In other words: use it! :)
> 
> Mind doing a few platform tests on the (new version of the) patch?

Well, only a bit :) It's annoying that BSDI doesn't come with autoconf, but
I managed to use all my early-morning wit (it's 6:30AM <wink>) to work
around it. I've tested it on BSDI 4.1 and FreeBSD 4.2-RELEASE.

> I already know that it works on Red Hat Linux 6.2 (my box) and Solaris
> 2.6 (Andrew's box).  I would be delighted to know that it works on at
> least one other platform that has getc_unlocked() and one platform
> that doesn't have it!

Sorry, I have to disappoint you. FreeBSD does have getc_unlocked, they
just didn't document it. Hurrah for autoconf ;P Anyway, it worked like a
charm on BSDI:

(Python 2.0)
total 1794310 chars and 37660 lines
count_chars_lines     0.310  0.300
readlines_sizehint    0.150  0.150
using_fileinput       2.013  2.017
while_readline        1.006  1.000

(CVS Python + getc_unlocked)
daemon2:~/python/python/dist/src > ./python test.py termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.354  0.350
readlines_sizehint    0.182  0.183
using_fileinput       1.594  1.583
while_readline        0.363  0.367

But something weird is going on on FreeBSD:

(Standard CVS Python)
> ./python ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.265  0.266
readlines_sizehint    0.148  0.148
using_fileinput       0.943  0.938
while_readline        0.214  0.219

(CVS+getc_unlocked)
> ./python-getc-unlocked  ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.266  0.266
readlines_sizehint    0.151  0.141
using_fileinput       1.066  1.078
while_readline        0.283  0.281

This was sufficiently unexpected that I looked a bit further. The FreeBSD
Python was compiled without editing Modules/Setup, so it was statically
linked, no readline etc, but *with* threads (which are on by default, and
functional on both FreeBSD and BSDI 4.1.) Here's the timings after I enabled
just '*shared*':

(CVS + *shared*)
> ./python ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.276  0.273
readlines_sizehint    0.150  0.156
using_fileinput       0.902  0.898
while_readline        0.206  0.203

(This was not a fluke, I repeated it several times, getting hardly any
variation.) Enabling readline and cursesmodule had no additional effect.
Adding *shared* to the getc_unlocked tree saw roughly the same improvement,
but was still slower than without getc_unlocked.

(CVS + *shared* + getc_unlocked)
> ./python ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.272  0.273
readlines_sizehint    0.149  0.148
using_fileinput       1.031  1.031
while_readline        0.267  0.266

Increasing the size of the testfile didn't change anything, other than the
absolute numbers. I browsed stdio.h, where both getc() and getc_unlocked()
are defined as macros. getc_unlocked is defined as:

#define __sgetc(p) (--(p)->_r < 0 ? __srget(p) : (int)(*(p)->_p++))
#define getc_unlocked(fp)       __sgetc(fp)

and getc either as

#define getc(fp)        getc_unlocked(fp)
(without threads) or

static __inline int                     \
__getc_locked(FILE *_fp)                \
{                                       \
        extern int __isthreaded;        \
        int _ret;                       \
        if (__isthreaded)               \
                _FLOCKFILE(_fp);        \
        _ret = getc_unlocked(_fp);      \
        if (__isthreaded)               \
                funlockfile(_fp);       \
        return (_ret);                  \
}
#define getc(fp)        __getc_locked(fp)

_FLOCKFILE(x) is defined as flockfile(x), so that isn't the difference. The
speed difference has to be in the quick-and-easy test for whether the
locking is even necessary. Starting a thread on 'time.sleep(900)' in test.py
shows these numbers:

(standard CVS python)
> ./python-shared-std ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.433  0.445
readlines_sizehint    0.204  0.188
using_fileinput       1.595  1.594
while_readline        0.456  0.453

(getc_unlocked)
> ./python-getc-unlocked-shared ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.441  0.453
readlines_sizehint    0.206  0.195
using_fileinput       1.677  1.688
while_readline        0.509  0.508

So... using getc_unlocked manually for performance reasons isn't a cardinal
sin on FreeBSD only if you are really using threads :-)

Lets-outsmart-the-OS-scheduler-next!-ly y'rs
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Thu Jan  4 08:57:26 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 08:57:26 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test/output test_coercion,1.2,1.3
In-Reply-To: <E14DzKN-0005eK-00@usw-pr-cvs1.sourceforge.net>; from nascheme@users.sourceforge.net on Wed, Jan 03, 2001 at 05:36:27PM -0800
References: <E14DzKN-0005eK-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010104085726.E2467@xs4all.nl>

On Wed, Jan 03, 2001 at 05:36:27PM -0800, Neil Schemenauer wrote:
> Update of /cvsroot/python/python/dist/src/Lib/test/output
> In directory usw-pr-cvs1:/tmp/cvs-serv21710/Lib/test/output
> 
> Modified Files:
> 	test_coercion 
> Log Message:
> Sequence repeat works now for in-place multiply with an integer type
> as the left operand.  I don't know if this is a feature or a bug.

> ! 2 *= [1] => [1, 1]

It's a feature.

x = 2 * [1]

works, so

x = 2
x *= [1]

does, too. Obviously, '2 *= [1]' shouldn't, but I'm assuming you don't
actually execute that (it should give a SyntaxError)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From fredrik at effbot.org  Thu Jan  4 10:32:55 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Thu, 4 Jan 2001 10:32:55 +0100
Subject: [Python-Dev] RE: Mailman problems?
References: <LNBBLJKPBEHFEDALKOLCGEKIIGAA.tim.one@home.com>
Message-ID: <00a701c07631$531983b0$e46940d5@hagrid>

tim wrote:
> I have no idea where that "o" came from!  It was supposed to be "o".
> Barry, fix it!

no need.  from the perlguts man page:

    "You can ignore [pad]THX[xo] when browsing the Perl
    headers/sources."

in-my-dictionary-perl's-an-american-physicist-ly yrs /F




From mal at lemburg.com  Thu Jan  4 11:02:35 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 04 Jan 2001 11:02:35 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include
 classobject.h,2.33,2.34
References: <E14DzEi-0005T2-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <3A544A3B.32B86792@lemburg.com>

Neil Schemenauer wrote:
> 
> Update of /cvsroot/python/python/dist/src/Include
> In directory usw-pr-cvs1:/tmp/cvs-serv21006/Include
> 
> Modified Files:
>         classobject.h
> Log Message:
> Remove PyInstance_*BinOp functions.
> 
> Index: classobject.h
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Include/classobject.h,v
> retrieving revision 2.33
> retrieving revision 2.34
> diff -C2 -r2.33 -r2.34
> *** classobject.h       2000/09/01 23:29:26     2.33
> --- classobject.h       2001/01/04 01:30:34     2.34
> ***************
> *** 60,71 ****
>   extern DL_IMPORT(int) PyClass_IsSubclass(PyObject *, PyObject *);
> 
> - extern DL_IMPORT(PyObject *) PyInstance_DoBinOp(PyObject *, PyObject *,
> -                                                 char *, char *,
> -                                                 PyObject * (*)(PyObject *,
> -                                                                PyObject *));
> -
> - extern DL_IMPORT(int)
> - PyInstance_HalfBinOp(PyObject *, PyObject *, char *, PyObject **,
> -                       PyObject * (*)(PyObject *, PyObject *), int);

Wouldn't it be safer to provide emulation APIs for these ? There
might be code out there using these APIs.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at python.org  Thu Jan  4 15:06:53 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 04 Jan 2001 09:06:53 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include classobject.h,2.33,2.34
In-Reply-To: Your message of "Thu, 04 Jan 2001 11:02:35 +0100."
             <3A544A3B.32B86792@lemburg.com> 
References: <E14DzEi-0005T2-00@usw-pr-cvs1.sourceforge.net>  
            <3A544A3B.32B86792@lemburg.com> 
Message-ID: <200101041406.JAA11926@cj20424-a.reston1.va.home.com>

> > - extern DL_IMPORT(PyObject *) PyInstance_DoBinOp(PyObject *, PyObject *,
> > -                                                 char *, char *,
> > -                                                 PyObject * (*)(PyObject *,
> > -                                                                PyObject *));
> > -
> > - extern DL_IMPORT(int)
> > - PyInstance_HalfBinOp(PyObject *, PyObject *, char *, PyObject **,
> > -                       PyObject * (*)(PyObject *, PyObject *), int);
> 
> Wouldn't it be safer to provide emulation APIs for these ? There
> might be code out there using these APIs.

No.  These were never intended to be part of the API (and it was a
mistake that they used DL_IMPORT()).  They had to be extern because
they were defined in one file and used in another.  I'm glad they're
gone.  They are so obscure that I'd be *very* surprised if anybody was
using them, and even more if they even *wanted* emulation under the
new scheme -- I'd expect them to eagerly convert their code to using
new-style numbers right away.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Thu Jan  4 15:16:39 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 04 Jan 2001 09:16:39 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 04 Jan 2001 07:51:17 +0100."
             <20010104075116.J402@xs4all.nl> 
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com>  
            <20010104075116.J402@xs4all.nl> 
Message-ID: <200101041416.JAA11983@cj20424-a.reston1.va.home.com>

[Thomas finds that on FreeBSD, getc() is faster than getc_unlocked().]

Thomas, I really don't understand it.  The getc() source code you
showed calls getc_unlocked().  So how can it be faster?  The answer
must be somewhere else...  Cache line conflicts, the rewriting of the
loop that I did, a compiler bug, the inlining, who knows.  Can you
compare the generated assembly code?  On other platforms,
getc_unlocked() typically speeds the readline() test case up by a
significant factor (as in your BSDI numbers, where it's almost 3x
faster).

Could it be that you're mistaken and that somehow getc_unlocked() is
*not* chosen on FreeBSD?  Then I could believe it, the rewritten loop
is so different that the optimizer might have done something different
to it.  (Check config.h.  When all else fails, I put an #error in the
#ifdef branch that I expect not to be taken.)

Could it be that somehow getc_unlocked() is later defined to be the
same as getc(), so choosing it just adds the overhead of calling
f[un]lockfile() for each line?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Thu Jan  4 15:59:05 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 15:59:05 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101041416.JAA11983@cj20424-a.reston1.va.home.com>; from guido@python.org on Thu, Jan 04, 2001 at 09:16:39AM -0500
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com>
Message-ID: <20010104155904.L402@xs4all.nl>

On Thu, Jan 04, 2001 at 09:16:39AM -0500, Guido van Rossum wrote:
> [Thomas finds that on FreeBSD, getc() is faster than getc_unlocked().]

> Thomas, I really don't understand it.  The getc() source code you
> showed calls getc_unlocked().  So how can it be faster?  The answer
> must be somewhere else...  Cache line conflicts, the rewriting of the
> loop that I did, a compiler bug, the inlining, who knows.  Can you
> compare the generated assembly code?  On other platforms,
> getc_unlocked() typically speeds the readline() test case up by a
> significant factor (as in your BSDI numbers, where it's almost 3x
> faster).

Nono, reread my message, and your code. getc() isn't faster than
getc_unlocked(). getc() is faster than flockfile(f) + getc_unlocked(f) (+
the rearranging of the function, use of PyTHREAD_ALLOW inside the outer loop,
etc.) Significantly so when there is only one thread running (which is still
the common case, for most systems, and FreeBSD's libc has easy inside
knowledge about) and marginally so when there is at least one other thread.
The small advantage in the multi-threaded case can be explained by the
rest of the changes. 

You see, I was comparing a patched tree versus a non-patched tree, not a
getc_unlocked() enabled one versus a disabled one, so I was measuring the
speed difference of the *patch*, not of the use of getc_unlocked() vs
getc(). Here is the speed difference of just the use of getc() vs
getc_unlocked() (same tree, hand-edited config.h) in a non-threaded
environment:

> ./python-getc-disabled ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.271  0.273
readlines_sizehint    0.149  0.148
using_fileinput       0.898  0.898
while_readline        0.214  0.211

> ./python-getc-enabled ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.271  0.273
readlines_sizehint    0.148  0.148
using_fileinput       0.898  0.898
while_readline        0.214  0.211


As you see, no significant difference. Here is the difference in a threaded
environment (a second thread that does just 'time.sleep(900)'):

> ./python-getc-disabled ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.429  0.422
readlines_sizehint    0.200  0.211
using_fileinput       1.604  1.594
while_readline        0.465  0.461

> ./python-getc-enabled ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.429  0.430
readlines_sizehint    0.201  0.203
using_fileinput       1.600  1.602
while_readline        0.463  0.461

... where I have to note that the getc-disabled version's 'using_fileinput'
time fluctuates a lot more, mostly upwards, in the threaded environment. (I
see it jump to 1.609, 1.617 cputime, every few runs.) Still not a terribly
significant difference, but a hint that we, too, can use inside knowledge ;)

> Could it be that you're mistaken and that somehow getc_unlocked() is
> *not* chosen on FreeBSD?  Then I could believe it, the rewritten loop
> is so different that the optimizer might have done something different
> to it.  (Check config.h.  When all else fails, I put an #error in the
> #ifdef branch that I expect not to be taken.)

Yah, #error is great for debugging, I use it a lot ;) But I'm sure of this.
FreeBSD's getc() is just craftily optimized. Note that if we can get
get_line using getc_unlocked() to run as fast as get_line using getc() on
FreeBSD, it should also benifit other platforms, because the only speed to
be had is in our own code :) Not that I'm saying it can be improved, just
that it apparently got slower, because of this patch. I can't be much help
doing any performance tuning, though, I've about used up my lunchhour and
I'm working late tonight ;P

Good-thing-my-boss-can't-tell-the-difference-between-Apache-and-Python-src-ly
	y'rs, 
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Thu Jan  4 16:27:28 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 04 Jan 2001 10:27:28 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 04 Jan 2001 15:59:05 +0100."
             <20010104155904.L402@xs4all.nl> 
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com>  
            <20010104155904.L402@xs4all.nl> 
Message-ID: <200101041527.KAA12181@cj20424-a.reston1.va.home.com>

[Me & Thomas in violent agreement that there's something weird about
the speed of getc_unlocked() vs. getc() on FreeBSD.]

I just realized what's the probable cause.  Read your timing post
again:

# BSDI:
# 
# (Python 2.0)
# while_readline        1.006  1.000
# 
# (CVS Python + getc_unlocked)
# while_readline        0.363  0.367

# FreeBSD:
# 
# (Standard CVS Python)
# while_readline        0.214  0.219
# 
# (CVS+getc_unlocked)
# while_readline        0.283  0.281

Standard CVS Python, as opposed to Python 2.0 as released, uses GNU
getline()!  So on FreeBSD, for this test case, GNU getline() is faster
than getc_unlocked().

So the question is, should I leave the GNU getline() code in?  I'm
inclined against it -- it's not that much faster, and on other
platform getc_unlocked() is faster.  Given that getc_unlocked() is a
standard (of some sort) and GNU getline() is, well, just that, I'd say
let's stick with getc_unlocked().

(Unfortunately, from a phone conversation I had last night with Tim,
there's not much hope of doing something there -- and that platform
sorely needs it!  The hacks that Tim reported earlier are definitely
not thread-safe.  While it's easy to come up with getc_unlocked() for
Windows, the locking operations used internally there by the /MT code
are not exported from MSVCRT.DLL, and that's crucial.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Thu Jan  4 16:31:39 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 16:31:39 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101041527.KAA12181@cj20424-a.reston1.va.home.com>; from guido@python.org on Thu, Jan 04, 2001 at 10:27:28AM -0500
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl> <200101041527.KAA12181@cj20424-a.reston1.va.home.com>
Message-ID: <20010104163139.M402@xs4all.nl>

On Thu, Jan 04, 2001 at 10:27:28AM -0500, Guido van Rossum wrote:
> [Me & Thomas in violent agreement that there's something weird about
> the speed of getc_unlocked() vs. getc() on FreeBSD.]

> I just realized what's the probable cause.  Read your timing post
> again:

> Standard CVS Python, as opposed to Python 2.0 as released, uses GNU
> getline()!

Sorry, no go. You need two things to use getline(): getline() itself, and a
GNU libc. FreeBSD has neither. (And autoconf agrees with me.) If you *really
really* want me to, I can compile 2.0-standard on FreeBSD and show you. But
I'd rather not :)

Now go back and read my other mail about why FreeBSD is faster :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From akuchlin at mems-exchange.org  Thu Jan  4 16:43:15 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 4 Jan 2001 10:43:15 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010104155904.L402@xs4all.nl>; from thomas@xs4all.net on Thu, Jan 04, 2001 at 03:59:05PM +0100
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl>
Message-ID: <20010104104315.C23803@kronos.cnri.reston.va.us>

On Thu, Jan 04, 2001 at 03:59:05PM +0100, Thomas Wouters wrote:
>getc_unlocked(). getc() is faster than flockfile(f) + getc_unlocked(f) (+
>the rearranging of the function, use of PyTHREAD_ALLOW inside the outer loop,
>etc.) Significantly so when there is only one thread running (which is still

So it looks like the ALLOW_THREADS should be moved out of the for
loop.  This produced no measureable performance difference on Solaris;
I'll leave it to GvR to try it on Linux.  I wonder if FreeBSD has some
unusually slow thread operation?

--amk



From thomas at xs4all.net  Thu Jan  4 16:59:25 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 16:59:25 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010104104315.C23803@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Jan 04, 2001 at 10:43:15AM -0500
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl> <20010104104315.C23803@kronos.cnri.reston.va.us>
Message-ID: <20010104165925.G2467@xs4all.nl>

On Thu, Jan 04, 2001 at 10:43:15AM -0500, Andrew Kuchling wrote:
> On Thu, Jan 04, 2001 at 03:59:05PM +0100, Thomas Wouters wrote:

> >getc_unlocked(). getc() is faster than flockfile(f) + getc_unlocked(f) (+
> >the rearranging of the function, use of PyTHREAD_ALLOW inside the outer loop,
> >etc.) Significantly so when there is only one thread running (which is still

> So it looks like the ALLOW_THREADS should be moved out of the for
> loop.  This produced no measureable performance difference on Solaris;
> I'll leave it to GvR to try it on Linux.  I wonder if FreeBSD has some
> unusually slow thread operation?

Note that I was just guessing there. I did a quick scan of the function, and
noticed that the ALLOW_THREADS statements had moved into the outer loop. I
didn't even contemplate whether that made a difference, so don't trust that
judgement.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From akuchlin at mems-exchange.org  Thu Jan  4 17:10:29 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 4 Jan 2001 11:10:29 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010104165925.G2467@xs4all.nl>; from thomas@xs4all.net on Thu, Jan 04, 2001 at 04:59:25PM +0100
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl> <20010104104315.C23803@kronos.cnri.reston.va.us> <20010104165925.G2467@xs4all.nl>
Message-ID: <20010104111029.A28510@kronos.cnri.reston.va.us>

On Thu, Jan 04, 2001 at 04:59:25PM +0100, Thomas Wouters wrote:
>Note that I was just guessing there. I did a quick scan of the function, and
>noticed that the ALLOW_THREADS statements had moved into the outer loop. I
>didn't even contemplate whether that made a difference, so don't trust that
>judgement.

According to your benchmark, the performance of the threaded version
was the same whether or not getc_unlocked() was unused, so it's not
that flockfile() is really slow.  I can't believe the compiler
optimized the old, ungainly loop better than the newer, tighter loop.
That leaves the ALLOW_THREADS as the most reasonable culprit.

--amk




From akuchlin at mems-exchange.org  Thu Jan  4 18:10:11 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 04 Jan 2001 12:10:11 -0500
Subject: [Python-Dev] SGI's Digital Media SDK
Message-ID: <E14EDtz-0007fS-00@kronos.cnri.reston.va.us>

SGI just made a source release of their digital media SDK for IRIX and
Linux at http://oss.sgi.com/projects/dmsdk/ .  According to the FAQ,
this is derived from previous SGI libraries, "including the Video
Library (VL), the Audio Library (AL), Digital Media Image Convertor
(DMIC), Digital Media Audio Convertor (DMAC), and the Compression
Library (CL)."  Interested parties may want to look into this, because
Python still has the al, cd, cl, and sv modules; maybe they'd work
with the new software with a reasonable amount of fixing, and at least
now there's a reasonable chance that non-IRIX platforms will be
supported.

--amk




From guido at python.org  Thu Jan  4 20:07:13 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 04 Jan 2001 14:07:13 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 04 Jan 2001 10:43:15 EST."
             <20010104104315.C23803@kronos.cnri.reston.va.us> 
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl>  
            <20010104104315.C23803@kronos.cnri.reston.va.us> 
Message-ID: <200101041907.OAA12573@cj20424-a.reston1.va.home.com>

> So it looks like the ALLOW_THREADS should be moved out of the for
> loop.  This produced no measureable performance difference on Solaris;
> I'll leave it to GvR to try it on Linux.  I wonder if FreeBSD has some
> unusually slow thread operation?

I kind of doubt that it's Py_ALLOW_THREADS -- it's in the outer loop,
which typically only gets executed once.  It only goes around a second
time when the line is longer than the initial buffer.  We could tweak
the initial buffer size (currently 100, with increments of 1000).

--Guido van Rossum (home page: http://www.python.org/~guido/)




From mal at lemburg.com  Thu Jan  4 20:32:15 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 04 Jan 2001 20:32:15 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include 
 classobject.h,2.33,2.34
References: <E14DzEi-0005T2-00@usw-pr-cvs1.sourceforge.net>  
	            <3A544A3B.32B86792@lemburg.com> <200101041406.JAA11926@cj20424-a.reston1.va.home.com>
Message-ID: <3A54CFBF.CDD2138B@lemburg.com>

Guido van Rossum wrote:
> 
> > > - extern DL_IMPORT(PyObject *) PyInstance_DoBinOp(PyObject *, PyObject *,
> > > -                                                 char *, char *,
> > > -                                                 PyObject * (*)(PyObject *,
> > > -                                                                PyObject *));
> > > -
> > > - extern DL_IMPORT(int)
> > > - PyInstance_HalfBinOp(PyObject *, PyObject *, char *, PyObject **,
> > > -                       PyObject * (*)(PyObject *, PyObject *), int);
> >
> > Wouldn't it be safer to provide emulation APIs for these ? There
> > might be code out there using these APIs.
> 
> No.  These were never intended to be part of the API (and it was a
> mistake that they used DL_IMPORT()).  They had to be extern because
> they were defined in one file and used in another.  I'm glad they're
> gone.  They are so obscure that I'd be *very* surprised if anybody was
> using them, and even more if they even *wanted* emulation under the
> new scheme -- I'd expect them to eagerly convert their code to using
> new-style numbers right away.

I'll see whether I can get mxDateTime working with the new
scheme later this year -- it would be really great to do away
with the coercion hack I was using until now :-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Fri Jan  5 07:04:56 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 5 Jan 2001 01:04:56 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101041527.KAA12181@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENHIGAA.tim.one@home.com>

[Guido van Rossum]
> ...
> (Unfortunately, from a phone conversation I had last night with
> Tim, there's not much hope of doing something there -- and that
> platform [Win32] sorely needs it!  The hacks that Tim reported
> earlier are definitely not thread-safe.  While it's easy to come
> up with getc_unlocked() for Windows, the locking operations used
> internally there by the /MT code are not exported from MSVCRT.DLL,
> and that's crucial.)

The short course is that I still haven't found a workable way to lock
streams on Windows:  they do have a complete set of stream-locking functions
and macros, but there's no way short of deep magic I can find to get at them
("deep magic" == resort to assembler and patch in function addresses).

The only file-locking functions advertised in the C and platform SDK
libraries are trivial variants of Python's msvcrt.locking, but that has to
do with locking specific file byte-position ranges across processes, not
ensuring the integrity of runtime stream structures across threads.

Perl appears to ignore the issue of thread safety here (on Windows and
everywhere else).

Revealing experiment!

1. I threw away my changes and rebuilt from current CVS.

2. I made one change, expanding the getc() call in get_line to what MSVC
*would* expand it to if we weren't building in thread mode:

    if ((c = (--fp->_cnt >= 0 ?
              0xff & *fp->_ptr++ :
              _filbuf(fp))) == EOF) {

That alone reduced the runtime of my "while 1: readline" test case from over
30 seconds to 12.8.  What I did before went beyond that, by also (in effect)
unrolling the loop and optimizing it.  That bought an additional ~2 seconds.

So compared to Perl's 6 seconds, it looks like we're paying (on Win98SE)
approximately:

   17 seconds for compiling with _MT (threadsafe libc)
    6 seconds to do the work <wink>
    5 seconds for "other stuff", best guess mostly a poor
          platform malloc/realloc
    2 seconds for not optimizing the loop
   --
   30 total

Unfortunately, the smoking gun is the only one whose firing pin we can't
file down on this platform.

so-the-good-news-is-that-it's-impossible-for-perl-not-to-be-at-
    least-twice-as-fast<wink>-ly y'rs  - tim




From guido at python.org  Fri Jan  5 16:29:05 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 05 Jan 2001 10:29:05 -0500
Subject: [Python-Dev] Python 2.1 release schedule (PEP 226)
Message-ID: <200101051529.KAA19100@cj20424-a.reston1.va.home.com>

We had our first PythonLabs meeting of the year yesterday, and we went
over the 2.1 release schedule.  The release schedule is posted in PEP
226: http://python.sourceforge.net/peps/pep-0226.html

We found that the schedule previously posted there was a bit too
aggressive, given our goals for this release, so we have adjusted the
dates somewhat.  We have also decided on a date for the first alpha
release (previously unmentioned in the PEP).  So, here are the
relevant dates:

    19-Jan-2001: First 2.1 alpha release
    23-Feb-2001: First 2.1 beta release
    01-Apr-2001: 2.1 final release

We're already in PEP freeze mode -- no more PEPs will be considered
for inclusion in 2.1.  Below is a list of the PEPs that we are
currently considering, with some comments.  But first some general
remarks:

- The alpha release cycle is for testing of tentative features.  Alpha
  releases contain working code that we want to see widely tested;
  however, it's possible that a feature present in an alpha release is
  changed or even retracted in a later release.

- Beta releases represent a feature freeze -- after the first beta
  release, we will resign ourselves to fixing bugs.  Once beta 1 is
  released, no new features will be introduced, and no features will
  be withdrawn.

The alpha cycle is especially important for features (such as nested
scopes) that (may) introduce backwards incompatibilities.  There may
be more than one alpha release depending on feedback on the alpha 1
release.  (But having too many alpha releases is not good -- people
won't bother downloading.)

Thus, we can only introduce a new feature in beta 1 if we're very sure
that it is mature enough to stay without interface changes.  The final
decision on all PEPs under consideration has to be made before the
beta 1 release.

The beta cycle is important to ensure stability of the final release.

Specific PEPs under consideration:

 I    42  pep-0042.txt  Small Feature Requests                 Hylton

	  Actually, most of these won't be fulfilled in 2.1.

 SD  205  pep-0205.txt  Weak References                        Drake

	  Fred is still working on this.  I hope Tim can assist.  But
	  we may have to postpone this.

 S   207  pep-0207.txt  Rich Comparisons                   Lemburg, van Rossum

	  I'm pretty sure that this is a piece of cake now that the
	  coercion patches are checked in.

 S   208  pep-0208.txt  Reworking the Coercion Model           Schemenauer

	  All checked in.  Great work, Neil!

 S   217  pep-0217.txt  Display Hook for Interactive Use       Zadka

	  Moshe, this was accepted ages ago.  Would you mind
	  submitting a patch to SourceForge?  If you don't champion
	  this (and nobody else does), we may have to postpone it
	  still.

 S   222  pep-0222.txt  Web Library Enhancements               Kuchling

	  This is really up to Andrew.  It seems he plans to create
	  new modules, so he won't be introducing incompatibilities in
	  existing APIs.

 S   227  pep-0227.txt  Statically Nested Scopes               Hylton

	  Jeremy is still working on a proper implementation, which he
	  hopes to have ready in time for the first alpha release
	  date.

 S   229  pep-0229.txt  Using Distutils to Build Python        Kuchling

	  I just moved this from pie-in-the-sky to active.  Andrew has
	  a working prototype, it just doesn't work 100% yet, so I'm
	  very hopeful.

 S   230  pep-0230.txt  Warning Framework                      van Rossum

	  All done.

 S   232  pep-0232.txt  Function Attributes                    Warsaw

	  Still waiting for Barry to implement this, but it's pretty
	  straightforward.

 S   233  pep-0233.txt  Python Online Help                     Prescod

	  Paul, what's up with this?  Tim & I recommended to do
	  something simple and working, and then you disappeared from
	  the face of the earth.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at acm.org  Fri Jan  5 16:28:16 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 5 Jan 2001 10:28:16 -0500 (EST)
Subject: [Python-Dev] new "theme" on SourceForge!
Message-ID: <14933.59408.512734.105160@cj42289-a.reston1.va.home.com>

  While "theme-ability" is becoming very popular for desktop software
(think about the latest Gnome and KDE systems for Unix, and some of
the multimedia applications for Windows, and the newest MacOS
desktops), it can be a huge drain on Web sites; too many graphics is a
pain, and too many tables just makes it worse.
  SourceForge had definately fallen prey to the overly-fancy themes,
and all of us developers paid the price with slow rendering.
  But they've fixed that!
  The SF crew has announced a new "theme" called "Ultra Light" which
is optimized for slow connections.  What that really means is less
embedded graphics and fewer nested tables, so rendering is *much*
faster.
  To try the new theme, go to the "Change My Theme" link near the top
of the left-hand navigation area.  Use the form to select "Ultra
Light"; you can preview the theme first if you want.
  Guido also thinks its cool that the bug & patch report pages are
printable with this theme.  (Sheesh... managers! ;)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tim.one at home.com  Fri Jan  5 18:46:16 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 5 Jan 2001 12:46:16 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Lib fileinput.py,1.5,1.6
In-Reply-To: <E14EY6j-0000wX-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEPBIGAA.tim.one@home.com>

[Guido]
> Modified Files:
> 	fileinput.py
> Log Message:
> Speed it up by using readlines(sizehint).  It's still slower than
> other ways of reading input. :-(

On my box, it's now head-to-head with (maybe even a little quicker than) the
while 1: line-at-a-time way:

total 117615824 chars and 3237568 lines
readlines_sizehint    9.450  9.459
using_fileinput      29.880 29.884
while_readline       30.480 30.506

(stock CVS Python under Win98SE)

So that's a huge improvement!

the-two-people-using-fileinput-should-be-delighted<wink>-ly y'rs  - tim




From skip at mojam.com  Fri Jan  5 20:05:14 2001
From: skip at mojam.com (Skip Montanaro)
Date: Fri, 5 Jan 2001 13:05:14 -0600 (CST)
Subject: [Python-Dev] fileinput.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEPBIGAA.tim.one@home.com>
References: <E14EY6j-0000wX-00@usw-pr-cvs1.sourceforge.net>
	<LNBBLJKPBEHFEDALKOLCOEPBIGAA.tim.one@home.com>
Message-ID: <14934.6890.160122.384692@beluga.mojam.com>

    Tim> the-two-people-using-fileinput-should-be-delighted<wink>-ly

What do you think contributes to fileinput's relative disfavor?  This whole
thread on Python's file reading performance was started by the eternal whine
"why is Python so much slower than Perl?" which really means why is

   line = f.readline()
   while line:
      process(line)

so much slower than whatever that thing is in Perl that everybody uses as
the be-all-end-all performance benchmark (something with <> in it).

Given that fileinput is supposed to make the I/O loop in Python more
familiar to those people wandering over from Perl (at least in part), you'd
think that people would naturally gravitate to it.  Would it benefit from
some exposure in the Python tutorial?  Is it fast enough now to warrant the
extra exposure?

just-whining-out-loud-ly y'rs

Skip



From tim.one at home.com  Fri Jan  5 20:11:00 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 5 Jan 2001 14:11:00 -0500
Subject: [Python-Dev] new "theme" on SourceForge!
In-Reply-To: <14933.59408.512734.105160@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEPJIGAA.tim.one@home.com>

[Fred L. Drake, Jr.]

Who would have guessed that the "L." stands for Light?

> ...
> The SF crew has announced a new "theme" called "Ultra Light" which
> is optimized for slow connections.

Indeed, I think I can cancel my cable modem now and go back to a 28.8 phone
modem.

liking-it!-ly y'rs  - tim




From jeremy at alum.mit.edu  Fri Jan  5 20:14:49 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 5 Jan 2001 14:14:49 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
Message-ID: <14934.7465.360749.199433@localhost.localdomain>

There was a brief discussion of unit testing last millennium, which did
not reach any conclusions.  I'd like to restart the discussion and set
some specific goals.  The action item is a unit testing bake-off, held
next week, to choose a tool.

The primary goal is to choose a unit testing framework for the
regression test suite.  Tests written with this framework would
eventually replace the current regrtest.py framework, based on
comparing test output to expected output.

For the 2.1 release, the goal would be to choose a test framework to
include in the standard distribution and use it to write some or all
of the new tests.  We would need to integrate it in some way with
regrtest.py, so that a single command can be used to run all the
tests.

In the long run, we can migrate existing tests to use the new system.
The new system can help us address some other goals:

    - running an entire test suite to completion instead of stopping
      on the first failure

    - clearer reporting of what went wrong

    - better support for conditional tests, e.g. write a test for
      httplib that only runs if the network is up.  This is tied into
      better error reporting, since the current test suite could only
      report that httplib succeeded or failed.

Does anyone disagree with the goal?

Three tools have been proposed: PyUnit, Quixote unittest, and doctest.

doctest has been championed by Peter Funk, who wants a few new
features, but Tim, its author, isn't pushing it as a tool for writing
stand alone tests.  I think the best way to use doctest is for module
writers to consider it when writing a new module.  If doctest is used
from the start for a module, we could integrate it with the regression
test.  It seems quite useful for what it is intended for, but is not a
general solution.

That leaves PyUnit and Quixote's unittest.  The two tools are fairly
similar, but differ on a number of non-trivial details.  Quixote also
integrates code coverage, which is quite handy.  If we don't adopt its
unittest, we should add code coverage to PyUnit.

Is anyone else interested in the choice between the two?  If so, I
suggest you try writing some tests with each tool and reporting back
with your feedback.  I propose leaving one week for such a bake-off and
making a decision next Friday.

Jeremy



From fredrik at effbot.org  Fri Jan  5 20:55:18 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 5 Jan 2001 20:55:18 +0100
Subject: [Python-Dev] unit testing bake-off
References: <14934.7465.360749.199433@localhost.localdomain>
Message-ID: <004c01c07751$6eed84d0$e46940d5@hagrid>

Jeremy Hylton wrote:
> Is anyone else interested in the choice between the two?

yes.  I suggest adding doctest.py plus one unit test implementation.

> If so, I suggest you try writing some tests with each tool and
> reporting back with your feedback.

we've recently migrated from a 30-minute reimplementation of Kent
Beck's original framework to one of the frameworks you mention.  with
that background, the choice was easy.  let me know when it's time to
vote...

</F>




From guido at python.org  Fri Jan  5 20:55:33 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 05 Jan 2001 14:55:33 -0500
Subject: [Python-Dev] fileinput.py
In-Reply-To: Your message of "Fri, 05 Jan 2001 13:05:14 CST."
             <14934.6890.160122.384692@beluga.mojam.com> 
References: <E14EY6j-0000wX-00@usw-pr-cvs1.sourceforge.net> <LNBBLJKPBEHFEDALKOLCOEPBIGAA.tim.one@home.com>  
            <14934.6890.160122.384692@beluga.mojam.com> 
Message-ID: <200101051955.OAA20190@cj20424-a.reston1.va.home.com>

> What do you think contributes to fileinput's relative disfavor?

In my view, fileinput is one of those unfortunate features that exist
solely to shut up a particular kind of criticism.  Without fileinput,
Perl zealots would have an easy argument for a "trivial reject" of
even considering Python.  Now, when somebody claims the superiority of
Perl's "loop involving a <> thingie", you can point to fileinput to
prevent them from scoring a point.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Jan  5 21:01:13 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 05 Jan 2001 15:01:13 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: Your message of "Fri, 05 Jan 2001 20:55:18 +0100."
             <004c01c07751$6eed84d0$e46940d5@hagrid> 
References: <14934.7465.360749.199433@localhost.localdomain>  
            <004c01c07751$6eed84d0$e46940d5@hagrid> 
Message-ID: <200101052001.PAA20238@cj20424-a.reston1.va.home.com>

> yes.  I suggest adding doctest.py plus one unit test implementation.

I second this vote for doctest (in addition to a unittest thing).  I
propose that Tim checks in his latest version of doctest.  It should
go under Lib, not under Lib/test, I think.  (Certainly that's how Tim
has been proposing its use.)

It requires LaTeX docs, but since it's got a great docstring, that
should be easy.

> > If so, I suggest you try writing some tests with each tool and
> > reporting back with your feedback.
> 
> we've recently migrated from a 30-minute reimplementation of Kent
> Beck's original framework to one of the frameworks you mention.  with
> that background, the choice was easy.  let me know when it's time to
> vote...

Which framework are you now using?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Jan  5 21:14:41 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 05 Jan 2001 15:14:41 -0500
Subject: [Python-Dev] Add __exports__ to modules
Message-ID: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>

Please have a look at this SF patch:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470

This implements control over which names defined in a module are
externally visible: if there's a variable __exports__ in the module,
it is a list of identifiers, and any access from outside the module to
names not in the list is disallowed.  This affects access using the
getattr and setattr protocols (which raise AttributeError for
disallowed names), as well as "from M import v" (which raises
ImportError).

I like it.  This has been asked for many times.  Does anybody see a
reason why this should *not* be added?

Tim remarked that introducing this will prompt demands for a similar
feature on classes and instances, where it will be hard to implement
without causing a bit of a slowdown.  It causes a slight slowdown (an
extra dictionary lookup for each use of "M.v") even when it is not
used, but for accessing module variables that's acceptable.  I'm not
so sure about instance variable references.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Fri Jan  5 21:19:55 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 5 Jan 2001 15:19:55 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <200101052001.PAA20238@cj20424-a.reston1.va.home.com>
References: <14934.7465.360749.199433@localhost.localdomain>
	<004c01c07751$6eed84d0$e46940d5@hagrid>
	<200101052001.PAA20238@cj20424-a.reston1.va.home.com>
Message-ID: <14934.11371.879059.610988@localhost.localdomain>

If anyone is interested in experimenting with a test suite, here is a
summary of the code coverage for the current regression test suite as
run on my Linux box.  Pick a module with low code coverage and your
experiment can also improve the regression test suite.

Jeremy

 67.42%    798  Modules/arraymodule.c
 74.39%    773  Modules/audioop.c
 81.84%    380  Modules/binascii.c
 62.36%    449  Modules/bsddbmodule.c
 78.29%    152  Modules/cmathmodule.c
 67.89%    246  Modules/_codecsmodule.c
 47.41%   2647  Modules/cPickle.c
 87.50%      8  Modules/cryptmodule.c
 64.34%    272  Modules/cStringIO.c
  0.00%   1351  Modules/_cursesmodule.c
  0.00%    202  Modules/_curses_panel.c
 99.28%    139  Modules/errnomodule.c
 30.71%    127  Modules/fcntlmodule.c
 81.90%    315  Modules/gcmodule.c
  0.00%      4  Modules/getbuildinfo.c
 47.29%    277  Modules/getpath.c
 72.22%     54  Modules/grpmodule.c
 79.95%    419  Modules/imageop.c
  0.00%     11  Modules/../Include/cStringIO.h
 13.25%    234  Modules/linuxaudiodev.c
 14.80%    223  Modules/_localemodule.c
 30.66%    137  Modules/main.c
 73.20%     97  Modules/mathmodule.c
 98.39%    124  Modules/md5c.c
 69.70%     66  Modules/md5module.c
 48.62%    362  Modules/mmapmodule.c
 66.22%     74  Modules/newmodule.c
 84.91%     53  Modules/operator.c
 50.57%   1236  Modules/parsermodule.c
  0.00%    350  Modules/pcremodule.c
 28.88%   1077  Modules/posixmodule.c
 82.05%     39  Modules/pwdmodule.c
 77.96%    431  Modules/pyexpat.c
  0.00%   1876  Modules/pypcre.c
 50.00%      2  Modules/python.c
  0.00%    189  Modules/readline.c
 78.35%    425  Modules/regexmodule.c
 72.93%    931  Modules/regexpr.c
  0.00%     81  Modules/resource.c
 76.98%    443  Modules/rgbimgmodule.c
 82.70%    289  Modules/rotormodule.c
 82.47%    291  Modules/selectmodule.c
 85.10%    208  Modules/shamodule.c
 81.52%    276  Modules/signalmodule.c
 51.18%    678  Modules/socketmodule.c
 78.64%   1105  Modules/_sre.c
 69.67%    689  Modules/stropmodule.c
 80.49%    656  Modules/structmodule.c
  4.88%    123  Modules/termios.c
 60.71%    140  Modules/threadmodule.c
 68.78%    205  Modules/timemodule.c
 76.92%     65  Modules/ucnhash.c
 87.50%     16  Modules/unicodedatabase.c
 65.83%    120  Modules/unicodedata.c
 68.81%    420  Modules/zlibmodule.c
 64.68%   1005  Objects/abstract.c
 18.77%    261  Objects/bufferobject.c
 68.77%   1204  Objects/classobject.c
 27.59%     58  Objects/cobject.c
 59.41%    271  Objects/complexobject.c
 78.32%    678  Objects/dictobject.c
 52.14%    723  Objects/fileobject.c
 80.43%    368  Objects/floatobject.c
 84.86%    185  Objects/frameobject.c
 60.40%    149  Objects/funcobject.c
 78.68%    455  Objects/intobject.c
 77.66%    779  Objects/listobject.c
 81.17%   1142  Objects/longobject.c
 50.68%    148  Objects/methodobject.c
 58.82%    136  Objects/moduleobject.c
 76.50%    549  Objects/object.c
 15.24%    105  Objects/rangeobject.c
 41.03%     78  Objects/sliceobject.c
 76.63%   1797  Objects/stringobject.c
 77.00%    287  Objects/tupleobject.c
 22.22%     18  Objects/typeobject.c
 84.26%    108  Objects/unicodectype.c
 66.61%   2743  Objects/unicodeobject.c
 90.79%     76  Parser/acceler.c
  0.00%     28  Parser/bitset.c
  0.00%     67  Parser/firstsets.c
 18.18%     22  Parser/grammar1.c
  0.00%    139  Parser/grammar.c
  0.00%     30  Parser/intrcheck.c
  0.00%     38  Parser/listnode.c
  0.00%      2  Parser/metagrammar.c
  0.00%     63  Parser/myreadline.c
 90.70%     43  Parser/node.c
 82.26%    124  Parser/parser.c
 79.38%     97  Parser/parsetok.c
  0.00%    366  Parser/pgen.c
  0.00%     85  Parser/pgenmain.c
  0.00%     60  Parser/printgrammar.c
 76.70%    588  Parser/tokenizer.c
 62.31%   1231  Python/bltinmodule.c
 76.55%   2021  Python/ceval.c
 64.78%    230  Python/codecs.c
 73.85%   2367  Python/compile.c
 76.67%     30  Python/dynload_shlib.c
 75.75%    301  Python/errors.c
 65.59%    401  Python/exceptions.c
  0.00%     31  Python/frozenmain.c
 56.83%    776  Python/getargs.c
100.00%      2  Python/getcompiler.c
100.00%      2  Python/getcopyright.c
 80.00%      5  Python/getmtime.c
 15.62%     32  Python/getopt.c
100.00%      2  Python/getplatform.c
100.00%      4  Python/getversion.c
 61.78%   1167  Python/import.c
 66.67%     42  Python/importdl.c
 51.35%    483  Python/marshal.c
 60.58%    274  Python/modsupport.c
 88.73%     71  Python/mystrtoul.c
  0.00%      2  Python/pyfpe.c
 91.15%    113  Python/pystate.c
 37.80%    635  Python/pythonrun.c
  0.00%      5  Python/sigcheck.c
 12.67%    150  Python/structmember.c
 53.87%    323  Python/sysmodule.c
100.00%      5  Python/thread.c
 53.47%    144  Python/thread_pthread.h
 21.74%    138  Python/traceback.c
 58.65%  48417  TOTAL



From tim.one at home.com  Fri Jan  5 21:46:10 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 5 Jan 2001 15:46:10 -0500
Subject: [Python-Dev] RE: fileinput.py
In-Reply-To: <14934.6890.160122.384692@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEAAIHAA.tim.one@home.com>

[Skip Montanaro]
> What do you think contributes to fileinput's relative disfavor?

Only half jokingly, because I never use it <wink>, and I don't think Fredrik
or Alex Martelli do either.  That means it rarely gets mentioned by the
c.l.py reply bots.  Plus it's not *used* anywhere in the Python
distribution, so nobody stumbles into it that way either.  Plus the docs
require more than one line to explain what it does, and get bogged down
describing the Awk-like (Perl took this from Awk) convolutions before the
simplest (one explictly named file) case.  It *is* regularly mentioned in
the eternal "while 1:" debate, but that's it.

> This whole thread on Python's file reading performance was started
> by the eternal whine "why is Python so much slower than Perl?"

No, it started with Guido's objections to Jeff's xreadlines patch.  I
dragged Perl into it -- because, like it or not, that was the right thing to
do <wink>.

> which really means why is
>
>    line = f.readline()
>    while line:
>       process(line)
>
> so much slower than whatever that thing is in Perl that everybody
> uses as the be-all-end-all performance benchmark (something with
> <> in it).

"<FILE>" is simply Perl's way of spelling Python's FILE.readline() (and
FILE.readlines(), when <FILE> appears in an array context; and FILE.read()
when Perl's Awkish "record separator" is disabled; and ...).  "<>" without
an explict filehandle does all the inherited-from-Awk magic with argv, else
that stuff doesn't come into play.   "<>" (wihtout a filehandle) seems
rarely used in Perl practice, though, *except* in support of

your_shell_prompt> some_perl_script < some_file

That is, "<>" is usually used simply as an abbrevision for <STDIN>, and I
bet *most* Perl programmers don't even know "<>" is more general than that.

> Given that fileinput is supposed to make the I/O loop in Python more
> familiar to those people wandering over from Perl (at least in part),
> you'd think that people would naturally gravitate to it.

I guess you didn't actually read the timing results <wink>.  Really, it's
been an outrageously slow way to do input.  That's better now, and I'm much
more likely now than I used to be to use

    for line in fileinput.input('file'):

instead of

    f = open('file')
    while 1:
        line = f.readline()
        if not line:
            break

The relative attraction of the former is obvious if it's reasonably quick.
I don't really have any use for the Awk complications (note that I'm running
on Windows, though, and the shells here don't expand wildcards -- the Awk
gimmicks are much more useful on Unix systems).

> Would it benefit from some exposure in the Python tutorial?

Heh -- that's a tough one.  The *simplest* case is the only one deserving of
promotion.  But in that case, Jeff's xreadlines is about as convenient and
much quicker.  I bet we'll all be afraid to change the tutorial to mention
either <0.9 wink>.

> Is it fast enough now to warrant the extra exposure?

Don't know.  It's the same speed as "while 1: on *my* box now, but still 3x
slower than the double-loop method.

> just-whining-out-loud-ly y'rs

so-do-*you*-want-to-use-it-now?-ly y'rs  - tim




From thomas at xs4all.net  Fri Jan  5 22:19:42 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 5 Jan 2001 22:19:42 +0100
Subject: [Python-Dev] RE: fileinput.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEAAIHAA.tim.one@home.com>; from tim.one@home.com on Fri, Jan 05, 2001 at 03:46:10PM -0500
References: <14934.6890.160122.384692@beluga.mojam.com> <LNBBLJKPBEHFEDALKOLCIEAAIHAA.tim.one@home.com>
Message-ID: <20010105221942.J2467@xs4all.nl>

On Fri, Jan 05, 2001 at 03:46:10PM -0500, Tim Peters wrote:

> "<>" (wihtout a filehandle) seems
> rarely used in Perl practice, though, *except* in support of
> 
> your_shell_prompt> some_perl_script < some_file
> 
> That is, "<>" is usually used simply as an abbrevision for <STDIN>, and I
> bet *most* Perl programmers don't even know "<>" is more general than that.

Well, I can't say anything about *most* Perl programmers, but all Perl
programmers I know (including me) know damned well what <> does, and use it
frequently. And in all the ways: no arguments meaning <STDIN>, a list of
files meaning open those files one at a time, using - to include stdin in
that list, accessing the filename and linenumber, etc. None of them can be
called newbies, though.

But then, I like using Python's fileinput, too, so maybe I'm just weird :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From ping at lfw.org  Fri Jan  5 23:01:53 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Fri, 5 Jan 2001 16:01:53 -0600 (CST)
Subject: [Python-Dev] RE: fileinput.py
In-Reply-To: <20010105221942.J2467@xs4all.nl>
Message-ID: <Pine.LNX.4.10.10101051553200.452-100000@skuld.kingmanhall.org>

On Fri, 5 Jan 2001, Thomas Wouters wrote:

> On Fri, Jan 05, 2001 at 03:46:10PM -0500, Tim Peters wrote:
> > That is, "<>" is usually used simply as an abbrevision for <STDIN>, and I
> > bet *most* Perl programmers don't even know "<>" is more general than that.
> 
> Well, I can't say anything about *most* Perl programmers, but all Perl
> programmers I know (including me) know damned well what <> does, and use it
> frequently. And in all the ways: no arguments meaning <STDIN>, a list of
> files meaning open those files one at a time, using - to include stdin in
> that list, accessing the filename and linenumber, etc.

I was just about to chime in and say the same thing.  I don't even
program in Perl any more, and i still remember all the ways that <> works.

For text-processing scripts, it's unbeatable.  It does pretty much
exactly everything you want, and the idiom

    while (<>) {
        ...
    }

is simple, quickly learned, frequently used, and instantly recognizable.

    import sys
    if len(sys.argv) > 1:
        file = open(sys.argv[1])
    else:
        file = sys.stdin
    while 1:
        line = file.readline()
        if not line:
            break
        ...

is much more complex, harder to explain, harder to learn, and runs slower.

I have two separate suggestions:

    1.  Include 'sys' in builtins.  It's silly to have to 'import sys'
        just to be able to see sys.argv and sys.stdin.

    2.  Put fileinput.input() in sys.

With both, the while (<>) idiom becomes:

    for line in sys.input():
        ...


-- ?!ng

"This code is better than any code that doesn't work has any right to be."
    -- Roger Gregory, on Xanadu




From skip at mojam.com  Fri Jan  5 23:19:36 2001
From: skip at mojam.com (Skip Montanaro)
Date: Fri, 5 Jan 2001 16:19:36 -0600 (CST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <14934.11371.879059.610988@localhost.localdomain>
References: <14934.7465.360749.199433@localhost.localdomain>
	<004c01c07751$6eed84d0$e46940d5@hagrid>
	<200101052001.PAA20238@cj20424-a.reston1.va.home.com>
	<14934.11371.879059.610988@localhost.localdomain>
Message-ID: <14934.18552.749081.871226@beluga.mojam.com>

    Jeremy> If anyone is interested in experimenting with a test suite, here
    Jeremy> is a summary of the code coverage for the current regression
    Jeremy> test suite as run on my Linux box.

Speaking of which, I am still running my nightly code coverage thing (still
with warts) whose results are available at

    http://musi-cal.mojam.com/~skip/python/Python/dist/src/

Does anyone care?  Should I turn it off?

Skip



From thomas at xs4all.net  Sat Jan  6 00:18:58 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sat, 6 Jan 2001 00:18:58 +0100
Subject: [Python-Dev] RE: fileinput.py
In-Reply-To: <Pine.LNX.4.10.10101051553200.452-100000@skuld.kingmanhall.org>; from ping@lfw.org on Fri, Jan 05, 2001 at 04:01:53PM -0600
References: <20010105221942.J2467@xs4all.nl> <Pine.LNX.4.10.10101051553200.452-100000@skuld.kingmanhall.org>
Message-ID: <20010106001858.B402@xs4all.nl>

On Fri, Jan 05, 2001 at 04:01:53PM -0600, Ka-Ping Yee wrote:

>     while (<>) {
>         ...
>     }

> is simple, quickly learned, frequently used, and instantly recognizable.

>     import sys
>     if len(sys.argv) > 1:
>         file = open(sys.argv[1])
>     else:
>         file = sys.stdin
>     while 1:
>         line = file.readline()
>         if not line:
>             break
>         ...

... Except that it can take more than one filename, and will do the one
after another, and that it takes "-" as a filename for stdin. Doing it in a
script is not dead simple, unless you open up all files at once (which can
be harmful, and Perl, for one, doesn't do) or you do most of the work
fileinput does. That is why I use fileinput (and while-diamond) -- I might
not need it now, but when I do need it, it already works :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From moshez at zadka.site.co.il  Sat Jan  6 12:00:33 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Sat,  6 Jan 2001 13:00:33 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
Message-ID: <20010106110033.52127A84F@darjeeling.zadka.site.co.il>

On Fri, 05 Jan 2001 15:14:41 -0500, Guido van Rossum <guido at python.org> wrote:

> Please have a look at this SF patch:
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> 
> This implements control over which names defined in a module are
> externally visible: if there's a variable __exports__ in the module,
> it is a list of identifiers, and any access from outside the module to
> names not in the list is disallowed.  This affects access using the
> getattr and setattr protocols (which raise AttributeError for
> disallowed names), as well as "from M import v" (which raises
> ImportError).

Ummmmm.....why do we want this? What's wrong with the current
suggestion of using "_"? __exports__ feels somehow wrong to
me. None of the rest of Python has any access control, and
I really like that. A big -1 from me, for what it's worth.

> I like it.

I'm surprised. Why do you like that?

>  This has been asked for many times.  

So has adding curly-braces as control structure, with all due respect.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From billtut at microsoft.com  Sat Jan  6 04:43:06 2001
From: billtut at microsoft.com (Bill Tutt)
Date: Fri, 5 Jan 2001 19:43:06 -0800 
Subject: [Python-Dev] Add __exports__ to modules
Message-ID: <58C671173DB6174A93E9ED88DCB0883DB8637E@red-msg-07.redmond.corp.microsoft.com>

I think I'm with Moshe on this one, whats wrong with just using underscores
(__) to play the hiding game.

Here's my silly language suggestion for this week:

with self:
  .bar = foo
  bar.blah = .fubar
  .bar = .bar + 1
  # etc....

Bill



From skip at mojam.com  Sat Jan  6 05:15:12 2001
From: skip at mojam.com (Skip Montanaro)
Date: Fri, 5 Jan 2001 22:15:12 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010106110033.52127A84F@darjeeling.zadka.site.co.il>
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
	<20010106110033.52127A84F@darjeeling.zadka.site.co.il>
Message-ID: <14934.39888.908416.983794@beluga.mojam.com>

    > On Fri, 05 Jan 2001 15:14:41 -0500, Guido van Rossum <guido at python.org> wrote:
    > Please have a look at this SF patch:
    > 
    > http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
    > 
    > This implements control over which names defined in a module are
    > externally visible: if there's a variable __exports__ in the module,
    > it is a list of identifiers, and any access from outside the module to
    > names not in the list is disallowed.  This affects access using the
    > getattr and setattr protocols (which raise AttributeError for
    > disallowed names), as well as "from M import v" (which raises
    > ImportError).

I have to agree with Moshe.  If __exports__ is implemented for modules we'll
have multiple, different access control mechanisms for different things,
some of which thoughtful programmers would be able to get around, some of
which they wouldn't.  Here are the ways I'm aware of to control attribute
visibility (there may be others - I don't usually delve too deeply into this
stuff):

  * preface module globals with "_": This just prevents those globals from
    being added to the current namespace when a programmer executes "from
    module import *".  Programmers can workaround this by attribute access
    through the module object or by explicitly importing it: "from module
    import _foo" works, yes?

  * preface class or instance attributes with "__":  This just mangles the
    name by prefacing the visible name with _<classname>.  The programmer
    can still access it by knowing the simple name mangling rule.

In both cases the programmer can still get at the attribute value when
necessary.

If you were to add some sort of access control to module globals, I would
have thought it would have been along the same lines as the existing
mechanisms in place to "hide" class/instance attributes.  Would it be
possible (or desirable) to add the name mangling restriction to module
globals as an alternative to this more restrictive implementation?  What
about the chances that class/instance attribute hiding will get more
restrictive in the future?  Finally, are the motivations for wanting to
restrict access to module globals and class/instance attributes that much
different from one another that they call for fundamentally different
mechanisms?

Skip



From barry at digicool.com  Sat Jan  6 06:15:20 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 6 Jan 2001 00:15:20 -0500
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
	<20010106110033.52127A84F@darjeeling.zadka.site.co.il>
Message-ID: <14934.43496.322436.612746@anthem.wooz.org>

I'm -0 on this, largely for the reasons already brought up: if modules
grow __exports__ then there will be pressure to add it to classes, and
modules already have a limited version of access control through
leading underscore names.

I might be more positive on the addition if __exports__ were added to
classes, because at least there'd be a consistently stronger fence
added to name access rules that prevented even consenting adults from
fiddling with the naughty bits.

-Barry




From nas at arctrix.com  Sat Jan  6 00:20:58 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 5 Jan 2001 15:20:58 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <14934.43496.322436.612746@anthem.wooz.org>; from barry@digicool.com on Sat, Jan 06, 2001 at 12:15:20AM -0500
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com> <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <14934.43496.322436.612746@anthem.wooz.org>
Message-ID: <20010105152058.A6016@glacier.fnational.com>

On Sat, Jan 06, 2001 at 12:15:20AM -0500, Barry A. Warsaw wrote:
> I might be more positive on the addition if __exports__ were added to
> classes, because at least there'd be a consistently stronger fence
> added to name access rules that prevented even consenting adults from
> fiddling with the naughty bits.

I think you, Skip and Moshe are missing a big advantage of having
the __exports__ mechanism.  It should allow some attribute access
inside of modules to become faster (like LOAD_FAST for locals).
I think that optimization could be implemented without too much
difficultly.  I've never channeled Guido before so I could be off
the mark.  If the only advantage is encapsulation then I'm -0.

  Neil



From barry at digicool.com  Sat Jan  6 08:09:31 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 6 Jan 2001 02:09:31 -0500
Subject: [Python-Dev] PEP 232 update and patch
Message-ID: <14934.50347.851118.581484@anthem.wooz.org>


I've updated PEP 232, function attributes, and uploaded a patch to SF.
I couldn't coax cvs diff into including the new files
Lib/test/test_funcattrs.py and Lib/test/output/test_funcattrs so I'll
attach them below.

PEP 232:
   http://python.sourceforge.net/peps/pep-0232.html

SF patch #103123:
   http://sourceforge.net/patch/?func=detailpatch&patch_id=103123&group_id=5470

Enjoy,
-Barry

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_funcattrs.py
URL: <http://mail.python.org/pipermail/python-dev/attachments/20010106/bf1d0513/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_funcattrs
URL: <http://mail.python.org/pipermail/python-dev/attachments/20010106/bf1d0513/attachment.asc>

From martin at loewis.home.cs.tu-berlin.de  Sat Jan  6 11:06:49 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 6 Jan 2001 11:06:49 +0100
Subject: [Python-Dev] PEP 208 comment
Message-ID: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de>

I just studied PEP 208 for the first time. Overall, it seems all
natural and nice, but there is one one aspect I'd like to see changed:
the naming of the type flag.

Currently, it is called Py_TPFLAGS_NEWSTYLENUMBER. IMHO, nothing in a
program should be called "new". The flag will still be there five
years from now, but it won't be new anymore. Also, while the flag
indicates that style of the numbers is new, it does not say what it
does. So I propose to rename it; if nobody finds a better name, I
propose to call it Py_TPFLAGS_UNCOERCED.

Regards,
Martin



From thomas at xs4all.net  Sat Jan  6 13:52:19 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sat, 6 Jan 2001 13:52:19 +0100
Subject: [Python-Dev] PEP 208 comment
In-Reply-To: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Sat, Jan 06, 2001 at 11:06:49AM +0100
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de>
Message-ID: <20010106135219.L2467@xs4all.nl>

On Sat, Jan 06, 2001 at 11:06:49AM +0100, Martin v. Loewis wrote:

> Currently, it is called Py_TPFLAGS_NEWSTYLENUMBER. IMHO, nothing in a
> program should be called "new". The flag will still be there five
> years from now, but it won't be new anymore. Also, while the flag
> indicates that style of the numbers is new, it does not say what it
> does. So I propose to rename it; if nobody finds a better name, I
> propose to call it Py_TPFLAGS_UNCOERCED.

Wrong name. The TPFLAGs only indicate whether a struct is large enough to
contain a particular member, not whether that member is going to contain or
do anything. 'Py_TPFLAGS_HASCOERCE' or some such would seem more appropriate
to me.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From martin at loewis.home.cs.tu-berlin.de  Sat Jan  6 14:36:39 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 6 Jan 2001 14:36:39 +0100
Subject: [Python-Dev] PEP 208 comment
In-Reply-To: <20010106135219.L2467@xs4all.nl> (message from Thomas Wouters on
	Sat, 6 Jan 2001 13:52:19 +0100)
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de> <20010106135219.L2467@xs4all.nl>
Message-ID: <200101061336.f06DadP02895@mira.informatik.hu-berlin.de>

> Wrong name. The TPFLAGs only indicate whether a struct is large enough to
> contain a particular member, not whether that member is going to contain or
> do anything. 

That may have been the original intention; *this* specific flag is not
of that kind. Please look at abstract.c:binary_op1, which has

	if (v->ob_type->tp_as_number != NULL && NEW_STYLE_NUMBER(v)) {
		slot = NB_BINOP(v->ob_type->tp_as_number, op_slot);
		if (*slot) {
			x = (*slot)(v, w);
			if (x != Py_NotImplemented) {
				return x;
			}
			Py_DECREF(x); /* can't do it */
		}
		if (v->ob_type == w->ob_type) {
			goto binop_error;
		}
	}

Here, no additional member was added: there always was tp_as_number,
and that also supported all possible op_slot values. What is new here
is that the slot may be called even if v and w have different types;
that was not allowed before the PEP 208 changes. Yet it tests for
NEW_STYLE_NUMBER(v), which is

PyType_HasFeature((o)->ob_type, Py_TPFLAGS_NEWSTYLENUMBER)

So the presence of this flag is indeed an promise that a specific
member will do something that it normally wouldn't do.

> 'Py_TPFLAGS_HASCOERCE' or some such would seem more appropriate to
> me.

Well, all numbers still have coercion - it just may not be used if the
flag is present. It's not a matter of having or not having something
(well, only the "new style" numbers may have nb_cmp, but calling it
Py_TPFLAGS_HAS_NB_CMP would be besides the point, IMO).

Anyway, I don't want to defend my version too much - I just want to
request that the current name is changed to *something* more
descriptive.

Regards,
Martin



From skip at mojam.com  Sat Jan  6 15:40:30 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 6 Jan 2001 08:40:30 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010105152058.A6016@glacier.fnational.com>
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
	<20010106110033.52127A84F@darjeeling.zadka.site.co.il>
	<14934.43496.322436.612746@anthem.wooz.org>
	<20010105152058.A6016@glacier.fnational.com>
Message-ID: <14935.11870.360839.235102@beluga.mojam.com>

    Neil> I think you, Skip and Moshe are missing a big advantage of having
    Neil> the __exports__ mechanism.  It should allow some attribute access
    Neil> inside of modules to become faster (like LOAD_FAST for locals).  I
    Neil> think that optimization could be implemented without too much
    Neil> difficultly.

True enough, that hadn't occurred to me.  Knowing that now, I still don't
think consistency of the interface should suffer as a result of
under-the-covers performance gains.

Skip




From skip at mojam.com  Sat Jan  6 15:42:25 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 6 Jan 2001 08:42:25 -0600 (CST)
Subject: [Python-Dev] Re: [Patches] [Patch #103123] PEP 232 implementation (function attributes)
In-Reply-To: <E14En6H-0000ol-00@usw-sf-web1.sourceforge.net>
References: <E14En6H-0000ol-00@usw-sf-web1.sourceforge.net>
Message-ID: <14935.11985.972526.108391@beluga.mojam.com>

Oooo...  I tried went to check out Barry's function attribute patch at

    http://sourceforge.net/patch/?func=detailpatch&patch_id=103123&group_id=5470

and got

    Fatal error: Call to a member function on a non-object in
    /usr/local/htdocs/alexandria/www/patch/index.php on line 55

in response.  Any idea whazzup?

Skip



From akuchlin at cnri.reston.va.us  Sat Jan  6 15:47:59 2001
From: akuchlin at cnri.reston.va.us (Andrew Kuchling)
Date: Sat, 6 Jan 2001 09:47:59 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <14934.18552.749081.871226@beluga.mojam.com>; from skip@mojam.com on Fri, Jan 05, 2001 at 04:19:36PM -0600
References: <14934.7465.360749.199433@localhost.localdomain> <004c01c07751$6eed84d0$e46940d5@hagrid> <200101052001.PAA20238@cj20424-a.reston1.va.home.com> <14934.11371.879059.610988@localhost.localdomain> <14934.18552.749081.871226@beluga.mojam.com>
Message-ID: <20010106094759.A13723@newcnri.cnri.reston.va.us>

On Fri, Jan 05, 2001 at 04:19:36PM -0600, Skip Montanaro wrote:
>Speaking of which, I am still running my nightly code coverage thing (still
>with warts) whose results are available at
>    http://musi-cal.mojam.com/~skip/python/Python/dist/src/

Add a link to it from the Python development pages on SourceForge; I
suspect much of the problem is that people don't remember the URL for
it, and don't want to dig through the archives to find it.

--amk




From mal at lemburg.com  Sat Jan  6 16:15:27 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 06 Jan 2001 16:15:27 +0100
Subject: [Python-Dev] PEP 208 comment
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de>
Message-ID: <3A57368F.FC01F78@lemburg.com>

"Martin v. Loewis" wrote:
> 
> I just studied PEP 208 for the first time. Overall, it seems all
> natural and nice, but there is one one aspect I'd like to see changed:
> the naming of the type flag.
> 
> Currently, it is called Py_TPFLAGS_NEWSTYLENUMBER. IMHO, nothing in a
> program should be called "new". The flag will still be there five
> years from now, but it won't be new anymore. Also, while the flag
> indicates that style of the numbers is new, it does not say what it
> does. So I propose to rename it; if nobody finds a better name, I
> propose to call it Py_TPFLAGS_UNCOERCED.

Given that the design could well be applied to other slots as
well, I think you've got a point there. The idea behind the
flag was to signal that slots will no longer make object type
assumptions which they could previously. Right now, only numeric
types support this feature. In the future I could imaging
strings and other types involving coercion would also want
to use the feature.

Given this design idea, how about calling the flag
Py_TPFLAGS_CHECKTYPES ?!

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From skip at mojam.com  Sat Jan  6 16:35:20 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 6 Jan 2001 09:35:20 -0600 (CST)
Subject: [Python-Dev] function attributes as "true" class attributes & reclamation error
Message-ID: <14935.15160.130742.390323@beluga.mojam.com>

You know, I thought of something (which was probably already obvious to the
rest of you) while perusing Barry's patch.  Attaching function attributes to
unbound methods could really function like C++ static data members.  You'd
have to write accessor functions to make setting the attributes look clean,
but that wouldn't be all bad.  Precisely because you couldn't modify them
through the bound method, there's be no chance you could make the mistake of
modifying them that way and having them transmogrify into instance
attributes.

Here's a quick example:

    class C:
      def __init__(self):
	self.just_resting()
      __init__.howmany = 0

      def __del__(self):
	self.hes_dead()

      def hes_dead(self):
	C.__init__.howmany -= 1

      def just_resting(self):
	C.__init__.howmany += 1

      def howmany(self):
	return C.__init__.howmany

    def howmany():
	return C.__init__.howmany

    c = C()
    print c.howmany()
    d = C()
    print d.howmany()
    del c
    print d.howmany()

After applying Barry's patch, if I execute this script from the command line
it displays

    1
    2
    1

as one would expect, but then catches an attribute error during cleanup:

    Exception exceptions.AttributeError: "'None' object has no attribute
    '__init__'" in <method C.__del__ of C instance at 0x80ffc14> ignored

If I add "del d" to the end of the script the exception disappears.  I
suspect there is a cleanup order problem of some sort.  It seems like C is
getting reclaimed before d (not possible), or that d's __class__ attribute
is set to None before its __del__ method is called.  Is this a known problem
or something introduced by Barry's patch?

Skip




From barry at digicool.com  Sat Jan  6 17:09:47 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 6 Jan 2001 11:09:47 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #103123] PEP 232 implementation (function attributes)
References: <E14En6H-0000ol-00@usw-sf-web1.sourceforge.net>
	<14935.11985.972526.108391@beluga.mojam.com>
Message-ID: <14935.17227.634808.132783@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip at mojam.com> writes:

    SM> and got

    |     Fatal error: Call to a member function on a non-object in
    |     /usr/local/htdocs/alexandria/www/patch/index.php on line 55

    SM> in response.  Any idea whazzup?

I got a similar error on SF when I tried to find my patch on the
patches page.  I still think the patch manager just gives you no way
to see all the patches when there's more than what fits on one page.
The error dropped a cookie in my lap that logged me out too.

After I logged in again, it all seemed to work.

-Barry




From martin at loewis.home.cs.tu-berlin.de  Sat Jan  6 16:20:51 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 6 Jan 2001 16:20:51 +0100
Subject: [Python-Dev] PEP 208 comment
In-Reply-To: <3A57368F.FC01F78@lemburg.com> (mal@lemburg.com)
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de> <3A57368F.FC01F78@lemburg.com>
Message-ID: <200101061520.f06FKpu03218@mira.informatik.hu-berlin.de>

> Given this design idea, how about calling the flag
> Py_TPFLAGS_CHECKTYPES ?!

Sounds good to me.

Martin



From thomas at xs4all.net  Sat Jan  6 17:47:24 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sat, 6 Jan 2001 17:47:24 +0100
Subject: [Python-Dev] PEP 208 comment
In-Reply-To: <200101061336.f06DadP02895@mira.informatik.hu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Sat, Jan 06, 2001 at 02:36:39PM +0100
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de> <20010106135219.L2467@xs4all.nl> <200101061336.f06DadP02895@mira.informatik.hu-berlin.de>
Message-ID: <20010106174724.M2467@xs4all.nl>

On Sat, Jan 06, 2001 at 02:36:39PM +0100, Martin v. Loewis wrote:

> That may have been the original intention; *this* specific flag is not
> of that kind. Please look at abstract.c:binary_op1, which has

You're right, I stand corrected, I retract my proposal :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Sat Jan  6 23:05:23 2001
From: guido at python.org (Guido van Rossum)
Date: Sat, 06 Jan 2001 17:05:23 -0500
Subject: [Python-Dev] function attributes as "true" class attributes & reclamation error
In-Reply-To: Your message of "Sat, 06 Jan 2001 09:35:20 CST."
             <14935.15160.130742.390323@beluga.mojam.com> 
References: <14935.15160.130742.390323@beluga.mojam.com> 
Message-ID: <200101062205.RAA23603@cj20424-a.reston1.va.home.com>

> You know, I thought of something (which was probably already obvious to the
> rest of you) while perusing Barry's patch.  Attaching function attributes to
> unbound methods could really function like C++ static data members.  You'd
> have to write accessor functions to make setting the attributes look clean,
> but that wouldn't be all bad.  Precisely because you couldn't modify them
> through the bound method, there's be no chance you could make the mistake of
> modifying them that way and having them transmogrify into instance
> attributes.
> 
> Here's a quick example:
> 
>     class C:
>       def __init__(self):
> 	self.just_resting()
>       __init__.howmany = 0
> 
>       def __del__(self):
> 	self.hes_dead()
> 
>       def hes_dead(self):
> 	C.__init__.howmany -= 1
> 
>       def just_resting(self):
> 	C.__init__.howmany += 1
> 
>       def howmany(self):
> 	return C.__init__.howmany
> 
>     def howmany():
> 	return C.__init__.howmany
> 
>     c = C()
>     print c.howmany()
>     d = C()
>     print d.howmany()
>     del c
>     print d.howmany()

Skip, I don't find this better than the existing solution, which uses
C._howmany instead of C.__init__.howmany.

True, you can access it as self._howmany and if you assign to
self._howmany you'd transform it into an instance attribute -- but
that falls in the "then don't do that" category.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Sat Jan  6 23:14:44 2001
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 6 Jan 2001 17:14:44 -0500
Subject: [Python-Dev] Rehabilitating fgets
Message-ID: <LNBBLJKPBEHFEDALKOLCCEBOIHAA.tim_one@email.msn.com>

[Guido]
> ...
> Unfortunately we can't use fgets(), even if it were faster than
> getline(), because it doesn't tell how many characters it read.

Let's think about that a little harder, because it appears to be our only
hope on Windows (the MS fgets isn't optimized like the Perl inner loop, but
it does lock/unlock the stream only at routine entry/exit, and uses a hidden
non-locking (== much faster) variant of getc in the guts -- we've seen that
the "locking" part of MS getc accounts for 17 of 30 seconds in my test
case).

> On files containing null bytes, readline() is supposed to treat
> these like any other character;

fgets does too (at least it does on Windows, and I believe that's std
behavior).  The problem is that it also makes up a null byte on its own.

> If your input is "abc\0def\nxyz\n", the first readline() call
> should return "abc\0def\n".

Yes.

> But with fgets(), you're left to look in the returned buffer for
> a null byte,

Also yes.  But suppose I search "from the right", and ensure the buffer is
free of null bytes before the fgets.  For your input file above, fgets
overwrites the initial 9 bytes of the buffer (assuming the buffer is at
least 9 bytes long ...) with

    "abc\0def\n\0"

and there's no problem if I search from the right.

> and there's no way (in general) to distinguish this result from
> an input file that only consisted of the three characters "abc".

As above, I'm not convinced of that.  The input file "abc" would overwrite
the first four bytes of the buffer with

    "abc\0"

and leave the tail end alone (well, the MS fgets leaves the tail alone,
although I'm not sure ANSI C guarantees that).

Of course I've *read* any number of Unix(tm) FAQs that also claim it's
impossible, but I never believed them either <wink>.

This extra buffer fiddling is surely an expense I don't want to pay, but the
timing evidence on Windows so far says that I can probably search and/or
copy the whole buffer 100 times and still be faster than enduring the
threadsafe getc.

Am I missing something obvious?





From guido at python.org  Sat Jan  6 23:33:00 2001
From: guido at python.org (Guido van Rossum)
Date: Sat, 06 Jan 2001 17:33:00 -0500
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: Your message of "Sat, 06 Jan 2001 17:14:44 EST."
             <LNBBLJKPBEHFEDALKOLCCEBOIHAA.tim_one@email.msn.com> 
References: <LNBBLJKPBEHFEDALKOLCCEBOIHAA.tim_one@email.msn.com> 
Message-ID: <200101062233.RAA23942@cj20424-a.reston1.va.home.com>

[Tim suggests to use fgets(), preparing the buffer with non-null
bytes, and searching for a null byte from the right.]

If this is really sufficiently fast, I'd say, go for it.  Looks
bullet-proof as long as the source code to MSVCRT doesn't change. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Sat Jan  6 23:34:42 2001
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 6 Jan 2001 17:34:42 -0500
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEBOIHAA.tim_one@email.msn.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEBPIHAA.tim_one@email.msn.com>

[Tim, pondering]
> ... But suppose I search "from the right", and ensure the buffer is
> free of null bytes before the fgets.

Even better, suppose I ensure the buffer is free of both null bytes and
newlines before the fgets; then if I search from the *left* for a newline
and find one, it must be that fgets found a line and it ends right there,
and this should usually obtain.  There's no need to search from the right
unless I don't find a newline ...





From skip at mojam.com  Sun Jan  7 02:15:08 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 6 Jan 2001 19:15:08 -0600 (CST)
Subject: [Python-Dev] function attributes as "true" class attributes & reclamation error
In-Reply-To: <200101062205.RAA23603@cj20424-a.reston1.va.home.com>
References: <14935.15160.130742.390323@beluga.mojam.com>
	<200101062205.RAA23603@cj20424-a.reston1.va.home.com>
Message-ID: <14935.49948.574427.668588@beluga.mojam.com>

    Skip> Attaching function attributes to unbound methods could really
    Skip> function like C++ static data members....

    Guido> Skip, I don't find this better than the existing solution, which
    Guido> uses C._howmany instead of C.__init__.howmany.

It was more a "hey, I never thought of it quite that way" than a "hey, I
think this would be a great new idiom".  In fact, I believe the more
important part of my note was the bit about the attribute error on exit.

I'm sure function attributes will attract their fair share of abuse. ;-)

Skip





From tim_one at email.msn.com  Sun Jan  7 04:16:31 2001
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 6 Jan 2001 22:16:31 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin test_charmapcodec test_pow
Message-ID: <LNBBLJKPBEHFEDALKOLCMECFIHAA.tim_one@email.msn.com>

I'm pretty sure the test_pow and test_charmapcodec failures aren't my doing.

test_builtin fails because raw_input() isn't stripping a trailing newline.
I've got my own code in this area that *may* be to blame, but I don't see
how it could be.  I note that fileobject.c's new function get_line_raw has
the comment

/* Internal routine to get a line for raw_input():
   strip trailing '\n', raise EOFError if EOF reached immediately
*/

but the code doesn't look for a trailing newline (let alone strip one).





From tim_one at email.msn.com  Sun Jan  7 04:33:02 2001
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 6 Jan 2001 22:33:02 -0500
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: <200101062233.RAA23942@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com>

> [Tim suggests to use fgets(), preparing the buffer with non-null
> bytes, and searching for a null byte from the right.]

[Guido]
> If this is really sufficiently fast, I'd say, go for it.  Looks
> bullet-proof as long as the source code to MSVCRT doesn't change. :-)

Surprise?  Despite all the memsets, memchrs (looking for a newline), and
one-at-a-time backward searches (looking for a null byte), it's a huge win
on Windows:

total 117615824 chars and 3237568 lines
readlines_sizehint    9.550  9.578
using_fileinput      28.790 28.781
while_readline       13.120 13.134

The last one was 30.5 seconds before the fgets hackery.

I'll check it in tomorrow after sleeping on it (there's a large pile of
messy endcases (not only does fgets() invent a null byte, it can't tell you
whether it stopped reading due to EOF, so maybe the last line in the file
ends with 10000 null bytes + no newline + exactly lines up with a buffer
boundary -- etc); test_builtin is failing in a closely related area but
nobody would have checked in code that failed a std test <wink>; and it's
been a frustrating day all around).

i-want-my-cable-modem-back-now-ly y'rs  - tim





From esr at thyrsus.com  Sun Jan  7 05:01:25 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sat, 6 Jan 2001 23:01:25 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com>; from tim_one@email.msn.com on Sat, Jan 06, 2001 at 10:33:02PM -0500
References: <200101062233.RAA23942@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com>
Message-ID: <20010106230125.A29058@thyrsus.com>

Tim Peters <tim_one at email.msn.com>:
> > [Tim suggests to use fgets(), preparing the buffer with non-null
> > bytes, and searching for a null byte from the right.]

No, I haven't forgotten about the curses autoconfig stuff.  But...

This mess reminds me.  For some work I'm doing right now, it would be
very useful if there were a way to query the end-of-file status of a
file descriptor without actually doing a read.

I don't see this ability anywhere in the 2.0 API.  Questions:

1. Am I missing something obvious?

2. If the answer to 1 is that I am not, in fact, being a dumbass, what
   is the right way to support this?  The obvious alternatives are an 
   eof member (analogous to the existing `closed' member, or an eof()
   method.  I favor the latter.

3. If we agree on a design, I'm willing to implement this at least for
   Unix.  Should be a small project.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The direct use of physical force is so poor a solution to the problem of
limited resources that it is commonly employed only by small children and
great nations.
	-- David Friedman



From skip at mojam.com  Sun Jan  7 05:05:22 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 6 Jan 2001 22:05:22 -0600 (CST)
Subject: [Python-Dev] readline module seems crippled - am I missing something?
Message-ID: <14935.60162.726131.593211@beluga.mojam.com>

For a more-or-less throwaway script I'm working on I need a little input
function similar to Emacs's read-from-minibuffer, which accepts both a
prompt and an initial string for the input buffer.  Seems like I ought to be
able to whip something up using readline, but it's not happening.  GNU
readline's docs aren't the greatest, but I thought this simple script would
work:

    import readline
    readline.insert_text("default")
    x = raw_input("?")
    print x

I expected to see an editable "default" displayed after the prompt and have
x default to "default" if I just hit the return key.  I see nothing
displayed after the question mark, and x is the empty string if I just hit
return.  

This does print "default":

    readline.insert_text("default")
    x = readline.get_line_buffer()
    print x

so I know that insert_text and get_line_buffer seem to be working as
intended.  Looking at call_readline in Modules/readline.c I see nothing that
would disrupt the line buffer before the call to readline().

Am I missing something totally obvious about how GNU readline works or the
conditions under which readline is used (only at the interactive prompt?) or
is some required bit of GNU readline not exposed through Python's readline
module?

Skip



From tim.one at home.com  Sun Jan  7 11:09:02 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 7 Jan 2001 05:09:02 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <20010106230125.A29058@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>

[Eric S. Raymond]
> ...
> For some work I'm doing right now, it would be very useful if
> there were a way to query the end-of-file status of a file
> descriptor without actually doing a read.
>
> I don't see this ability anywhere in the 2.0 API.

When someone says "API", I think "C API".  In that case you can use
feof(stream) directly, or whatever the heck your platform supports for
handles (_eof(handle) on Windows, which I know is an OS you're secretly
longing to master <wink>).

I don't believe there's a way to find out from Python short of trying to
read, though.  Well, I suppose you could try to compare f.tell() to the
size, if you knew that f.tell() and "the size" made sense for f ...

> 1. Am I missing something obvious?

I don't know!  I never asked Guido about this, and given that he's not on
vacation now I'm not allowed to channel him.  I would hazard a guess,
though, that he thinks "you do or don't get something back when you read" is
clearer than "you may or may not get something back when you read,
regardless of which answer I give you in response to .eof() -- depending".
The latter is particularly muddy in a threaded environment, even for plain
old disk files.

> 2. If the answer to 1 is that I am not, in fact, being a dumbass,
>    what is the right way to support this?  The obvious alternatives
>    are an eof member (analogous to the existing `closed' member, or
>    an eof() method.  I favor the latter.
>
> 3. If we agree on a design, I'm willing to implement this at least
>    for Unix.  Should be a small project.

I agree an .eof() method would be better than a data member.  Note that
whenever Python internals hit stream EOF today, they call clearerr(), so
simply adding an feof() wrapper wouldn't suffice.  Guido seemed to try to
make sure that feof() would never be useful <0.8 wink>.

one-of-life's-little-mysteries-ly y'rs  - tim




From gstein at lyra.org  Sun Jan  7 11:46:54 2001
From: gstein at lyra.org (Greg Stein)
Date: Sun, 7 Jan 2001 02:46:54 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects fileobject.c,2.96,2.97
In-Reply-To: <E14EY5D-0000pm-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Fri, Jan 05, 2001 at 06:43:07AM -0800
References: <E14EY5D-0000pm-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010107024654.W17220@lyra.org>

On Fri, Jan 05, 2001 at 06:43:07AM -0800, Guido van Rossum wrote:
> Update of /cvsroot/python/python/dist/src/Objects
> In directory usw-pr-cvs1:/tmp/cvs-serv3183
> 
> Modified Files:
> 	fileobject.c 
> Log Message:
> Restructured get_line() for clarity and speed.
> 
> - The raw_input() functionality is moved to a separate function.
> 
> - Drop GNU getline() in favor of getc_unlocked(), which exists on more
>   platforms (and is even a tad faster on my system).

The "configure" tests for getline() can be punted if we won't use it any
more...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From gstein at lyra.org  Sun Jan  7 13:27:57 2001
From: gstein at lyra.org (Greg Stein)
Date: Sun, 7 Jan 2001 04:27:57 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Jan 05, 2001 at 03:14:41PM -0500
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
Message-ID: <20010107042757.X17220@lyra.org>

It feels wrong. Whatever happened to the "we're all adults here" mantra.

Besides people asking for it, what is a good reason *for* it to be added?

Cheers,
-g

On Fri, Jan 05, 2001 at 03:14:41PM -0500, Guido van Rossum wrote:
> Please have a look at this SF patch:
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> 
> This implements control over which names defined in a module are
> externally visible: if there's a variable __exports__ in the module,
> it is a list of identifiers, and any access from outside the module to
> names not in the list is disallowed.  This affects access using the
> getattr and setattr protocols (which raise AttributeError for
> disallowed names), as well as "from M import v" (which raises
> ImportError).
> 
> I like it.  This has been asked for many times.  Does anybody see a
> reason why this should *not* be added?
> 
> Tim remarked that introducing this will prompt demands for a similar
> feature on classes and instances, where it will be hard to implement
> without causing a bit of a slowdown.  It causes a slight slowdown (an
> extra dictionary lookup for each use of "M.v") even when it is not
> used, but for accessing module variables that's acceptable.  I'm not
> so sure about instance variable references.
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Greg Stein, http://www.lyra.org/



From guido at python.org  Sun Jan  7 17:52:11 2001
From: guido at python.org (Guido van Rossum)
Date: Sun, 07 Jan 2001 11:52:11 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: Your message of "Sat, 06 Jan 2001 23:01:25 EST."
             <20010106230125.A29058@thyrsus.com> 
References: <200101062233.RAA23942@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com>  
            <20010106230125.A29058@thyrsus.com> 
Message-ID: <200101071652.LAA31411@cj20424-a.reston1.va.home.com>

> This mess reminds me.  For some work I'm doing right now, it would be
> very useful if there were a way to query the end-of-file status of a
> file descriptor without actually doing a read.

I hope you really mean file object (== wrapper around stdio FILE
object).  A file descriptor (small little integer in Unix) doesn't
have a way to find this out.

Even for file objects, it is typically only known that there's an EOF
condition after a lowest-level read operation returned 0 bytes.  So in
effect you must still do a read in order to determine EOF status.

I just ran a small test program, and fread() appears to set the eof
status when it returns a short count.  Normally, Python's read() uses
fread() so this might be useful.  However after a readline(), you
can't know the eof status (unless the last line of the file doesn't
end in a newline).

> I don't see this ability anywhere in the 2.0 API.  Questions:
> 
> 1. Am I missing something obvious?
> 
> 2. If the answer to 1 is that I am not, in fact, being a dumbass, what
>    is the right way to support this?  The obvious alternatives are an 
>    eof member (analogous to the existing `closed' member, or an eof()
>    method.  I favor the latter.
> 
> 3. If we agree on a design, I'm willing to implement this at least for
>    Unix.  Should be a small project.

Before adding an eof() method, can you explain what your program is
trying to do?  Is it reading from a pipe or socket?  Then select() or
poll() might be useful.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Sun Jan  7 19:30:32 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 7 Jan 2001 13:30:32 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 07, 2001 at 05:09:02AM -0500
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>
Message-ID: <20010107133032.F4586@thyrsus.com>

Tim Peters <tim.one at home.com>:
> I agree an .eof() method would be better than a data member.  Note that
> whenever Python internals hit stream EOF today, they call clearerr(), so
> simply adding an feof() wrapper wouldn't suffice.  Guido seemed to try to
> make sure that feof() would never be useful <0.8 wink>.

That's inconvenient, but only means the internal Python state flag
that feof() would inspect would have to be checked after each read.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"...The Bill of Rights is a literal and absolute document. The First
Amendment doesn't say you have a right to speak out unless the
government has a 'compelling interest' in censoring the Internet. The
Second Amendment doesn't say you have the right to keep and bear arms
until some madman plants a bomb. The Fourth Amendment doesn't say you
have the right to be secure from search and seizure unless some FBI
agent thinks you fit the profile of a terrorist. The government has no
right to interfere with any of these freedoms under any circumstances."
	-- Harry Browne, 1996 USA presidential candidate, Libertarian Party



From esr at thyrsus.com  Sun Jan  7 19:45:41 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 7 Jan 2001 13:45:41 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <200101071652.LAA31411@cj20424-a.reston1.va.home.com>; from guido@python.org on Sun, Jan 07, 2001 at 11:52:11AM -0500
References: <200101062233.RAA23942@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com> <20010106230125.A29058@thyrsus.com> <200101071652.LAA31411@cj20424-a.reston1.va.home.com>
Message-ID: <20010107134541.G4586@thyrsus.com>

Guido van Rossum <guido at python.org>:
> > This mess reminds me.  For some work I'm doing right now, it would be
> > very useful if there were a way to query the end-of-file status of a
> > file descriptor without actually doing a read.
> 
> I hope you really mean file object (== wrapper around stdio FILE
> object).  A file descriptor (small little integer in Unix) doesn't
> have a way to find this out.

You're right, my bad.
 
> Even for file objects, it is typically only known that there's an EOF
> condition after a lowest-level read operation returned 0 bytes.  So in
> effect you must still do a read in order to determine EOF status.
> 
> I just ran a small test program, and fread() appears to set the eof
> status when it returns a short count.  Normally, Python's read() uses
> fread() so this might be useful.  However after a readline(), you
> can't know the eof status (unless the last line of the file doesn't
> end in a newline).

I considered trying a zero-length read() in Python, but this strikes me 
as inelegant even if it would work.

> Before adding an eof() method, can you explain what your program is
> trying to do?  Is it reading from a pipe or socket?  Then select() or
> poll() might be useful.

Sadly, it's exactly the wrong case.  Hmmm...omitting irrelevant details,
it's a situation where a markup file can contain sections in two different
languages.  The design requires the first interpreter to exit on seeing
either EOF or a marker that says "switching to second language".  For
reasons too compllicated to explain, it would be best if the parser for
the first language didn't simply call the second parser.

The logic I wanted to write amounts to:

while 1:
    line = fp.readline()
    if not line or line == "history":
        break
    interpret_in-language_1(line)

if not fp.feof()
    while 1:
        line = fp.readline()
        if not line:
            break
    	interpret_in-language_2(line)

I just tested the zero-length-read method.  That worked.  I guess I'll
use it.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Today, we need a nation of Minutemen, citizens who are not only prepared to
take arms, but citizens who regard the preservation of freedom as the basic
purpose of their daily life and who are willing to consciously work and
sacrifice for that freedom."
	-- John F. Kennedy



From martin at loewis.home.cs.tu-berlin.de  Sun Jan  7 19:45:15 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 7 Jan 2001 19:45:15 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
Message-ID: <200101071845.f07IjFi01249@mira.informatik.hu-berlin.de>

Authors of extension packages often find the need to auto-import some
of their modules. This is often needed for registration, e.g. a codec
author (like Tamito KAJIYAMA, who wrote the JapaneseCodecs package)
may need to register a search function with codecs.register. This is
currently only possible by writing into sitecustomize.py, which must
be done by the system administrator manually.

To enhance the service of site.py, I've written the patch

http://sourceforge.net/patch/?func=detailpatch&patch_id=103134&group_id=5470

which treats lines in PTH files which start with "import" as
statements and executes them, instead of appending these lines to
sys.path.

The patch is relatively small, but since it is an extension: Do I need
to write a PEP for it?

Regards,
Martin



From tismer at tismer.com  Sun Jan  7 19:05:21 2001
From: tismer at tismer.com (Christian Tismer)
Date: Sun, 07 Jan 2001 20:05:21 +0200
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
		<20010106110033.52127A84F@darjeeling.zadka.site.co.il>
		<14934.43496.322436.612746@anthem.wooz.org>
		<20010105152058.A6016@glacier.fnational.com> <14935.11870.360839.235102@beluga.mojam.com>
Message-ID: <3A58AFE1.3AB619BD@tismer.com>


Skip Montanaro wrote:
> 
>     Neil> I think you, Skip and Moshe are missing a big advantage of having
>     Neil> the __exports__ mechanism.  It should allow some attribute access
>     Neil> inside of modules to become faster (like LOAD_FAST for locals).  I
>     Neil> think that optimization could be implemented without too much
>     Neil> difficultly.
> 
> True enough, that hadn't occurred to me.  Knowing that now, I still don't
> think consistency of the interface should suffer as a result of
> under-the-covers performance gains.

Ok, vice versa:
Given that we can support access control via __exports__
for modules, classes and instances as well, *and* if we
can think up a scheme that allows a LOAD_FAST like speedup
for all of these cases at the same time,
then I would say +1, otherwise -0, half-hearted solution.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From guido at python.org  Sun Jan  7 22:13:01 2001
From: guido at python.org (Guido van Rossum)
Date: Sun, 07 Jan 2001 16:13:01 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: Your message of "Sun, 07 Jan 2001 13:30:32 EST."
             <20010107133032.F4586@thyrsus.com> 
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>  
            <20010107133032.F4586@thyrsus.com> 
Message-ID: <200101072113.QAA32467@cj20424-a.reston1.va.home.com>

> Tim Peters <tim.one at home.com>:
> > I agree an .eof() method would be better than a data member.  Note that
> > whenever Python internals hit stream EOF today, they call clearerr(), so
> > simply adding an feof() wrapper wouldn't suffice.  Guido seemed to try to
> > make sure that feof() would never be useful <0.8 wink>.
> 
[ESR]
> That's inconvenient, but only means the internal Python state flag
> that feof() would inspect would have to be checked after each read.

This was done because some platforms set feof() when there's still a
possibity to read more (e.g. after an interactive user typed ^D),
while others don't.  It's inconvenient to get an endless stream of
EOFs from stdin when a user typed ^D to one particular prompt, so I
decided to clear the EOF status.

[ESR in a later message]
> I considered trying a zero-length read() in Python, but this strikes me 
> as inelegant even if it would work.

I doubt that a zero-length read conveys any information.  It should
return "" whether or not there is more to read!  Plus, look at the
implementation of readline() (file_readline() in
Objects/fileobject.c): it shortcuts the n == 0 case and returns an
empty string without touching the file.

[me]
> > Before adding an eof() method, can you explain what your program is
> > trying to do?  Is it reading from a pipe or socket?  Then select() or
> > poll() might be useful.

[ESR again]
> Sadly, it's exactly the wrong case.  Hmmm...omitting irrelevant details,
> it's a situation where a markup file can contain sections in two different
> languages.  The design requires the first interpreter to exit on seeing
> either EOF or a marker that says "switching to second language".  For
> reasons too compllicated to explain, it would be best if the parser for
> the first language didn't simply call the second parser.
> 
> The logic I wanted to write amounts to:
> 
> while 1:
>     line = fp.readline()
>     if not line or line == "history":
>         break
>     interpret_in-language_1(line)
> 
> if not fp.feof()
>     while 1:
>         line = fp.readline()
>         if not line:
>             break
>     	interpret_in-language_2(line)
> 
> I just tested the zero-length-read method.  That worked.  I guess I'll
> use it.

Bizarre (given what I know about zero-length read).  But in the above
code, you can replace "if not fp.feof()" with "if line".  In other
words, you just have to carry the state over within your program.

So, I see no reason why the logic in your program couldn't take care
of this, which in general is a preferred way to solve a problem than
to change the language.

Also note that in Python it's no sin to attempt to read a line even
when the file is already at EOF -- you will simply get an empty line
again.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at effbot.org  Sun Jan  7 22:29:46 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Sun, 7 Jan 2001 22:29:46 +0100
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>              <20010107133032.F4586@thyrsus.com>  <200101072113.QAA32467@cj20424-a.reston1.va.home.com>
Message-ID: <035901c078f0$f6180f70$e46940d5@hagrid>

Guido van Rossum wrote:
> Bizarre (given what I know about zero-length read).  But in the above
> code, you can replace "if not fp.feof()" with "if line".  In other
> words, you just have to carry the state over within your program.

and if that's too hard, just hide the state in
a class:

class FileWrapper:

    def __init__(self, file):
        self.__file = file
        self.__line = None

    def __more(self):
        # try reading another line
        if not self.__line:
            self.__line = self.__file.readline()

    def eof(self):
        self.__more()
        return not self.__line

    def readline(self):
        self.__more()
        line = self.__line
        self.__line = None
        return line

file = open("myfile.txt")

file = FileWrapper(file)

while not file.eof():
    print repr(file.readline())

</F>




From guido at python.org  Sun Jan  7 22:32:26 2001
From: guido at python.org (Guido van Rossum)
Date: Sun, 07 Jan 2001 16:32:26 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin test_charmapcodec test_pow
In-Reply-To: Your message of "Sat, 06 Jan 2001 22:16:31 EST."
             <LNBBLJKPBEHFEDALKOLCMECFIHAA.tim_one@email.msn.com> 
References: <LNBBLJKPBEHFEDALKOLCMECFIHAA.tim_one@email.msn.com> 
Message-ID: <200101072132.QAA32627@cj20424-a.reston1.va.home.com>

> I'm pretty sure the test_pow and test_charmapcodec failures aren't my doing.
> 
> test_builtin fails because raw_input() isn't stripping a trailing newline.
> I've got my own code in this area that *may* be to blame, but I don't see
> how it could be.  I note that fileobject.c's new function get_line_raw has
> the comment
> 
> /* Internal routine to get a line for raw_input():
>    strip trailing '\n', raise EOFError if EOF reached immediately
> */
> 
> but the code doesn't look for a trailing newline (let alone strip one).

My bad.  Try the latest CVS now.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Sun Jan  7 23:15:27 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 7 Jan 2001 17:15:27 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <200101072113.QAA32467@cj20424-a.reston1.va.home.com>; from guido@python.org on Sun, Jan 07, 2001 at 04:13:01PM -0500
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com>
Message-ID: <20010107171527.A5093@thyrsus.com>

Guido van Rossum <guido at python.org>:
> [ESR in a later message]
> > I considered trying a zero-length read() in Python, but this strikes me 
> > as inelegant even if it would work.
> 
> I doubt that a zero-length read conveys any information.  It should
> return "" whether or not there is more to read!

Duh.  Of course it would.  

You know, I've always been half-consciously dissatisfied with Python's
use of "" as an EOF marker, and now I know why.  It's precisely
because there's no way to distinguish these cases.  I think a zero-length
read ought to return "" and a read on EOF ought to return None.

> Bizarre (given what I know about zero-length read).  But in the above
> code, you can replace "if not fp.feof()" with "if line".  In other
> words, you just have to carry the state over within your program.
> 
> So, I see no reason why the logic in your program couldn't take care
> of this, which in general is a preferred way to solve a problem than
> to change the language.

OK, two objections, one practical and one (more important) esthetic:

Practical: I guess I oversimplified the code for expository purposes.
What's actually going on is that I have two parser classes both based
on shlex -- they do character-at-a-time input and don't actually
*have* accessible line buffers.

Esthetic: Yes, I can have the first parser set a flag, or return some
EOF token.  But this seems deeply wrong to me, because EOFness is not
a property of the parser but of the underlying stream object.  It
seems to me that my program ought to be able to ask the stream object
whether it's at EOF rather than carrying its own flag for that state.

In Python as it is, there's no clean way to do this.  I'd have to do a
nonzero-length read to test it (I failed to check the right alternate
case before when I tried zero-length).  That's really broken.  What if the
neither the underlying stream nor the parser supports pushback?

Do you see now why I think this is a more general issue?

Now, another and more general way to handle this would be to make an
equivalent of the old FIONCLEX ioctl part of Python's standard set of 
file object methods -- a way to ask "how many bytes are ready to be
read in this stream?  

Trivial to make it work for plain files, of course.  Harder to make it   
work usefully for pipes/fifos/sockets/terminals.  Having it pass up the
results of the fstat.size field (corrected for the current seek address
if you're reading a plain file) would be a good start.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Live free or die; death is not the worst of evils.
	-- General George Stark.



From tismer at tismer.com  Sun Jan  7 23:37:55 2001
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 08 Jan 2001 00:37:55 +0200
Subject: [Python-Dev] ANN: Stackless Python 2.0
Message-ID: <3A58EFC3.5A722FF0@tismer.com>

Dear community,

I'm happy to announce that

		Stackless Python 2.0

is finally ready and available for download.

Stackless Python for Python 1.5.2+ also got some minor
enhancements. Both versions are available as Win32
installer files here:

http://www.stackless.com/spc20-win32.exe
http://www.stackless.com/spc15-win32.exe

Speed: Stackless Python for Python 2.0 is again a bit faster
than the original. This time even better: About 9-10 percent.
I have to say that optimization was much harder this time.
My speed patches are now done by a Python script, which will
make maintenance and diff reading much easier in the future.

There is now also a bit of example code available, like
the uthread9.py Microthreads module from Will Ware, Just van Rossum,
and Mike Fletcher.

Source code and an update to the website will become available in
the next days.

enjoy - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From mal at lemburg.com  Mon Jan  8 01:26:00 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 01:26:00 +0100
Subject: [Python-Dev] Std tests failing, Windows: test_builtin 
 test_charmapcodec test_pow
References: <LNBBLJKPBEHFEDALKOLCMECFIHAA.tim_one@email.msn.com>
Message-ID: <3A590918.E90031AA@lemburg.com>

Tim Peters wrote:
> 
> I'm pretty sure the test_pow and test_charmapcodec failures aren't my doing.

test_charmapcodec is my fault... I should run the tests in a
clean room environment before checkin: my PYTHONPATH picked up
some other file which it was not supposed to do.

I'll fix it next week.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Mon Jan  8 05:13:26 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 7 Jan 2001 23:13:26 -0500
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEBPIHAA.tim_one@email.msn.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEEDIHAA.tim.one@home.com>

The "Win32" readline() hack is now checked in, but there's really nothing
Win32-specific about it anymore.  It makes one mild assumption about what
the C std doesn't clearly address but may have intended:  that in case of a
non-NULL return, fgets doesn't overwrite any of the buffer positions beyond
the terminating null byte (the std is clear that it doesn't overwrite
anything at all in case of a NULL-because-EOF return, but I can't say
whether they're pointing that out as a consequence, or pointing that out as
an exception).

I'm curious about how it performs (relative to the getc_unlocked hack) on
other platforms.  If you'd like to try that, just recompile fileobject.c
with

    USE_MS_GETLINE_HACK

#define'd.  It should *work* on any platform with fgets() meeting the
assumption.  The new test_bufio.py std test gives it a pretty good
correctness workout, if you're worried about that.




From esr at snark.thyrsus.com  Mon Jan  8 05:16:53 2001
From: esr at snark.thyrsus.com (Eric S. Raymond)
Date: Sun, 7 Jan 2001 23:16:53 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
Message-ID: <200101080416.f084GrM10912@snark.thyrsus.com>

Setting things up so curses is autoconfigured into the default build
if your system has it in the expected places turned out to be dead
easy.  Some clever person (the BDFL himself?) wrote the build process
so that there is *already* a Setup.config.in that gets configure
expansions done on it, with the generated Setup.config used when
makesetup does its magic.

As a bonus, I've also added autoconfiguration for readline.  A small
detail, but one which I suspect many people building their own Pythons
frequently trip over.

The technique generalizes easily.  The archetype for a facility for
autoconfiguring libfoo with a Python extension foo.c if it's present
has just two steps:

Add this to Modules/Setup.config.in:

@USE_FOO_MODULE at foo foo.c -lfoo

Add this to configure.in:

# This is used to generate Setup.config
AC_SUBST(USE_FOO_MODULE)
AC_CHECK_LIB(foo, random_foo_function, 
	[USE_FOO_MODULE=""],
	[USE_FOO_MODULE="#"])

(Apologies for the lack of description with the patch.  I tripped over
a SourceForge interface bug.)
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The possession of arms by the people is the ultimate warrant
that government governs only with the consent of the governed.
        -- Jeff Snyder



From tim.one at home.com  Mon Jan  8 06:34:20 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 00:34:20 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <3A590918.E90031AA@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com>

An update:  test_builtin works again (thanks, Guido!), and test_charmapcodec
will "next week" (thanks, MAL!).

Still unknown (to me):  is the test_pow failure unique to Windows?  One
response from a Unix(tm) geek would settle that.




From nas at arctrix.com  Sun Jan  7 23:59:49 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 7 Jan 2001 14:59:49 -0800
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 12:34:20AM -0500
References: <3A590918.E90031AA@lemburg.com> <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com>
Message-ID: <20010107145949.A14166@glacier.fnational.com>

On Mon, Jan 08, 2001 at 12:34:20AM -0500, Tim Peters wrote:
> Still unknown (to me):  is the test_pow failure unique to Windows?  One
> response from a Unix(tm) geek would settle that.

It works fine for me on Linux.  I thought I tested on Windows
before checking in the coerce patch.  I'll try again.

  Neil



From nas at arctrix.com  Mon Jan  8 00:29:14 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 7 Jan 2001 15:29:14 -0800
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <20010107145949.A14166@glacier.fnational.com>; from nas@arctrix.com on Sun, Jan 07, 2001 at 02:59:49PM -0800
References: <3A590918.E90031AA@lemburg.com> <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com> <20010107145949.A14166@glacier.fnational.com>
Message-ID: <20010107152914.A14228@glacier.fnational.com>

On Sun, Jan 07, 2001 at 02:59:49PM -0800, Neil Schemenauer wrote:
> It works fine for me on Linux.  I thought I tested on Windows
> before checking in the coerce patch.  I'll try again.

Wierd. rt.bat does not run the test_pow script.  If I run
"regrtet test_pow" then the test fails.  It could be a problem
with line endings (I copied the source for a Unix CVS checkout).

Anyhow, I found the bug.  I don't know how test_pow was passing
under Linux.  Time to reboot again.

  Neil



From tim.one at home.com  Mon Jan  8 07:39:20 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 01:39:20 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <20010107152914.A14228@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEEKIHAA.tim.one@home.com>

[NeilS]
> Wierd. rt.bat does not run the test_pow script.

Works for me, else I never would have noticed <wink>.  Also works for me in
single-test mode:

C:\Code\python\dist\src\PCbuild>rt test_pow

C:\Code\python\dist\src\PCbuild>python ../lib/test/regrtest.py test_pow
test_pow
The actual stdout doesn't match the expected stdout.
This much did match (between asterisk lines):
**********************************************************************
test_pow
Testing integer mode...
    Testing 2-argument pow() function...
    Testing 3-argument pow() function...
Testing long integer mode...
    Testing 2-argument pow() function...
    Testing 3-argument pow() function...
Testing floating point mode...
    Testing 3-argument pow() function...
The number in both columns should match.
3 3
-5 -5
-1 -1
5 5
-3 -3
-7 -7

3L 3L
-5L -5L
-1L -1L
5L 5L
-3L -3L
-7L -7L

3.0 3.0
-5.0 -5.0
-1.0 -1.0
-7.0 -7.0

**********************************************************************
Then ...
We expected (repr): ''
But instead we got: 'Float mismatch:'
test test_pow failed -- Writing: 'Float mismatch:', expected: ''
1 test failed: test_pow

C:\Code\python\dist\src\PCbuild>

That may point to the problem, too:  the canned output file is truncated?

> If I run "regrtet test_pow" then the test fails.  It could be a
> problem with line endings (I copied the source for a Unix CVS
> checkout).

Don't understand; e.g., "copied" what, from where to where?  I'm not sure I
gave you write access to my box, and hacking into Windows machines is uncool
because it's not challenging <wink>.

> Anyhow, I found the bug.  I don't know how test_pow was passing
> under Linux.  Time to reboot again.

Cool!  BTW, Windows solves the "don't reboot enough" problem for you via
automation, sometimes on an hourly basis.

Thanks for sharing the brain cells, Neil!




From thomas at xs4all.net  Mon Jan  8 07:44:11 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 07:44:11 +0100
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <200101080416.f084GrM10912@snark.thyrsus.com>; from esr@snark.thyrsus.com on Sun, Jan 07, 2001 at 11:16:53PM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com>
Message-ID: <20010108074411.N2467@xs4all.nl>

On Sun, Jan 07, 2001 at 11:16:53PM -0500, Eric S. Raymond wrote:
> Setting things up so curses is autoconfigured into the default build
> if your system has it in the expected places turned out to be dead
> easy.  Some clever person (the BDFL himself?) wrote the build process
> so that there is *already* a Setup.config.in that gets configure
> expansions done on it, with the generated Setup.config used when
> makesetup does its magic.

Skip, actually, IIRC. It was added in the last stages of 2.0 development, to
auto-detect bsddb. However, I still think it should be a separate
'configure', in the Modules directory. Especially now that Andrew is
practically checking in the distutils setup ;) The main configure can make
an educated guess whether Python and distutils are available, and call
configure with some passed-through options if not. It does depend on what
the distutils setup does, though, and I'll shamefully admit that I haven't
looked at that ;P

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From nas at arctrix.com  Mon Jan  8 00:51:16 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 7 Jan 2001 15:51:16 -0800
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEEKIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 01:39:20AM -0500
References: <20010107152914.A14228@glacier.fnational.com> <LNBBLJKPBEHFEDALKOLCGEEKIHAA.tim.one@home.com>
Message-ID: <20010107155116.A14312@glacier.fnational.com>

On Mon, Jan 08, 2001 at 01:39:20AM -0500, Tim Peters wrote:
> [NeilS]
> > If I run "regrtet test_pow" then the test fails.  It could be a
> > problem with line endings (I copied the source for a Unix CVS
> > checkout).
> 
> Don't understand; e.g., "copied" what, from where to where?

I should have been clearer.  I mean the problem with rt.bat not
running test_pow.  I copied the CVS source from my Linux ext2
filesystem to a VFAT filesystem.  I was too lazy to fix the line
endings.

  Neil



From nas at arctrix.com  Mon Jan  8 00:52:38 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 7 Jan 2001 15:52:38 -0800
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <20010107152914.A14228@glacier.fnational.com>; from nas@arctrix.com on Sun, Jan 07, 2001 at 03:29:14PM -0800
References: <3A590918.E90031AA@lemburg.com> <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com> <20010107145949.A14166@glacier.fnational.com> <20010107152914.A14228@glacier.fnational.com>
Message-ID: <20010107155238.A14291@glacier.fnational.com>

On Sun, Jan 07, 2001 at 03:29:14PM -0800, Neil Schemenauer wrote:
> I don't know how test_pow was passing under Linux.

Under Linux with the buggy float_pow:

    >>> pow(10.0, 0, 10)
    nan
    >>> pow(10.0, 0, 10) == 1
    1
    >>> pow(10.0, 0, 10) == 0
    1

Under Windows NAN obviously behaves differently.

  floating-point-is-fun-ly y'rs Neil



From esr at thyrsus.com  Mon Jan  8 07:49:45 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 8 Jan 2001 01:49:45 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <20010108074411.N2467@xs4all.nl>; from thomas@xs4all.net on Mon, Jan 08, 2001 at 07:44:11AM +0100
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl>
Message-ID: <20010108014945.A19516@thyrsus.com>

Thomas Wouters <thomas at xs4all.net>:
> On Sun, Jan 07, 2001 at 11:16:53PM -0500, Eric S. Raymond wrote:
> > Setting things up so curses is autoconfigured into the default build
> > if your system has it in the expected places turned out to be dead
> > easy.  Some clever person (the BDFL himself?) wrote the build process
> > so that there is *already* a Setup.config.in that gets configure
> > expansions done on it, with the generated Setup.config used when
> > makesetup does its magic.
> 
> Skip, actually, IIRC. It was added in the last stages of 2.0 development, to
> auto-detect bsddb. However, I still think it should be a separate
> 'configure', in the Modules directory.

You may be right.  Still, this patch solves the immediate problem in a
reasonably clean way, and I urge that it should go in.  We can do a
more complete reorganization of the build process later.  (I'll help with
that; I'm pretty expert with autoconf and friends.)
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"As to the species of exercise, I advise the gun. While this gives
[only] moderate exercise to the body, it gives boldness, enterprise,
and independence to the mind.  Games played with the ball and others
of that nature, are too violent for the body and stamp no character on
the mind. Let your gun, therefore, be the constant companion to your
walks."
        -- Thomas Jefferson, writing to his teenaged nephew.



From tim.one at home.com  Mon Jan  8 08:05:46 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 02:05:46 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010106110033.52127A84F@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>

Well, I like __exports__ (but not some details of the patch, for which see
my SF comments).  Guido is aware of the optimization possibilities, but
that's not what's driving it.  I don't know why he likes it; I like it
because the only normal use for a module is to do module.attr, or "from
module import attr", and dir(module) very often exposes stuff today that the
module author had no intention of exporting.  For example, if I do

    import os
    dir(os)

under CVS Python today, on my box I see that os exports "i".  It's bound to
_exit.  That's baffling, and is purely an accident of how module os.py
initialization works when you're running on Windows.

Couple that with that I've hardly ever seen (or bothered to write) a module
docstring spelling out everything a module *intends* to export, and an
__exports__ line near the top (when present) would also automagically give a
solid answer to that question.

modules aren't classes or instances, and in normal practice modules
accumulate all sorts of accidental attrs (due to careless (== normal)
imports, and module init code).  It doesn't make any *sense* that os exports
"sys" either, or that random exports "cos", or that cgi exports "string", or
... this inelegance is ubiquitous.

In a world with an __exports__ that gets used, though, I do wonder whether
people will or won't export their test() functions.  I really like that they
do now.

or-maybe-it's-just-that-i-like-modules-that-*have*-a-
    test-function<wink>-ly y'rs  - tim





From gstein at lyra.org  Mon Jan  8 08:25:32 2001
From: gstein at lyra.org (Greg Stein)
Date: Sun, 7 Jan 2001 23:25:32 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 02:05:46AM -0500
References: <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
Message-ID: <20010107232532.V17220@lyra.org>

On Mon, Jan 08, 2001 at 02:05:46AM -0500, Tim Peters wrote:
>...
> modules aren't classes or instances, and in normal practice modules
> accumulate all sorts of accidental attrs (due to careless (== normal)
> imports, and module init code).  It doesn't make any *sense* that os exports
> "sys" either, or that random exports "cos", or that cgi exports "string", or
> ... this inelegance is ubiquitous.

Simple question: so what?

"Oh, no! My module exposes mod.sys! Oh, woe is me!"  *snort*

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From tim.one at home.com  Mon Jan  8 08:29:39 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 02:29:39 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <20010107155238.A14291@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEENIHAA.tim.one@home.com>

[Neil Schemenauer]
> Under Linux with the buggy float_pow:
>
>     >>> pow(10.0, 0, 10)
>     nan
>     >>> pow(10.0, 0, 10) == 1
>     1
>     >>> pow(10.0, 0, 10) == 0
>     1
>
> Under Windows NAN obviously behaves differently.

Comparisons with NaN are a platform-dependent accident, partly because some
C compilers generate nonsense code, partly because Python isn't coded to
cater to NaN's peculiarities either.  The behavior under Windows is
(accidentally) better in these cases today (NaN should never compare equal
to anything -- not even to itself -- and, curiously, MSVC's codegen mistakes
cancel out Python's mistakes in this case!).

Thank you for fixing the bug.  Only test_charmapcodec is failing for me now,
and MAL knows the cause and cure.

nothing-can-stop-the-alpha-now-ly y'rs  - tim




From thomas at xs4all.net  Mon Jan  8 08:42:30 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 08:42:30 +0100
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEENIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 02:29:39AM -0500
References: <20010107155238.A14291@glacier.fnational.com> <LNBBLJKPBEHFEDALKOLCIEENIHAA.tim.one@home.com>
Message-ID: <20010108084230.O2467@xs4all.nl>

On Mon, Jan 08, 2001 at 02:29:39AM -0500, Tim Peters wrote:

> (NaN should never compare equal to anything -- not even to itself

You know that's impossible, in Python, right ? (Due to the shortcut taken by
'==', based on object identity.) Is that going to be 'fixed', too ? :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From ping at lfw.org  Mon Jan  8 08:51:11 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Sun, 7 Jan 2001 23:51:11 -0800 (PST)
Subject: [Python-Dev] inspect.py
In-Reply-To: <Pine.LNX.4.10.10011021617550.800-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101072348530.1032-100000@skuld.kingmanhall.org>

Hi again.

Sorry to bother you if you're busy -- i haven't seen any responses
about inspect.py for a few days and wanted to know what your
reactions were.  The module and test suite are still at:

    http://www.lfw.org/python/inspect.py
    http://www.lfw.org/python/test_inspect.py

The only change since my announcement last Wednesday is that
getframe() has been renamed to getframeinfo().

Thanks,


-- ?!ng

"Old code doesn't die -- it just smells that way."
    -- Bill Frantz




From tim.one at home.com  Mon Jan  8 09:17:57 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 03:17:57 -0500
Subject: NaN nonsense (was RE: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow)
In-Reply-To: <20010108084230.O2467@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEPIHAA.tim.one@home.com>

>> (NaN should never compare equal to anything -- not even to itself

[Thomas Wouters]
> You know that's impossible, in Python, right ? (Due to the
> shortcut taken by '==', based on object identity.)

Surely you jest:  I probably knew that while you were still nursing <wink>.

OTOH, Python on WinTel comes remarkably close (by accident):

C:\Code\python\dist\src\PCbuild>python
Python 2.0 (#8, Jan  5 2001, 00:33:19) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> inf = 1e300**2
>>> inf
1.#INF
>>> nan = inf - inf
>>> nan
-1.#IND
>>> nan2 = nan * 1.0
>>> nan2
-1.#IND
>>> nan == nan2
0
>>>

> Is that going to be 'fixed', too ? :)

Not if I can help it.  I'd be in favor of adding an fcmp function that needs
to be called explicitly when you want the full complexity of 754
comparisons.  Count them all up, and there are 32 distinct 754 binary float
comparison operators!  The 754 std says 26 (from memory, may be 2 more or
less) of those have to be supplied, but-- since 754 is not a language
std --says nothing about how they're to be spelled.

OTOH, C99 resolutely tries to map that into C, and 754 True Believers will
use that as a club.

On the third hand, as Tom MacDonald posted here earlier (he was X3J11
chair), he's not sure anyone will ever implement C99 in whole.  The
complexities of full 754 support are a large part of why he worries about
that.

too-much-too-late-ly y'rs  - tim




From tim.one at home.com  Mon Jan  8 09:17:59 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 03:17:59 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010107232532.V17220@lyra.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>

[Greg Stein]
> Simple question: so what?
>
> "Oh, no! My module exposes mod.sys! Oh, woe is me!"  *snort*

Couldn't care less about the module author.  It's the module user who has to
sort this stuff out.  "Don't use 'import *'" is good advice but not followed
either, and after I do

from MyPackage import sys  # intentionally exports its own sys
from GregSnort import *    # accidentally exports some other sys

madness ensues.  Like I said, it's inelegant, and at best.

Simple question for you:  what would __exports__ hurt?  "Oh, no!  Tim's
module explicitly lists what it intended to export!  Oh, woe is me!".  Gimme
a break.




From gstein at lyra.org  Mon Jan  8 09:26:03 2001
From: gstein at lyra.org (Greg Stein)
Date: Mon, 8 Jan 2001 00:26:03 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 03:17:59AM -0500
References: <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>
Message-ID: <20010108002603.X17220@lyra.org>

On Mon, Jan 08, 2001 at 03:17:59AM -0500, Tim Peters wrote:
> [Greg Stein]
> > Simple question: so what?
> >
> > "Oh, no! My module exposes mod.sys! Oh, woe is me!"  *snort*
> 
> Couldn't care less about the module author.  It's the module user who has to
> sort this stuff out.  "Don't use 'import *'" is good advice but not followed
> either, and after I do
> 
> from MyPackage import sys  # intentionally exports its own sys
> from GregSnort import *    # accidentally exports some other sys
> 
> madness ensues.  Like I said, it's inelegant, and at best.
> 
> Simple question for you:  what would __exports__ hurt?  "Oh, no!  Tim's
> module explicitly lists what it intended to export!  Oh, woe is me!".  Gimme
> a break.

hehe... adding __exports__ to your module is fine. Adding more crud to
Python, in opposition to the "we're all adults" motto, doesn't seem Right.

Somebody wants to use "from foo import *" on a module not designed for it?
Too bad for them. If you're suggesting __exports__ is to patch over problems
caused by "from foo import *", then I think you're barking up the wrong tree
:-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From moshez at zadka.site.co.il  Mon Jan  8 17:50:57 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon,  8 Jan 2001 18:50:57 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010107232532.V17220@lyra.org>
References: <20010107232532.V17220@lyra.org>, <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
Message-ID: <20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>

[Tim Peters]
> modules aren't classes or instances, and in normal practice modules
> accumulate all sorts of accidental attrs (due to careless (== normal)
> imports, and module init code).  It doesn't make any *sense* that os exports
> "sys" either, or that random exports "cos", or that cgi exports "string", or
> ... this inelegance is ubiquitous.

[Greg Stein]
> Simple question: so what?
> 
> "Oh, no! My module exposes mod.sys! Oh, woe is me!"  *snort*

Let me "me to" here:
Put another way, what Greg said is just a rephrase of "don't use from
foo import * unless foo's docos say it's OK". Add to that the simple
access control of a leading underscore, and I don't see any place
which needs it.

Something better to do would be to use 
import foo as _foo

In some standard library modules, and minimize using from foo import bar
in them. Since everyone know that leading underscore means "implementation
detail - ignore at your convenience, use at yor peril", this would keep
the "we're all adults" philosophy of Python, with all the advantages
*I* see in __exports__.

One more point against __exports__, which I hoped I would not have to
make (but when I'm up against the timbot *and* Guido, I need to pull
out the heavy artillery): it would *totally* stop any hope in the
future of module level __getattr__ (or at least complicate the semantics).
I think Alex M. is thinking of a PEP, but he's taking his time, since
no PEPs can be considered until 2.1 is out.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tim.one at home.com  Mon Jan  8 09:49:58 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 03:49:58 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010108002603.X17220@lyra.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEFBIHAA.tim.one@home.com>

[Greg Stein]
> hehe... adding __exports__ to your module is fine. Adding more
> crud to Python, in opposition to the "we're all adults" motto,
> doesn't seem Right.

My idea of what's Right is copied from my boss <wink>.

> Somebody wants to use "from foo import *" on a module not designed
> for it?  Too bad for them.

How is someone supposed to know whether a module "was designed" for import*?
Even Tkinter (which just about everyone does "import *" on) also exports
sys, and everything from the "types" module, by accident too.

> If you're suggesting __exports__ is to patch over problems
> caused by "from foo import *", then I think you're barking up the
> wrong tree
> :-)

Indeed.  But I'm suggesting that the problems that *can* arise from
"import*" illustrate the fundamental silliness of exporting things by
accident.  It's come up much more often for me when I'm looking over
someone's shoulder, teaching them how to use dir() in an interactive shell
to answer their own damn questions <0.5 wink>.  It's usually the case that
dir(M) shows them something that isn't documented, and over time I am *not*
pleased that "oh, I guess the 'string' in there is just crap" is how they
learn to view it.

I can live without __exports__; but I'd prefer not to, because I would
always use it if it were there.

if-i'd-both-use-it-and-heartily-recommend-it-it's-hard-to-
    oppose-it-ly y'rs  - tim




From m.favas at per.dem.csiro.au  Mon Jan  8 12:48:40 2001
From: m.favas at per.dem.csiro.au (Mark Favas)
Date: Mon, 08 Jan 2001 19:48:40 +0800
Subject: [Python-Dev] _cursesmodule.c clobbered since Christmas
Message-ID: <3A59A918.E0D02E0D@per.dem.csiro.au>

I last successfully downloaded from CVS, compiled, linked and tested on
Dec. 22 last year. For the last week or so, the current CVS
_cursesmodule.c gives a bunch of compiler warning messages of the form:

cc: Warning: ./_cursesmodule.c, line 619: In this statement,
"derwin(...)" of ty
pe "int", is being converted to "pointer to struct _win_st".
(cvtdiftypes)
  win = derwin(self->win,nlines,ncols,begin_y,begin_x);
--^
cc: Warning: ./_cursesmodule.c, line 1259: In this statement,
"subpad(...)" of t
ype "int", is being converted to "pointer to struct _win_st".
(cvtdiftypes)
    win = subpad(self->win, nlines, ncols, begin_y, begin_x);
----^
cc: Warning: ./_cursesmodule.c, line 1488: In this statement,
"termname(...)" of
 type "int", is being converted to "pointer to const char".
(cvtdiftypes)
NoArgReturnStringFunction(termname)
^
(more elided)

and

cc: Warning: ./_cursesmodule.c, line 305: The scalar variable "arg1" is
fetched 
but not initialized.  And there may be other such fetches of this
variable that 
have not been reported in this compilation. (uninit1)
Window_NoArg2TupleReturnFunction(getparyx, int, "(ii)")
^
cc: Warning: ./_cursesmodule.c, line 305: The scalar variable "arg2" is
fetched 
but not initialized.  And there may be other such fetches of this
variable that 
have not been reported in this compilation. (uninit1)
Window_NoArg2TupleReturnFunction(getparyx, int, "(ii)")
^
(more elided)

and at link time, fails with:

ld:
Unresolved:
getbegyx
getmaxyx
getparyx


I've held off bothering anyone about this, but it begins to look as
though no-one else has noticed... My platform? Tru64 Unix, V4.0F (aka
OSF1). The recent pow() bug hit this platform, too. Happy to do any
testing...



-- 
Mark Favas  -   m.favas at per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA



From guido at python.org  Mon Jan  8 15:27:50 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 09:27:50 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: Your message of "Mon, 08 Jan 2001 01:49:45 EST."
             <20010108014945.A19516@thyrsus.com> 
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl>  
            <20010108014945.A19516@thyrsus.com> 
Message-ID: <200101081427.JAA03146@cj20424-a.reston1.va.home.com>

> You may be right.  Still, this patch solves the immediate problem in a
> reasonably clean way, and I urge that it should go in.  We can do a
> more complete reorganization of the build process later.  (I'll help with
> that; I'm pretty expert with autoconf and friends.)

I expect Andrew's code to go in before 2.1 is released.  So I don't
see a reason why we should hurry and check in a stop-gap measure.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Jan  8 15:33:09 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 09:33:09 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Mon, 08 Jan 2001 00:26:03 PST."
             <20010108002603.X17220@lyra.org> 
References: <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>  
            <20010108002603.X17220@lyra.org> 
Message-ID: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>

> hehe... adding __exports__ to your module is fine. Adding more crud to
> Python, in opposition to the "we're all adults" motto, doesn't seem Right.
> 
> Somebody wants to use "from foo import *" on a module not designed for it?
> Too bad for them. If you're suggesting __exports__ is to patch over problems
> caused by "from foo import *", then I think you're barking up the wrong tree
> :-)

You haven't been answering many newbie questions lately, have you? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Jan  8 16:06:28 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 10:06:28 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: Your message of "Sun, 07 Jan 2001 17:15:27 EST."
             <20010107171527.A5093@thyrsus.com> 
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com>  
            <20010107171527.A5093@thyrsus.com> 
Message-ID: <200101081506.KAA03404@cj20424-a.reston1.va.home.com>

> > So, I see no reason why the logic in your program couldn't take care
> > of this, which in general is a preferred way to solve a problem than
> > to change the language.
> 
> OK, two objections, one practical and one (more important) esthetic:
> 
> Practical: I guess I oversimplified the code for expository purposes.
> What's actually going on is that I have two parser classes both based
> on shlex -- they do character-at-a-time input and don't actually
> *have* accessible line buffers.

And what's wrong with always starting the second parser?  If the
stream was at EOF it will simply process zero lines.  Or does your
parser have a problem with empty input?

> Esthetic: Yes, I can have the first parser set a flag, or return some
> EOF token.  But this seems deeply wrong to me, because EOFness is not
> a property of the parser but of the underlying stream object.  It
> seems to me that my program ought to be able to ask the stream object
> whether it's at EOF rather than carrying its own flag for that state.

Eric, before we go furhter, can you give an exact definition of
EOFness to me?

> In Python as it is, there's no clean way to do this.  I'd have to do a
> nonzero-length read to test it (I failed to check the right alternate
> case before when I tried zero-length).  That's really broken.  What if the
> neither the underlying stream nor the parser supports pushback?
> 
> Do you see now why I think this is a more general issue?

No.  What's wrong with just setting the parser loose on the input and
letting it deal with EOF?  In your example, apparently a line
containing the word "history" signals that the rest of the file must
be parsed by the second parser.  What if "history" is the last line of
the file?  The eof() test can't tell you *that*!

> Now, another and more general way to handle this would be to make an
> equivalent of the old FIONCLEX ioctl part of Python's standard set of 
> file object methods -- a way to ask "how many bytes are ready to be
> read in this stream?  

There's no portable way to do that.

> Trivial to make it work for plain files, of course.  Harder to make it   
> work usefully for pipes/fifos/sockets/terminals.  Having it pass up the
> results of the fstat.size field (corrected for the current seek address
> if you're reading a plain file) would be a good start.

This seems totally the wrong level to solve your problem.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From moshez at zadka.site.co.il  Tue Jan  9 00:13:21 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 01:13:21 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>
References: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>  
            <20010108002603.X17220@lyra.org>
Message-ID: <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il>

On Mon, 08 Jan 2001 09:33:09 -0500, Guido van Rossum <guido at python.org> wrote:
> > hehe... adding __exports__ to your module is fine. Adding more crud to
> > Python, in opposition to the "we're all adults" motto, doesn't seem Right.
> > 
> > Somebody wants to use "from foo import *" on a module not designed for it?
> > Too bad for them. If you're suggesting __exports__ is to patch over problems
> > caused by "from foo import *", then I think you're barking up the wrong tree
> > :-)
> 
> You haven't been answering many newbie questions lately, have you? :-)

Well, I have. 
And frankly, I think having "from foo import *" issue a warning at 2.1
a *much* better solution.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From guido at python.org  Mon Jan  8 16:15:20 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 10:15:20 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Tue, 09 Jan 2001 01:13:21 +0200."
             <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il> 
References: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com> <20010108002603.X17220@lyra.org>  
            <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il> 
Message-ID: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>

[Greg]
> > > hehe... adding __exports__ to your module is fine. Adding more crud to
> > > Python, in opposition to the "we're all adults" motto, doesn't seem Right.
> > > 
> > > Somebody wants to use "from foo import *" on a module not designed for it?
> > > Too bad for them. If you're suggesting __exports__ is to patch over problems
> > > caused by "from foo import *", then I think you're barking up the wrong tree
> > > :-)

[Guido]
> > You haven't been answering many newbie questions lately, have you? :-)

[Moshe]
> Well, I have. 
> And frankly, I think having "from foo import *" issue a warning at 2.1
> a *much* better solution.

(1) For what problem?

(2) Under exactly what circumstances do you want from foo import *
    issue a warning?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Mon Jan  8 16:26:21 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 16:26:21 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101071845.f07IjFi01249@mira.informatik.hu-berlin.de>
Message-ID: <3A59DC1D.29DE500B@lemburg.com>

"Martin v. Loewis" wrote:
> 
> Authors of extension packages often find the need to auto-import some
> of their modules. This is often needed for registration, e.g. a codec
> author (like Tamito KAJIYAMA, who wrote the JapaneseCodecs package)
> may need to register a search function with codecs.register. This is
> currently only possible by writing into sitecustomize.py, which must
> be done by the system administrator manually.
> 
> To enhance the service of site.py, I've written the patch
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=103134&group_id=5470
> 
> which treats lines in PTH files which start with "import" as
> statements and executes them, instead of appending these lines to
> sys.path.
> 
> The patch is relatively small, but since it is an extension: Do I need
> to write a PEP for it?

Just curious: wouldn't this introduce a /tmp-style problem to
Python ?

The scenario is quite simple: a Python script runs under root.
The script could pick up a lingering .pth file (e.g. from /tmp
or one of its subdirs -- distutils does this !) and then executes
arbitrary code as *root*.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From jim at interet.com  Mon Jan  8 16:43:05 2001
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 08 Jan 2001 10:43:05 -0500
Subject: [Python-Dev] Create a synthetic stdout for Windows?
Message-ID: <3A59E009.96922CA5@interet.com>

There a number of problems which frequently recur on c.l.p
that can serve as a source of Python improvement ideas.
On December 30, 2000 gerson.kurz at t-online.de (Gerson Kurz) writes:

   If I embedd Python in a Win32 console application (using
   Demo\embed.c), everything works fine. If I take the very same piece
of
   code and put it in a Win32 Windows application (not MFC, just a plain
   WinMain()) I see no output (and more importantly so, no errors),
   because the application does not have a stdout/stderr set up.

This is well known.  Windows developers must replace sys.stdout and
sys.stderr with alternative mechanisms.  Unfortunately this solution
does not completely work because errors can occur before sys.stdout
is replaced.  I propose patching pythonw.exe (WinMain.c) and adding
a new module to fix this so it Just Works.  The patch is completely
Windows specific.  I am not sure if this constitutes a PEP, but would
like everyone's feedback anyway.

Design Requirements

1) "pythonw.exe myfile.py" will give the usual error message if
   myfile.py does not exist.

2) "pythonw.exe myfile.py" will give the usual traceback for a
   syntax error in myfile.py.

3) python.exe will provide a useful C-language stdout/stderr so
   the user does not have to replace sys.stdout/err herself.

4) None of the above will interfere will the user's replacement
   of sys.stdout/err for her own purposes.

Description of Patch

A new module winstdoutmodule.c (138 lines) is included in Windows
builds. It contains a C entry point PyWin_StdoutReplace() which
creates a valid C stdout/err, and code to display output
in a popup dialog box.  There is a Python entry point
winstdout.print() to display output, but it is only used
for special purposes, and the typical user will never import
winstdout.

The file WinMain.c calls PyWin_StdoutReplace() before it
calls Py_Main(), and PyWin_StdoutPrint() afterwards.  This
is meant to display startup error messages.  Normally,
any available output is displayed when the system is idle.

Technical Details

Some experimentation (as opposed to documentation) shows that
Win32 programs have a valid FILE * stdout, but fileno(stdout)
gives INVALID_HANDLE_VALUE; the FILE * has an invalid OS file
object.  It is tempting to hack the FILE structure directly.
But it is more prudent to use the only documented way to
replace stdout, namely the standard call "freopen()" (also
available on Unix).  The design uses this call to open a
temporary file to append stdout and stderr output.  To
display output, the file is checked when the system is
idle, and MessageBox() is called with the file contents if any.

Status

After a few false starts, I now have working code.

Is this a good idea?  If so, is the implementation optimal
(comments from MarkH especially welcome)?

JimA



From mal at lemburg.com  Mon Jan  8 16:52:32 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 16:52:32 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>  
	            <20010108002603.X17220@lyra.org> <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il>
Message-ID: <3A59E240.7F77790E@lemburg.com>

Moshe Zadka wrote:
> 
> On Mon, 08 Jan 2001 09:33:09 -0500, Guido van Rossum <guido at python.org> wrote:
> > > hehe... adding __exports__ to your module is fine. Adding more crud to
> > > Python, in opposition to the "we're all adults" motto, doesn't seem Right.
> > >
> > > Somebody wants to use "from foo import *" on a module not designed for it?
> > > Too bad for them. If you're suggesting __exports__ is to patch over problems
> > > caused by "from foo import *", then I think you're barking up the wrong tree
> > > :-)
> >
> > You haven't been answering many newbie questions lately, have you? :-)
> 
> Well, I have.
> And frankly, I think having "from foo import *" issue a warning at 2.1
> a *much* better solution.

Why raise a warning ? "from xyz import *" is still very useful in
intercative sessions and also has some merrits when it comes to
importing all subpackages of a package (well, at least those listed
in __all__).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From barry at digicool.com  Mon Jan  8 16:54:10 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 8 Jan 2001 10:54:10 -0500
Subject: [Python-Dev] Add __exports__ to modules
References: <20010107232532.V17220@lyra.org>
	<20010106110033.52127A84F@darjeeling.zadka.site.co.il>
	<LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
	<20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>
Message-ID: <14937.58018.792925.31985@anthem.wooz.org>

>>>>> "MZ" == Moshe Zadka <moshez at zadka.site.co.il> writes:

    MZ> it would *totally* stop any hope in the future of module level
    MZ> __getattr__ (or at least complicate the semantics).  I think
    MZ> Alex M. is thinking of a PEP, but he's taking his time, since
    MZ> no PEPs can be considered until 2.1 is out.

Given the current discussion, I'm now -1 on __exports__ unless a PEP
is written.  I think enough issues and interactions have been brought
up that a PEP is warranted first.

-Barry




From moshez at zadka.site.co.il  Tue Jan  9 01:03:00 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 02:03:00 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>
References: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>, <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com> <20010108002603.X17220@lyra.org>  
            <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il>
Message-ID: <20010109000300.DF2A5A82D@darjeeling.zadka.site.co.il>

On Mon, 08 Jan 2001 10:15:20 -0500, Guido van Rossum <guido at python.org> wrote:

> (1) For what problem?

Users seeing things they didn't expect in their modules.

> (2) Under exactly what circumstances do you want from foo import *
>     issue a warning?

All.
If you want to be less extreme, don't warn if the module defines
a __from_star_ok__

But in any case, I'm done with this thread. We'll probably won't
manage to convince each other.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From guido at python.org  Mon Jan  8 17:04:58 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 11:04:58 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Mon, 08 Jan 2001 10:54:10 EST."
             <14937.58018.792925.31985@anthem.wooz.org> 
References: <20010107232532.V17220@lyra.org> <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>  
            <14937.58018.792925.31985@anthem.wooz.org> 
Message-ID: <200101081604.LAA04464@cj20424-a.reston1.va.home.com>

> Given the current discussion, I'm now -1 on __exports__ unless a PEP
> is written.  I think enough issues and interactions have been brought
> up that a PEP is warranted first.

I have to agree.  I am no longer championing this patch.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Mon Jan  8 17:27:17 2001
From: skip at mojam.com (Skip Montanaro)
Date: Mon, 8 Jan 2001 10:27:17 -0600 (CST)
Subject: [Python-Dev] inspect.py
In-Reply-To: <Pine.LNX.4.10.10101072348530.1032-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10011021617550.800-100000@skuld.kingmanhall.org>
	<Pine.LNX.4.10.10101072348530.1032-100000@skuld.kingmanhall.org>
Message-ID: <14937.60005.951163.80255@beluga.mojam.com>

    Ping> Sorry to bother you if you're busy -- i haven't seen any responses
    Ping> about inspect.py for a few days and wanted to know what your
    Ping> reactions were.

Fiddling code bits is not the sort of stuff I do very often, but every time
I do I wind up having to reacquaint myself with all sorts of object details
that slip out of my brain shortly after the latest need is gone.  Having a
module that hides the details seems like a good idea to me.

+1.  I vote it go into 2.1 assuming a bit for the library reference can be
written in time.

Skip



From akuchlin at mems-exchange.org  Mon Jan  8 17:31:09 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 11:31:09 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <200101081427.JAA03146@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 08, 2001 at 09:27:50AM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com>
Message-ID: <20010108113109.C7563@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 09:27:50AM -0500, Guido van Rossum wrote:
>I expect Andrew's code to go in before 2.1 is released.  So I don't
>see a reason why we should hurry and check in a stop-gap measure.

But it might not; the final version might be unacceptable or run into
some intractable problem.  Assuming the patch is correct (I haven't
looked at it), why not check it in?  The work has already been done to
write it, after all.

--amk




From akuchlin at mems-exchange.org  Mon Jan  8 17:41:10 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 11:41:10 -0500
Subject: [Python-Dev] _cursesmodule.c clobbered since Christmas
In-Reply-To: <3A59A918.E0D02E0D@per.dem.csiro.au>; from m.favas@per.dem.csiro.au on Mon, Jan 08, 2001 at 07:48:40PM +0800
References: <3A59A918.E0D02E0D@per.dem.csiro.au>
Message-ID: <20010108114110.D7563@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 07:48:40PM +0800, Mark Favas wrote:
>I last successfully downloaded from CVS, compiled, linked and tested on
>Dec. 22 last year. For the last week or so, the current CVS
>_cursesmodule.c gives a bunch of compiler warning messages of the form:

Hmm... on Dec. 22 there was a sizable change to export a C API from
the module; since then there's only been one minor change.  Perhaps
the last version you compiled successfully was from before I checked
in those changes.  In any case, I'll look into it as soon as my Compaq
test drive account is usable and I have access to a Tru64 4.0
machine again.  Thanks for the report!

Once the PEP 229 changes go in, many more modules will be tried on
many more platforms.  It might be worth considering setting up a
Tinderbox for Python, or at least doing a systematic test on several
platforms before releases.

--amk




From paulp at ActiveState.com  Mon Jan  8 17:46:47 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Mon, 08 Jan 2001 08:46:47 -0800
Subject: [Python-Dev] Add __exports__ to modules
References: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
Message-ID: <3A59EEF7.BB4118BD@ActiveState.com>

Tim Peters wrote:
> 
> ... It doesn't make any *sense* that os exports
> "sys" either, or that random exports "cos", or that cgi exports "string", or
> ... this inelegance is ubiquitous.

I agree strongly. I think that Python people are careless about what
their module dictionaries look like. My two main annoyances are modules
that export other modules randomly and modules that export huge wacks of
constants.

> Indeed.  But I'm suggesting that the problems that *can* arise from
> "import*" illustrate the fundamental silliness of exporting things by
> accident.  It's come up much more often for me when I'm looking over
> someone's shoulder, teaching them how to use dir() in an interactive shell
> to answer their own damn questions <0.5 wink>.  It's usually the case that
> dir(M) shows them something that isn't documented, and over time I am *not*
> pleased that "oh, I guess the 'string' in there is just crap" is how they
> learn to view it.

Screw dir()! Let's talk about important stuff: Komodo. And Idle. And
WingIDE. And PythonWorks and PythonWin. :)

How are class browsers and "intellisense prompters" supposed to know
that it "makes sense" to prompt the user with os.path but not
CGIHTTPServer.os.path. 

Overall, I think Tim is right. We are all adults here and part of being
adults is keeping your privates private and your nose clean.

 Paul Prescod



From paulp at ActiveState.com  Mon Jan  8 17:47:39 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Mon, 08 Jan 2001 08:47:39 -0800
Subject: [Python-Dev] Add __exports__ to modules
References: <20010107232532.V17220@lyra.org>, <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>
Message-ID: <3A59EF2B.792801E5@ActiveState.com>

Moshe Zadka wrote:
> 
> ...
> Let me "me to" here:
> Put another way, what Greg said is just a rephrase of "don't use from
> foo import * unless foo's docos say it's OK". 

That's not the issue. It's not about keeping people out of your module.
In fact I would propose that mod.__dict__ should be as loose as ever.

It's a user interface issue. If we encourage people to learn about
modules in interactive environments like the prompt using dir(), class
browsers and IDEs then we need to create modules that are friendly for
those users. I think that the current situation is pretty bad that way.
what does CGIHTTPServer export BaseHTTPServer? And why is
CGIHTTPServer.CGIHTTPServer a class but CGIHTTPServer.BaseHTTPServer is
a module?

We go to great lengths to make the syntax newbie friendly. I think that
we should make similar efforts in a cleanly reflective class library.

> Add to that the simple
> access control of a leading underscore, and I don't see any place
> which needs it.
> 
> Something better to do would be to use
> import foo as _foo

It's pretty clear that nobody does this now and nobody is going to start
doing it in the near future. It's too invasive and it makes the code too
ugly. Why obfuscate thousands of lines of code when a simple feature can
mitigate that?

>...
> One more point against __exports__, which I hoped I would not have to
> make (but when I'm up against the timbot *and* Guido, I need to pull
> out the heavy artillery): it would *totally* stop any hope in the
> future of module level __getattr__ (or at least complicate the semantics).
> I think Alex M. is thinking of a PEP, but he's taking his time, since
> no PEPs can be considered until 2.1 is out.

__exports__ would merely be considered an implementation detail of the
"default __getattr__". Custom __getattr__'s could decide whether to
respect it or not. It doesn't complicate anything much.

 Paul Prescod



From nas at arctrix.com  Mon Jan  8 10:54:55 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 8 Jan 2001 01:54:55 -0800
Subject: [Python-Dev] Create a synthetic stdout for Windows?
In-Reply-To: <3A59E009.96922CA5@interet.com>; from jim@interet.com on Mon, Jan 08, 2001 at 10:43:05AM -0500
References: <3A59E009.96922CA5@interet.com>
Message-ID: <20010108015455.A15138@glacier.fnational.com>

On Mon, Jan 08, 2001 at 10:43:05AM -0500, James C. Ahlstrom wrote:
> Is this a good idea?  If so, is the implementation optimal
> (comments from MarkH especially welcome)?

The general idea sounds good to me.  Having tracebacks go nowhere
when running pythonw is un-Python-like.  I don't know enough
about MFC, etc. to comment on the specifics of your patch.

  Neil



From akuchlin at mems-exchange.org  Mon Jan  8 17:49:13 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 11:49:13 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59EEF7.BB4118BD@ActiveState.com>; from paulp@ActiveState.com on Mon, Jan 08, 2001 at 08:46:47AM -0800
References: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <3A59EEF7.BB4118BD@ActiveState.com>
Message-ID: <20010108114913.E7563@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 08:46:47AM -0800, Paul Prescod wrote:
>How are class browsers and "intellisense prompters" supposed to know
>that it "makes sense" to prompt the user with os.path but not
>CGIHTTPServer.os.path. 

Could we then simply adopt __exports__ as a convention for such
browsers, but with no changes to core Python to support it?  Browsers
would then follow the algorithm "Use __exports__ if present, dir() if
not."  

--amk



From paulp at ActiveState.com  Mon Jan  8 17:51:26 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Mon, 08 Jan 2001 08:51:26 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <3A59F00E.53A0A32A@ActiveState.com>

Tim Peters wrote:
> 
> ....
> 
> Perl appears to ignore the issue of thread safety here (on Windows and
> everywhere else).

If you can create a sample program that demonstrates the unsafety I'll
anonymously submit it as a bug on our internal system and ensure that
the next version of Perl is as slow as Python. :)

Seriously: If someone comes at me with
Perl-IO-is-way-faster-than-Python-IO, I'd like to know what concretely
they've given up in order to achieve that performance. And even just
for my own interest I'd like to understand the cost/benefit of
stream thread safety. For instance would it make sense to just write
a thread-safe wrapper for streams used from multiple threads?

 Paul Prescod



From paulp at ActiveState.com  Mon Jan  8 18:01:49 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Mon, 08 Jan 2001 09:01:49 -0800
Subject: [Python-Dev] Add __exports__ to modules
References: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <3A59EEF7.BB4118BD@ActiveState.com> <20010108114913.E7563@kronos.cnri.reston.va.us>
Message-ID: <3A59F27D.C27B8CD0@ActiveState.com>

Andrew Kuchling wrote:
> 
> ...
> 
> Could we then simply adopt __exports__ as a convention for such
> browsers, but with no changes to core Python to support it?  Browsers
> would then follow the algorithm "Use __exports__ if present, dir() if
> not."

dir() is one of the "interactive tools" I'd like to work better in the
presence of __exports__. On the other hand, dir() works pretty poorly
for object instances today so maybe we need something new anyhow. 
Perhaps attrs()? 

If there were an "attrs()" and it basically returned __exports__ if it
existed and dir() if it didn't, then I would buy it. Graphical apps
would just build on attrs().

 Paul



From MarkH at ActiveState.com  Mon Jan  8 18:04:31 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Mon, 8 Jan 2001 09:04:31 -0800
Subject: [Python-Dev] Create a synthetic stdout for Windows?
In-Reply-To: <3A59E009.96922CA5@interet.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPGEDKCOAA.MarkH@ActiveState.com>

> Is this a good idea?  If so, is the implementation optimal

Im really on the fence here.  Note however that your solution does not solve
the original problem.  Eg, your example is:

> On December 30, 2000 gerson.kurz at t-online.de (Gerson Kurz) writes:
>
>    If I embedd Python in a Win32 console application (using
>    Demo\embed.c), everything works fine. If I take the very same piece

But your solution involves:

> The file WinMain.c calls PyWin_StdoutReplace() before it
> calls Py_Main(), and PyWin_StdoutPrint() afterwards.  This

Note that the original problem was _embedding_ Python - thus, you need to
patch _their_ WinMain to make it work for them - something you can't do.

Even if PyWin_StdoutReplace() was a public symbol so they _could_ call it, I
am not convinced they would - it is almost certain they will still need to
redirect output to somewhere useful, so why bother redirecting it
temporarily just to redirect it for real immediately after?

Finally, I am slightly concerned about the possibility of "hanging" certain
programs. For example, I believe that DCOM will often invoke a COM server in
a different "desktop" than the user (this is also true for Services, but
Python services don't use pythonw.exe).  Thus, a Python program may end up
hanging with a dialog box, but in the context where no user is able to see
it.  However, this could be addressed by adding a command-line option to
prevent this new behaviour kicking in.

I would prefer to see a decent API for extracting error and traceback
information from Python.  On the other hand, I _do_ see the problem for
"newbies" trying to use pythonw.exe.

So - I guess I am saying that I don't see this as optimal, and it doesnt
solve the original problem you pointed at - but in the interests of making
pythonw.exe seem "less broken" for newbies, I could live with this as long
as I could prevent it when necessary.

Another option would be to use the Win32 Console APIs, and simply attempt to
create a console for the error message.  Eg, maybe PyErr_Print() could be
changed to check for the existance of a console, and if not found, create
it.  However, the problem with this approach is that the error message will
often be printed just as the process is terminating - meaning you will see a
new console with the error message for about 0.025 of a second before it
vanishes due to process termination.  Any sort of "press any key to
terminate" option then leaves us in the same position - if no user can see
the message, the process appears hung.

Mark.




From andreas at andreas-jung.com  Mon Jan  8 18:06:16 2001
From: andreas at andreas-jung.com (Andreas Jung)
Date: Mon, 8 Jan 2001 18:06:16 +0100
Subject: [Python-Dev] Re: ANN: Stackless Python 2.0
In-Reply-To: <3A58EFC3.5A722FF0@tismer.com>; from tismer@tismer.com on Mon, Jan 08, 2001 at 12:37:55AM +0200
References: <3A58EFC3.5A722FF0@tismer.com>
Message-ID: <20010108180616.A18993@yetix.sz-sb.de>

On Mon, Jan 08, 2001 at 12:37:55AM +0200, Christian Tismer wrote:
> Dear community,
> 
> I'm happy to announce that
> 
> 		Stackless Python 2.0
> 
> is finally ready and available for download.
> 
> Stackless Python for Python 1.5.2+ also got some minor
> enhancements. Both versions are available as Win32
> installer files here:

Are there patches available against the standard Python 2.0 
source code tree ?

Andreas 



From tismer at tismer.com  Mon Jan  8 17:15:55 2001
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 08 Jan 2001 18:15:55 +0200
Subject: [Python-Dev] Re: ANN: Stackless Python 2.0
References: <3A58EFC3.5A722FF0@tismer.com> <20010108180616.A18993@yetix.sz-sb.de>
Message-ID: <3A59E7BB.6908B7E2@tismer.com>


Andreas Jung wrote:
> 
> On Mon, Jan 08, 2001 at 12:37:55AM +0200, Christian Tismer wrote:
> > Dear community,
> >
> > I'm happy to announce that
> >
> >               Stackless Python 2.0
> >
> > is finally ready and available for download.
> >
> > Stackless Python for Python 1.5.2+ also got some minor
> > enhancements. Both versions are available as Win32
> > installer files here:
> 
> Are there patches available against the standard Python 2.0
> source code tree ?

I had no time yet to put the source trees on the web.
Should happen in one or two days.
The I will probably not provide patches, hoping that
some other Unix people will catch up and provide that
part. This worked the same for the 1.5.2 version.

The 2.0 port consists of 10 or so files, which can be used
as direct replacements for the same files in the 2.0 distro.
I think on Unix this is the right way to go.
For me it is simpler to have my own litle tree, since I'm
working with Windows, and I just have to modify my VC++
project file.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From moshez at zadka.site.co.il  Tue Jan  9 02:30:09 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 03:30:09 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59F27D.C27B8CD0@ActiveState.com>
References: <3A59F27D.C27B8CD0@ActiveState.com>, <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <3A59EEF7.BB4118BD@ActiveState.com> <20010108114913.E7563@kronos.cnri.reston.va.us>
Message-ID: <20010109013009.37D6DA82D@darjeeling.zadka.site.co.il>

On Mon, 08 Jan 2001 09:01:49 -0800, Paul Prescod <paulp at ActiveState.com> wrote:

> dir() is one of the "interactive tools" I'd like to work better in the
> presence of __exports__. On the other hand, dir() works pretty poorly
> for object instances today so maybe we need something new anyhow. 
> Perhaps attrs()? 
> 
> If there were an "attrs()" and it basically returned __exports__ if it
> existed and dir() if it didn't, then I would buy it. Graphical apps
> would just build on attrs().

Even better, __exports__ could be what was imported in 
from foo import *.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From andreas at andreas-jung.com  Mon Jan  8 18:25:36 2001
From: andreas at andreas-jung.com (Andreas Jung)
Date: Mon, 8 Jan 2001 18:25:36 +0100
Subject: [Python-Dev] Re: ANN: Stackless Python 2.0
In-Reply-To: <3A59E7BB.6908B7E2@tismer.com>; from tismer@tismer.com on Mon, Jan 08, 2001 at 06:15:55PM +0200
References: <3A58EFC3.5A722FF0@tismer.com> <20010108180616.A18993@yetix.sz-sb.de> <3A59E7BB.6908B7E2@tismer.com>
Message-ID: <20010108182536.A20361@yetix.sz-sb.de>

On Mon, Jan 08, 2001 at 06:15:55PM +0200, Christian Tismer wrote:
> 
> The 2.0 port consists of 10 or so files, which can be used
> as direct replacements for the same files in the 2.0 distro.
> I think on Unix this is the right way to go.
> For me it is simpler to have my own litle tree, since I'm
> working with Windows, and I just have to modify my VC++
> project file.

I would prefer a tar.gz archive that contains just the modified files.
With this approach it is easy possible to extract the archive inside
the Python source tree.

Andreas



From loewis at informatik.hu-berlin.de  Mon Jan  8 18:51:28 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 8 Jan 2001 18:51:28 +0100 (MET)
Subject: [Python-Dev] Extending startup code: PEP needed?
Message-ID: <200101081751.SAA08918@pandora.informatik.hu-berlin.de>

> Just curious: wouldn't this introduce a /tmp-style problem to
> Python ?

I tried, but I could not produce such a problem.

> The scenario is quite simple: a Python script runs under root.
> The script could pick up a lingering .pth file (e.g. from /tmp
> or one of its subdirs -- distutils does this !) and then executes
> arbitrary code as *root*.

No, Python looks only in a few places for pth file: 
{<prefix>,<exec_prefix>}{,/lib/python<version>/site-packages,/lib/site-python}

so it won't pick up pth files in /tmp.

Regards,
Martin



From esr at thyrsus.com  Mon Jan  8 19:01:37 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 8 Jan 2001 13:01:37 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <200101081506.KAA03404@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 08, 2001 at 10:06:28AM -0500
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com> <20010107171527.A5093@thyrsus.com> <200101081506.KAA03404@cj20424-a.reston1.va.home.com>
Message-ID: <20010108130137.E22834@thyrsus.com>

Guido van Rossum <guido at python.org>:
> Eric, before we go furhter, can you give an exact definition of
> EOFness to me?

A file is at EOF when attempts to read more data from it will fail
returning no data.

> What's wrong with just setting the parser loose on the input and
> letting it deal with EOF?

Nothing wrong in theory, but it's a problem in practice.  I don't want
to import the second parser unless it's actually needed, because it's much
larger than the first one.

>                                In your example, apparently a line
> containing the word "history" signals that the rest of the file must
> be parsed by the second parser.  What if "history" is the last line of
> the file?  The eof() test can't tell you *that*!

Right.  That case never happens.  I mean it *really* never happens :-).

What we're talking about is a game system.  The first parser recognizes
a spec language for describing games of a particular class (variants of
Diplomacy, if that's meaningful to you).  The system keeps logfiles which
consist of a a section in the game description language, optionally 
followed by the token "history" and an order log.

The parser for the order log language is a *lot* larger than the one
for the description language.  This is why I said I don't want the
first parser to just call the second.  I want to test for EOF to
know whether I have to import the second parser at all!

Here's the beginning of my problem: the first parser can't export a line
buffer, because it doesn't *have* a line buffer.  It's a subclass of
shlex and does single-character reads.

There are two ways I can cope with this.  One is to do a (nonzero)
length read after the first parser exits; the other is to have the
first parser set a state flag controlling whether the second parser
loads.

This is where it bites that I can't test for EOF with a read(0). The
second shlex parser only has token-level pushback!  If do a
nonzero-length read and I get data, I'm screwed.  On the other hand
(as I said before) setting a lexer state flag seems wrong, because
EOFness is a property of the underlying stream rather than the parser.
I'd be duplicating state that exists in the stdio stream structure
anyway; it ought to be accessible.

> > Now, another and more general way to handle this would be to make an
> > equivalent of the old FIONCLEX ioctl part of Python's standard set of 
> > file object methods -- a way to ask "how many bytes are ready to be
> > read in this stream?  
> 
> There's no portable way to do that.

Actually, fstat(2) is portable enough to support a very useful
approximation of FIONCLEX.  I know, because I tried it.

Last night I coded up a "waiting" method for file objects that calls
fstat(2) on the associated file descriptor.  For a plain file, it
then subtracts the result of ftell() from the fstat size field and
returns that -- for other files, it simply returns the size field.

I then tested this on plain files, FIFOs, and sockets under Linux. It
turns out fstat(2) gives useful information in all three cases (a
count of characters waiting in the buffer in the latter two).  I expected
this; it should be true under all current Unixes.

fstat(2) does not give useful size-field results for Linux block
devices.  I didn't test the character (terminal) devices.  (I
documented my results in Python's Doc/lib/stat.tex, in a patch I have
already submitted to SourceForge.)

I would be quite surprised if the plain-file case didn't work on Mac
and Windows.  I would be a little surprised if the socket case failed,
because all three probably inherited fstat(2) from the ancestral BSD
TCP/IP stack.

Just having the plain-file case work would, IMHO, be justification
enough for this method.  If it turns out to be portable across Mac and
Windows sockets as well, *huge* win.  Could this be tested by someone
with access to Windows and Mac systems?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

An armed society is a polite society.  Manners are good when one 
may have to back up his acts with his life.
        -- Robert A. Heinlein, "Beyond This Horizon", 1942




From mal at lemburg.com  Mon Jan  8 19:10:50 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 19:10:50 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de>
Message-ID: <3A5A02AA.675A35D1@lemburg.com>

Martin von Loewis wrote:
> 
> > Just curious: wouldn't this introduce a /tmp-style problem to
> > Python ?
> 
> I tried, but I could not produce such a problem.
> 
> > The scenario is quite simple: a Python script runs under root.
> > The script could pick up a lingering .pth file (e.g. from /tmp
> > or one of its subdirs -- distutils does this !) and then executes
> > arbitrary code as *root*.
> 
> No, Python looks only in a few places for pth file:
> {<prefix>,<exec_prefix>}{,/lib/python<version>/site-packages,/lib/site-python}
> 
> so it won't pick up pth files in /tmp.

Hmm, but what if the Python script picks up a site.py which is
different from the standard one distributed with Python ?

The code adding (and with the patch: executing) the .pth files
is defined in site.py and it is rather easy to override this
file by adding a modified site.py file to the current working dir...
a potential security hole in its own right, I guess :(

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at python.org  Mon Jan  8 19:30:34 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 13:30:34 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: Your message of "Mon, 08 Jan 2001 13:01:37 EST."
             <20010108130137.E22834@thyrsus.com> 
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com> <20010107171527.A5093@thyrsus.com> <200101081506.KAA03404@cj20424-a.reston1.va.home.com>  
            <20010108130137.E22834@thyrsus.com> 
Message-ID: <200101081830.NAA05301@cj20424-a.reston1.va.home.com>

Eric, take a hint.  You're not going to get your eof() method no
matter what arguments you bring up.  But I'll explain it to you again
anyway... :-)

> Guido van Rossum <guido at python.org>:
> > Eric, before we go furhter, can you give an exact definition of
> > EOFness to me?

[Eric]
> A file is at EOF when attempts to read more data from it will fail
> returning no data.

I was afraid you would say this.  That's not a condition that's easy
to calculate without doing I/O, *and* that's not the condition that
you are interested in for your problem.  According to your definition,
f.eof() should be true in this example:

    f = open("/etc/passwd")
    f.seek(0, 2)                 # Seek to end of file
    print f.eof()                # What will this print???
    print `f.readline()`         # Will print ''

But getting the right result here requires a lot of knowledge about
how the file is implemented!  While you've explained how this can be
implemented on Unix, it can't be implemented with just the tools that
stdio gives us.  Going beyond stdio in order to implement a feature is
a grave decision.  After all, Python is portable to many
less-than-mainstream operating systems (VxWorks, OS/9, VMS...).  Now,
if this was just a speed hack (like xreadlines) I could accept having
some platform-dependent code, if at least there was a portable way to
do it that was just a bit slower.  But here you can't convince me that
this can be done in a portable way, and I don't want to force porters
to figure out how to do this for their platform before their port can
work.  I also don't want to make f.eof() a non-portable feature: *if*
it is provided, it's too important for that.

Note that stdio's feof() doesn't have this definition!  It is set when
the last *read* (or getc(), etc.) stumbled upon an EOF condition.
That's also of limited value; it's mostly defined so you can
distinguish between errors and EOF when you get a short read.  The
stdio feof() flag would be false in the above example.

> > What's wrong with just setting the parser loose on the input and
> > letting it deal with EOF?
> 
> Nothing wrong in theory, but it's a problem in practice.  I don't want
> to import the second parser unless it's actually needed, because it's much
> larger than the first one.

So be practical and let the first parser set a global flag that tells
you whether it's necessary to load the second one.

> >                                In your example, apparently a line
> > containing the word "history" signals that the rest of the file must
> > be parsed by the second parser.  What if "history" is the last line of
> > the file?  The eof() test can't tell you *that*!
> 
> Right.  That case never happens.  I mean it *really* never happens :-).
> 
> What we're talking about is a game system.  The first parser recognizes
> a spec language for describing games of a particular class (variants of
> Diplomacy, if that's meaningful to you).  The system keeps logfiles which
> consist of a a section in the game description language, optionally 
> followed by the token "history" and an order log.
> 
> The parser for the order log language is a *lot* larger than the one
> for the description language.  This is why I said I don't want the
> first parser to just call the second.  I want to test for EOF to
> know whether I have to import the second parser at all!
> 
> Here's the beginning of my problem: the first parser can't export a line
> buffer, because it doesn't *have* a line buffer.  It's a subclass of
> shlex and does single-character reads.
> 
> There are two ways I can cope with this.  One is to do a (nonzero)
> length read after the first parser exits; the other is to have the
> first parser set a state flag controlling whether the second parser
> loads.

Do the latter.  Nothing wrong with it that I can see.

> This is where it bites that I can't test for EOF with a read(0).

And can you tell me a system where you *can* test for EOF with a
read(0)?  I've never heard of such a thing.  The Unix read() system
call has the same properties as Python's f.read().  I'm pretty sure
that fread() with a zero count also doesn't give you the information
you're after.

> The
> second shlex parser only has token-level pushback!  If do a
> nonzero-length read and I get data, I'm screwed.  On the other hand
> (as I said before) setting a lexer state flag seems wrong, because
> EOFness is a property of the underlying stream rather than the parser.
> I'd be duplicating state that exists in the stdio stream structure
> anyway; it ought to be accessible.

Bullshit.  The EOFness that you're after (according to your own
definition) is not the same as the EOFness of the stdio stream.  The
EOFness in the stdio stream could help you, but Python resets it -- so
that making it available wouldn't be as easy as you claim.  Anyway,
you seem to have a sufficiently vague idea of what "EOFness" means
that I don't think providing access to whatever low-level EOFness
condition might exist would do you much good.

> > > Now, another and more general way to handle this would be to make an
> > > equivalent of the old FIONCLEX ioctl part of Python's standard set of 
> > > file object methods -- a way to ask "how many bytes are ready to be
> > > read in this stream?  
> > 
> > There's no portable way to do that.
> 
> Actually, fstat(2) is portable enough to support a very useful
> approximation of FIONCLEX.  I know, because I tried it.
> 
> Last night I coded up a "waiting" method for file objects that calls
> fstat(2) on the associated file descriptor.  For a plain file, it
> then subtracts the result of ftell() from the fstat size field and
> returns that -- for other files, it simply returns the size field.
> 
> I then tested this on plain files, FIFOs, and sockets under Linux. It
> turns out fstat(2) gives useful information in all three cases (a
> count of characters waiting in the buffer in the latter two).  I expected
> this; it should be true under all current Unixes.
> 
> fstat(2) does not give useful size-field results for Linux block
> devices.  I didn't test the character (terminal) devices.  (I
> documented my results in Python's Doc/lib/stat.tex, in a patch I have
> already submitted to SourceForge.)
> 
> I would be quite surprised if the plain-file case didn't work on Mac
> and Windows.  I would be a little surprised if the socket case failed,
> because all three probably inherited fstat(2) from the ancestral BSD
> TCP/IP stack.
> 
> Just having the plain-file case work would, IMHO, be justification
> enough for this method.  If it turns out to be portable across Mac and
> Windows sockets as well, *huge* win.  Could this be tested by someone
> with access to Windows and Mac systems?

I don't see the huge win.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Jan  8 19:33:26 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 13:33:26 -0500
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: Your message of "Mon, 08 Jan 2001 19:10:50 +0100."
             <3A5A02AA.675A35D1@lemburg.com> 
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de>  
            <3A5A02AA.675A35D1@lemburg.com> 
Message-ID: <200101081833.NAA05325@cj20424-a.reston1.va.home.com>

Discussions based on Python running as root and picking up untrusted
code from $PYTHONPATH are pointless.  Of course this is a security
hole.  If root runs *any* Python script in a way that could pick up
even a single untrusted module, there's a security hole.  site.py or
*.pth files are just a special case of this, so I don't see why this
is used as an example.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Mon Jan  8 19:48:40 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 13:48:40 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59EF2B.792801E5@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEGFIHAA.tim.one@home.com>

[Moshe]
> Something better to do would be to use
> import foo as _foo

[Paul]
> It's pretty clear that nobody does this now and nobody is going
> to start doing it in the near future. It's too invasive and it
> makes the code too ugly.

Actually, this function is one of my std utilities:

def _pvt_import(globs, modname, *items):
    """globs, modname, *items -> import into globs with leading "_".

    If *items is empty, set globs["_" + modname] to module modname.
    If *items is not empty, import each item similarly but don't
    import the module into globs.
    Leave names that already begin with an underscore as-is.

    # import math as _math
    >>> _pvt_import(globals(), "math")
    >>> round(_math.pi, 0)
    3.0

    # import math.sin as _sin and math.floor as _floor
    >>> _pvt_import(globals(), "math", "sin", "floor")
    >>> _floor(3.14)
    3.0
    """

    mod = __import__(modname, globals())
    if items:
        for name in items:
            xname = name
            if xname[0] != "_":
                xname = "_" + xname
            globs[xname] = getattr(mod, name)
    else:
        xname = modname
        if xname[0] != "_":
            xname = "_" + xname
        globs[xname] = mod

Note that it begins with an underscore because it's *meant* to be exported
<0.5 wink>.  That is, the module importing this does

    from utils import _pvt_import

because they don't already have _pvt_import to automate adding the
underscore, and without the underscore almost everyone would accidentally
export "pvt_import" in turn.  IOW,

    import M
    from N import M

not only import M, by default they usually export it too, but the latter is
rarely *intended*.  So, over the years, I've gone thru several phases of
naming objects I *intend* to export with a leading underscore.  That's the
only way to prevent later imports from exporting by accident.  I don't
believe I've distributed any code using _pvt_import, though, because it
fights against the language and expectations.  Metaprogramming against the
grain should be a private sin <0.9 wink>.

_metaprogramming-ly y'rs  - tim




From mal at lemburg.com  Mon Jan  8 19:40:37 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 19:40:37 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de>  
	            <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com>
Message-ID: <3A5A09A5.D0DC33A1@lemburg.com>

Guido van Rossum wrote:
> 
> Discussions based on Python running as root and picking up untrusted
> code from $PYTHONPATH are pointless.  Of course this is a security
> hole.  If root runs *any* Python script in a way that could pick up
> even a single untrusted module, there's a security hole.  site.py or
> *.pth files are just a special case of this, so I don't see why this
> is used as an example.

Agreed; see my reply to Martin.

Still, wouldn't it be wise to add some logic to Python to prevent
importing untrusted modules, e.g. by making sys.path read-only and
disabling the import hook usage using a command line ? 

This would at least prevent the most obvious attacks. I wonder how
RedHat works around these problems.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From jim at interet.com  Mon Jan  8 20:16:45 2001
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 08 Jan 2001 14:16:45 -0500
Subject: [Python-Dev] Create a synthetic stdout for Windows?
References: <LCEPIIGDJPKCOIHOBJEPGEDKCOAA.MarkH@ActiveState.com>
Message-ID: <3A5A121D.FDD8C2C1@interet.com>

Mark Hammond wrote:

> Note that the original problem was _embedding_ Python - thus, you need to
> patch _their_ WinMain to make it work for them - something you can't do.

Correct, if they don't use pythonw.exe, but use a different
main program, the new stdout will not be installed.  But then
they must have their own main.c, and they can add the C call.
 
> Even if PyWin_StdoutReplace() was a public symbol so they _could_ call it, I

Yes, the symbol PyWin_StdoutReplace() is public, and they
can call it.

> am not convinced they would - it is almost certain they will still need to
> redirect output to somewhere useful, so why bother redirecting it
> temporarily just to redirect it for real immediately after?

Redirecting it temporarily is valuable, because if the sys.stdout
replacement occurs in (for example) myprog.py, then "pythonw.exe
myprog.py"
will fail to produce any error messages for a syntax error in myprog.py.

Also, I was hoping further sys.stdout redirection would be unnecessary.
 
> Finally, I am slightly concerned about the possibility of "hanging" certain
> programs. For example, I believe that DCOM will often invoke a COM server in
> a different "desktop" than the user (this is also true for Services, but
> Python services don't use pythonw.exe).  Thus, a Python program may end up
> hanging with a dialog box, but in the context where no user is able to see
> it.  However, this could be addressed by adding a command-line option to
> prevent this new behaviour kicking in.

Limiting the code to pythonw.exe instead of trying to install
it in python20.dll was supposed to prevent damage to the use
of Python in servers.  Since pythonw.exe is a Windows (GUI) program,
I am assuming there is a screen.  The dialog box is started with
MessageBox() and a window handle of GetForegroundWindow().  So
there doesn't need to be an application window.  I have tested it
with GUI programs, and it also works when run from a console.

Having said that, you may be right that there is some way to
hang on a dialog box which can not be seen.  It depends on what
MessageBox() and GetForegroundWindow() actually do.  If it seems
that this patch has merit, I would be grateful if you would review
the code to look for issues of this type.
 
> I would prefer to see a decent API for extracting error and traceback
> information from Python.  On the other hand, I _do_ see the problem for
> "newbies" trying to use pythonw.exe.

There could be an API added to the winstdout module such as
  msg = winstdout.GetMessageText()
which would return saved text, control its display etc.
But then the problem remains of actually displaying the messages
especially in the context of tracebacks and errors.  And it is
probably easier to redirect sys.stdout so it does what you want
rather than use the API.

I do not view winstdout as a "newbie" feature, but rather a
generally useful C-language addition to Python.

> So - I guess I am saying that I don't see this as optimal, and it doesnt
> solve the original problem you pointed at - but in the interests of making
> pythonw.exe seem "less broken" for newbies, I could live with this as long
> as I could prevent it when necessary.

I guess I am saying, perhaps incorrectly, that the mechanism provided
will make further redirection of sys.stdout unnecessary 99% of the
time.  Experimentation shows that Python composes tracebacks and
error messages a line or partial line at a time.  That is, you can
not display each call to printf(), but must wait until the system is
idle to be sure that multiple calls to printf() are complete.  So this
forces you to use the idle processing loop, not rocket science but
at least inconvenient.  And the only source of stdout/err is tracebacks,
error messages and the "print" statement.  What would you do with
these in a Windows program except display an "OK" dialog box?

If someone out there knows of a different example of sys.stdout
redirection in use in the real world, it would be helpful if
they would describe it.  Maybe it could be incorporated.

> Another option would be to use the Win32 Console APIs, and simply attempt to
> create a console for the error message.  Eg, maybe PyErr_Print() could be
> changed to check for the existance of a console, and if not found, create
> it.  However, the problem with this approach is that the error message will
> often be printed just as the process is terminating - meaning you will see a
> new console with the error message for about 0.025 of a second before it
> vanishes due to process termination.  Any sort of "press any key to
> terminate" option then leaves us in the same position - if no user can see
> the message, the process appears hung.

Yes, this a problem with the console API approach.  Another is that
popping up a black console for output instead of the usual "OK"
dialog box is unnatural, and will force the user to replace sys.stdout.
I was hoping this C stdout will make this unnecessary.

JimA



From esr at thyrsus.com  Mon Jan  8 20:17:50 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 8 Jan 2001 14:17:50 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <200101081830.NAA05301@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 08, 2001 at 01:30:34PM -0500
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com> <20010107171527.A5093@thyrsus.com> <200101081506.KAA03404@cj20424-a.reston1.va.home.com> <20010108130137.E22834@thyrsus.com> <200101081830.NAA05301@cj20424-a.reston1.va.home.com>
Message-ID: <20010108141750.C23214@thyrsus.com>

Guido van Rossum <guido at python.org>:
> [Eric]
> > A file is at EOF when attempts to read more data from it will fail
> > returning no data.
> 
> I was afraid you would say this.  That's not a condition that's easy
> to calculate without doing I/O, *and* that's not the condition that
> you are interested in for your problem.  According to your definition,
> f.eof() should be true in this example:
> 
>     f = open("/etc/passwd")
>     f.seek(0, 2)                 # Seek to end of file
>     print f.eof()                # What will this print???
>     print `f.readline()`         # Will print ''

I agree that after f.seek(0, 2) f is in an end-of-file condition.  But
I think it's precisely the definition that would be useful for my
problem.  Contrary to what you say, I think my definition of EOF is
quite sharp -- a sequential read would return no data.

Better to think of what I need as an "is there data waiting?" query.
I should have framed it that way, rather than about EOFness, from the
beginning.

> But getting the right result here requires a lot of knowledge about
> how the file is implemented!  While you've explained how this can be
> implemented on Unix, it can't be implemented with just the tools that
> stdio gives us.

Granted.  However, it looks possible that "is there data waiting"
*can* be portably implemented with the help of fstat(2), which by
precedent is also part of Python's toolkit.

> I also don't want to make f.eof() a non-portable feature: *if*
> it is provided, it's too important for that.

Agreed.

> Note that stdio's feof() doesn't have this definition!  It is set when
> the last *read* (or getc(), etc.) stumbled upon an EOF condition.
> That's also of limited value; it's mostly defined so you can
> distinguish between errors and EOF when you get a short read.  The
> stdio feof() flag would be false in the above example.

OK.  You're right about that.  I should have thought more clearly about
the difference between the state of stdio and the state of the underlying
file or device.  Access to stdio state won't do by itself.

> > This is where it bites that I can't test for EOF with a read(0).
> 
> And can you tell me a system where you *can* test for EOF with a
> read(0)?  I've never heard of such a thing.  The Unix read() system
> call has the same properties as Python's f.read().  I'm pretty sure
> that fread() with a zero count also doesn't give you the information
> you're after.

I'd have to test -- but what Unix read(2) does in this case isn't
really my point.  My real point is that I can't probe for whether
there's data waiting to be read in what seems like the obvious way.  I
expect Python to compensate for the deficiencies of the underlying C,
not reflect them.

> > Just having the plain-file case work would, IMHO, be justification
> > enough for this method.  If it turns out to be portable across Mac and
> > Windows sockets as well, *huge* win.  Could this be tested by someone
> > with access to Windows and Mac systems?
> 
> I don't see the huge win.

Try "polling after a non-blocking open".  A lower-overhead and more 
natural way to do it than with a poller object.  (This is on my mind 
because I used a poller object to query FIFOs just last week.)

The game system I'm working on, BTW, has another point of interest for
this list.  It is a rather large and complex suite of C programs that
makes heavy use of dynamic-memory allocation; I am translating to
Python partly in order to avoid chronic misallocation problems (leaks
and wild pointers) and partly because the thing needed to be rewritten
anyway to eliminate global state so I can embed it an multithreaded
server.

Side-by-side comparison of the original C and its translation should
be quite an interesting educational experience once it's done.  That
just might be my next yesar's paper.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

It is the assumption of this book that a work of art is a gift, not a
commodity.  Or, to state the modern case with more precision, that works of
art exist simultaneously in two "economies," a market economy and a gift
economy.  Only one of these is essential, however: a work of art can survive
without the market, but where there is no gift there is no art.
	-- Lewis Hyde, The Gift: Imagination and the Erotic Life of Property



From guido at python.org  Mon Jan  8 20:36:02 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 14:36:02 -0500
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: Your message of "Mon, 08 Jan 2001 19:40:37 +0100."
             <3A5A09A5.D0DC33A1@lemburg.com> 
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com>  
            <3A5A09A5.D0DC33A1@lemburg.com> 
Message-ID: <200101081936.OAA05440@cj20424-a.reston1.va.home.com>

> Still, wouldn't it be wise to add some logic to Python to prevent
> importing untrusted modules, e.g. by making sys.path read-only and
> disabling the import hook usage using a command line ? 
> 
> This would at least prevent the most obvious attacks. I wonder how
> RedHat works around these problems.

I don't understand what kind of attacks you are thinking of.  What
would making sys.path read-only prevent?  You seem to be thinking that
some malicious piece of code could try to subvert you by setting
sys.path.  But what you forget is that if this piece of code cannot be
trusted wiuth sys.path, it should not be trusted to run at all!

--Guido van Rossum (home page: http://www.python.org/~guido/)




From loewis at informatik.hu-berlin.de  Mon Jan  8 20:45:44 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 8 Jan 2001 20:45:44 +0100 (MET)
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: <3A5A02AA.675A35D1@lemburg.com> (mal@lemburg.com)
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com>
Message-ID: <200101081945.UAA12178@pandora.informatik.hu-berlin.de>

> The code adding (and with the patch: executing) the .pth files
> is defined in site.py and it is rather easy to override this
> file by adding a modified site.py file to the current working dir...
> a potential security hole in its own right, I guess :(

Indeed - independent of my patch changing the other site.py :-)

Regards,
Martin



From skip at mojam.com  Mon Jan  8 20:49:22 2001
From: skip at mojam.com (Skip Montanaro)
Date: Mon, 8 Jan 2001 13:49:22 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59EF2B.792801E5@ActiveState.com>
References: <20010107232532.V17220@lyra.org>
	<20010106110033.52127A84F@darjeeling.zadka.site.co.il>
	<LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
	<20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>
	<3A59EF2B.792801E5@ActiveState.com>
Message-ID: <14938.6594.44596.509259@beluga.mojam.com>

    Paul> It's not about keeping people out of your module.  In fact I would
    Paul> propose that mod.__dict__ should be as loose as ever.

Okay, how about this as a compromise first step?  Allow programmers to put
__exports__ lists in their modules but don't do anything with them *except*
modify dir() to respect that if it exists?  That would pretty up dir()
output for newbies, almost certainly not break anything, improve the
internal documentation of the modules that use __exports__, and still allow
us to move in a more restrictive direction at a later time if we so choose.

Skip



From moshez at zadka.site.co.il  Tue Jan  9 05:04:23 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 06:04:23 +0200 (IST)
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: <3A5A02AA.675A35D1@lemburg.com>
References: <3A5A02AA.675A35D1@lemburg.com>, <200101081751.SAA08918@pandora.informatik.hu-berlin.de>
Message-ID: <20010109040423.68AA4A82D@darjeeling.zadka.site.co.il>

On Mon, 08 Jan 2001 19:10:50 +0100, "M.-A. Lemburg" <mal at lemburg.com> wrote:

> Hmm, but what if the Python script picks up a site.py which is
> different from the standard one distributed with Python ?

Then the site.py can do whatever it wants.
No need to go through PTHs
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tim.one at home.com  Mon Jan  8 20:59:48 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 14:59:48 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <20010108130137.E22834@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEGHIHAA.tim.one@home.com>

Quickie:

[Guido]
> Eric, before we go furhter, can you give an exact definition of
> EOFness to me?

[Eric]
> A file is at EOF when attempts to read more data from it will fail
> returning no data.

To be very clear about this, that's not what C's feof() means:  in general,
the end-of-file indicator in std C stream input is set only *after* you've
attempted a read that "didn't work".  For example,

#include <stdio.h>

void
main()
{
	FILE* fp = fopen("guts", "wb");
	fputs("abc", fp);
	fclose(fp);
	fp = fopen("guts", "rb");
	for (;;) {
		int c;
		c = getc(fp);
		printf("getc returned %c (%d)\n", c, c);
		printf("At EOF after getc? %d\n", feof(fp));
		if (c == EOF)
			break;
	}
}

Unless your C is broken, feof() will return 0 after getc() returns 'a', and
again after 'b', and again after 'c'.  It's not until getc() returns EOF
that feof() first returns a non-zero result.

Then add these two lines after the "for":

	fseek(fp, 0L, SEEK_END);
	printf("after seeking to the end, feof() says %d\n", feof(fp));

Unless your fseek() is non-std, that clears the end-of-file indicator, and
regardless of to where you seek.  So the std behavior throughout libc is
much like Python's behavior:  there's nothing that can tell you whether
you're at the end of the file, in general, short of trying to read and
failing to get something back.

In your case you seem to *know* that you have a "plain old file", meaning
that its size is well-defined and that ftell() makes sense for it.  You also
seem to know that you don't have to worry about anyone else, e.g., appending
to it (or in any other way changing its size, or changing your stream's file
position), while you're mucking with it.  So why not just do f.tell() and
compare that to the size yourself?  This sounds easy for you to do, but in
this particular case you enjoy the benefits of a world of assumptions that
aren't true in general.

> ...
> This is where it bites that I can't test for EOF with a read(0).

You can't in std C using an fread of 0 bytes either -- that has no effect on
the end-of-file indicator.  Add

		if (c == 'c') {
			char buf[100];
			size_t i = fread(buf, 1, 0, fp);
			printf("after fread of 0 bytes, feof() says %d\n",
			       feof(fp));
		}

before the "(c == EOF)" test above to try that on your platform.

> ...
> I would be quite surprised if the plain-file case didn't work on Mac
> and Windows.

Don't know about Mac.  On Windows everything is grossly complicated because
of line-end translations in text mode.  Like the C std says, the only
*portable* thing you can do with an ftell() result for a text file is feed
it back unaltered to fseek().  It so happens that on Windows, using MS's
libc, if f.readline() returns "abc\n" for the first line of a native text
file, f.tell() returns 5, reflecting the actual byte offset in the file
(including the \r that .readline() doesn't show you).  So you *can* get away
with comparing f.tell() to the file's size on Windows too (using the MS C
compiler; don't know about others).

the-operational-defn-of-eof-is-the-only-portable-defn-
    there-is-ly y'rs  - tim




From moshez at zadka.site.co.il  Tue Jan  9 05:08:29 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 06:08:29 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <14938.6594.44596.509259@beluga.mojam.com>
References: <14938.6594.44596.509259@beluga.mojam.com>, <20010107232532.V17220@lyra.org>
	<20010106110033.52127A84F@darjeeling.zadka.site.co.il>
	<LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
	<20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>
	<3A59EF2B.792801E5@ActiveState.com>
Message-ID: <20010109040829.BDB66A82D@darjeeling.zadka.site.co.il>

[Paul Prescod] 
> It's not about keeping people out of your module.  In fact I would
> propose that mod.__dict__ should be as loose as ever.

[Skip Montanaro]
> Okay, how about this as a compromise first step?  Allow programmers to put
> __exports__ lists in their modules but don't do anything with them *except*
> modify dir() to respect that if it exists?  That would pretty up dir()
> output for newbies, almost certainly not break anything, improve the
> internal documentation of the modules that use __exports__, and still allow
> us to move in a more restrictive direction at a later time if we so choose.

I'm +1 on that personally. 
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From mal at lemburg.com  Mon Jan  8 21:38:00 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 21:38:00 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com>  
	            <3A5A09A5.D0DC33A1@lemburg.com> <200101081936.OAA05440@cj20424-a.reston1.va.home.com>
Message-ID: <3A5A2528.C289BE1D@lemburg.com>

Guido van Rossum wrote:
> 
> > Still, wouldn't it be wise to add some logic to Python to prevent
> > importing untrusted modules, e.g. by making sys.path read-only and
> > disabling the import hook usage using a command line ?
> >
> > This would at least prevent the most obvious attacks. I wonder how
> > RedHat works around these problems.
> 
> I don't understand what kind of attacks you are thinking of.  What
> would making sys.path read-only prevent?  You seem to be thinking that
> some malicious piece of code could try to subvert you by setting
> sys.path.  But what you forget is that if this piece of code cannot be
> trusted wiuth sys.path, it should not be trusted to run at all!

I was thinking an attack where knowledge of common temporary
execution locations is used to trick Python into executing
untrusted code -- the untrusted code would only have to be
copied to the known temporary execution directory and then
gets executed by Python next time the program using the temporary
location is invoked.

But you're right: this is possible with and without sys.path being
writeable or not.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Mon Jan  8 21:45:57 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 21:45:57 +0100
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <200101081427.JAA03146@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 08, 2001 at 09:27:50AM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com>
Message-ID: <20010108214557.H402@xs4all.nl>

On Mon, Jan 08, 2001 at 09:27:50AM -0500, Guido van Rossum wrote:
> > You may be right.  Still, this patch solves the immediate problem in a
> > reasonably clean way, and I urge that it should go in.  We can do a
> > more complete reorganization of the build process later.  (I'll help with
> > that; I'm pretty expert with autoconf and friends.)

> I expect Andrew's code to go in before 2.1 is released.  So I don't
> see a reason why we should hurry and check in a stop-gap measure.

Oh, we're gonna distribute binaries of Python 2.0/1.5.2-with-distutils for
every known platform that can run configure ? :) I still think there are
more than enough platforms without Python to warrant using autoconf for
configuring modules. The module list and their demands are stable enough to
make maintenance a fair breeze, IMHO.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From akuchlin at mems-exchange.org  Mon Jan  8 22:57:58 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 16:57:58 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <20010108214557.H402@xs4all.nl>; from thomas@xs4all.net on Mon, Jan 08, 2001 at 09:45:57PM +0100
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com> <20010108214557.H402@xs4all.nl>
Message-ID: <20010108165758.B9260@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 09:45:57PM +0100, Thomas Wouters wrote:
>every known platform that can run configure ? :) I still think there are
>more than enough platforms without Python to warrant using autoconf for
>configuring modules. The module list and their demands are stable enough to
>make maintenance a fair breeze, IMHO.

Umm... the proposed PEP 229 patch would compile a Python binary with
sre, posix, and strop statically linked; this minimal Python is then
used to run the setup.py script.  You shouldn't require a preinstalled
Python, though the current version of the patch doesn't meet this
requirement yet.

--amk




From tim.one at home.com  Mon Jan  8 21:59:40 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 15:59:40 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <3A59F00E.53A0A32A@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>

[Tim]
> Perl appears to ignore the issue of thread safety here (on Windows and
> everywhere else).

[Paul Prescod]
> If you can create a sample program that demonstrates the unsafety
> I'll anonymously submit it as a bug on our internal system

I don't want to spend time on that, as I *assume* it's already well-known
within the Perl thread community.  Besides, the last version of Perl I got
from ActiveState <wink> complains:

     No threads in this perl at temp.pl line 14

if I try to use Perl threads.  That's:

> \perl\bin\perl -v

This is perl, v5.6.0 built for MSWin32-x86-multi-thread
(with 1 registered patch, see perl -V for more detail)

Copyright 1987-2000, Larry Wall

Binary build 620 provided by ActiveState Tool Corp.
http://www.ActiveState.com
Built 18:31:05 Oct 31 2000

...

If I can repair that by downloading a more recent release, let me know.

> and ensure that the next version of Perl is as slow as Python. :)

I don't want to slow them down!  To the contrary, now I've got a solid
reason for why I keep using Perl for simple high-volume text-crunching jobs
<wink>.

> Seriously: If someone comes at me with Perl-IO-is-way-faster-than-
> Python-IO, I'd like to know what concretely they've given up in order
> to achieve that performance.

My line-at-a-time test case used (rounding to nearest whole integers) 30
seconds in Python and 6 in Perl.  The result of testing many changes to
Python's implementation was that the excess 24 seconds broke down like so:

    17   spent inside internal MS threadsafe getc() lock/unlock
             routines
     5   uncertain, but evidence suggests much of it due to MS
             malloc/realloc (Perl does its own memory mgmt)
     2   for not copying directly out of the platform FILE*
             implementation struct in a highly optimized loop (like
             Perl does)

My last checkin to fileobject.c reclaimed 17 seconds on Win98SE while
remaining threadsafe, via a combination of locking per line instead of per
character, and invoking realloc much less often (only for lines exceeding
200 chars).  (BTW, I'm still curious to know how that compares to the
getc_unlocked hack on a platform other than Windows!)

> And even just for my own interest I'd like to understand the cost/
> benefit of stream thread safety.

If you're not *using* threads, or not using them to muck with the same
stream at the same time, the ratio is infinite.  And that's usually the
case.

> For instance would it make sense to just write a thread-safe
> wrapper for streams used from multiple threads?

Alas, on Windows you can't pick and choose:  you get the threadsafe libc, or
you don't.  So long as anyone may want to use threads for any reason
whatsoever, we must link with threadsafe libraries.  But, as above, on
Windows we're not paying much for that anymore in this case (unless maybe
the threadsafe MS malloc family is also outrageously slower than its
careless counterpart ...).  It does prevent me from persuing the "optimized
inner loop" business, because MS doesn't expose its locking primitives (so I
can't do in C everything I would need to do to optimize the inner loop while
remaining threadsafe).

there-are-damn-few-pieces-of-libc-we-wouldn't-be-better-off-
    writing-ourselves-but-then-we'd-have-a-much-harder-time-
    playing-with-others'-code-ly y'rs  - tim




From akuchlin at mems-exchange.org  Mon Jan  8 22:15:34 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 16:15:34 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 03:59:40PM -0500
References: <3A59F00E.53A0A32A@ActiveState.com> <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>
Message-ID: <20010108161534.A2392@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 03:59:40PM -0500, Tim Peters wrote:
>200 chars).  (BTW, I'm still curious to know how that compares to the
>getc_unlocked hack on a platform other than Windows!)

On Solaris and Linux, the results seemed to be lost in the noise.
Repeated runs of filetest.py were sometimes faster than without
USE_MS_GETLINE_HACK, so the variation is probably large enough to
swamp any difference between the two.  (Assuming I enabled the getline
hack correctly of course; someone please replicate...)

--amk

Linux: w/o USE_MS_GETLINE_HACK
kronos Python-2.0>./python ~/filetest.py
total 1559913 chars and 32513 lines
count_chars_lines     0.186  0.190
readlines_sizehint    0.108  0.110
using_fileinput       0.447  0.450
while_readline        0.184  0.180

Linux w/ USE_MS_GETLINE_HACK:
kronos Python-2.0>./python ~/filetest.py
total 1559913 chars and 32513 lines
count_chars_lines     0.178  0.180
readlines_sizehint    0.108  0.110
using_fileinput       0.434  0.430
while_readline        0.183  0.190                                              
Solaris w/o USE_MS_GETLINE_HACK:
amarok src>./python ~/filetest.py
total 1559913 chars and 32513 lines
count_chars_lines     0.640  0.630
readlines_sizehint    0.278  0.280
using_fileinput       1.874  1.820
while_readline        0.839  0.840

Solaris w/ USE_MS_GETLINE_HACK:
amarok src>./python ~/filetest.py
total 1559913 chars and 32513 lines
count_chars_lines     0.569  0.570
readlines_sizehint    0.275  0.280
using_fileinput       1.902  1.900
while_readline        0.769  0.770



From gstein at lyra.org  Mon Jan  8 22:29:40 2001
From: gstein at lyra.org (Greg Stein)
Date: Mon, 8 Jan 2001 13:29:40 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010108161534.A2392@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Mon, Jan 08, 2001 at 04:15:34PM -0500
References: <3A59F00E.53A0A32A@ActiveState.com> <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com> <20010108161534.A2392@kronos.cnri.reston.va.us>
Message-ID: <20010108132940.G4141@lyra.org>

On Mon, Jan 08, 2001 at 04:15:34PM -0500, Andrew Kuchling wrote:
> On Mon, Jan 08, 2001 at 03:59:40PM -0500, Tim Peters wrote:
> >200 chars).  (BTW, I'm still curious to know how that compares to the
> >getc_unlocked hack on a platform other than Windows!)
> 
> On Solaris and Linux, the results seemed to be lost in the noise.

Your times are so small... I'd suggest do a few iterations within
filetest.py so your margin of error isn't so noticable.

Cheers,
-g

>...
> Linux: w/o USE_MS_GETLINE_HACK
> kronos Python-2.0>./python ~/filetest.py
> total 1559913 chars and 32513 lines
> count_chars_lines     0.186  0.190
> readlines_sizehint    0.108  0.110
> using_fileinput       0.447  0.450
> while_readline        0.184  0.180
> 
> Linux w/ USE_MS_GETLINE_HACK:
> kronos Python-2.0>./python ~/filetest.py
> total 1559913 chars and 32513 lines
> count_chars_lines     0.178  0.180
> readlines_sizehint    0.108  0.110
> using_fileinput       0.434  0.430
> while_readline        0.183  0.190                                              
> Solaris w/o USE_MS_GETLINE_HACK:
> amarok src>./python ~/filetest.py
> total 1559913 chars and 32513 lines
> count_chars_lines     0.640  0.630
> readlines_sizehint    0.278  0.280
> using_fileinput       1.874  1.820
> while_readline        0.839  0.840
> 
> Solaris w/ USE_MS_GETLINE_HACK:
> amarok src>./python ~/filetest.py
> total 1559913 chars and 32513 lines
> count_chars_lines     0.569  0.570
> readlines_sizehint    0.275  0.280
> using_fileinput       1.902  1.900
> while_readline        0.769  0.770

-- 
Greg Stein, http://www.lyra.org/



From thomas at xs4all.net  Mon Jan  8 22:59:17 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 22:59:17 +0100
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <20010108165758.B9260@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Mon, Jan 08, 2001 at 04:57:58PM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com> <20010108214557.H402@xs4all.nl> <20010108165758.B9260@kronos.cnri.reston.va.us>
Message-ID: <20010108225916.P2467@xs4all.nl>

On Mon, Jan 08, 2001 at 04:57:58PM -0500, Andrew Kuchling wrote:

> Umm... the proposed PEP 229 patch would compile a Python binary with
> sre, posix, and strop statically linked; this minimal Python is then
> used to run the setup.py script.  You shouldn't require a preinstalled
> Python, though the current version of the patch doesn't meet this
> requirement yet.

Apologies. I should've bothered to read the PEP first, but I haven't found
the time yet :P I retract all my comments on the subject until I do.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Mon Jan  8 23:08:50 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 23:08:50 +0100
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010109000300.DF2A5A82D@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Tue, Jan 09, 2001 at 02:03:00AM +0200
References: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>, <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com> <20010108002603.X17220@lyra.org> <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il> <200101081515.KAA03474@cj20424-a.reston1.va.home.com> <20010109000300.DF2A5A82D@darjeeling.zadka.site.co.il>
Message-ID: <20010108230850.Q2467@xs4all.nl>

On Tue, Jan 09, 2001 at 02:03:00AM +0200, Moshe Zadka wrote:

> > (2) Under exactly what circumstances do you want from foo import *
> >     issue a warning?

> All.
> If you want to be less extreme, don't warn if the module defines
> a __from_star_ok__

We already have a perfectly acceptable way of turning off warnings in
particular circumstances. I'm +1 on warning against using 'from spam import
*' by the way, though it would be even better (+2!) if there was a 'import *
considered harmful' page/chapter in the documentation somewhere, so we could
point to it.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Mon Jan  8 23:23:02 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 17:23:02 -0500
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: Your message of "Mon, 08 Jan 2001 21:38:00 +0100."
             <3A5A2528.C289BE1D@lemburg.com> 
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com> <3A5A09A5.D0DC33A1@lemburg.com> <200101081936.OAA05440@cj20424-a.reston1.va.home.com>  
            <3A5A2528.C289BE1D@lemburg.com> 
Message-ID: <200101082223.RAA05858@cj20424-a.reston1.va.home.com>

> I was thinking an attack where knowledge of common temporary
> execution locations is used to trick Python into executing
> untrusted code -- the untrusted code would only have to be
> copied to the known temporary execution directory and then
> gets executed by Python next time the program using the temporary
> location is invoked.

When does Python execute code from a predictable common temporary
location?  When is that likely to be used from a Python script running
as root?

Note that if you use tempfile.TemporaryFile(), you can create a
temporary file that's not subvertible.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at acm.org  Mon Jan  8 23:35:17 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 8 Jan 2001 17:35:17 -0500 (EST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010108230850.Q2467@xs4all.nl>
References: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>
	<200101081433.JAA03185@cj20424-a.reston1.va.home.com>
	<20010107232532.V17220@lyra.org>
	<LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>
	<20010108002603.X17220@lyra.org>
	<20010108231321.AB08FA82D@darjeeling.zadka.site.co.il>
	<20010109000300.DF2A5A82D@darjeeling.zadka.site.co.il>
	<20010108230850.Q2467@xs4all.nl>
Message-ID: <14938.16549.944123.917467@cj42289-a.reston1.va.home.com>

Thomas Wouters writes:
 > *' by the way, though it would be even better (+2!) if there was a 'import *
 > considered harmful' page/chapter in the documentation somewhere, so we could
 > point to it.

  Care to write it?


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From MarkH at ActiveState.com  Tue Jan  9 00:00:01 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Mon, 8 Jan 2001 15:00:01 -0800
Subject: [Python-Dev] Create a synthetic stdout for Windows?
In-Reply-To: <3A5A05DA.86B3EB86@interet.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPEEEGCOAA.MarkH@ActiveState.com>

> Limiting the code to pythonw.exe instead of trying to install
> it in python20.dll was supposed to prevent damage to the use
> of Python in servers.  Since pythonw.exe is a Windows (GUI) program,
> I am assuming there is a screen.

Sometimes _no_ screen at all is wanted - ie, no main GUI window, and no
console window.  pythonw is used in this case.  COM uses pythonw.exe in just
this way, and when executed by DCOM, it will be executed in a context where
the user can not see any such dialog.

However, I would be happy to ensure the correct command-line is used to
prevent this behaviour in this case.

Indeed, in _every_ case I use pythonw.exe I would disable this - but I
accept that other users have simpler requirements.

> Having said that, you may be right that there is some way to
> hang on a dialog box which can not be seen.  It depends on what
> MessageBox() and GetForegroundWindow() actually do.  If it seems
> that this patch has merit, I would be grateful if you would review
> the code to look for issues of this type.

There will be no issues in the code - it is just that Win2k will execute in
a different "workspace" (I think that is the term).  This is identical to
the problem of a service attempting to display a messagebox - the code is
perfect and works perfectly - just in a context where noone can see it, or
dismiss it.

> > I would prefer to see a decent API for extracting error and traceback
> > information from Python.  On the other hand, I _do_ see the problem for
> > "newbies" trying to use pythonw.exe.
>
> There could be an API added to the winstdout module such as
>   msg = winstdout.GetMessageText()
> which would return saved text, control its display etc.

I was thinking more of a "Py_GetTraceback()", which would return a complete
exception string.

Thus, embedders could write code similar to:

  whatever = Py_BuildValue(...);
  ret = PyObject_Call(foo, whatever);
  ...
  if (!ok) {
    char *text = Py_GetTraceback();
    MsgBox(text);
  }

Thus, with only a small amount of work, they have _complete_ control over
the output.  However, I agree this doesnt really solve pythonw.exe's
problems.

> I do not view winstdout as a "newbie" feature, but rather a
> generally useful C-language addition to Python.

Hrm.  I dont believe a commercial app, for example, would find this
suitable - they would roll their own solution.

Hence I see this purely for newbie users.  Advanced users have complete
control now - a simple try/except block around their main code, and you are
pretty good.  A builtin module for displaying a messagebox is as robust as
an experienced user needs to emulate this, IMO.

> I guess I am saying, perhaps incorrectly, that the mechanism provided
> will make further redirection of sys.stdout unnecessary 99% of the
> time.

Yes, I disagree here.  IMO it is no good for a commercial, real app.  As I
said, I see this as a feature so the newbie will not believe pythonw.exe is
broken.  Advanced users can already do similar things themselves.

> Experimentation shows that Python composes tracebacks and
> error messages a line or partial line at a time.  That is, you can
> not display each call to printf(), but must wait until the system is
> idle to be sure that multiple calls to printf() are complete.  So this
> forces you to use the idle processing loop, not rocket science but
> at least inconvenient.

What "idle processing loop"?

> And the only source of stdout/err is tracebacks,
> error messages and the "print" statement.  What would you do with
> these in a Windows program except display an "OK" dialog box?

Log the error to a file, and display a "friendly" dialog - possibly offering
to automatically submit a support request/bug report.

The casual user is going to be _very_ scared by a Python traceback.  This is
a sin of a similar magnitude to those crappy applications with unhandled VB
exceptions.

IMO, nothing looks more unprofessional than an app that displays an internal
VB error message.  Python is no different IMO.  For real applications, there
is a good chance that the majority of your users have never heard of Python.

Thus, I don't believe your solution suitable for the real, professional,
commercial user.  However, I agree that your solution does not prevent this
user doing the "right thing"...

But all this does keep me believing this is a "newbie" helper.

>
> If someone out there knows of a different example of sys.stdout
> redirection in use in the real world, it would be helpful if
> they would describe it.  Maybe it could be incorporated.

Sure.  Komodo to a file with a friendly dialog (sometimes ;-).

Pythonwin actually attempts a few things first - eg, not every exception
Pythonwin casues at startup should be logged.

Python services write unhandled errors to the event log.

I don't believe I have worked on 2 projects with the same requirement
here!!!

Mark.




From nas at arctrix.com  Mon Jan  8 17:22:10 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 8 Jan 2001 08:22:10 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 03:59:40PM -0500
References: <3A59F00E.53A0A32A@ActiveState.com> <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>
Message-ID: <20010108082210.A16149@glacier.fnational.com>

On Mon, Jan 08, 2001 at 03:59:40PM -0500, Tim Peters wrote:
> My line-at-a-time test case used (rounding to nearest whole integers) 30
> seconds in Python and 6 in Perl.  The result of testing many changes to
> Python's implementation was that the excess 24 seconds broke down like so:
> 
>     17   spent inside internal MS threadsafe getc() lock/unlock
>              routines
>      5   uncertain, but evidence suggests much of it due to MS
>              malloc/realloc (Perl does its own memory mgmt)
>      2   for not copying directly out of the platform FILE*
>              implementation struct in a highly optimized loop (like
>              Perl does)

Have you tried pymalloc?  

  Neil



From billtut at microsoft.com  Tue Jan  9 01:38:14 2001
From: billtut at microsoft.com (Bill Tutt)
Date: Mon, 8 Jan 2001 16:38:14 -0800 
Subject: [Python-Dev] Create a synthetic stdout for Windows?
Message-ID: <58C671173DB6174A93E9ED88DCB0883D0A6202@red-msg-07.redmond.corp.microsoft.com>

> From: 	Mark Hammond [mailto:MarkH at ActiveState.com] 

> There will be no issues in the code - it is just that Win2k will execute
in
> a different "workspace" (I think that is the term).  This is identical to
> the problem of a service attempting to display a messagebox - the code is
> perfect and works perfectly - just in a context where noone can see it, or
> dismiss it.


The term Mark is looking for here is Windowstation, and it's an NT thing,
not just a Win2k thing. Windowstations have been around for ages.

Bill



From ping at lfw.org  Tue Jan  9 02:51:15 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 8 Jan 2001 17:51:15 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <14938.6594.44596.509259@beluga.mojam.com>
Message-ID: <Pine.LNX.4.10.10101081749580.5156-100000@skuld.kingmanhall.org>

On Mon, 8 Jan 2001, Skip Montanaro wrote:
> Okay, how about this as a compromise first step?  Allow programmers to put
> __exports__ lists in their modules but don't do anything with them *except*
> modify dir() to respect that if it exists?

I'd say: Just have dir() and import * pay attention to __exports__.
Don't mess with getattr or __dict__.


-- ?!ng

Happiness comes more from loving than being loved; and often when our
affection seems wounded it is is only our vanity bleeding. To love, and
to be hurt often, and to love again--this is the brave and happy life.
    -- J. E. Buchrose 




From ping at lfw.org  Tue Jan  9 03:00:08 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 8 Jan 2001 18:00:08 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59F27D.C27B8CD0@ActiveState.com>
Message-ID: <Pine.LNX.4.10.10101081751530.5156-100000@skuld.kingmanhall.org>

On Mon, 8 Jan 2001, Paul Prescod wrote:
> dir() is one of the "interactive tools" I'd like to work better in the
> presence of __exports__. On the other hand, dir() works pretty poorly
> for object instances today so maybe we need something new anyhow. 

I suggest a built-in function "methods()" that works like this:

    def methods(obj):
        if type(obj) is InstanceType: return methods(obj.__class__)
        results = []
        if hasattr(obj, '__bases__'):
            for base in obj.__bases__:
                results.extend(methods(base))
        results.extend(
            filter(lambda k, o=obj: type(getattr(o, k)) in
                   [MethodType, BuiltinMethodType], dir(obj)))
        return unique(results)

    def unique(seq):
        dict = {}
        for item in seq: dict[item] = 1
        results = dict.keys()
        results.sort()
        return results


    >>> import sys
    >>> 
    >>> methods(sys.stdin)
    ['close', 'fileno', 'flush', 'isatty', 'read', 'readinto', 'readline', 'readlines', 'seek', 'tell', 'truncate', 'write', 'writelines']
    >>>        
    >>> import SocketServer
    >>> 
    >>> methods(SocketServer.ForkingTCPServer)
    ['__init__', 'collect_children', 'fileno', 'finish_request', 'get_request', 'handle_error', 'handle_request', 'process_request', 'serve_forever', 'server_activate', 'server_bind', 'verify_request']
    >>> 



-- ?!ng

Happiness comes more from loving than being loved; and often when our
affection seems wounded it is is only our vanity bleeding. To love, and
to be hurt often, and to love again--this is the brave and happy life.
    -- J. E. Buchrose 




From gstein at lyra.org  Tue Jan  9 03:20:56 2001
From: gstein at lyra.org (Greg Stein)
Date: Mon, 8 Jan 2001 18:20:56 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects fileobject.c,2.102,2.103
In-Reply-To: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Mon, Jan 08, 2001 at 06:00:13PM -0800
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010108182056.C4640@lyra.org>

On Mon, Jan 08, 2001 at 06:00:13PM -0800, Guido van Rossum wrote:
>...
> Modified Files:
> 	fileobject.c 
> Log Message:
> Tsk, tsk, tsk.  Treat FreeBSD the same as the other BSDs when defining
> a fallback for TELL64.  Fixes SF Bug #128119.
>...
> *** fileobject.c	2001/01/08 04:02:07	2.102
> --- fileobject.c	2001/01/09 02:00:11	2.103
> ***************
> *** 59,63 ****
>   #if defined(MS_WIN64)
>   #define TELL64 _telli64
> ! #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
>   /* NOTE: this is only used on older
>      NetBSD prior to f*o() funcions */
> --- 59,63 ----
>   #if defined(MS_WIN64)
>   #define TELL64 _telli64
> ! #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(__FreeBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
>   /* NOTE: this is only used on older
>      NetBSD prior to f*o() funcions */

All of those #ifdefs could be tossed and it would be more robust (long term)
if an autoconf macro were used to specify when TELL64 should be defined.

[ I've looked thru fileobject.c and am a bit confused: the conditions for
  defining TELL64 do not match the conditions for *using* it. that would
  seem to imply a semantic error somewhere and/or a potential gotcha when
  they get skewed (like I assume what happened to FreeBSD). simplifying with
  an autoconf macro may help to rationalize it. ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From tim.one at home.com  Tue Jan  9 05:29:02 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 23:29:02 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010108161534.A2392@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHPIHAA.tim.one@home.com>

[Andrew Kuchling]

I'll chop everything except while_readline (which is most affected by this
stuff):

> Linux: w/o USE_MS_GETLINE_HACK
> while_readline        0.184  0.180
>
> Linux w/ USE_MS_GETLINE_HACK:
> while_readline        0.183  0.190
>
> Solaris w/o USE_MS_GETLINE_HACK:
> while_readline        0.839  0.840
>
> Solaris w/ USE_MS_GETLINE_HACK:
> while_readline        0.769  0.770

So it's probably a wash.  In that case, do we want to maintain two hacks for
this?  I can't use the FLOCKFILE/etc approach on Windows, while "the
Windows" approach probably works everywhere (although its speed relies on
the platform factoring out at least the locking/unlocking in fgets).

Both methods lack a refinement I would like to see, but can't achieve in
"the Windows way":  ensure that consistency is on no worse than a per-line
basis.  Right now, both methods lock/unlock the file only for the extent of
the current buffer size, so that two threads *can* get back different
interleaved pieces of a single long line.  Like so:

import thread

def read(f):
    x = f.readline()
    print "thread saw " + `len(x)` + " chars"
    m.release()

f = open("ga", "w") # a file with one long line
f.write("x" * 100000 + "\n")
f.close()

m = thread.allocate_lock()
for i in range(10):
    print i
    f = open("ga", "r")
    m.acquire()
    thread.start_new_thread(read, (f,))
    x = f.readline()
    print "main saw " + `len(x)` + " chars"
    m.acquire(); m.release()
    f.close()

Here's a typical run on Windows (current CVS Python):

0
main saw 95439 chars
thread saw 4562 chars
1
main saw 97941 chars
thread saw 2060 chars
2
thread saw 43801 chars
main saw 56200 chars
3
thread saw 8011 chars
main saw 91990 chars
4
main saw 46546 chars
thread saw 53455 chars
5
thread saw 53125 chars
main saw 46876 chars
6
main saw 98638 chars
thread saw 1363 chars
7
main saw 72121 chars
thread saw 27880 chars
8
thread saw 70031 chars
main saw 29970 chars
9
thread saw 27555 chars
main saw 72446 chars

So, yes, it's threadsafe now:  between them, the threads always see a grand
total of 100001 characters.  But what friggin' good is that <wink>?  If,
e.g., Guido wants multiple threads to chew over his giant logfile, there's
no guarantee that .readline() ever returns an actual line from the file.

Not that Python 2.0 was any better in this respect ...




From tim.one at home.com  Tue Jan  9 05:48:25 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 23:48:25 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010108082210.A16149@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEIAIHAA.tim.one@home.com>

[Tim]
>      5   uncertain, but evidence suggests much of it due to MS
>              malloc/realloc (Perl does its own memory mgmt)

[NeilS]
> Have you tried pymalloc?

Not recently, and don't expect to find time for it this week.  IIRC,
Vladimir did get significant speedups-- lo those many years ago! --when he
tried it on Windows, though.  Maybe (or maybe not) that was due to
exploiting the global lock (i.e., exploiting that pymalloc didn't need to do
its own serialization, when called from the Python core).




From tim.one at home.com  Tue Jan  9 05:52:25 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 23:52:25 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHPIHAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEIAIHAA.tim.one@home.com>

[Tim]
> ...
> Here's a typical run on Windows (current CVS Python):
>
> 0
> main saw 95439 chars
> thread saw 4562 chars
> 1
> main saw 97941 chars
> thread saw 2060 chars
> 2
> thread saw 43801 chars
> main saw 56200 chars
> 3
> thread saw 8011 chars
> main saw 91990 chars
> 4
> main saw 46546 chars
> thread saw 53455 chars
> 5
> thread saw 53125 chars
> main saw 46876 chars
> 6
> main saw 98638 chars
> thread saw 1363 chars
> 7
> main saw 72121 chars
> thread saw 27880 chars
> 8
> thread saw 70031 chars
> main saw 29970 chars
> 9
> thread saw 27555 chars
> main saw 72446 chars

Oops!  I lied.  That was the released 2.0.  Current CVS is either better or
worse, depending on whether you think "working" by accident more often is a
good thing or leads to false confidence <wink>:

0
main saw 100001 chars
thread saw 0 chars
1
main saw 100001 chars
thread saw 0 chars
2
main saw 100001 chars
thread saw 0 chars
3
main saw 100001 chars
thread saw 0 chars
4
main saw 100001 chars
thread saw 0 chars
5
thread saw 25802 chars
main saw 74199 chars
6
thread saw 802 chars
main saw 99199 chars
7
main saw 100001 chars
thread saw 0 chars
8
main saw 100001 chars
thread saw 0 chars
9
main saw 100001 chars
thread saw 0 chars




From mal at lemburg.com  Tue Jan  9 08:23:42 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 09 Jan 2001 08:23:42 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com> <3A5A09A5.D0DC33A1@lemburg.com> <200101081936.OAA05440@cj20424-a.reston1.va.home.com>  
	            <3A5A2528.C289BE1D@lemburg.com> <200101082223.RAA05858@cj20424-a.reston1.va.home.com>
Message-ID: <3A5ABC7E.E953962B@lemburg.com>

Guido van Rossum wrote:
> 
> > I was thinking an attack where knowledge of common temporary
> > execution locations is used to trick Python into executing
> > untrusted code -- the untrusted code would only have to be
> > copied to the known temporary execution directory and then
> > gets executed by Python next time the program using the temporary
> > location is invoked.
> 
> When does Python execute code from a predictable common temporary
> location?  When is that likely to be used from a Python script running
> as root?
> 
> Note that if you use tempfile.TemporaryFile(), you can create a
> temporary file that's not subvertible.

It's not Python itself that's running temporary files. Tools
like distutils, RPM, etc. tend to run Python code in temporary
locations during build stages. That's what I was thinking about.
OTOH, root should know where these tools run their code, so
I guess it's moot to discuss who's fault this really is, e.g.
distutils style distributions should never be unzipped to /tmp
for subsequent installation, but nobody will prevent root
from doing so.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Tue Jan  9 08:35:09 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 9 Jan 2001 02:35:09 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEIGIHAA.tim.one@home.com>

[Guido]
> Are you sure Perl still uses stdio at all?

I've got solid answers now, but I'll paraphrase them anonymously to save the
bother of untangling multi-person email etiquette snarls:

+ Yes, Perl uses platform stdio.  Usually.  Yes on Windows anyway.

+ But Perl "cheats" on Windows (well, everywhere it can ...), as I've
explained in great detail half a dozen times over the years.  No reason to
retract any of that.

+ The cheating is not thread-safe.

+ The last stab at threads accessible from Perl was an experiment that got
dropped.  There are no user-muckable threads in std Perl builds.

+ But there is a notion of threads available at the C level.

+ This latter notion of threads is used to implement Perl's fork() on
Windows, so can be exploited to test Windows Perl thread safety without
writing a Perl extension module in C.

+ This Perl program (very much like the 2-threaded one I just posted for
Python) uses that trick:

-------------------------------------------------------------------
sub counter {
    my $nc = 0;
    while (<FILE>) {
        $nc += length;
    }
    print "num bytes seen = $nc\n";
}

open(FILE, "ga");
binmode FILE;

fork();
&counter();
-------------------------------------------------------------------

Under the covers, that really shares the FILE filehandle on Windows via
threads.  Running it multiple times yields multiple wild results; the number
of bytes seen by parent and child rarely sum to the number of bytes actually
in the input file ("ga").  The most common output for me is that one thread
sees the entire file, while the other sees "a lot" of it (since the Perl
inner loop registerizes its FILE* struct member shadows for as long as
possible, that's actually what I expected).

So the code is exactly as thread-unsafe as it looked.

bosses-demand-answers-but-they-forget-their-questions<wink>-ly
    y'rs  - tim




From guido at python.org  Tue Jan  9 14:41:24 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 09 Jan 2001 08:41:24 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Mon, 08 Jan 2001 23:29:02 EST."
             <LNBBLJKPBEHFEDALKOLCIEHPIHAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCIEHPIHAA.tim.one@home.com> 
Message-ID: <200101091341.IAA09132@cj20424-a.reston1.va.home.com>

> So it's probably a wash.  In that case, do we want to maintain two hacks for
> this?  I can't use the FLOCKFILE/etc approach on Windows, while "the
> Windows" approach probably works everywhere (although its speed relies on
> the platform factoring out at least the locking/unlocking in fgets).

I'm much more confident about the getc_unlocked() approach than about
fgets() -- with the latter we need much more faith in the C library
implementers.  (E.g. that fgets() never writes beyond the null bytes
it promises, and that it locks/unlocks only once.)  Also, you're
relying on blindingly fast memchr() and memset() implementations.

> Both methods lack a refinement I would like to see, but can't achieve in
> "the Windows way":  ensure that consistency is on no worse than a per-line
> basis.  [Example omitted]

The only portable way to ensure this that I can see, is to have a
separate mutex in the Python file object.  Since this is hardly a
common thing to do, I think it's better to let the application manage
that lock if they need it.

(Then why are we bothering with flockfile(), you may ask?  Because
otherwise, accidental multithreaded reading from the same file could
cause core dumps.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Tue Jan  9 16:48:13 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Tue, 9 Jan 2001 10:48:13 -0500
Subject: [Python-Dev] Python 2.1 release schedule (PEP 226)
In-Reply-To: <200101051529.KAA19100@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Jan 05, 2001 at 10:29:05AM -0500
References: <200101051529.KAA19100@cj20424-a.reston1.va.home.com>
Message-ID: <20010109104813.D6203@kronos.cnri.reston.va.us>

On Fri, Jan 05, 2001 at 10:29:05AM -0500, Guido van Rossum wrote:
> S   222  pep-0222.txt  Web Library Enhancements               Kuchling
>
>	  This is really up to Andrew.  It seems he plans to create
>	  new modules, so he won't be introducing incompatibilities in
>	  existing APIs.

I don't think PEP 222 will be worked on for 2.1; there have only been
a few reactions, and none at all on the python-web-modules mailing
list, so I don't think anyone really cares very much at this point.
Maybe for 2.2, or maybe I'll just write new classes for Quixote.

That leaves PEP 229 as the only PEP I need to work on for 2.1.

--amk



From tim.one at home.com  Tue Jan  9 22:12:42 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 9 Jan 2001 16:12:42 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101091341.IAA09132@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKAIHAA.tim.one@home.com>

[Guido]
> I'm much more confident about the getc_unlocked() approach than about
> fgets() -- with the latter we need much more faith in the C library
> implementers.  (E.g. that fgets() never writes beyond the null bytes
> it promises, and that it locks/unlocks only once.)  Also, you're
> relying on blindingly fast memchr() and memset() implementations.

Yet Andrew's timings say it's a wash on Linux and Solaris (perhaps even a
bit quicker on Solaris, despite that it's paying an extra layer of function
call per line, to keep it out of get_line proper).  That tells me the
assumptions are indeed mild.  The business about not writing beyond the null
byte is a concern only I would have raised:  the possibility is an
aggressively paranoid reading of the std (I do *lots* of things with libc
I'm paranoid about <0.9 wink>).  If even *Microsoft* didn't blow these
things, it's hard to imagine any other vendor exploding ...

Still, I'd rather get rid of ms_getline_hack if I could, because the code is
so much more complicated.

>> Both methods lack a refinement I would like to see, but can't
>> achieve in "the Windows way":  ensure that consistency is on no
>> worse than a per-line basis.  [Example omitted]

> The only portable way to ensure this that I can see, is to have a
> separate mutex in the Python file object.  Since this is hardly a
> common thing to do, I think it's better to let the application manage
> that lock if they need it.

Well, it would be easy to fiddle the HAVE_GETC_UNLOCKED method to keep the
file locked until the line was complete, and I wouldn't be opposed to making
life saner on platforms that allow it.  But there's another problem here:
part of the reason we release Python threads around the fgets is in case
some other thread is trying to write the data we're trying to read, yes?
But since FLOCKFILE is in effect, other threads *trying* to write to the
stream we're reading will get blocked anyway.  Seems to give us potential
for deadlocks.

> (Then why are we bothering with flockfile(), you may ask?

I wouldn't ask that, no <wink>.

> Because otherwise, accidental multithreaded reading from the same
> file could cause core dumps.)

Ugh ... turns out that on my box I can provoke core dumps anyway, with this
program.  Blows up under released 2.0 and CVS Pythons (so it's not due to
anything new):

import thread

def read(f):
    import time
    time.sleep(.01)
    n = 0
    while n < 1000000:
        x = f.readline()
        n += len(x)
        print "r",
    print "read " + `n`
    m.release()

m = thread.allocate_lock()
f = open("ga", "w+")
print "opened"
m.acquire()
thread.start_new_thread(read, (f,))
n = 0
x = "x" * 113 + "\n"
while n < 1000000:
    f.write(x)
    print "w",
    n += len(x)
m.acquire()
print "done"

Typical run:

C:\Python20>\code\python\dist\src\pcbuild\python temp.py
opened
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w r w r
w r w r w r w r w r w r w r w r w r w r w r w r w r w r w r w r w
r r w r w r w r w r w r

and then it dies in msvcrt.dll with a bad pointer.  Also dies under the
debugger (yay!) ... always dies like so:

+ We (Python) call the MS fwrite, from fileobject.c file_write.
+ MS fwrite succeeds with its _lock_str(stream) call.
+ MS fwrite then calls MS _fwrite_lk.
+ MS _fwrite_lk calls memcpy, which blows up for a non-obvious reason.

Looks like the stream's _cnt member has gone mildly negative, which
_fwrite_lk casts to unsigned and so treats like a giant positive count, and
so memcpy eventually runs off the end of the process address space.

Only thing I can conclude from this is that MS's internal stream-locking
implementation is buggy.  At least on W98SE.  Other flavors of Windows?
Other platforms?

Note that I don't claim the program above is *sensible*, just that it
shouldn't blow up.  Alas, short of indeed adding a separate mutex in Python
file objects-- or writing our own stdio --I don't believe I can fix this.

the-best-thing-to-do-with-threads-is-don't-ly y'rs  - tim




From fdrake at acm.org  Tue Jan  9 23:58:49 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 9 Jan 2001 17:58:49 -0500 (EST)
Subject: [Python-Dev] Updated development documentation
Message-ID: <14939.38825.218757.535010@cj42289-a.reston1.va.home.com>

  I've just updated the development version of the documentation, but
am not sure the automated notice got sent.
  This version contains a wide variety of smaller updates, plus added
documentation on the fpectl and xreadlines modules.


        http://python.sourceforge.net/devel-docs/


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From MarkH at ActiveState.com  Wed Jan 10 01:00:03 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Tue, 9 Jan 2001 16:00:03 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEKAIHAA.tim.one@home.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPOEFJCOAA.MarkH@ActiveState.com>

> Only thing I can conclude from this is that MS's internal stream-locking
> implementation is buggy.  At least on W98SE.  Other flavors of Windows?
> Other platforms?

Same behaviour on Win2k for me.

Mark.



From tim.one at home.com  Wed Jan 10 01:55:11 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 9 Jan 2001 19:55:11 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEHPIGAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEKMIHAA.tim.one@home.com>

Final report (I've spent way more time on this than I can afford already, so
it's "final" by defn <0.3 wink>).  We started here (on my Win98SE box, using
Guido's test program):

total 117615824 chars and 3237568 lines
count_chars_lines    14.780 14.772
readlines_sizehint    9.390  9.375
using_fileinput      66.130 66.157
while_readline       30.380 30.337

Here's where we are today:

total 117615824 chars and 3237568 lines
count_chars_lines    14.670 14.667
readlines_sizehint    9.500  9.506
using_fileinput      28.670 28.708
while_readline       13.680 13.676
for_xreadlines        7.630  7.635

Same box, same input file, same test program except for this addition:

def for_xreadlines(fn):
    f = open(fn, MODE)
    for line in xreadlines.xreadlines(f):
        pass
    f.close()

This last is within 25% of Perl "while (<>)" speed, but-- unlike Perl --is
thread-safe.  Good show!  The other speedups are nothing to snort at either.

The strangest thing left to my eye is why xreadlines enjoys a significant
advantage over the double-loop buffering method (readlines_sizehint) on my
box; reducing the very large (1Mb) buffer in Guido's test program made no
material difference to that.

nothing's-ever-finished-but-everything-ends-ly y'rs  - tim




From tim.one at home.com  Wed Jan 10 06:46:24 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 10 Jan 2001 00:46:24 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPOEFJCOAA.MarkH@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKELFIHAA.tim.one@home.com>

[Tim]
> Only thing I can conclude from this is that MS's internal stream-
> locking implementation is buggy.  At least on W98SE.  Other flavors
> of Windows?  Other platforms?

[Mark Hammond]
> Same behaviour on Win2k for me.

Thanks, Mark!  I opened a bug on SF to record more clues:

http://sourceforge.net/bugs/?func=detailbug&bug_id=128210&group_id=5470

I didn't assign it to anyone because-- best I can tell --there's nothing
realistic we can do about it.  Probably won't happen in practice anyway
<wink>.

there's-a-reason-thread-problems-pop-up-on-windows-first-but-
    ms-isn't-it-ly y'rs  - tim




From billtut at microsoft.com  Wed Jan 10 10:10:51 2001
From: billtut at microsoft.com (Bill Tutt)
Date: Wed, 10 Jan 2001 01:10:51 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <58C671173DB6174A93E9ED88DCB0883DB863A8@red-msg-07.redmond.corp.microsoft.com>

With a nice simple C test case from Tim, I've submitted this one to internal
support.
I'll let everybody know what happens when I know more.

Bill

 -----Original Message-----
From: 	Tim Peters [mailto:tim.one at home.com] 
Sent:	Tuesday, January 09, 2001 9:46 PM
To:	python-dev at python.org
Subject:	RE: [Python-Dev] xreadlines : readlines :: xrange : range

[Tim]
> Only thing I can conclude from this is that MS's internal stream-
> locking implementation is buggy.  At least on W98SE.  Other flavors
> of Windows?  Other platforms?

[Mark Hammond]
> Same behaviour on Win2k for me.

Thanks, Mark!  I opened a bug on SF to record more clues:

http://sourceforge.net/bugs/?func=detailbug&bug_id=128210&group_id=5470

I didn't assign it to anyone because-- best I can tell --there's nothing
realistic we can do about it.  Probably won't happen in practice anyway
<wink>.

there's-a-reason-thread-problems-pop-up-on-windows-first-but-
    ms-isn't-it-ly y'rs  - tim


_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
http://www.python.org/mailman/listinfo/python-dev



From m.favas at per.dem.csiro.au  Wed Jan 10 12:57:56 2001
From: m.favas at per.dem.csiro.au (Mark Favas)
Date: Wed, 10 Jan 2001 19:57:56 +0800
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
Message-ID: <3A5C4E44.23B593E9@per.dem.csiro.au>

Just Another Data Point - my box (DEC Alpha, Tru64 Unix) shows the same
behaviour as Tim's WinBox wrt the new xreadline and the double-loop
readlines (so it's not just something funny with MS (not that there's
not anything funny with MS...)):

total 131426612 chars and 514216 lines
count_chars_lines     5.450  5.066
readlines_sizehint    4.112  4.083
using_fileinput      10.928 10.916
while_readline       11.766 11.733
for_xreadlines        3.569  3.533

-- 
Mark Favas  -   m.favas at per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA



From tismer at tismer.com  Wed Jan 10 12:06:42 2001
From: tismer at tismer.com (Christian Tismer)
Date: Wed, 10 Jan 2001 13:06:42 +0200
Subject: [Python-Dev] Add __exports__ to modules
References: <Pine.LNX.4.10.10101081749580.5156-100000@skuld.kingmanhall.org>
Message-ID: <3A5C4242.E445C3A1@tismer.com>


Ka-Ping Yee wrote:
> 
> On Mon, 8 Jan 2001, Skip Montanaro wrote:
> > Okay, how about this as a compromise first step?  Allow programmers to put
> > __exports__ lists in their modules but don't do anything with them *except*
> > modify dir() to respect that if it exists?
> 
> I'd say: Just have dir() and import * pay attention to __exports__.
> Don't mess with getattr or __dict__.

quadruple-nodd - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From mal at lemburg.com  Wed Jan 10 14:21:28 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 10 Jan 2001 14:21:28 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
Message-ID: <3A5C61D8.2E5D098C@lemburg.com>

Guido van Rossum wrote:
> 
> Please have a look at this SF patch:
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> 
> This implements control over which names defined in a module are
> externally visible: if there's a variable __exports__ in the module,
> it is a list of identifiers, and any access from outside the module to
> names not in the list is disallowed.  This affects access using the
> getattr and setattr protocols (which raise AttributeError for
> disallowed names), as well as "from M import v" (which raises
> ImportError).

Can't we use the existing attribute __all__ (this is currently
only used for packages) for this kind of thing. As other have already
remarked: I would rather like to see this attribute being used
as basis for 'from M import *' rather than enforce the access
restrictions like the patch suggests.

Access control mechanisms should be treated in different ways
such as wrapping objects using access-control proxies (see mx.Proxy
for an example of such an implementation) and on-demand only.
I wouldn't wan't to pay the performance hit for each and every
lookup in all my Python applications just because someone out
there feels that "from M import *" has a meaning in life
apart from being useful in interactive sessions to ease typing ;-)
 
> I like it.  This has been asked for many times.  Does anybody see a
> reason why this should *not* be added?
> 
> Tim remarked that introducing this will prompt demands for a similar
> feature on classes and instances, where it will be hard to implement
> without causing a bit of a slowdown.  It causes a slight slowdown (an
> extra dictionary lookup for each use of "M.v") even when it is not
> used, but for accessing module variables that's acceptable.  I'm not
> so sure about instance variable references.

Again, I'd rather see these implemented using different
techniques which are under programmer control and made
explicit and visible in the program flow. Proxies are ideal
for these things, since they allow great flexibility while
still providing reasonable security at Python level.

I have been using the proxy approach for years now and 
so far with great success. What's even better is that
weak references and garbage finalization aids come along with
it for free.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at python.org  Wed Jan 10 16:12:56 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 10:12:56 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 09 Jan 2001 19:55:11 EST."
             <LNBBLJKPBEHFEDALKOLCKEKMIHAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCKEKMIHAA.tim.one@home.com> 
Message-ID: <200101101512.KAA26193@cj20424-a.reston1.va.home.com>

> The strangest thing left to my eye is why xreadlines enjoys a significant
> advantage over the double-loop buffering method (readlines_sizehint) on my
> box; reducing the very large (1Mb) buffer in Guido's test program made no
> material difference to that.

I was baffled at this too (same difference on my box), until I
discovered that the buffer size is specified *twice*: once as a
default in the arg list of readlines_sizehint(), then *again* in the
call to timer() near the bottom of the file.

Take the latter one out and the times are comparable, in fact
readlines_sizehint() is a few percent quicker.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jim at interet.com  Wed Jan 10 16:19:01 2001
From: jim at interet.com (James C. Ahlstrom)
Date: Wed, 10 Jan 2001 10:19:01 -0500
Subject: [Python-Dev] Create a synthetic stdout for Windows?
References: <LCEPIIGDJPKCOIHOBJEPEEEGCOAA.MarkH@ActiveState.com>
Message-ID: <3A5C7D65.780065C6@interet.com>

Mark Hammond wrote:

> Sometimes _no_ screen at all is wanted - ie, no main GUI window, and no
> console window.  pythonw is used in this case.  COM uses pythonw.exe in just
> this way, and when executed by DCOM, it will be executed in a context where
> the user can not see any such dialog.
> 
> However, I would be happy to ensure the correct command-line is used to
> prevent this behaviour in this case.
> 
> Indeed, in _every_ case I use pythonw.exe I would disable this - but I
> accept that other users have simpler requirements.

It would be easier to have a pythonw2.exe where this feature is
built in, rather than a command line option.  But see below.
 
> > I do not view winstdout as a "newbie" feature, but rather a
> > generally useful C-language addition to Python.
> 
> Hrm.  I dont believe a commercial app, for example, would find this
> suitable - they would roll their own solution.
...
> > I guess I am saying, perhaps incorrectly, that the mechanism provided
> > will make further redirection of sys.stdout unnecessary 99% of the
> > time.
> 
> Yes, I disagree here.  IMO it is no good for a commercial, real app.  As I
...
> > If someone out there knows of a different example of sys.stdout
> > redirection in use in the real world, it would be helpful if
> > they would describe it.  Maybe it could be incorporated.
> 
> Sure.  Komodo to a file with a friendly dialog (sometimes ;-).
...
> I don't believe I have worked on 2 projects with the same requirement
> here!!!

Well, that is the problem.  Is this feature "generally useful"?
I am writing Windows programs in which Python is the "main"
and provides the GUI, so I find this useful.  And I do show
my users tracebacks.  But perhaps this is unique to me.  I
don't see users of wxPython nor tkinter replying "great idea"
so maybe they don't use pythonw.

Absent more support, I don't think this idea has enough
merit to justify a patch.

JimA



From guido at python.org  Wed Jan 10 17:39:34 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 11:39:34 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 10 Jan 2001 01:10:51 PST."
             <58C671173DB6174A93E9ED88DCB0883DB863A8@red-msg-07.redmond.corp.microsoft.com> 
References: <58C671173DB6174A93E9ED88DCB0883DB863A8@red-msg-07.redmond.corp.microsoft.com> 
Message-ID: <200101101639.LAA26776@cj20424-a.reston1.va.home.com>

> With a nice simple C test case from Tim, I've submitted this one to internal
> support.
> I'll let everybody know what happens when I know more.

I bet you it's rejected on the basis of "the docs tell you not to mix
reading and writing on the same stream without intervening seek or
flush."  If I were on the support line I would do that.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Wed Jan 10 17:38:16 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 11:38:16 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 09 Jan 2001 16:12:42 EST."
             <LNBBLJKPBEHFEDALKOLCGEKAIHAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCGEKAIHAA.tim.one@home.com> 
Message-ID: <200101101638.LAA26759@cj20424-a.reston1.va.home.com>

> [Guido]
> > I'm much more confident about the getc_unlocked() approach than about
> > fgets() -- with the latter we need much more faith in the C library
> > implementers.  (E.g. that fgets() never writes beyond the null bytes
> > it promises, and that it locks/unlocks only once.)  Also, you're
> > relying on blindingly fast memchr() and memset() implementations.

[Tim]
> Yet Andrew's timings say it's a wash on Linux and Solaris (perhaps even a
> bit quicker on Solaris, despite that it's paying an extra layer of function
> call per line, to keep it out of get_line proper).  That tells me the
> assumptions are indeed mild.  The business about not writing beyond the null
> byte is a concern only I would have raised:  the possibility is an
> aggressively paranoid reading of the std (I do *lots* of things with libc
> I'm paranoid about <0.9 wink>).  If even *Microsoft* didn't blow these
> things, it's hard to imagine any other vendor exploding ...
> 
> Still, I'd rather get rid of ms_getline_hack if I could, because the code is
> so much more complicated.

Which is another argument to prefer the getc_unlocked() code when it
works -- it's obviously correct. :-)

> >> Both methods lack a refinement I would like to see, but can't
> >> achieve in "the Windows way":  ensure that consistency is on no
> >> worse than a per-line basis.  [Example omitted]
> 
> > The only portable way to ensure this that I can see, is to have a
> > separate mutex in the Python file object.  Since this is hardly a
> > common thing to do, I think it's better to let the application manage
> > that lock if they need it.
> 
> Well, it would be easy to fiddle the HAVE_GETC_UNLOCKED method to keep the
> file locked until the line was complete, and I wouldn't be opposed to making
> life saner on platforms that allow it.

Hm...  That would be possible, except for one unfortunate detail:
_PyString_Resize() may call PyErr_BadInternalCall() which touches
thread state.

> But there's another problem here:
> part of the reason we release Python threads around the fgets is in case
> some other thread is trying to write the data we're trying to read, yes?

NO, NO NO!  Mixing reads and writes on the same stream wasn't what we
are locking against at all.  (As you've found out, it doesn't even
work.)  We're only trying to protect against concurrent *reads*.

> But since FLOCKFILE is in effect, other threads *trying* to write to the
> stream we're reading will get blocked anyway.  Seems to give us potential
> for deadlocks.

Only if tyeh are holding other locks at the same time.  I haven't done
a thorough survey of fileobject.c, but I've skimmed it, I believe it's
religious about releasing the Global Interpreter Lock around I/O
calls.  But, of course, 3rd party C code might not be.

> > (Then why are we bothering with flockfile(), you may ask?
> 
> I wouldn't ask that, no <wink>.
> 
> > Because otherwise, accidental multithreaded reading from the same
> > file could cause core dumps.)
> 
> Ugh ... turns out that on my box I can provoke core dumps anyway, with this
> program.  Blows up under released 2.0 and CVS Pythons (so it's not due to
> anything new):

Yeah.  But this is insane use -- see my comments on SF.  It's only
worth fixing because it could be used to intentionally crash Python --
but there are easier ways...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Wed Jan 10 17:41:47 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 10 Jan 2001 10:41:47 -0600 (CST)
Subject: [Python-Dev] Shouldn't the Mac be listed as an environment?
Message-ID: <14940.37067.893679.750918@beluga.mojam.com>

I just noticed that the "Environment" options for Python on the SF site are
listed as

     Console (Text Based), Win32 (MS Windows), X11 Applications

Shouldn't something Macintosh-related be in that list as well?

Skip



From guido at python.org  Wed Jan 10 17:53:16 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 11:53:16 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Wed, 10 Jan 2001 14:21:28 +0100."
             <3A5C61D8.2E5D098C@lemburg.com> 
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>  
            <3A5C61D8.2E5D098C@lemburg.com> 
Message-ID: <200101101653.LAA28986@cj20424-a.reston1.va.home.com>

> Guido van Rossum wrote:
> > 
> > Please have a look at this SF patch:
> > 
> > http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> > 
> > This implements control over which names defined in a module are
> > externally visible: if there's a variable __exports__ in the module,
> > it is a list of identifiers, and any access from outside the module to
> > names not in the list is disallowed.  This affects access using the
> > getattr and setattr protocols (which raise AttributeError for
> > disallowed names), as well as "from M import v" (which raises
> > ImportError).

[Marc-Andre]
> Can't we use the existing attribute __all__ (this is currently
> only used for packages) for this kind of thing. As other have already
> remarked: I would rather like to see this attribute being used
> as basis for 'from M import *' rather than enforce the access
> restrictions like the patch suggests.

Yes -- I came up with the same thought.

So here's a plan: somebody please submit a patch that does only one
thing: from...import * looks for __all__ and if it exists, imports
exactly those names.  No changes to dir(), or anything.

> Access control mechanisms should be treated in different ways
> such as wrapping objects using access-control proxies (see mx.Proxy
> for an example of such an implementation) and on-demand only.
> I wouldn't wan't to pay the performance hit for each and every
> lookup in all my Python applications just because someone out
> there feels that "from M import *" has a meaning in life
> apart from being useful in interactive sessions to ease typing ;-)

In the process of looking into Zope internals I've noticed that
proxies are indeed very useful!

I note that the IMPORT opcodes in ceval.c require that the imported
module (as found in sys.modules[name] or returned by __import__()) is
a real module object.  I think this is unnecessary -- at least
IMPORT_FROM should work even if the module is a proxy or some other
thing (I've been known to smuggle class instances into sys.modules :-)
and IMPORT_STAR should work with a non-module at least if it has an
__all__ attribute.

> > I like it.  This has been asked for many times.  Does anybody see a
> > reason why this should *not* be added?
> > 
> > Tim remarked that introducing this will prompt demands for a similar
> > feature on classes and instances, where it will be hard to implement
> > without causing a bit of a slowdown.  It causes a slight slowdown (an
> > extra dictionary lookup for each use of "M.v") even when it is not
> > used, but for accessing module variables that's acceptable.  I'm not
> > so sure about instance variable references.
> 
> Again, I'd rather see these implemented using different
> techniques which are under programmer control and made
> explicit and visible in the program flow. Proxies are ideal
> for these things, since they allow great flexibility while
> still providing reasonable security at Python level.
> 
> I have been using the proxy approach for years now and 
> so far with great success. What's even better is that
> weak references and garbage finalization aids come along with
> it for free.

Agreed.  Which reminds me -- would you mind reviewing Fred's new
version of PEP 205 (weak refs)?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Wed Jan 10 18:12:20 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 10 Jan 2001 18:12:20 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>  
	            <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <3A5C97F4.945D0C1@lemburg.com>

Guido van Rossum wrote:
> 
> > Guido van Rossum wrote:
> > >
> > > Please have a look at this SF patch:
> > >
> > > http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> > >
> > > This implements control over which names defined in a module are
> > > externally visible: if there's a variable __exports__ in the module,
> > > it is a list of identifiers, and any access from outside the module to
> > > names not in the list is disallowed.  This affects access using the
> > > getattr and setattr protocols (which raise AttributeError for
> > > disallowed names), as well as "from M import v" (which raises
> > > ImportError).
> 
> [Marc-Andre]
> > Can't we use the existing attribute __all__ (this is currently
> > only used for packages) for this kind of thing. As other have already
> > remarked: I would rather like to see this attribute being used
> > as basis for 'from M import *' rather than enforce the access
> > restrictions like the patch suggests.
> 
> Yes -- I came up with the same thought.

Sorry, I didn't read the whole thread on the topic. Rereading the
above paragraph I guess I should have had some more coffee at the
time of writing ;-)
 
> So here's a plan: somebody please submit a patch that does only one
> thing: from...import * looks for __all__ and if it exists, imports
> exactly those names.  No changes to dir(), or anything.

+1 -- this won't be me though (at least not this week).
 
> > Access control mechanisms should be treated in different ways
> > such as wrapping objects using access-control proxies (see mx.Proxy
> > for an example of such an implementation) and on-demand only.
> > I wouldn't wan't to pay the performance hit for each and every
> > lookup in all my Python applications just because someone out
> > there feels that "from M import *" has a meaning in life
> > apart from being useful in interactive sessions to ease typing ;-)
> 
> In the process of looking into Zope internals I've noticed that
> proxies are indeed very useful!
> 
> I note that the IMPORT opcodes in ceval.c require that the imported
> module (as found in sys.modules[name] or returned by __import__()) is
> a real module object.  I think this is unnecessary -- at least
> IMPORT_FROM should work even if the module is a proxy or some other
> thing (I've been known to smuggle class instances into sys.modules :-)
> and IMPORT_STAR should work with a non-module at least if it has an
> __all__ attribute.

Cool.  This could make Python instances usable as "modules"
-- with full getattr() hook support !

For IMPORT_STAR I'd suggest first looking for __all__ and
then reverting to __dict__.items() in case this fails. 

BTW, is __dict__ needed by the import mechanism or would
the getattr/setattr slots suffice ? And if yes, must it
be a real Python dictionary ?
 
> > > I like it.  This has been asked for many times.  Does anybody see a
> > > reason why this should *not* be added?
> > >
> > > Tim remarked that introducing this will prompt demands for a similar
> > > feature on classes and instances, where it will be hard to implement
> > > without causing a bit of a slowdown.  It causes a slight slowdown (an
> > > extra dictionary lookup for each use of "M.v") even when it is not
> > > used, but for accessing module variables that's acceptable.  I'm not
> > > so sure about instance variable references.
> >
> > Again, I'd rather see these implemented using different
> > techniques which are under programmer control and made
> > explicit and visible in the program flow. Proxies are ideal
> > for these things, since they allow great flexibility while
> > still providing reasonable security at Python level.
> >
> > I have been using the proxy approach for years now and
> > so far with great success. What's even better is that
> > weak references and garbage finalization aids come along with
> > it for free.
> 
> Agreed.  Which reminds me -- would you mind reviewing Fred's new
> version of PEP 205 (weak refs)?

I'll have a look at it next week. Is that OK ?
 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fdrake at acm.org  Wed Jan 10 18:37:58 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 10 Jan 2001 12:37:58 -0500 (EST)
Subject: [Python-Dev] Shouldn't the Mac be listed as an environment?
In-Reply-To: <14940.37067.893679.750918@beluga.mojam.com>
References: <14940.37067.893679.750918@beluga.mojam.com>
Message-ID: <14940.40438.1654.487682@cj42289-a.reston1.va.home.com>

Skip Montanaro writes:
 > I just noticed that the "Environment" options for Python on the SF site are
 > listed as
 > 
 >      Console (Text Based), Win32 (MS Windows), X11 Applications
 > 
 > Shouldn't something Macintosh-related be in that list as well?

  Are the maintainers of the MacOS port using the SF bug tracker or
something else?  If they're using it, then by all means we should add
it.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From thomas at xs4all.net  Wed Jan 10 19:06:06 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 10 Jan 2001 19:06:06 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules xreadlinesmodule.c,NONE,1.1 Setup.dist,1.3,1.4
In-Reply-To: <E14G6bV-0004nX-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Tue, Jan 09, 2001 at 01:46:53PM -0800
References: <E14G6bV-0004nX-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010110190606.T2467@xs4all.nl>

On Tue, Jan 09, 2001 at 01:46:53PM -0800, Guido van Rossum wrote:

> static void
> xreadlines_dealloc(PyXReadlinesObject *op) {
> 	Py_XDECREF(op->file);
> 	Py_XDECREF(op->lines);
> 	PyObject_DEL(op);
> }

I'm confuzzled. Is this breach of the style guidelines intentional,
accidental, or just not cared enough about ? The style isn't even consistent
in that single module!

> void
> initxreadlines(void)
> {
> 	PyObject *m;
> 
> 	m = Py_InitModule("xreadlines", xreadlines_methods);
> }


-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From skip at mojam.com  Wed Jan 10 19:11:52 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 10 Jan 2001 12:11:52 -0600 (CST)
Subject: [Python-Dev] Shouldn't the Mac be listed as an environment?
In-Reply-To: <14940.40438.1654.487682@cj42289-a.reston1.va.home.com>
References: <14940.37067.893679.750918@beluga.mojam.com>
	<14940.40438.1654.487682@cj42289-a.reston1.va.home.com>
Message-ID: <14940.42472.174920.866172@beluga.mojam.com>

    Fred> Are the maintainers of the MacOS port using the SF bug tracker or
    Fred> something else?  If they're using it, then by all means we should
    Fred> add it.

Even if they aren't, I think it would be valuable to list.  There aren't all
that many tools (open source or otherwise) that run on Unix, Windows and Mac
and can be used as either a console app or a GUI.

I assume the reason Fred asks is that the Environment: list is generated
on-the-fly and somehow ties into use of the SF bug tracker.

Skip



From thomas at xs4all.net  Wed Jan 10 19:45:44 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 10 Jan 2001 19:45:44 +0100
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101101653.LAA28986@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 10, 2001 at 11:53:16AM -0500
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com> <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <20010110194544.V2467@xs4all.nl>

On Wed, Jan 10, 2001 at 11:53:16AM -0500, Guido van Rossum wrote:

> I note that the IMPORT opcodes in ceval.c require that the imported
> module (as found in sys.modules[name] or returned by __import__()) is
> a real module object.  I think this is unnecessary -- at least
> IMPORT_FROM should work even if the module is a proxy or some other
> thing (I've been known to smuggle class instances into sys.modules :-)
> and IMPORT_STAR should work with a non-module at least if it has an
> __all__ attribute.

Hmm.... Have you been sneaking looks at python-list again, Guido ? :-) I'm
certain the expanding of IMPORT would make a lot of people very happy. Alex
Martelli only just discovered the fact you can populate sys.modules
yourself, with non-module objects, and was wondering about its legality and
compatibility.

I, for one, am very +1 on the idea, also on MAL's idea to do our best in the
IMPORT_STAR case (try dict.items(), etc.)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tim.one at home.com  Wed Jan 10 19:49:40 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 10 Jan 2001 13:49:40 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101101512.KAA26193@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGENEIHAA.tim.one@home.com>

[Tim]
> The strangest thing left to my eye is why xreadlines enjoys a
> significant advantage over the double-loop buffering method
> (readlines_sizehint) on my box; reducing the very large
> (1Mb) buffer in Guido's test program made no material difference
> to that.

[Guido]
> I was baffled at this too (same difference on my box), until I
> discovered that the buffer size is specified *twice*: once as a
> default in the arg list of readlines_sizehint(), then *again* in
> the call to timer() near the bottom of the file.

Bingo!

> Take the latter one out and the times are comparable, in fact
> readlines_sizehint() is a few percent quicker.

They're indistinguishable then on my box (on one run xreadlines is .1
seconds  (out of around 7.6 total) quicker, on another readlines_sizehint),
*provided* that I specify the same buffer size (8192) that xreadlines uses
internally.  However, if I even double that, readlines_sizehint is uniformly
about 10% slower.  It's also a tiny bit slower if I cut the sizehint buffer
size to 4096.

I'm afraid Mysteries will remain no matter how many person-decades we spend
staring at this <0.5 wink> ...




From guido at python.org  Wed Jan 10 19:50:10 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 13:50:10 -0500
Subject: [Python-Dev] Shouldn't the Mac be listed as an environment?
In-Reply-To: Your message of "Wed, 10 Jan 2001 10:41:47 CST."
             <14940.37067.893679.750918@beluga.mojam.com> 
References: <14940.37067.893679.750918@beluga.mojam.com> 
Message-ID: <200101101850.NAA29744@cj20424-a.reston1.va.home.com>

> I just noticed that the "Environment" options for Python on the SF site are
> listed as
> 
>      Console (Text Based), Win32 (MS Windows), X11 Applications
> 
> Shouldn't something Macintosh-related be in that list as well?

Yeah, except for two problems: :-)

(1) This is a selection from a drop-down menu that doesn't have a Mac
    option;

(2) There are only three slots allowed.

So this is the best we can do.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gstein at lyra.org  Wed Jan 10 19:53:32 2001
From: gstein at lyra.org (Greg Stein)
Date: Wed, 10 Jan 2001 10:53:32 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010110194544.V2467@xs4all.nl>; from thomas@xs4all.net on Wed, Jan 10, 2001 at 07:45:44PM +0100
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com> <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com> <20010110194544.V2467@xs4all.nl>
Message-ID: <20010110105332.T4640@lyra.org>

On Wed, Jan 10, 2001 at 07:45:44PM +0100, Thomas Wouters wrote:
> On Wed, Jan 10, 2001 at 11:53:16AM -0500, Guido van Rossum wrote:
> 
> > I note that the IMPORT opcodes in ceval.c require that the imported
> > module (as found in sys.modules[name] or returned by __import__()) is
> > a real module object.  I think this is unnecessary -- at least
> > IMPORT_FROM should work even if the module is a proxy or some other
> > thing (I've been known to smuggle class instances into sys.modules :-)
> > and IMPORT_STAR should work with a non-module at least if it has an
> > __all__ attribute.
> 
> Hmm.... Have you been sneaking looks at python-list again, Guido ? :-) I'm
> certain the expanding of IMPORT would make a lot of people very happy. Alex
> Martelli only just discovered the fact you can populate sys.modules
> yourself, with non-module objects, and was wondering about its legality and
> compatibility.
> 
> I, for one, am very +1 on the idea, also on MAL's idea to do our best in the
> IMPORT_STAR case (try dict.items(), etc.)

+1 ... I'm always up for removing type restrictions. Did that with the
bytecodes in function objects a while back.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From MarkH at ActiveState.com  Wed Jan 10 19:54:34 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Wed, 10 Jan 2001 10:54:34 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules xreadlinesmodule.c,NONE,1.1 Setup.dist,1.3,1.4
In-Reply-To: <20010110190606.T2467@xs4all.nl>
Message-ID: <LCEPIIGDJPKCOIHOBJEPMEGKCOAA.MarkH@ActiveState.com>

> I'm confuzzled. Is this breach of the style guidelines intentional,
> accidental, or just not cared enough about ?

I vote the latter!

Who-really-cares ly,

Mark.



From guido at python.org  Wed Jan 10 20:00:24 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 14:00:24 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: Your message of "Mon, 08 Jan 2001 11:31:09 EST."
             <20010108113109.C7563@kronos.cnri.reston.va.us> 
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com>  
            <20010108113109.C7563@kronos.cnri.reston.va.us> 
Message-ID: <200101101900.OAA30486@cj20424-a.reston1.va.home.com>

[me]
> >I expect Andrew's code to go in before 2.1 is released.  So I don't
> >see a reason why we should hurry and check in a stop-gap measure.

[Andrew]
> But it might not; the final version might be unacceptable or run into
> some intractable problem.  Assuming the patch is correct (I haven't
> looked at it), why not check it in?  The work has already been done to
> write it, after all.

OK, done.

It was more work than I had hoped for, because Eric apparently
(despite having developer privileges!) doesn't use the CVS tree -- he
sent in a diff relative to the 2.0 release.  I munged it into place,
adding the feature that readline, _curses and bsdddb are built as
shared libraries by default.  You'd have to edit Setup.config.in to
change this.  Hope this doesn't break anybody's setup.  (Skip???)

Question for Eric: do you still want developer privileges?  They come
with responsibilities too.  Please check out the @#$%& CVS tree! :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Wed Jan 10 20:03:07 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 14:03:07 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Mon, 01 Jan 2001 19:49:35 CST."
             <20010101194935.19672@falcon.inetnebr.com> 
References: <20010101194935.19672@falcon.inetnebr.com> 
Message-ID: <200101101903.OAA30522@cj20424-a.reston1.va.home.com>

Hi Jeff,

I'm glad to tell you that I've accepted your xreadlines patches.  It's
all checked into the CVS tree now, except for your patch to
fileinput.py, where I had already checked in a similar change using
readlines(sizehint) directly.

Thanks again for your contribution!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From paulp at ActiveState.com  Wed Jan 10 21:08:31 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Wed, 10 Jan 2001 12:08:31 -0800
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>  
	            <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <3A5CC13F.DFB26A0B@ActiveState.com>

Guido van Rossum wrote:
> 
> ...
> 
> Yes -- I came up with the same thought.
> 
> So here's a plan: somebody please submit a patch that does only one
> thing: from...import * looks for __all__ and if it exists, imports
> exactly those names.  No changes to dir(), or anything.

Why? From my point of view, the changes to dir() are much more
important. I seldom tell newbies about import * but I always tell them
how they can browse objects (especially modules) with dir. If dir() is
changed then IDEs and so forth would use that and inherit the right
behavior. If the module exporting behavior gets more sophisticated in a
future version of Python they will continue to inherit the behavior.

Also, dir() could look for an __all__ on all objects including "module
proxies", classes and "plain old instances". In other words we can
extend the convention to other objects "for free".

 Paul



From tim.one at home.com  Wed Jan 10 21:25:24 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 10 Jan 2001 15:25:24 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101101638.LAA26759@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAENJIHAA.tim.one@home.com>

[Tim]
>> Well, it would be easy to fiddle the HAVE_GETC_UNLOCKED method
>> to keep the file locked until the line was complete, and I
>> wouldn't be opposed to making life saner on platforms that allow it.

[Guido]
> Hm...  That would be possible, except for one unfortunate detail:
> _PyString_Resize() may call PyErr_BadInternalCall() which touches
> thread state.

FLOCKFILE/FUNLOCKFILE are independent of Python's notion of thread state.
IOW, do FLOCKFILE once before the for(;;), and FUNLOCKFILE once on every
*exit* path thereafter.  We can block/unblock Python threads as often as
desired between those *file*-locking brackets.  The only thing the repeated
FLOCKFILE/FUNLOCKFILE calls do to my eyes now is to create the *possibility*
for multiple readers to get partial lines of the file.

> ...
> NO, NO NO!  Mixing reads and writes on the same stream wasn't what we
> are locking against at all.  (As you've found out, it doesn't even
> work.)

On Windows, yes, but that still seems to me to be a bug in MS's code.  If
anyone had reported a core dump on any other platform, I'd be more tractable
<wink> on this point.

> We're only trying to protect against concurrent *reads*.

As above, I believe that we could do a better job of that, then, on
platforms that HAVE_GETC_UNLOCKED, by protecting not only against core dumps
but also against .readline() not delivering an intact line from the file.

>> But since FLOCKFILE is in effect, other threads *trying* to write
>> to the stream we're reading will get blocked anyway.  Seems to give us
>> potential for deadlocks.

> Only if tyeh are holding other locks at the same time.

I'm not being clear, then.  Thread X does f.readline(), on a
HAVE_GETC_UNLOCKED platform.  get_line allows other threads to run and
invokes FLOCKFILE on f->f_fp.  get_line's GETC in thread X eventually hits
the end of the stdio buffer, and does its platform's version of _filbuf.
_filbuf may wait (depending on the nature of the stream) for more input to
show up.  Simultaneously, thread Y attempts to write some data to f.  But
the *FLOCKFILE* lock prevents it from doing anything with f.  So X is
waiting for Y to write data inside platform _filbuf, but Y is waiting for X
to release the platform stream lock inside some platform stream-output
routine (if I'm being clear now, Python locks have nothing to do with this
scenario:  it's the platform stream lock).

I think this is purely the user's fault if it happens.  Just pointing it out
as another insecurity we're probably not able to protect users from.

> ...
> Yeah.  But this is insane use -- see my comments on SF.  It's only
> worth fixing because it could be used to intentionally crash Python --
> but there are easier ways...

If it's unique to MS (as I suspect), I see no reason to even consider trying
to fix it in Python.  Unless the Perl Mongers use it to crash Zope <wink>.





From cgw at fnal.gov  Wed Jan 10 22:57:41 2001
From: cgw at fnal.gov (Charles G Waldman)
Date: Wed, 10 Jan 2001 15:57:41 -0600 (CST)
Subject: [Python-Dev] Interning filenames of imported modules
Message-ID: <14940.56021.646147.770080@buffalo.fnal.gov>

I have a question about the following code in compile.c:jcompile (line 3678)

		filename = PyString_InternFromString(sc.c_filename); 
		name = PyString_InternFromString(sc.c_name);

In the case of a long-running server which constantly imports modules,
this causes the interned string dict to grow without bound.  Is there
a strong reason that the filename needs to be interned?  How about the
module name?

How about some way to enforce a limit on the size of the interned
strings dictionary?




From mwh21 at cam.ac.uk  Wed Jan 10 23:02:49 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: Wed, 10 Jan 2001 22:02:49 +0000 (GMT)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A5CC13F.DFB26A0B@ActiveState.com>
Message-ID: <Pine.SOL.4.21.0101102121460.10616-100000@red.csi.cam.ac.uk>

On Wed, 10 Jan 2001, Paul Prescod wrote:

> Guido van Rossum wrote:
> > 
> > ...
> > 
> > Yes -- I came up with the same thought.
> > 
> > So here's a plan: somebody please submit a patch that does only one
> > thing: from...import * looks for __all__ and if it exists, imports
> > exactly those names.  No changes to dir(), or anything.
> 
> Why? From my point of view, the changes to dir() are much more
> important. I seldom tell newbies about import * but I always tell them
> how they can browse objects (especially modules) with dir. If dir() is
> changed then IDEs and so forth would use that and inherit the right
> behavior. If the module exporting behavior gets more sophisticated in a
> future version of Python they will continue to inherit the behavior.

Changing dir would also make rlcompleter nicer - it's something of a pain
to use with a module that has, eg, "from TERMIOS import *"-ed.  This might
also make "from ... import *" less of a pariah...

Sounds good to me, IOW.

Cheers,
M.




From tim.one at home.com  Wed Jan 10 23:23:14 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 10 Jan 2001 17:23:14 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101101639.LAA26776@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGENMIHAA.tim.one@home.com>

[Guido]
> I bet you it's rejected on the basis of "the docs tell you not to mix
> reading and writing on the same stream without intervening seek or
> flush."  If I were on the support line I would do that.

So would I if I were a typical first-line support idiot <wink>.  But the
*implementers*-- if they ever see it --should be very keen to figure out how
they managed to let the _iobuf get corrupted.  *I'm* not mucking with their
internals, nor doing wild pointer stores, nor anything else sneaky to
subvert their locking protection.  I wasn't even trying to break it.  The
only code reading from or storing into the _iobuf is theirs.  They're
ordinary stdio calls with ordinary arguments, and if *any* sequence of those
can cause internal corruption, they've almost certainly got a problem that
will manifest in other situations too.

Think like an implementer here <0.5 wink>:  they've lost track of how many
characters are in the buffer despite a locking scheme whose purpose is to
prevent that.  If it were my implementation, that would be a top-priority
bug no matter how silly the first program I saw that triggered it.

but-willing-to-let-them-decide-whether-they-care-ly y'rs  - tim




From skip at mojam.com  Wed Jan 10 23:52:55 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 10 Jan 2001 16:52:55 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A5CC13F.DFB26A0B@ActiveState.com>
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
	<3A5C61D8.2E5D098C@lemburg.com>
	<200101101653.LAA28986@cj20424-a.reston1.va.home.com>
	<3A5CC13F.DFB26A0B@ActiveState.com>
Message-ID: <14940.59335.723701.574821@beluga.mojam.com>

    Paul> Also, dir() could look for an __all__ on all objects including
    Paul> "module proxies", classes and "plain old instances". In other
    Paul> words we can extend the convention to other objects "for free".

The __exports__/dir() patch I submitted will do this if you remove the
PyModule_Check that guards it.

Skip






From tim.one at home.com  Thu Jan 11 00:06:05 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 10 Jan 2001 18:06:05 -0500
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
In-Reply-To: <3A5C4E44.23B593E9@per.dem.csiro.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCGENOIHAA.tim.one@home.com>

[Mark Favas]
> Just Another Data Point - my box (DEC Alpha, Tru64 Unix) shows the same
> behaviour as Tim's WinBox wrt the new xreadline and the double-loop
> readlines (so it's not just something funny with MS (not that there's
> not anything funny with MS...)):
>
> total 131426612 chars and 514216 lines

You average over 255 chars/line?  Really?  What kind of file are you
reading?  I don't really want to measure the speed of line-at-a-time input
on binary files where "line" doesn't actually make sense <0.6 wink>.

> count_chars_lines     5.450  5.066
> readlines_sizehint    4.112  4.083
> using_fileinput      10.928 10.916
> while_readline       11.766 11.733
> for_xreadlines        3.569  3.533

Guido pointed out that his readlines_sizehint test forced use of a 1Mb
buffer (in the call, not only the default value).  For whatever reason, that
was significantly slower than using an 8Kb sizehint on my box.

Another oddity is that while_readline is slower than using_fileinput for
you.  From that I take it Python config does *not* #define

     HAVE_GETC_UNLOCKED

on your platform.  If that's true (or esp. if it's not!), would you do me a
favor?  Recompile fileobject.c with

     USE_MS_GETLINE_HACK

#define'd, try the timing test again (while_readline is the most interesting
test for this), and run the test_bufio.py std test to make sure you're
actually getting the right answers.

At this point I'm +0.5 on the idea of fileobject.c using ms_getline_hack
whenever HAVE_GETC_UNLOCKED isn't available.  I'd be surprised if
ms_getline_hack failed to work correctly on any platform; a bigger unknown
(to me) is whether it will yield a speedup.  So far it yields a large
speedup on Windows, and looks like a speedup equal to getc_unlocked() yields
on Linux and Solaris.  Info on a platform from Mars (like Tru64 Unix <wink>)
would be valuable in deciding whether to boost +0.5.

don't-want-your-python-to-run-slower-than-possible-if-possible-ly
    y'rs  - tim




From tismer at tismer.com  Wed Jan 10 23:38:57 2001
From: tismer at tismer.com (Christian Tismer)
Date: Thu, 11 Jan 2001 00:38:57 +0200
Subject: [Python-Dev] [Stackless] ANN: Sourcecode for Stackless Python 2.0
Message-ID: <3A5CE481.24A7656@tismer.com>

On Monday, Jan 8th, I spake

"""
Source code and an update to the website will become available in
the next days.
"""

Now, here it is, together with a slightly updated website,
which tries to mention all the people who are helping
or sponsoring me (yes, there are sponsors!).
If somebody feels ignored by me, let me know. I'm good at
making mistakes.

Let me also know if there are problems building the code,
or if there are *no* problems understanding the code.
I don't expect either :-)

There is nearly no support for Unix, but Stackless *should*
build on Unix as it did before without problems.

enjoy - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From nas at arctrix.com  Wed Jan 10 19:15:45 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 10 Jan 2001 10:15:45 -0800
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGENOIHAA.tim.one@home.com>; from tim.one@home.com on Wed, Jan 10, 2001 at 06:06:05PM -0500
References: <3A5C4E44.23B593E9@per.dem.csiro.au> <LNBBLJKPBEHFEDALKOLCGENOIHAA.tim.one@home.com>
Message-ID: <20010110101545.A21305@glacier.fnational.com>

On Wed, Jan 10, 2001 at 06:06:05PM -0500, Tim Peters wrote:
> At this point I'm +0.5 on the idea of fileobject.c using ms_getline_hack
> whenever HAVE_GETC_UNLOCKED isn't available.

Leave it to the timbot use floating point votes. :)

Compare ms_getline_hack to what Perl does in order speed up IO.
I think its worth maintaining that piece of relatively portable
code given the benefit.  If the code has to be maintained then it
might was well be used.  If we find a platform the breaks we can
always disable it before the final release.

  Neil



From m.favas at per.dem.csiro.au  Thu Jan 11 02:28:59 2001
From: m.favas at per.dem.csiro.au (Mark Favas)
Date: Thu, 11 Jan 2001 09:28:59 +0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <3A5D0C5B.162F624A@per.dem.csiro.au>

[Tim produces a warped threader that crashes on MS OS's]
>> ...
>> NO, NO NO!  Mixing reads and writes on the same stream wasn't what
>> we are locking against at all.  (As you've found out, it doesn't 
>> even work.)

>On Windows, yes, but that still seems to me to be a bug in MS's code.  >If anyone had reported a core dump on any other platform, I'd be more >tractable <wink> on this point.

On Tru64 Unix, I get an infinite generator of 'r's (after an initial few
'w's) to the screen (but no crashes). If I reduce the size of the loop
counters from 1000000 to 3000, I get the following output:
opened
w w w w w w w w w w w w w w w w w w w w w w w w w w w r read 5114
done

-- 
Mark Favas  -   m.favas at per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA



From m.favas at per.dem.csiro.au  Thu Jan 11 04:40:18 2001
From: m.favas at per.dem.csiro.au (Mark Favas)
Date: Thu, 11 Jan 2001 11:40:18 +0800
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
Message-ID: <3A5D2B22.B8028AC@per.dem.csiro.au>

[Tim responded]
>>
>> total 131426612 chars and 514216 lines

>You average over 255 chars/line?  Really?  What kind of file are you
>reading?  I don't really want to measure the speed of line-at-a-time >input on binary files where "line" doesn't actually make sense <0.6 wink>.

Real-life input, my boy! It's actually a syslog from my mailserver,
consisting mainly of sendmail log messages, and I have a current need to
process these things (MS Exchange, corrupted database, clobbered backup
tapes), so this thread came along at the right time...

>Guido pointed out that his readlines_sizehint test forced use of a 1Mb
>buffer (in the call, not only the default value).  For whatever >reason, that was significantly slower than using an 8Kb sizehint on my >box.

Removing the buffer size arg in the call to readlines_sizehint results
in this (using up-to-the-minute CVS):
total 131426612 chars and 514216 lines
count_chars_lines     4.922  4.916
readlines_sizehint    3.881  3.850
using_fileinput      10.371 10.366
while_readline       10.943 10.916
for_xreadlines        2.990  2.967

and with an 8Kb sizehint:
total 131426612 chars and 514216 lines
count_chars_lines     5.241  5.216
readlines_sizehint    2.917  2.900
using_fileinput      10.351 10.333
while_readline       10.990 10.983
for_xreadlines        2.877  2.867


>Another oddity is that while_readline is slower than using_fileinput >for you.  From that I take it Python config does *not* #define
>
>     HAVE_GETC_UNLOCKED
>
>on your platform.  If that's true 

Nope, HAVE_GETC_UNLOCKED is indeed #define'd

>(or esp. if it's not!), would you do me a
>favor?  Recompile fileobject.c with
>
>     USE_MS_GETLINE_HACK
>
>#define'd, try the timing test again (while_readline is the most >interesting test for this), and run the test_bufio.py std test to make >sure you're actually getting the right answers.

Sure:
With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #define'd (although
defining the former makes the latter def irrelevant):
(test_bufio also OK)
total 131426612 chars and 514216 lines
count_chars_lines     5.056  5.050
readlines_sizehint    3.771  3.667
using_fileinput      11.128 11.116
while_readline        8.287  8.233
for_xreadlines        3.090  3.083

With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #undef'ed (just for
completeness):
total 131426612 chars and 514216 lines
count_chars_lines     4.916  4.900
readlines_sizehint    3.875  3.867
using_fileinput      14.404 14.383
while_readline       322.728 321.837
for_xreadlines        7.113  7.100

So, having HAVE_GETC_UNLOCKED #define'd does make a small improvement
<grin>

-- 
Mark Favas  -   m.favas at per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA



From nas at arctrix.com  Wed Jan 10 22:55:23 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 10 Jan 2001 13:55:23 -0800
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
In-Reply-To: <3A5D2B22.B8028AC@per.dem.csiro.au>; from m.favas@per.dem.csiro.au on Thu, Jan 11, 2001 at 11:40:18AM +0800
References: <3A5D2B22.B8028AC@per.dem.csiro.au>
Message-ID: <20010110135523.A21894@glacier.fnational.com>

On Thu, Jan 11, 2001 at 11:40:18AM +0800, Mark Favas wrote:
[with getc_unlocked]
> while_readline       10.943 10.916

[without]
> while_readline       322.728 321.837

Holy crap.  Great work team.

  Neil



From tim.one at home.com  Thu Jan 11 06:03:51 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 11 Jan 2001 00:03:51 -0500
Subject: [Python-Dev] Baffled on Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCCEOGIHAA.tim.one@home.com>

In version 2.26 of mmapmodule.c, Guido replaced (as part of a contributed
Cygwin patch):

#ifdef MS_WIN32
__declspec(dllexport) void
#endif /* MS_WIN32 */
#ifdef UNIX
extern void
#endif

by:

DL_EXPORT(void)

before initmmap.

1. Windows Python can no longer import mmap:

>>> import mmap
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ImportError: dynamic module does not define init function (initmmap)
>>>

This is because GetProcAddress returns NULL.

2. Everything's fine if I revert Guido's change (although I assume that
breaks Cygwin then).

3. DL_EXPORT(void) expands to "void".

4. The way mmapmodule.c is coded and built after Guido's change appears to
me to be the same as how every other non-builtin module is coded and built
on Windows.  For example, winsound.c, which uses DL_EXPORT(void) before its
initwinsound and where that macro also expands to "void".  But importing
winsound works fine.

Since what I'm seeing makes no consistent sense, I'm at a loss how to fix
it.  But then I'm punch-drunk too <0.7 wink>.

Any Windows geek got a clue?




From tim.one at home.com  Thu Jan 11 07:10:40 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 11 Jan 2001 01:10:40 -0500
Subject: [Python-Dev] RE: xreadline speed vs readlines_sizehint
In-Reply-To: <3A5D2B22.B8028AC@per.dem.csiro.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPGIHAA.tim.one@home.com>

[Tim, to MarkF]
>> You average over 255 chars/line?  [nag, nag, nag]

[Mark Favas]
> Real-life input, my boy! It's actually a syslog from my
> mailserver, consisting mainly of sendmail log messages, and I
> have a current need to process these things (MS Exchange,
> corrupted database, clobbered backup tapes), so this thread
> came along at the right time...

Hmm.  I tuned ms_getline_hack for Guido's logfiles, which he said don't
often exceed 160 chars/line.  I guess if you're on a 64-bit platform,
though, it must take about twice as many chars per line to record a log msg
<wink>.

> ...
> Removing the buffer size arg in the call to readlines_sizehint results
> in this (using up-to-the-minute CVS):
> total 131426612 chars and 514216 lines
> count_chars_lines     4.922  4.916
> readlines_sizehint    3.881  3.850
> using_fileinput      10.371 10.366
> while_readline       10.943 10.916
> for_xreadlines        2.990  2.967
>
> and with an 8Kb sizehint:
> total 131426612 chars and 514216 lines
> count_chars_lines     5.241  5.216
> readlines_sizehint    2.917  2.900
> using_fileinput      10.351 10.333
> while_readline       10.990 10.983
> for_xreadlines        2.877  2.867

That's sure consistent across platforms, then.  I guess we'll write it off
to "cache effects" (a catch-all explanation for any timing mystery -- go
ahead, just *try* to prove it's wrong <0.5 wink>).

[and Mark has HAVE_GETC_UNLOCKED on his Tru64 Unix box, yet
 using_fileinput is quicker than while_readline]

> With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #define'd
> (although defining the former makes the latter def irrelevant):
> (test_bufio also OK)
> total 131426612 chars and 514216 lines
> count_chars_lines     5.056  5.050
> readlines_sizehint    3.771  3.667
> using_fileinput      11.128 11.116
> while_readline        8.287  8.233
> for_xreadlines        3.090  3.083

So ms_getline_hack is significantly faster on your box (I'm only looking at
while_readline:  11 using getc_unlocked, 8.3 using ms_getline_hack).  There
are only two reasons I can imagine for that:

1. Your vendor optimizes the inner loop in fgets (as all vendors should, but
few do).

and/or

2. Despite the long average length of your lines, many of them are
nevertheless shorter than 200 chars, and so all the pain ms_getline_hack
endures to avoid a realloc pays off.

Unfortunately, there's not enough info to figure out if either, both, or
none of those are on-target.  It's such a large percentage speedup, though,
that my bet goes primarily to #1 -- unless realloc is really pig slow on
your box.  Which some things *are*:

> With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #undef'ed (just
> for completeness):
> total 131426612 chars and 514216 lines
> count_chars_lines     4.916  4.900
> readlines_sizehint    3.875  3.867
> using_fileinput      14.404 14.383
> while_readline       322.728 321.837
> for_xreadlines        7.113  7.100
>
> So, having HAVE_GETC_UNLOCKED #define'd does make a small improvement
> <grin>

Yes, that's the "platform from Mars" evidence I was seeking:  if
ms_getline_hack survives test_bufio on *your* crazy box, it's as close to
provably correct as any algorithm in all of Computer Science <wink>.

a-factor-of-39-is-almost-big-enough-to-notice!-ly y'rs  - tim




From m.favas at per.dem.csiro.au  Thu Jan 11 08:26:37 2001
From: m.favas at per.dem.csiro.au (Mark Favas)
Date: Thu, 11 Jan 2001 15:26:37 +0800
Subject: [Python-Dev] Re: xreadline speed vs readlines_sizehint
References: <LNBBLJKPBEHFEDALKOLCIEPGIHAA.tim.one@home.com>
Message-ID: <3A5D602D.9DC991CB@per.dem.csiro.au>

[Tim speculates on getc_unlocked and his ms_getline_hack]:
> 
> So ms_getline_hack is significantly faster on your box (I'm only
> looking at while_readline:  11 using getc_unlocked, 8.3 using 
> ms_getline_hack).  There are only two reasons I can imagine for that:
> 
> 1. Your vendor optimizes the inner loop in fgets (as all vendors
> should, but few do).

Digital engineering, Compaq management/marketing <0.6 wink>
> 
> and/or
> 
> 2. Despite the long average length of your lines, many of them are
> nevertheless shorter than 200 chars, and so all the pain
> ms_getline_hack endures to avoid a realloc pays off.
> 
> Unfortunately, there's not enough info to figure out if either, both,
> or none of those are on-target.  It's such a large percentage
> speedup, though, that my bet goes primarily to #1 -- unless realloc
> is really pig slow on your box.

The lines range in length from 96 to 747 characters, with 11% @ 233, 17%
@ 252 and 52% @ 254 characters, so #1 looks promising - most lines are
long enough to trigger a realloc. Cranking up INITBUFSIZE in
ms_getline_hack to 260 from 200 improves thing again, by another 25%: 
total 131426612 chars and 514216 lines
count_chars_lines     5.081  5.066
readlines_sizehint    3.743  3.717
using_fileinput      11.113 11.100
while_readline        6.100  6.083
for_xreadlines        3.027  3.033

Apart from the name <grin>, I like ms_getline_hack...

tho'-a-factor-of-100-makes-xreadlines-a-welcome-addition!-ly y'rs

-- 
Mark Favas  -   m.favas at per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA



From m.favas at per.dem.csiro.au  Thu Jan 11 10:08:29 2001
From: m.favas at per.dem.csiro.au (Mark Favas)
Date: Thu, 11 Jan 2001 17:08:29 +0800
Subject: [Python-Dev] Current CVS version of sysmodule.c fails to compile
Message-ID: <3A5D780D.62D0F473@per.dem.csiro.au>

On Tru64 Unix, with Compaq's C/CXX compilers, the current CVS version of
sysmodule.c produces the following errors:

cc -O -Olimit 1500 -I./../Include -I.. -DHAVE_CONFIG_H   -c -o
sysmodule.o sysmodule.c
cc: Error: sysmodule.c, line 73: Invalid declarator. (declarator)
        PyObject *o, *stdout;
----------------------^
cc: Error: sysmodule.c, line 79: In this statement, "o" is not declared.
(undeclared)
        if (!PyArg_ParseTuple(args, "O:displayhook", &o))
------------------------------------------------------^
cc: Error: sysmodule.c, line 93: In this statement, "(&_iob[1])" is not
an lvalue, but occurs in a context that requires one. (needlvalue)
        stdout = PySys_GetObject("stdout");
--------^
cc: Warning: sysmodule.c, line 98: In this statement, the referenced
type of the pointer value "(&_iob[1])" is "struct declared without a
tag", which is not compatible with "struct _object". (ptrmismatch)
        if (PyFile_WriteObject(o, stdout, 0) != 0)
----------------------------------^
cc: Warning: sysmodule.c, line 100: In this statement, the referenced
type of the pointer value "(&_iob[1])" is "struct declared without a
tag", which is not compatible with "struct _object". (ptrmismatch)
        PyFile_SoftSpace(stdout, 1);
-------------------------^

The problem is that stdout is a macro #define'd in stdio.h as (&_iob[1])
(stdin and stderr also are similarly #define'd).

-- 
Mark Favas  -   m.favas at per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA



From gstein at lyra.org  Thu Jan 11 10:18:44 2001
From: gstein at lyra.org (Greg Stein)
Date: Thu, 11 Jan 2001 01:18:44 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.216,2.217 sysmodule.c,2.80,2.81
In-Reply-To: <E14GaUL-0005nd-00@usw-pr-cvs1.sourceforge.net>; from moshez@users.sourceforge.net on Wed, Jan 10, 2001 at 09:41:29PM -0800
References: <E14GaUL-0005nd-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010111011843.W4640@lyra.org>

On Wed, Jan 10, 2001 at 09:41:29PM -0800, Moshe Zadka wrote:
> Update of /cvsroot/python/python/dist/src/Python
> In directory usw-pr-cvs1:/tmp/cvs-serv21213/Python
> 
> Modified Files:
> 	ceval.c sysmodule.c
>...
> --- 1246,1269 ----
>   		case PRINT_EXPR:
>   			v = POP();
> ! 			w = PySys_GetObject("displayhook");
> ! 			if (w == NULL) {
> ! 				PyErr_SetString(PyExc_RuntimeError,
> ! 						"lost sys.displayhook");
> ! 				err = -1;
>   			}
> + 			if (err == 0) {
> + 				x = Py_BuildValue("(O)", v);
> + 				if (x == NULL)
> + 					err = -1;
> + 			}
> + 			if (err == 0) {
> + 				w = PyEval_CallObject(w, x);
> + 				if (w == NULL)
> + 					err = -1;
> + 			}
>   			Py_DECREF(v);
> + 			Py_XDECREF(x);

x was never initialized to NULL. In fact, the loop sets it to Py_None. If
you get an error in the initial "w" setup case, then you could erroneously
decref None.

Further, there is no DECREF for the CallObject result ("w"). But watch out:
you don't want to DECREF the PySys_GetObject result (that is a borrowed
reference).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From gstein at lyra.org  Thu Jan 11 10:28:16 2001
From: gstein at lyra.org (Greg Stein)
Date: Thu, 11 Jan 2001 01:28:16 -0800
Subject: [Python-Dev] Current CVS version of sysmodule.c fails to compile
In-Reply-To: <3A5D780D.62D0F473@per.dem.csiro.au>; from m.favas@per.dem.csiro.au on Thu, Jan 11, 2001 at 05:08:29PM +0800
References: <3A5D780D.62D0F473@per.dem.csiro.au>
Message-ID: <20010111012815.X4640@lyra.org>

You're quite right! I've checked in a change, renaming it to "outf".

Cheers,
-g

On Thu, Jan 11, 2001 at 05:08:29PM +0800, Mark Favas wrote:
> On Tru64 Unix, with Compaq's C/CXX compilers, the current CVS version of
> sysmodule.c produces the following errors:
> 
> cc -O -Olimit 1500 -I./../Include -I.. -DHAVE_CONFIG_H   -c -o
> sysmodule.o sysmodule.c
> cc: Error: sysmodule.c, line 73: Invalid declarator. (declarator)
>         PyObject *o, *stdout;
> ----------------------^
> cc: Error: sysmodule.c, line 79: In this statement, "o" is not declared.
> (undeclared)
>         if (!PyArg_ParseTuple(args, "O:displayhook", &o))
> ------------------------------------------------------^
> cc: Error: sysmodule.c, line 93: In this statement, "(&_iob[1])" is not
> an lvalue, but occurs in a context that requires one. (needlvalue)
>         stdout = PySys_GetObject("stdout");
> --------^
> cc: Warning: sysmodule.c, line 98: In this statement, the referenced
> type of the pointer value "(&_iob[1])" is "struct declared without a
> tag", which is not compatible with "struct _object". (ptrmismatch)
>         if (PyFile_WriteObject(o, stdout, 0) != 0)
> ----------------------------------^
> cc: Warning: sysmodule.c, line 100: In this statement, the referenced
> type of the pointer value "(&_iob[1])" is "struct declared without a
> tag", which is not compatible with "struct _object". (ptrmismatch)
>         PyFile_SoftSpace(stdout, 1);
> -------------------------^
> 
> The problem is that stdout is a macro #define'd in stdio.h as (&_iob[1])
> (stdin and stderr also are similarly #define'd).
> 
> -- 
> Mark Favas  -   m.favas at per.dem.csiro.au
> CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Greg Stein, http://www.lyra.org/



From skip at mojam.com  Thu Jan 11 15:13:55 2001
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 11 Jan 2001 08:13:55 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <E14GgKS-0002AH-00@usw-pr-cvs1.sourceforge.net>
References: <E14GgKS-0002AH-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <14941.49059.26189.733094@beluga.mojam.com>

    Moshe> * Did not DECREF result from displayhook function
    ...
    Moshe>   				w = PyEval_CallObject(w, x);
    Moshe> + 				Py_XDECREF(w);
    Moshe>   				if (w == NULL)
    ...


While it works, is it really kosher to test w's value after the DECREF?
Just seems like an odd construct to me.  I'm used to seeing the test
immediately after it's been set.

Skip






From guido at python.org  Thu Jan 11 15:44:58 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 09:44:58 -0500
Subject: [Python-Dev] Interning filenames of imported modules
In-Reply-To: Your message of "Wed, 10 Jan 2001 15:57:41 CST."
             <14940.56021.646147.770080@buffalo.fnal.gov> 
References: <14940.56021.646147.770080@buffalo.fnal.gov> 
Message-ID: <200101111444.JAA14597@cj20424-a.reston1.va.home.com>

> I have a question about the following code in compile.c:jcompile (line 3678)
> 
> 		filename = PyString_InternFromString(sc.c_filename); 
> 		name = PyString_InternFromString(sc.c_name);
> 
> In the case of a long-running server which constantly imports modules,
> this causes the interned string dict to grow without bound.  Is there
> a strong reason that the filename needs to be interned?  How about the
> module name?

It's probably not *necessary* for the filename, but I know why I am
interning it: since a module typically contains a bunch of functions,
and each function has its own code object with a reference to the
filename, I'm trying to save memory (the filename is a C string
pointer in the "sc" structure, so it has to be turned into a Python
string when creating the code object).

The module name is used as an identifier elsewhere so will become
interned anyway.

> How about some way to enforce a limit on the size of the interned
> strings dictionary?

I've never thought of this -- but I suppose that a weak dictionary
could be used.  Fred's working on a PEP for weak references, so
there's a chance that we might use this eventually.

In the mean time, a possibility would be to provide a service function
that goes through the "interned" dictionary and looks for values with
a reference count of 1, and deletes them.  You could then explicitly
call this service function occasionally in your program.  I would let
it return a tuple: (number of values kept, number of values deleted).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Thu Jan 11 16:08:48 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:08:48 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 10 Jan 2001 13:49:40 EST."
             <LNBBLJKPBEHFEDALKOLCGENEIHAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCGENEIHAA.tim.one@home.com> 
Message-ID: <200101111508.KAA14870@cj20424-a.reston1.va.home.com>

> They're indistinguishable then on my box (on one run xreadlines is .1
> seconds  (out of around 7.6 total) quicker, on another readlines_sizehint),
> *provided* that I specify the same buffer size (8192) that xreadlines uses
> internally.  However, if I even double that, readlines_sizehint is uniformly
> about 10% slower.  It's also a tiny bit slower if I cut the sizehint buffer
> size to 4096.
> 
> I'm afraid Mysteries will remain no matter how many person-decades we spend
> staring at this <0.5 wink> ...

8192 happens to be the size of the stack-allocated buffer readlines()
uses, and also the stdio BUFSIZ parameter, on many systems.  Look for
SMALLCHUNK in fileobject.c.

Would it make sense to tie the two constants together more to tune
this optimally even when BUFSIZ is different?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Thu Jan 11 16:09:54 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Thu, 11 Jan 2001 10:09:54 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
References: <200101080416.f084GrM10912@snark.thyrsus.com>
	<20010108074411.N2467@xs4all.nl>
	<20010108014945.A19516@thyrsus.com>
	<200101081427.JAA03146@cj20424-a.reston1.va.home.com>
	<20010108113109.C7563@kronos.cnri.reston.va.us>
	<200101101900.OAA30486@cj20424-a.reston1.va.home.com>
Message-ID: <14941.52418.18484.898061@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido at python.org> writes:

    GvR> It was more work than I had hoped for, because Eric
    GvR> apparently (despite having developer privileges!) doesn't use
    GvR> the CVS tree -- he sent in a diff relative to the 2.0
    GvR> release.  I munged it into place, adding the feature that
    GvR> readline, _curses and bsdddb are built as shared libraries by
    GvR> default.  You'd have to edit Setup.config.in to change this.
    GvR> Hope this doesn't break anybody's setup.  (Skip???)

We may need to move dbm module to Setup.config from Setup and build it
shared too.  The problem I ran into when building the pybsddb3 module
was that even though I'd built the standard bsddb shared, I was also
building in dbm statically.  This pulled in a dependency to the old
db.so module (under RH6.1) and core dumped me during the test suite
for pybsddb.  Commenting out dbm did the trick, so building it shared
should work too.

Couple of things: dbm isn't enabled by default I believe so moving it
to Setup.config may not be the right thing after all (would that imply
an autoconf test and auto-enabling if it's detected?)  Also, Andrew's
distutils-based build procedure may obviate the need for this change.

-Barry




From ping at lfw.org  Thu Jan 11 16:14:17 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 07:14:17 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>

On Wed, 10 Jan 2001, Guido van Rossum wrote:
> Yes -- I came up with the same thought.
> 
> So here's a plan: somebody please submit a patch that does only one
> thing: from...import * looks for __all__ and if it exists, imports
> exactly those names.  No changes to dir(), or anything.

Please don't use __all__.  At the moment, __all__ is the only way
to easily tell whether a particular module object really represents
a package, and the only way to get the list of submodule names.

If __all__ is overloaded to also represent exportable symbols in
modules, these two pieces of information will be impossible (or
require much ugly hackery) to obtain.


-- ?!ng




From guido at python.org  Thu Jan 11 16:23:26 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:23:26 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 10 Jan 2001 15:25:24 EST."
             <LNBBLJKPBEHFEDALKOLCAENJIHAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCAENJIHAA.tim.one@home.com> 
Message-ID: <200101111523.KAA14982@cj20424-a.reston1.va.home.com>

> [Tim]
> >> Well, it would be easy to fiddle the HAVE_GETC_UNLOCKED method
> >> to keep the file locked until the line was complete, and I
> >> wouldn't be opposed to making life saner on platforms that allow it.
> 
> [Guido]
> > Hm...  That would be possible, except for one unfortunate detail:
> > _PyString_Resize() may call PyErr_BadInternalCall() which touches
> > thread state.

[Tim]
> FLOCKFILE/FUNLOCKFILE are independent of Python's notion of thread state.
> IOW, do FLOCKFILE once before the for(;;), and FUNLOCKFILE once on every
> *exit* path thereafter.  We can block/unblock Python threads as often as
> desired between those *file*-locking brackets.  The only thing the repeated
> FLOCKFILE/FUNLOCKFILE calls do to my eyes now is to create the *possibility*
> for multiple readers to get partial lines of the file.

I don't want to call FLOCKFILE while holding the Python lock, as this
means that *if* we're blocked in FLOCKFILE (e.g. we're reading from a
pipe or socket), no other Python thread can run!

> > ...
> > NO, NO NO!  Mixing reads and writes on the same stream wasn't what we
> > are locking against at all.  (As you've found out, it doesn't even
> > work.)
> 
> On Windows, yes, but that still seems to me to be a bug in MS's code.  If
> anyone had reported a core dump on any other platform, I'd be more tractable
> <wink> on this point.

Yes, it's a Windows bug.

> > We're only trying to protect against concurrent *reads*.
> 
> As above, I believe that we could do a better job of that, then, on
> platforms that HAVE_GETC_UNLOCKED, by protecting not only against core dumps
> but also against .readline() not delivering an intact line from the file.

See above for a reason why I think that's not safe.  I think that
applications that want to do this can do their own locking.  (They'll
find out soon enough that readline() isn't atomic. :-)

> >> But since FLOCKFILE is in effect, other threads *trying* to write
> >> to the stream we're reading will get blocked anyway.  Seems to give us
> >> potential for deadlocks.
> 
> > Only if tyeh are holding other locks at the same time.
> 
> I'm not being clear, then.  Thread X does f.readline(), on a
> HAVE_GETC_UNLOCKED platform.  get_line allows other threads to run and
> invokes FLOCKFILE on f->f_fp.  get_line's GETC in thread X eventually hits
> the end of the stdio buffer, and does its platform's version of _filbuf.
> _filbuf may wait (depending on the nature of the stream) for more input to
> show up.  Simultaneously, thread Y attempts to write some data to f.  But
> the *FLOCKFILE* lock prevents it from doing anything with f.  So X is
> waiting for Y to write data inside platform _filbuf, but Y is waiting for X
> to release the platform stream lock inside some platform stream-output
> routine (if I'm being clear now, Python locks have nothing to do with this
> scenario:  it's the platform stream lock).

I don't think that _filbuf can possibly wait for another thread to
write data to the same stream object.  A single stream object doesn't
act like a pipe, even if it is open for simultaneous reading and
writing.  So if there's no more data in the file, _fulbuf will simply
return with an EOF status, not wait for the data that the other thread
would write.

> I think this is purely the user's fault if it happens.  Just pointing it out
> as another insecurity we're probably not able to protect users from.

I don't think this can happen.

> > ...
> > Yeah.  But this is insane use -- see my comments on SF.  It's only
> > worth fixing because it could be used to intentionally crash Python --
> > but there are easier ways...
> 
> If it's unique to MS (as I suspect), I see no reason to even consider trying
> to fix it in Python.  Unless the Perl Mongers use it to crash Zope <wink>.

OK.  It's unique to MS.  So close the bug report with a "won't fix"
resolution.  There's no point in having bug reports remain open that
we know we can't fix.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Thu Jan 11 16:27:05 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:27:05 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 10 Jan 2001 17:23:14 EST."
             <LNBBLJKPBEHFEDALKOLCGENMIHAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCGENMIHAA.tim.one@home.com> 
Message-ID: <200101111527.KAA15005@cj20424-a.reston1.va.home.com>

> Think like an implementer here <0.5 wink>:  they've lost track of how many
> characters are in the buffer despite a locking scheme whose purpose is to
> prevent that.  If it were my implementation, that would be a top-priority
> bug no matter how silly the first program I saw that triggered it.

The locking prevents concurrent threads accessing the stream.

But mixing reads and writes (without intervening fseek etc.) is
illegal use of the stream, and the C standard allows them to be lax
here, even if the program was single-threaded.

In other words: the locking is so good that it serializes the sequence
of reads and writes; but if the sequence of reads and writes is
illegal, they don't guarantee anything.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido at python.org  Thu Jan 11 16:28:23 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:28:23 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 11 Jan 2001 09:28:59 +0800."
             <3A5D0C5B.162F624A@per.dem.csiro.au> 
References: <3A5D0C5B.162F624A@per.dem.csiro.au> 
Message-ID: <200101111528.KAA15021@cj20424-a.reston1.va.home.com>

> On Tru64 Unix, I get an infinite generator of 'r's (after an initial few
> 'w's) to the screen (but no crashes).

Same here on Linux.

> If I reduce the size of the loop
> counters from 1000000 to 3000, I get the following output:
> opened
> w w w w w w w w w w w w w w w w w w w w w w w w w w w r read 5114
> done

I still get an infinite amount of 'r's.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Thu Jan 11 16:28:21 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 11 Jan 2001 16:28:21 +0100
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEEDIHAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 07, 2001 at 11:13:26PM -0500
References: <LNBBLJKPBEHFEDALKOLCEEBPIHAA.tim_one@email.msn.com> <LNBBLJKPBEHFEDALKOLCIEEDIHAA.tim.one@home.com>
Message-ID: <20010111162820.W2467@xs4all.nl>

On Sun, Jan 07, 2001 at 11:13:26PM -0500, Tim Peters wrote:

> I'm curious about how it performs (relative to the getc_unlocked hack) on
> other platforms.  If you'd like to try that, just recompile fileobject.c
> with

>     USE_MS_GETLINE_HACK

> #define'd.  It should *work* on any platform with fgets() meeting the
> assumption.  The new test_bufio.py std test gives it a pretty good
> correctness workout, if you're worried about that.

FreeBSD seems to work fine. Speed is practically the same as without
USE_MS_GETLINE_HACK (but with HAVE_GETC_UNLOCKED), though still not quite
the same as before all this hackery :-) Not by much though. For most tests
it's smaller than the margin of error, though the difference is still as
much as 20, 30% for the while_readline test. When using a second thread
somewhere in the test, the difference vanishes further.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mal at lemburg.com  Thu Jan 11 16:33:28 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 11 Jan 2001 16:33:28 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>
Message-ID: <3A5DD248.8EE0DF63@lemburg.com>

Ka-Ping Yee wrote:
> 
> On Wed, 10 Jan 2001, Guido van Rossum wrote:
> > Yes -- I came up with the same thought.
> >
> > So here's a plan: somebody please submit a patch that does only one
> > thing: from...import * looks for __all__ and if it exists, imports
> > exactly those names.  No changes to dir(), or anything.
> 
> Please don't use __all__.  At the moment, __all__ is the only way
> to easily tell whether a particular module object really represents
> a package, and the only way to get the list of submodule names.

But __all__ has to be user-defined, so I don't buy that argument.
Note that the only true way to recognize a package is by looking
for an attribute "__path__" since Python adds this for packages
only.
 
> If __all__ is overloaded to also represent exportable symbols in
> modules, these two pieces of information will be impossible (or
> require much ugly hackery) to obtain.

Again, __all__ is not automatically generated, so trusting it
doesn't get you very far. To be able to find subpackages you will
always have to apply some hackery (based on __path__) in order
to be sure. It would be better to add a helper function to
packages to query this kind of information -- the package usually
knows best where to look and what to look for.

Note that __all__ was explicitly invented to be used by
from package import * so I think it is the right choice here.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Thu Jan 11 16:37:19 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 11 Jan 2001 10:37:19 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <14941.52418.18484.898061@anthem.wooz.org>; from barry@digicool.com on Thu, Jan 11, 2001 at 10:09:54AM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com> <20010108113109.C7563@kronos.cnri.reston.va.us> <200101101900.OAA30486@cj20424-a.reston1.va.home.com> <14941.52418.18484.898061@anthem.wooz.org>
Message-ID: <20010111103719.A7191@thyrsus.com>

GvR> It was more work than I had hoped for, because Eric
GvR> apparently (despite having developer privileges!) doesn't use
GvR> the CVS tree -- he sent in a diff relative to the 2.0
GvR> release.

I'm using the CVS tree now.  I did that patch relative to 2.0 for
boring reasons having to do with the state of my laptop.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The IRS has become morally corrupted by the enormous power which we in
Congress have unwisely entrusted to it. Too often it acts like a
Gestapo preying upon defenseless citizens.
	-- Senator Edward V. Long



From thomas at xs4all.net  Thu Jan 11 16:48:32 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 11 Jan 2001 16:48:32 +0100
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A5DD248.8EE0DF63@lemburg.com>; from mal@lemburg.com on Thu, Jan 11, 2001 at 04:33:28PM +0100
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> <3A5DD248.8EE0DF63@lemburg.com>
Message-ID: <20010111164831.X2467@xs4all.nl>

On Thu, Jan 11, 2001 at 04:33:28PM +0100, M.-A. Lemburg wrote:

> > Please don't use __all__.  At the moment, __all__ is the only way
> > to easily tell whether a particular module object really represents
> > a package, and the only way to get the list of submodule names.
> 
> But __all__ has to be user-defined, so I don't buy that argument.
> Note that the only true way to recognize a package is by looking
> for an attribute "__path__" since Python adds this for packages
> only.

Ehm.... What, exactly, prevents usercode from doing

__path__ = "neener, neener"

? In other words, even *that* isn't a true way to recognize a package. You
can see what isn't a package, but not what is.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Thu Jan 11 16:58:55 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:58:55 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Thu, 11 Jan 2001 07:14:17 PST."
             <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101111558.KAA15447@cj20424-a.reston1.va.home.com>

> Please don't use __all__.  At the moment, __all__ is the only way
> to easily tell whether a particular module object really represents
> a package, and the only way to get the list of submodule names.
> 
> If __all__ is overloaded to also represent exportable symbols in
> modules, these two pieces of information will be impossible (or
> require much ugly hackery) to obtain.

Marc-Andre already explained that __all__ is not to be trusted.

If you want a reasonably good test for package-ness, use the presence
of __path__.

For a really good test, check whether __file__ ends in __init__.py[c].

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Thu Jan 11 17:14:00 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 11 Jan 2001 11:14:00 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
Message-ID: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us>

I've put a new version of the setup.py script at
     http://www.mems-exchange.org/software/files/python/setup.py

(I'm at work and can't remember the password to get into
www.amk.ca. :) )

This version improves the detection of Tcl/Tk, handles the
_curses_panel module, and doesn't do a chdir().  Same drill as before:
just grab the script, drop it in the root of your Python source tree
(2.0 or current CVS), run "./python setup.py build", and look at the
modules it compiles.  I can try it on Linux, so I'm most interested in
hearing reports for other Unix versions (*BSD, HP-UX, etc.)

--amk





From ping at lfw.org  Thu Jan 11 17:36:36 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 08:36:36 -0800 (PST)
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of Python)
Message-ID: <Pine.LNX.4.10.10101110803400.5846-100000@skuld.kingmanhall.org>

I'm pleased to announce a reasonable first pass at a documentation
utility for interactive use.  "pydoc" is usable in three ways:

1.  At the shell prompt, "pydoc <name>" displays documentation
    on <name>, very much like "man".

2.  At the shell prompt, "pydoc -k <keyword>" lists modules whose
    one-line descriptions mention the keyword, like "man -k".

3.  Within Python, "from pydoc import help" provides a "help"
    function to display documentation at the interpreter prompt.

All of them use sys.path in order to guarantee that the documentation
you see matches the modules you get.

To try "pydoc", download:

    http://www.lfw.org/python/pydoc.py
    http://www.lfw.org/python/htmldoc.py
    http://www.lfw.org/python/textdoc.py
    http://www.lfw.org/python/inspect.py

I would very much appreciate your feedback, especially from testing
on non-Unix platforms.  Thank you!

I've pasted some examples from my shell below (when you actually
run pydoc, the output is piped through "less", "more", or a pager
implemented in Python, depending on what is available).



-- ?!ng

"If I have seen farther than others, it is because I was standing on a
really big heap of midgets."
    -- K. Eric Drexler



skuld[1268]% pydoc -k mail
mailbox - Classes to handle Unix style, MMDF style, and MH style mailboxes.
mailcap - Mailcap file handling.  See RFC 1524.
mimify - Mimification and unmimification of mail messages.
test.test_mailbox - (no description)

skuld[1269]% pydoc -k text
textdoc - Generate text documentation from live Python objects.
collab - Routines for collaboration, especially group editing of text documents.
gettext - Internationalization and localization support.
test.test_gettext - (no description)
curses.textpad - Simple textbox editing widget with Emacs-like keybindings.
distutils.text_file - text_file
ScrolledText - (no description)

skuld[1270]% pydoc -k html
htmldoc - Generate HTML documentation from live Python objects.
htmlentitydefs - HTML character entity references.
htmllib - HTML 2.0 parser.

skuld[1271]% pydoc md5

Python Library Documentation: built-in module md5

NAME
    md5

FILE
    (built-in)

DESCRIPTION
    This module implements the interface to RSA's MD5 message digest
    algorithm (see also Internet RFC 1321). Its use is quite
    straightforward: use the new() to create an md5 object. You can now
    feed this object with arbitrary strings using the update() method, and
    at any point you can ask it for the digest (a strong kind of 128-bit
    checksum, a.k.a. ``fingerprint'') of the contatenation of the strings
    fed to it so far using the digest() method.
    
    Functions:
    
    new([arg]) -- return a new md5 object, initialized with arg if provided
    md5([arg]) -- DEPRECATED, same as new, but for compatibility
    
    Special Objects:
    
    MD5Type -- type object for md5 objects

FUNCTIONS
    md5(no arg info)
        new([arg]) -> md5 object
        
        Return a new md5 object. If arg is present, the method call update(arg)
        is made.
    
    new(no arg info)
        new([arg]) -> md5 object
        
        Return a new md5 object. If arg is present, the method call update(arg)
        is made.

skuld[1272]% pydoc types

Python Library Documentation: module types

NAME
    types

FILE
    /home/ping/sw/Python-1.5.2/Lib/types.py

DESCRIPTION
    # Define names for all type symbols known in the standard interpreter.
    # Types that are part of optional modules (e.g. array) are not listed.

skuld[1273]% pydoc abs

Python Library Documentation: built-in function abs

abs (no arg info)
    abs(number) -> number
    
    Return the absolute value of the argument.

skuld[1274]% pydoc repr             

Python Library Documentation: built-in function repr

repr (no arg info)
    repr(object) -> string
    
    Return the canonical string representation of the object.
    For most object types, eval(repr(object)) == object.


Python Library Documentation: module repr

NAME
    repr - # Redo the `...` (representation) but with limits on most sizes.

FILE
    /home/ping/sw/Python-1.5.2/Lib/repr.py

CLASSES
    Repr
    
    class Repr
        __init__(self)
        
        repr(self, x)
        
        repr1(self, x, level)
        
        repr_dictionary(self, x, level)
        
        repr_instance(self, x, level)
        
        repr_list(self, x, level)
        
        repr_long_int(self, x, level)
        
        repr_string(self, x, level)
        
        repr_tuple(self, x, level)

FUNCTIONS
    repr(no arg info)

skuld[1275]% pydoc re.MatchObject

Python Library Documentation: class MatchObject in re

class MatchObject
    __init__(self, re, string, pos, endpos, regs)
    
    end(self, g=0)
        Return the end of the substring matched by group g
    
    group(self, *groups)
        Return one or more groups of the match
    
    groupdict(self, default=None)
        Return a dictionary containing all named subgroups of the match
    
    groups(self, default=None)
        Return a tuple containing all subgroups of the match object
    
    span(self, g=0)
        Return (start, end) of the substring matched by group g
    
    start(self, g=0)
        Return the start of the substring matched by group g

skuld[1276]% pydoc xml    

Python Library Documentation: package xml

NAME
    xml - Core XML support for Python.

FILE
    /home/ping/dev/python/dist/src/Lib/xml/__init__.py

DESCRIPTION
    This package contains three sub-packages:
    
    dom -- The W3C Document Object Model.  This supports DOM Level 1 +
           Namespaces.
    
    parsers -- Python wrappers for XML parsers (currently only supports Expat).
    
    sax -- The Simple API for XML, developed by XML-Dev, led by David
           Megginson and ported to Python by Lars Marius Garshol.  This
           supports the SAX 2 API.

VERSION
    1.8

skuld[1277]% pydoc lovelyspam
no Python documentation found for lovelyspam

skuld[1278]% python
Python 1.5.2 (#1, Dec 12 2000, 02:25:44)  [GCC egcs-2.91.66 19990314/Linux (egcs- on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>>          
>>> from pydoc import help
>>> help(int)
Help on built-in function int:

int (no arg info)
    int(x) -> integer
    
    Convert a string or number to an integer, if possible.
    A floating point argument will be truncated towards zero.

>>> help("urlparse.urljoin")
Help on function urljoin in module urlparse:

urljoin(base, url, allow_fragments=1)
    # Join a base URL and a possibly relative URL to form an absolute
    # interpretation of the latter.
>>> import random
>>> help(random.generator)
Help on class generator in module random:

class generator(whrandom.whrandom)
    Random generator class.
    
    __init__(self, a=None)
        Constructor.  Seed from current time or hashable value.
    
    seed(self, a=None)
        Seed the generator from current time or hashable value.
>>> 





From moshez at zadka.site.co.il  Fri Jan 12 01:48:30 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 02:48:30 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A5C97F4.945D0C1@lemburg.com>
References: <3A5C97F4.945D0C1@lemburg.com>, <200101052014.PAA20328@cj20424-a.reston1.va.home.com>  
	            <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <20010112004830.21E10A82F@darjeeling.zadka.site.co.il>

On Wed, 10 Jan 2001 18:12:20 +0100, "M.-A. Lemburg" <mal at lemburg.com> wrote:

> > So here's a plan: somebody please submit a patch that does only one
> > thing: from...import * looks for __all__ and if it exists, imports
> > exactly those names.  No changes to dir(), or anything.
> 
> +1 -- this won't be me though (at least not this week).

I'm working on it -- I'll have a patch ready as soon as my slow
modem will manage to finish the "cvs diff".  Guido, I'll
assign it to you, OK?

> Cool.  This could make Python instances usable as "modules"
> -- with full getattr() hook support !

My Patch already does that -- if the instance supports __all__

> For IMPORT_STAR I'd suggest first looking for __all__ and
> then reverting to __dict__.items() in case this fails. 

That's what my patch is doing.

> BTW, is __dict__ needed by the import mechanism or would
> the getattr/setattr slots suffice ? And if yes, must it
> be a real Python dictionary ?

My patch works with getattr (no setattr) as longs as there
is an __all__ attribute. 

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From ping at lfw.org  Thu Jan 11 17:42:44 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 08:42:44 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101111558.KAA15447@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101110842110.5846-100000@skuld.kingmanhall.org>

On Thu, 11 Jan 2001, Guido van Rossum wrote:
> 
> Marc-Andre already explained that __all__ is not to be trusted.
> 
> If you want a reasonably good test for package-ness, use the presence
> of __path__.

Sorry, you're right.  I retract my comment about __all__.


-- ?!ng




From skip at mojam.com  Thu Jan 11 17:47:13 2001
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 11 Jan 2001 10:47:13 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010111164831.X2467@xs4all.nl>
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>
	<3A5DD248.8EE0DF63@lemburg.com>
	<20010111164831.X2467@xs4all.nl>
Message-ID: <14941.58257.304339.437443@beluga.mojam.com>

    Thomas> __path__ = "neener, neener"

I believe correct English usage here is "neener, neener, neener", with a
little extra emphasis on the first syllable of the third "neener"...

does-that-help?-ly y'rs,

Skip



From MarkH at ActiveState.com  Fri Jan 12 17:55:29 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Fri, 12 Jan 2001 08:55:29 -0800
Subject: [Python-Dev] RE: Baffled on Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEOGIHAA.tim.one@home.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPKEIHCOAA.MarkH@ActiveState.com>

> 4. The way mmapmodule.c is coded and built after Guido's change appears to
> me to be the same as how every other non-builtin module is coded and built
> on Windows.  For example, winsound.c, which uses DL_EXPORT(void)
> before its
> initwinsound and where that macro also expands to "void".  But importing
> winsound works fine.

winsound adds "/export:initwinsound" to the link line.  This is an
alternative to __declspec in the sources.

This all gets back to a discussion we had here nearly a year or so ago -
that "DL_EXPORT" isnt capturing our semantics, and that we should probably
create #defines that match the _intent_ of the definition, rather than the
implementation details - ie, replace DL_EXPORT with (say) PY_API_DECL and
PY_MODULEINIT_DECL or some such.

I'm happy to think about this and help implement it if the time is now
right...

> Any Windows geek got a clue?

Isn't that question a paradox? ;-)

Mark.




From skip at mojam.com  Thu Jan 11 18:11:23 2001
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 11 Jan 2001 11:11:23 -0600 (CST)
Subject: [Python-Dev] dir()/__all__/etc
Message-ID: <14941.59707.632995.224116@beluga.mojam.com>

I know Guido has said he doesn't want to fiddle with dir(), but my sense of
things from the overall discussion of the __exports__ concept tells me that
when used interactively dir() often presents confusing output for new Python
users.

I twiddled CGIHTTPServer to have __all__ and added the following dir()
function to my PYTHONSTARTUP file:

def dir(o,showall=0):
    if not showall and hasattr(o, "__all__"):
        x = list(o.__all__)
        x.sort()
        return x
    from __builtin__ import dir as d
    return d(o)

Compare its output with and without showall set:

  >>> dir(CGIHTTPServer)
  ['CGIHTTPRequestHandler', 'test']
  >>> dir(CGIHTTPServer,1)
  ['BaseHTTPServer', 'CGIHTTPRequestHandler', 'SimpleHTTPServer', '__all__',
   '__builtins__', '__doc__', '__file__', '__name__', '__version__',
   'executable', 'nobody', 'nobody_uid', 'os', 'string', 'sys', 'test',
   'urllib']

I haven't demonstrated any great programming prowess with this little
function, but I rather suspect it may be beyond most brand new users.  If
Guido can't be convinced to allow dir() to change, how about adding a sample
PYTHONSTARTUP file to the distribution that contains little bits like this
and Ping's pydoc.help stuff (assuming it gets into the distro, which I hope
it does)?

Skip



From mal at lemburg.com  Thu Jan 11 18:25:20 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 11 Jan 2001 18:25:20 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> <3A5DD248.8EE0DF63@lemburg.com> <20010111164831.X2467@xs4all.nl>
Message-ID: <3A5DEC80.596F0818@lemburg.com>

Thomas Wouters wrote:
> 
> On Thu, Jan 11, 2001 at 04:33:28PM +0100, M.-A. Lemburg wrote:
> 
> > > Please don't use __all__.  At the moment, __all__ is the only way
> > > to easily tell whether a particular module object really represents
> > > a package, and the only way to get the list of submodule names.
> >
> > But __all__ has to be user-defined, so I don't buy that argument.
> > Note that the only true way to recognize a package is by looking
> > for an attribute "__path__" since Python adds this for packages
> > only.
> 
> Ehm.... What, exactly, prevents usercode from doing
> 
> __path__ = "neener, neener"
> 
> ? In other words, even *that* isn't a true way to recognize a package. You
> can see what isn't a package, but not what is.

Purists.... ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From moshez at zadka.site.co.il  Fri Jan 12 03:06:37 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 04:06:37 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <14941.49059.26189.733094@beluga.mojam.com>
References: <14941.49059.26189.733094@beluga.mojam.com>, <E14GgKS-0002AH-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010112020637.EF4D5A82F@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001 08:13:55 -0600 (CST), Skip Montanaro <skip at mojam.com> wrote:

> While it works, is it really kosher to test w's value after the DECREF?

Yes. It may not point to anything valid, but it won't be NULL.

> Just seems like an odd construct to me.  I'm used to seeing the test
> immediately after it's been set.

It was more convenient that way. And I'm pretty certain the _DECREF
macros do not change their arguments.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From moshez at zadka.site.co.il  Fri Jan 12 03:09:13 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 04:09:13 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>
Message-ID: <20010112020913.1FE70A82F@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001 07:14:17 -0800 (PST), Ka-Ping Yee <ping at lfw.org> wrote:
> On Wed, 10 Jan 2001, Guido van Rossum wrote:
> > Yes -- I came up with the same thought.
> > 
> > So here's a plan: somebody please submit a patch that does only one
> > thing: from...import * looks for __all__ and if it exists, imports
> > exactly those names.  No changes to dir(), or anything.
> 
> Please don't use __all__.  At the moment, __all__ is the only way
> to easily tell whether a particular module object really represents
> a package

Why not __init__? It has to be there, and is in no other module object.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From moshez at zadka.site.co.il  Fri Jan 12 03:23:16 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 04:23:16 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010112004830.21E10A82F@darjeeling.zadka.site.co.il>
References: <20010112004830.21E10A82F@darjeeling.zadka.site.co.il>, <3A5C97F4.945D0C1@lemburg.com>, <200101052014.PAA20328@cj20424-a.reston1.va.home.com>  
	            <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <20010112022316.BE682A82D@darjeeling.zadka.site.co.il>

On Fri, 12 Jan 2001, Moshe Zadka <moshez at zadka.site.co.il> wrote:

> I'm working on it -- I'll have a patch ready as soon as my slow
> modem will manage to finish the "cvs diff".  Guido, I'll
> assign it to you, OK?

OK, it's 103200.
Unfortunately, I couldn't assign it to Guido, since I couldn't
upload it at all (yeah, still those lynx problems). This time
I managed to get one specific person to upload for me, but someone
else will have to assign to Guido.

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From nas at arctrix.com  Thu Jan 11 12:42:51 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 11 Jan 2001 03:42:51 -0800
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Jan 11, 2001 at 11:14:00AM -0500
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us>
Message-ID: <20010111034251.A23512@glacier.fnational.com>

Here is what I get on my Debian Linux machine:

  _codecs.so        cPickle.so    imageop.so        pwd.so       termios.so
  _curses.so        cStringIO.so  linuxaudiodev.so  regex.so     time.so
  _curses_panel.so  cmath.so      math.so           resource.so  timing.so
  _locale.so        crypt.so      md5.so            rgbimg.so    ucnhash.so
  _socket.so        dbm.so        mmap.so           rotor.so     unicodedata.so
  _tkinter.so       errno.so      new.so            select.so    zlib.so
  array.so          fcntl.so      nis.so            sha.so
  audioop.so        fpectl.so     operator.so       signal.so
  binascii.so       gdbm.so       parser.so         strop.so
  bsddb.so          grp.so        pcre.so           syslog.so
  
I think that is every module which can be compiled on my machine.  Great work
Andrew (and the distutil developers).

  Neil



From nas at arctrix.com  Thu Jan 11 12:47:09 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 11 Jan 2001 03:47:09 -0800
Subject: [Python-Dev] dir()/__all__/etc
In-Reply-To: <14941.59707.632995.224116@beluga.mojam.com>; from skip@mojam.com on Thu, Jan 11, 2001 at 11:11:23AM -0600
References: <14941.59707.632995.224116@beluga.mojam.com>
Message-ID: <20010111034709.C23512@glacier.fnational.com>

I'm -1 on making dir() pay attention to __all__.  I'm +1 on
adding a help() function which pays attention to __all__ and
(optionally?) prints doc strings.

  Neil



From gstein at lyra.org  Thu Jan 11 20:38:50 2001
From: gstein at lyra.org (Greg Stein)
Date: Thu, 11 Jan 2001 11:38:50 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101111558.KAA15447@cj20424-a.reston1.va.home.com>; from guido@python.org on Thu, Jan 11, 2001 at 10:58:55AM -0500
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> <200101111558.KAA15447@cj20424-a.reston1.va.home.com>
Message-ID: <20010111113850.F4640@lyra.org>

On Thu, Jan 11, 2001 at 10:58:55AM -0500, Guido van Rossum wrote:
> > Please don't use __all__.  At the moment, __all__ is the only way
> > to easily tell whether a particular module object really represents
> > a package, and the only way to get the list of submodule names.
> > 
> > If __all__ is overloaded to also represent exportable symbols in
> > modules, these two pieces of information will be impossible (or
> > require much ugly hackery) to obtain.
> 
> Marc-Andre already explained that __all__ is not to be trusted.
> 
> If you want a reasonably good test for package-ness, use the presence
> of __path__.
> 
> For a really good test, check whether __file__ ends in __init__.py[c].

Even that isn't safe: if the module was pulled from an archive, __file__
might not get set.

Determining whether something is a package is highly dependent upon how it
was brought into the system. It is entirely possibly that you *can't* know
something represents a package.

You can get close by looking in sys.modules to look for modules "below" the
given module. But if none have been imported yet, then you're out of luck.
If you're using imputil, then you can look for __ispkg__ in the module.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From thomas at xs4all.net  Thu Jan 11 20:50:24 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 11 Jan 2001 20:50:24 +0100
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010112020913.1FE70A82F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Fri, Jan 12, 2001 at 04:09:13AM +0200
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> <20010112020913.1FE70A82F@darjeeling.zadka.site.co.il>
Message-ID: <20010111205024.Z2467@xs4all.nl>

On Fri, Jan 12, 2001 at 04:09:13AM +0200, Moshe Zadka wrote:

> Why not __init__? It has to be there, and is in no other module object.

Wrong association... __init__ would be a method that gets executed. (At
least that's what I'd expect :)

'sides,-everyone-was-in-agreement-on-__all__-ly y'rs,

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From MarkH at ActiveState.com  Thu Jan 11 21:25:30 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Thu, 11 Jan 2001 12:25:30 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <20010112020637.EF4D5A82F@darjeeling.zadka.site.co.il>
Message-ID: <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>

> It was more convenient that way. And I'm pretty certain the _DECREF
> macros do not change their arguments.

Pretty certain???  That doesn't inspire confidence <wink>. How certain are
you that this will be true in the future?

I think it bad style indeed - for example, I could see benefit in having
DECREF (or _Py_Dealloc, called by decref) set the object to NULL in debug
builds.  What if that decision is taken in the future?

I thought rules were pretty clear with reference counting - dont assume
_anything_ about the object unless you hold a reference (or are damn sure
someone else does!)

Mark.




From thomas at xs4all.net  Thu Jan 11 22:41:57 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 11 Jan 2001 22:41:57 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>; from MarkH@ActiveState.com on Thu, Jan 11, 2001 at 12:25:30PM -0800
References: <20010112020637.EF4D5A82F@darjeeling.zadka.site.co.il> <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>
Message-ID: <20010111224157.A2467@xs4all.nl>

On Thu, Jan 11, 2001 at 12:25:30PM -0800, Mark Hammond wrote:

> I thought rules were pretty clear with reference counting - dont assume
> _anything_ about the object unless you hold a reference (or are damn sure
> someone else does!)

Moshe isn't breaking that rule. He isn't assuming anything about the object,
just about the value of the pointer to that object. I agree, though, that
it's bad practice to rely on it having the old value, after DECREFing it.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Thu Jan 11 22:48:46 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 16:48:46 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Thu, 11 Jan 2001 08:42:44 PST."
             <Pine.LNX.4.10.10101110842110.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101110842110.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101112148.QAA16227@cj20424-a.reston1.va.home.com>

> Sorry, you're right.  I retract my comment about __all__.

Can you explain *why* you wanted to test for package-ness?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Thu Jan 11 22:55:24 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 16:55:24 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: Your message of "Thu, 11 Jan 2001 11:14:00 EST."
             <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> 
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> 
Message-ID: <200101112155.QAA16678@cj20424-a.reston1.va.home.com>

> I've put a new version of the setup.py script at
>      http://www.mems-exchange.org/software/files/python/setup.py
> 
> (I'm at work and can't remember the password to get into
> www.amk.ca. :) )
> 
> This version improves the detection of Tcl/Tk, handles the
> _curses_panel module, and doesn't do a chdir().  Same drill as before:
> just grab the script, drop it in the root of your Python source tree
> (2.0 or current CVS), run "./python setup.py build", and look at the
> modules it compiles.  I can try it on Linux, so I'm most interested in
> hearing reports for other Unix versions (*BSD, HP-UX, etc.)

Good work -- but I still can't run this inside a platform-specific
subdirectory.  Are you planning on supporting this?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin at loewis.home.cs.tu-berlin.de  Thu Jan 11 22:20:45 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 11 Jan 2001 22:20:45 +0100
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of Python)
Message-ID: <200101112120.f0BLKjc01982@mira.informatik.hu-berlin.de>

> I would very much appreciate your feedback

At the first glance, it looks *very* promising. I really look forward
to see it in 2.1.

However, robustness probably needs to be improved:

>>> help()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: not enough arguments to help(); expected 1, got 0    

Wasn't there even a proposal that

>>> help

should do something meaningful (by implementing __repr__)?

>>> import string
>>> help(string)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "pydoc.py", line 183, in help
    pager('Help on %s:\n\n' % desc + textdoc.document(thing))
  File "./textdoc.py", line 171, in document
    if inspect.ismodule(object): results = document_module(object)
  File "./textdoc.py", line 87, in document_module
    if (inspect.getmodule(value) or object) is object:
  File "./inspect.py", line 190, in getmodule
    file = getsourcefile(object)
  File "./inspect.py", line 204, in getsourcefile
    filename = getfile(object)
  File "./inspect.py", line 172, in getfile
    raise TypeError, 'arg is a built-in class'
TypeError: arg is a built-in class

Also, the tools could use some command line options:

martin at mira:~/pydoc > ./pydoc.py --help
Traceback (most recent call last):
  File "./pydoc.py", line 190, in ?
    opts[args[i][1:]] = args[i+1]
IndexError: list index out of range

At a minimum, I propose -h, --help, -v, -V.

Regards,
Martin



From fdrake at acm.org  Thu Jan 11 23:11:24 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 11 Jan 2001 17:11:24 -0500 (EST)
Subject: [Python-Dev] [PEP 205] Weak References PEP updated, patch available!
Message-ID: <14942.12172.129547.770776@cj42289-a.reston1.va.home.com>

  I've updated the Weak References PEP a little:

http://python.sourceforge.net/peps/pep-0205.html

  A preliminary version of the implementation and documentation is
available as well:

http://sourceforge.net/patch/?func=detailpatch&patch_id=103203&group_id=5470

  Please send feedback on the PEP or implementation to me.
  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From akuchlin at mems-exchange.org  Thu Jan 11 23:26:33 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 11 Jan 2001 17:26:33 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: <200101112155.QAA16678@cj20424-a.reston1.va.home.com>; from guido@python.org on Thu, Jan 11, 2001 at 04:55:24PM -0500
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> <200101112155.QAA16678@cj20424-a.reston1.va.home.com>
Message-ID: <20010111172633.A26249@kronos.cnri.reston.va.us>

On Thu, Jan 11, 2001 at 04:55:24PM -0500, Guido van Rossum wrote:
>Good work -- but I still can't run this inside a platform-specific
>subdirectory.  Are you planning on supporting this?

I didn't really understand this when you pointed it out, but forgot to
ask for clarification.  What does your directory layout look like?

--amk




From ping at lfw.org  Thu Jan 11 23:26:53 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 14:26:53 -0800 (PST)
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <200101112120.f0BLKjc01982@mira.informatik.hu-berlin.de>
Message-ID: <Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>

On Thu, 11 Jan 2001, Martin v. Loewis wrote:
> 
> However, robustness probably needs to be improved:

Agreed.

> Wasn't there even a proposal that
> 
> >>> help
> 
> should do something meaningful (by implementing __repr__)?

There was.  I am planning to incorporate Paul Prescod's mechanism
for doing this; i just didn't have time to throw in that feature
yet, and wanted feedback on the man-like stuff first.

My next two targets are:
    1.  Generating text from the HTML documentation files
        using Paul Prescod's stuff in onlinehelp.py.

    2.  Running a background HTTP server that produces its
        pages using htmldoc.py.

Both are pieces we already have and only need to integrate; i just
wanted to get at least a working candidate done first.

Did using pydoc like "man" work okay for you?

> >>> import string
> >>> help(string)
> Traceback (most recent call last):
...
> TypeError: arg is a built-in class

Mine doesn't do this for me.  I think i may have left up an older version
of inspect.py by mistake.  Try downloading

    http://www.lfw.org/python/inspect.py

again -- apologies for the hassle.

> Also, the tools could use some command line options:
> 
> martin at mira:~/pydoc > ./pydoc.py --help
> Traceback (most recent call last):
>   File "./pydoc.py", line 190, in ?
>     opts[args[i][1:]] = args[i+1]
> IndexError: list index out of range
> 
> At a minimum, I propose -h, --help, -v, -V.

Okay.  There is usage help already; i just failed to make it sufficiently
robust about deciding when to show it.

    skuld[1010]% pydoc
    /home/ping/bin/pydoc <name> ...
        Show documentation on something.
        <name> may be the name of a Python function, module,
        package, or a dotted reference to a class or function
        within a module or module in a package.

    /home/ping/bin/pydoc -k <keyword>
        Search for a keyword in the short descriptions of modules.


-- ?!ng

"If I have seen farther than others, it is because I was standing on a
really big heap of midgets."
    -- K. Eric Drexler




From ping at lfw.org  Thu Jan 11 23:28:44 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 14:28:44 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101112148.QAA16227@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101111427060.5846-100000@skuld.kingmanhall.org>

On Thu, 11 Jan 2001, Guido van Rossum wrote:
> > Sorry, you're right.  I retract my comment about __all__.
> 
> Can you explain *why* you wanted to test for package-ness?

Auto-generating documentation.  pydoc.py currently tests for __path__,
and looks for the presence of __init__.py in a subdirectory to mean
that the subdirectory name is a package name.  Is it safe on all platforms
to just list all .py files in the subdirectory to get all submodules?


-- ?!ng

"If I have seen farther than others, it is because I was standing on a
really big heap of midgets."
    -- K. Eric Drexler




From tim.one at home.com  Fri Jan 12 00:17:06 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 11 Jan 2001 18:17:06 -0500
Subject: [Python-Dev] RE: Baffled on Windows
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPKEIHCOAA.MarkH@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEBJIIAA.tim.one@home.com>

[Mark Hammond]
> winsound adds "/export:initwinsound" to the link line.  This is an
> alternative to __declspec in the sources.

Yup/arghghghgh.  It's fixed now.  Thanks!

> This all gets back to a discussion we had here nearly a year
> or so ago -

Yup/arghghghgh.
.
> that "DL_EXPORT" isnt capturing our semantics, and that we should
> probably create #defines that match the _intent_ of the
> definition, rather than the implementation details - ie, replace
> DL_EXPORT with (say) PY_API_DECL and PY_MODULEINIT_DECL or some
> such.

Yup/noarghghghgh.

> I'm happy to think about this and help implement it if the time
> is now right...

Same here.  Now how can we tell whether the time is right?  I must say, it
hasn't gotten better by leaving it alone for a year.  I think we need a Unix
dweeb to play along, though -- if only to confirm that their compilers are
no help.

>> Any Windows geek got a clue?

> Isn't that question a paradox? ;-)

Well, nobody else will understand this, but *we* know that Windows geeks
need more clues than everyone else put together just to get the box booted
each day (or hour <0.9 wink>).




From michel at digicool.com  Fri Jan 12 02:15:52 2001
From: michel at digicool.com (Michel Pelletier)
Date: Thu, 11 Jan 2001 20:15:52 -0500
Subject: [Python-Dev] New Draft PEP: Python Interfaces
Message-ID: <web-555709@digicool.com>

Hello,

I have roughed out a draft PEP that proposes the extension
of Python to include an interface framework.  It is posted
online here:

http://www.zope.org/Members/michel/InterfacesPEP/PEP.txt

This is my first revision and stab at a PEP.  I'd like to
find out what you think about the PEP and maybe discuss it
some more offline on a different list.

Thanks!

-Michel



From martin at loewis.home.cs.tu-berlin.de  Fri Jan 12 02:15:25 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 12 Jan 2001 02:15:25 +0100
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>
	(message from Ka-Ping Yee on Thu, 11 Jan 2001 14:26:53 -0800 (PST))
References: <Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>
Message-ID: <200101120115.f0C1FPx03702@mira.informatik.hu-berlin.de>

> Did using pydoc like "man" work okay for you?

Yes, that is very impressive.

> Mine doesn't do this for me.  I think i may have left up an older version
> of inspect.py by mistake.  Try downloading
> 
>     http://www.lfw.org/python/inspect.py
> 
> again -- apologies for the hassle.

No need to apologize. It works fine now.

Thanks,
Martin



From moshez at zadka.site.co.il  Fri Jan 12 10:53:35 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 11:53:35 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>
References: <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>
Message-ID: <20010112095335.E8A15A82D@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001, "Mark Hammond" <MarkH at ActiveState.com> wrote:

> I think it bad style indeed - for example, I could see benefit in having
> DECREF (or _Py_Dealloc, called by decref) set the object to NULL in debug
> builds.  What if that decision is taken in the future?
> 
> I thought rules were pretty clear with reference counting - dont assume
> _anything_ about the object unless you hold a reference (or are damn sure
> someone else does!)

I'm not assuming anything about the object -- I'm assuming something
about the pointer. And macros should not change their arguments --
DECREF is basically a wrapper around _Py_Dealloc((PyObject *)(op)).

Just like

free(pointer);
if (pointer == NULL) 
	do_something();
is perfectly legal C.

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From moshez at zadka.site.co.il  Fri Jan 12 10:57:32 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 11:57:32 +0200 (IST)
Subject: [Python-Dev] dir()/__all__/etc
In-Reply-To: <14941.59707.632995.224116@beluga.mojam.com>
References: <14941.59707.632995.224116@beluga.mojam.com>
Message-ID: <20010112095732.1F65BA82D@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001 11:11:23 -0600 (CST), Skip Montanaro <skip at mojam.com> wrote:
> 
> I know Guido has said he doesn't want to fiddle with dir(), but my sense of
> things from the overall discussion of the __exports__ concept tells me that
> when used interactively dir() often presents confusing output for new Python
> users.
> 
> I twiddled CGIHTTPServer to have __all__ and added the following dir()
> function to my PYTHONSTARTUP file:
> 
> def dir(o,showall=0):
>     if not showall and hasattr(o, "__all__"):
>         x = list(o.__all__)
>         x.sort()
>         return x
>     from __builtin__ import dir as d
>     return d(o)
> 
> Compare its output with and without showall set:
> 
>   >>> dir(CGIHTTPServer)
>   ['CGIHTTPRequestHandler', 'test']
>   >>> dir(CGIHTTPServer,1)
>   ['BaseHTTPServer', 'CGIHTTPRequestHandler', 'SimpleHTTPServer', '__all__',
>    '__builtins__', '__doc__', '__file__', '__name__', '__version__',
>    'executable', 'nobody', 'nobody_uid', 'os', 'string', 'sys', 'test',
>    'urllib']
> 
> I haven't demonstrated any great programming prowess with this little
> function, but I rather suspect it may be beyond most brand new users.  If
> Guido can't be convinced to allow dir() to change, how about adding a sample
> PYTHONSTARTUP file to the distribution that contains little bits like this
> and Ping's pydoc.help stuff (assuming it gets into the distro, which I hope
> it does)?

And, while we're at it, the following bit too can be in the PYTHONSTARTUP:

def display(x):
	import __builtin__
	__builtin__._ = None
	if type(x) == type(''):
		print `x`
	else:
		print x
	__built__._ = x

import sys
sys.displayhook = display

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tim.one at home.com  Fri Jan 12 03:33:59 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 11 Jan 2001 21:33:59 -0500
Subject: [Python-Dev] dir()/__all__/etc
In-Reply-To: <20010111034709.C23512@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEBPIIAA.tim.one@home.com>

[Neil Schemenauer]
> I'm -1 on making dir() pay attention to __all__.

Me too.  The original __exports__ idea was an ironclad guarantee about which
names were externally visible for *any* purpose.  Then it made sense to
restrict dir() accordingly.  But if __all__ is just "a hint" (to be ignored
or honored at whim, by whoever chooses), the introspective uses of dir()
must be served too.

> I'm +1 on adding a help() function which pays attention to
> __all__ and (optionally?) prints doc strings.

I can't be +1 on anything that vague -- although I'm +1 on each part of it
if done in exactly the way I envision <wink>.




From ping at lfw.org  Fri Jan 12 03:51:54 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 18:51:54 -0800 (PST)
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of
 Python)
In-Reply-To: <200101120115.f0C1FPx03702@mira.informatik.hu-berlin.de>
Message-ID: <Pine.LNX.4.10.10101111846240.5846-100000@skuld.kingmanhall.org>

On Fri, 12 Jan 2001, Martin v. Loewis wrote:
> > Did using pydoc like "man" work okay for you?
> 
> Yes, that is very impressive.

Good.  What platform did you try it on?

I have updated the scripts now to provide a very rudimentary HTTP server
feature:

    skuld[1316]% pydoc -p 8080
    starting server on port 8080

This starts a server on port 8080 that generates HTML documentation for
modules on the fly.  The root page (http://localhost:8080/) shows an
index of modules -- it badly needs some cleaning up, but at least it
provides access to all the documentation.

    http://www.lfw.org/python/pydoc.py
    http://www.lfw.org/python/htmldoc.py

Also, as you requested:

    skuld[1324]% pydoc -h
    /home/ping/bin/pydoc <name> ...
        Show documentation on something.
        <name> may be the name of a Python function, module,
        package, or a dotted reference to a class or function
        within a module or module in a package.

    /home/ping/bin/pydoc -k <keyword>
        Search for a keyword in the short descriptions of modules.

    /home/ping/bin/pydoc -p <port>
        Start an HTTP server on the given port on the local machine.


More to come.


-- ?!ng




From fdrake at acm.org  Fri Jan 12 04:02:00 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 11 Jan 2001 22:02:00 -0500 (EST)
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>
References: <200101112120.f0BLKjc01982@mira.informatik.hu-berlin.de>
	<Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>
Message-ID: <14942.29609.19618.534613@cj42289-a.reston1.va.home.com>

Ka-Ping Yee writes:
 > My next two targets are:
 >     1.  Generating text from the HTML documentation files
 >         using Paul Prescod's stuff in onlinehelp.py.

  You mean the ones I publish as the standard documentation?  Relying
on the structure of that HTML is pure folly!  I don't think I can make
any guaranttees that the HTML structures won't change as the
processing evolves.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tim.one at home.com  Fri Jan 12 04:49:47 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 11 Jan 2001 22:49:47 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101111523.KAA14982@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOECEIIAA.tim.one@home.com>

[Guido]
> I don't want to call FLOCKFILE while holding the Python lock, as
> this means that *if* we're blocked in FLOCKFILE (e.g. we're reading
> from a pipe or socket), no other Python thread can run!

Ah, good point!  Doesn't appear an essential point, though:  the
HAVE_GETC_UNLOCKED code could still be fiddled easily enough to call
FLOCKFILE and FUNLOCKFILE exactly once per line, but with the first thread
release before the (dynamically only) FLOCKFILE and the last thread grab
after the (dynamically only) FUNLOCKFILE.  It's just a question of will, but
since that's lacking I'll drop it.

> ...
> I don't think that _filbuf can possibly wait for another thread to
> write data to the same stream object.

OK, I'll buy that.  Dropped too.

> ...
> OK.  It's unique to MS.  So close the bug report with a "won't fix"
> resolution.  There's no point in having bug reports remain open that
> we know we can't fix.

We don't really have a policy about that.  Perhaps you're articulating one
here, though!  I've always left bugs open if they're (a) bugs, and (b) open
<wink>.  For example, I left the Norton Blue-Screen crash bug open (although
I see now you eventually closed that).  Ditto the "Rare hangs in
w9xpopen.exe" bug (which is still open, but will never be fixed by *us*).
Just other examples of things we'll almost certainly never fix ourselves (we
have no handle on them, and all evidence says the OS is screwing up).

My view has been that if a user comes to the bug site, it's most helpful for
them if active (== "still happens") crashes and hangs appear among the open
problems.  Now that your view of it is clearer, I'll switch to yours.

too-easy<wink>-ly y'rs  - tim








From tim.one at home.com  Fri Jan 12 05:22:40 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 11 Jan 2001 23:22:40 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101111527.KAA15005@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIECGIIAA.tim.one@home.com>

[Guido]
> The locking prevents concurrent threads accessing the stream.
>
> But mixing reads and writes (without intervening fseek etc.) is
> illegal use of the stream, and the C standard allows them to be lax
> here, even if the program was single-threaded.
>
> In other words: the locking is so good that it serializes the
> sequence of reads and writes; but if the sequence of reads and
> writes is illegal, they don't guarantee anything.

We're never going to agree on this one, you know.

My definition of "bug" here has nothing to do with the std:  something's "a
bug" if it's not functioning as designed.  That's all.  So if the
implementers would say "oops!  that should not have happened!", then to me
it's "a bug".  It so happens I believe the MS implementers would consider
this to be a bug under that defn.  Multi-threaded libraries have to be
written to a much higher level than the C std guarantees (been there, done
that, and so have you), and this is specifically corruption in a crucial
area vulnerable to races.  They have a timing hole!  That's clear.  If the
MS implementers don't believe that's "a bug", then I'd say they're too
unprofessional to be allowed in the same country as a multithreaded library
<0.1 wink>.

Your definition of "bug" seems to be more "I don't want it in Python's open
bug list, so I'll do what Tim usually does and appeal to the std in a
transparent effort to convince someone that it's not really 'a bug' -- then
maybe I'll get it off of Python's bug list".

I'm sure you'll agree that's a fair summary of both sides <wink>.

it's-a-bug-and-it's-no-longer-on-python's-open-bug-list-ly y'rs
    - tim




From tim.one at home.com  Fri Jan 12 07:54:47 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 12 Jan 2001 01:54:47 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101111508.KAA14870@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECMIIAA.tim.one@home.com>

[Tim, on for_xreadlines vs readlines_sizehint, after disabling the
 default 1Mb buffer size in the latter]
> They're indistinguishable then on my box (on one run xreadlines
> is .1 seconds  (out of around 7.6 total) quicker, on another
> readlines_sizehint), *provided* that I specify the same buffer
> size (8192) that xreadlines uses internally.  However, if I even
> double that, readlines_sizehint is uniformly about 10% slower.  It's
> also a tiny bit slower if I cut the sizehint buffer size to 4096.

[Guido]
> 8192 happens to be the size of the stack-allocated buffer readlines()
> uses, and also the stdio BUFSIZ parameter, on many systems.  Look for
> SMALLCHUNK in fileobject.c.
>
> Would it make sense to tie the two constants together more to tune
> this optimally even when BUFSIZ is different?

Have to repeat what I first said:

> I'm afraid Mysteries will remain no matter how many
> person-decades we spend staring at this <0.5 wink> ...

I'm repeating that because BUFSIZ is 4096 on WinTel, but SMALLCHUNK (8192)
worked best for me.  Now we're in some complex balancing act among how often
the outer loop needs to refill the readlines_sizehint buffer;, how out of
whack the latter is with the platform stdio buffer; whether platform malloc
takes only twice as long to allocate space for 2*N strings as for N; and, if
the readlines buffer is too large, at exactly which point the known Win9x
eventually-quadratic-time behavior of PyList_Append starts to kick in.  I
can't out-think all that.  Indeed, I can't out-think any of it <frown>.

After staring at the code, I expect my "only a tiny bit slower" was an
illusion:  if 0 < sizehint <= SMALLCHUNK, sizehint appears to have no effect
on the operation on file_readline.

BTW, changing fileobject.c's SMALLCHUNK to a copy of BUFSIZ didn't make any
difference on Windows.




From moshez at zadka.site.co.il  Fri Jan 12 17:03:58 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 18:03:58 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules xreadlinesmodule.c,1.2,1.3
In-Reply-To: <E14GjqE-0003qi-00@usw-pr-cvs1.sourceforge.net>
References: <E14GjqE-0003qi-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010112160358.B0AC0A82D@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001, Thomas Wouters <twouters at users.sourceforge.net> wrote:

> Noone but me cares, but Guido said to go ahead and fix it if it bothered me.

I think you meant no one. Noone is an archaic spelling of noon.

quid-pro-quo-ly y'rs, Z.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From fredrik at effbot.org  Fri Jan 12 09:17:11 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 12 Jan 2001 09:17:11 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules xreadlinesmodule.c,1.2,1.3
References: <E14GjqE-0003qi-00@usw-pr-cvs1.sourceforge.net> <20010112160358.B0AC0A82D@darjeeling.zadka.site.co.il>
Message-ID: <012a01c07c70$11aac700$e46940d5@hagrid>

> > Noone but me cares, but Guido said to go ahead and fix it if it bothered me.
> 
> I think you meant no one. Noone is an archaic spelling of noon.

no, he meant me.  I care.

</F>




From martin at loewis.home.cs.tu-berlin.de  Fri Jan 12 09:09:00 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 12 Jan 2001 09:09:00 +0100
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of
 Python)
In-Reply-To: <Pine.LNX.4.10.10101111846240.5846-100000@skuld.kingmanhall.org>
	(message from Ka-Ping Yee on Thu, 11 Jan 2001 18:51:54 -0800 (PST))
References: <Pine.LNX.4.10.10101111846240.5846-100000@skuld.kingmanhall.org>
Message-ID: <200101120809.f0C890B00802@mira.informatik.hu-berlin.de>

> Good.  What platform did you try it on?

Linux, in a Konsole. I guess that is an environment you'd been using
as well :-)

Martin




From jack at oratrix.nl  Fri Jan 12 10:57:27 2001
From: jack at oratrix.nl (Jack Jansen)
Date: Fri, 12 Jan 2001 10:57:27 +0100
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of 
 Python)
In-Reply-To: Message by Ka-Ping Yee <ping@lfw.org> ,
	     Thu, 11 Jan 2001 08:36:36 -0800 (PST) , <Pine.LNX.4.10.10101110803400.5846-100000@skuld.kingmanhall.org> 
Message-ID: <20010112095727.C56D13BD8B0@snelboot.oratrix.nl>

> I'm pleased to announce a reasonable first pass at a documentation
> utility for interactive use.  "pydoc" is usable in three ways:
[...]
> I would very much appreciate your feedback, especially from testing
> on non-Unix platforms.  Thank you!

Wow, I'm impressed!

To make it run on the mac I had to add tests for the existence of os.system 
only. (So all statements "if os.system(...) > 0:" got to be "if hasattr(os, 
"system") and os.system(...) > 0:").

There are however various other niceties that could be added to make it more 
useful, can this be put into the repository or something?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 





From gstein at lyra.org  Fri Jan 12 11:31:53 2001
From: gstein at lyra.org (Greg Stein)
Date: Fri, 12 Jan 2001 02:31:53 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <20010111224157.A2467@xs4all.nl>; from thomas@xs4all.net on Thu, Jan 11, 2001 at 10:41:57PM +0100
References: <20010112020637.EF4D5A82F@darjeeling.zadka.site.co.il> <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com> <20010111224157.A2467@xs4all.nl>
Message-ID: <20010112023153.Q4640@lyra.org>

On Thu, Jan 11, 2001 at 10:41:57PM +0100, Thomas Wouters wrote:
> On Thu, Jan 11, 2001 at 12:25:30PM -0800, Mark Hammond wrote:
> 
> > I thought rules were pretty clear with reference counting - dont assume
> > _anything_ about the object unless you hold a reference (or are damn sure
> > someone else does!)
> 
> Moshe isn't breaking that rule. He isn't assuming anything about the object,
> just about the value of the pointer to that object. I agree, though, that
> it's bad practice to rely on it having the old value, after DECREFing it.

Oh, that is just so much baloney.

If I said Py_DECREF(&ptr), *then* I'd be worried. But if I ever call
Py_DECREF(foo) and it modifies foo, then I'd be quite upset. "functions"
just aren't supposed to do that.

-g

-- 
Greg Stein, http://www.lyra.org/



From guido at python.org  Fri Jan 12 14:51:51 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 08:51:51 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: Your message of "Thu, 11 Jan 2001 17:26:33 EST."
             <20010111172633.A26249@kronos.cnri.reston.va.us> 
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> <200101112155.QAA16678@cj20424-a.reston1.va.home.com>  
            <20010111172633.A26249@kronos.cnri.reston.va.us> 
Message-ID: <200101121351.IAA19676@cj20424-a.reston1.va.home.com>

> >Good work -- but I still can't run this inside a platform-specific
> >subdirectory.  Are you planning on supporting this?
> 
> I didn't really understand this when you pointed it out, but forgot to
> ask for clarification.  What does your directory layout look like?

Ah.  It's very simple.  I create a directory "linux" as a subdirectory
of the Python source tree (i.e. at the same level as Lib, Objects,
etc.).  Then I chdir into that directory, and I say "../configure".
The configure script creates subdirectories to hold the object files
for me: Grammar, Parser, Objects, Python, Modules, and sticks
Makefiles in them.  The "srcdir" variable in the Makefiles is set to
"..".  Then I say "make" and it builds Python.  The source directories
are used but no files are created or modified there: all files are
created in the "linux" directory.  This lets me have several separate
configurations: the feature used to be intended for sharing a source
tree between multiple platforms, but now I use it to have threaded,
nonthreaded, debugging, and regular builds under a single source tree.

This also works where the build directory is completely outside the
source tree (some people apparently mount the source tree read-only).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Jan 12 14:54:12 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 08:54:12 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Thu, 11 Jan 2001 14:28:44 PST."
             <Pine.LNX.4.10.10101111427060.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101111427060.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101121354.IAA19700@cj20424-a.reston1.va.home.com>

> > Can you explain *why* you wanted to test for package-ness?
> 
> Auto-generating documentation.  pydoc.py currently tests for __path__,
> and looks for the presence of __init__.py in a subdirectory to mean
> that the subdirectory name is a package name.  Is it safe on all platforms
> to just list all .py files in the subdirectory to get all submodules?

Yes, that should work.  Of course there could also be extension
modules or .pyc-only files there -- you could use imp..get_suffixes()
to find out all modules (even if that means you don't always have the
source code available).

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido at python.org  Fri Jan 12 15:07:30 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 09:07:30 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 11 Jan 2001 22:49:47 EST."
             <LNBBLJKPBEHFEDALKOLCOECEIIAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCOECEIIAA.tim.one@home.com> 
Message-ID: <200101121407.JAA19781@cj20424-a.reston1.va.home.com>

> [Guido]
> > I don't want to call FLOCKFILE while holding the Python lock, as
> > this means that *if* we're blocked in FLOCKFILE (e.g. we're reading
> > from a pipe or socket), no other Python thread can run!

[Tim]
> Ah, good point!  Doesn't appear an essential point, though:  the
> HAVE_GETC_UNLOCKED code could still be fiddled easily enough to call
> FLOCKFILE and FUNLOCKFILE exactly once per line, but with the first thread
> release before the (dynamically only) FLOCKFILE and the last thread grab
> after the (dynamically only) FUNLOCKFILE.  It's just a question of will, but
> since that's lacking I'll drop it.

Yes, but if the line is very long, you'd have to use malloc() -- you
can't use _PyString_Resize() since that can access the thread state.
You're right that I don't want to do this.

> > OK.  It's unique to MS.  So close the bug report with a "won't fix"
> > resolution.  There's no point in having bug reports remain open that
> > we know we can't fix.
> 
> We don't really have a policy about that.  Perhaps you're articulating one
> here, though!  I've always left bugs open if they're (a) bugs, and (b) open
> <wink>.  For example, I left the Norton Blue-Screen crash bug open (although
> I see now you eventually closed that).  Ditto the "Rare hangs in
> w9xpopen.exe" bug (which is still open, but will never be fixed by *us*).
> Just other examples of things we'll almost certainly never fix ourselves (we
> have no handle on them, and all evidence says the OS is screwing up).

Yes, as I was thinking about this I realized that that was the policy
I wanted.  So, yes, the w9xpopen popen bug can be closed as WontFix too.

> My view has been that if a user comes to the bug site, it's most helpful for
> them if active (== "still happens") crashes and hangs appear among the open
> problems.  Now that your view of it is clearer, I'll switch to yours.

I find it more important that the bug list gives us developers an
overview of tasks to be tackled.  The problems that won't go away can
be listed in the Python 2.0 MoinMoin web!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Jan 12 15:27:43 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 09:27:43 -0500
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of Python)
In-Reply-To: Your message of "Fri, 12 Jan 2001 10:57:27 +0100."
             <20010112095727.C56D13BD8B0@snelboot.oratrix.nl> 
References: <20010112095727.C56D13BD8B0@snelboot.oratrix.nl> 
Message-ID: <200101121427.JAA20034@cj20424-a.reston1.va.home.com>

> There are however various other niceties that could be added to make it more 
> useful, can this be put into the repository or something?

Ping, do you think you could check this in into the nondist tree?
nondist/sandbox/help would seem a good name (next to Paul's
nondist/sandbox/doctools).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Fri Jan 12 17:37:57 2001
From: skip at mojam.com (Skip Montanaro)
Date: Fri, 12 Jan 2001 10:37:57 -0600 (CST)
Subject: [Python-Dev] [Patch #103154] Cygwin Check Import Case Patch
In-Reply-To: <E14Gpl0-00016l-00@usw-sf-web3.sourceforge.net>
References: <E14Gpl0-00016l-00@usw-sf-web3.sourceforge.net>
Message-ID: <14943.13029.103771.261362@beluga.mojam.com>

    Guido> Summary: Cygwin Check Import Case Patch
    ...
    Guido> But I believe the solution is that the TERMIOS module should be
    Guido> renamed.

Isn't this a general problem?  As I recall, the convention when generating
Python modules from C header files is to simply convert the base name to
upper case and replace ".h" with ".py" (errno.h -> ERRNO.py).  From h2py.py:

    # Without filename arguments, acts as a filter.
    # If one or more filenames are given, output is written to corresponding
    # filenames in the local directory, translated to all uppercase, with
    # the extension replaced by ".py".

Perhaps the convention should be instead to append "d" or "data" to the base
name (errno.h -> errnodata.py).

Skip



From guido at python.org  Fri Jan 12 18:47:46 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 12:47:46 -0500
Subject: [Python-Dev] [Patch #103154] Cygwin Check Import Case Patch
In-Reply-To: Your message of "Fri, 12 Jan 2001 10:37:57 CST."
             <14943.13029.103771.261362@beluga.mojam.com> 
References: <E14Gpl0-00016l-00@usw-sf-web3.sourceforge.net>  
            <14943.13029.103771.261362@beluga.mojam.com> 
Message-ID: <200101121747.MAA27504@cj20424-a.reston1.va.home.com>

>     Guido> Summary: Cygwin Check Import Case Patch
>     ...
>     Guido> But I believe the solution is that the TERMIOS module should be
>     Guido> renamed.
> 
> Isn't this a general problem?  As I recall, the convention when generating
> Python modules from C header files is to simply convert the base name to
> upper case and replace ".h" with ".py" (errno.h -> ERRNO.py).  From h2py.py:
> 
>     # Without filename arguments, acts as a filter.
>     # If one or more filenames are given, output is written to corresponding
>     # filenames in the local directory, translated to all uppercase, with
>     # the extension replaced by ".py".
> 
> Perhaps the convention should be instead to append "d" or "data" to the base
> name (errno.h -> errnodata.py).

An even better solution is to get rid of those generated headers and
incorporate the desired symbols directly in the C extension modules.
That's happened for errno and socket, for example; maybe it's time to
do that for termios, too!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Fri Jan 12 19:54:47 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Fri, 12 Jan 2001 13:54:47 -0500
Subject: [Python-Dev] Patch 103216 - dbmmodule Setup changes
Message-ID: <14943.21239.382891.661026@anthem.wooz.org>

I've just uploaded patch 103216 to the Python project at SF.  This
does a couple of things.  First, it auto-detects (in configure)
whether dbmmodule can be built, and if so whether the -lndbm library
needs to be specified.  Second, it moves the entry for dbmmodule to
Setup.conf, after the *shared* key so that it'll be built as a dynamic
library by default.

This should fix the problem where compiling in dbmmodule sets up a
dependency to libdb which later hoses pybsddb3.

I'd have just checked it in, but I'd like someone else to just proof
it first.  I've only tested this with the current CVS tree on a fairly
stock RH6.1.

BTW, I didn't include the changes to configure in the patch, because
it's large and made SF's patch manager cough.  Besides it can be
generated from configure.in and config.h.in which are included in the
patch.

Cheers,
-Barry




From martin at loewis.home.cs.tu-berlin.de  Fri Jan 12 23:19:57 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 12 Jan 2001 23:19:57 +0100
Subject: [Python-Dev] PEP 205 comments
Message-ID: <200101122219.f0CMJvp01376@mira.informatik.hu-berlin.de>

Before commenting on the patch itself, I'd like to comment on the
patch describing it.

I'm missing a discussion as to why weak references don't act as
proxies (or why they do now). A weak proxy would provide the same
attributes as the object which it encapsulates, so it could be used
transparently in place of the original object. I can think of a number
of reasons why it is not done this way (e.g. complete transparency is
impossible to achieve); now that a revision of the patch provides
proxies, the documentation should state which features are forwarded
to the proxy and which aren't (it lists the type() as a difference,
but I doubt that is the only difference - repr is also different).

Next, I wonder whether weakref.new is allowed to return an existing
weak reference to the same object. If that is not acceptable, I'd like
to know why - if it was acceptable, then weakref.new(instance)
(i.e. without callback) could return the same weak reference all the
time. A smart implementation might chose to put the weak reference
with no callback in the start of the list, so creation of additional
weak references to the same object would be inexpensive.

Likewise, I'd like to know the rationale for the clear method. Why is
it desirable to drop the object, yet keep the weak reference? Isn't it
easier for the application to either ignore clearing altogether, or
dropping the reference to the weak reference? So I'd propose to kill
the clear method.

Again on proxies, there is no discussion or documentation of the
ReferenceError. Why is it a RuntimeError? LookupError, ValueError, and
AttributeError seem to be just as fine or better.

On to the type type extensions: Should there be a type flag indicating
presence of tp_weaklistoffset? It appears that the type structure had
tp_xxx7 for a long time, so likely all in-use binary modules have
that field set to zero. Is that sufficient?

Thanks for reading all of this message,

Martin



From skip at mojam.com  Sat Jan 13 16:37:55 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 13 Jan 2001 09:37:55 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib tempfile.py,1.23,1.24
In-Reply-To: <E14HGz6-0005Fh-00@usw-pr-cvs1.sourceforge.net>
References: <E14HGz6-0005Fh-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <14944.30291.658931.489979@beluga.mojam.com>

    Tim> On Linux, someone please run that standalone with more files and/or
    Tim> more threads; e.g.,

    Tim>     python lib/test/test_threadedtempfile.py -f 1000 -t 10

    Tim> to run with 10 threads each creating (and deleting) 1000 temp files.

After capitalizing "Lib", it worked fine for me:

    % ./python Lib/test/test_threadedtempfile.py -f 1000 -t 10
    Creating
    Starting
    Reaping
    Done: errors 0 ok 10000

Skip



From dkwolfe at pacbell.net  Sat Jan 13 19:48:21 2001
From: dkwolfe at pacbell.net (Dan Wolfe)
Date: Sat, 13 Jan 2001 10:48:21 -0800
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
Message-ID: <0G740027Q6Q1KL@mta6.snfc21.pbi.net>

Howdy Folks,

I need some help here. I'd like to see Python build out of the box with a 
./configure, make, make test, and make install on Darwin and Mac OS X.  
Having it build out of the box will make it easier to be incorporated 
into both Darwin and the base Mac OS X distribution - although not for 
the initial release of the latter but definitely doable for subsequent 
releases. In order to do this, I need to have it build cleanly on HFS and 
UFS filesystems.

Under HFS system, I've got a name conflict due to case insenstivity 
between the build target and the "Python" directory that forces me to 
build with a -with-suffix command on HFS and manually change the name 
after install - which is an automatic knockout factor when it comes to 
incorporating it in an automatic build system. Not to mention a problem 
with unix newbies trying to build from source...

Last night, I did some quick investigation to determine the best way to 
fix this problem as documented in PEP-42 in the build section and 
Sourceforge bug 122215 and determined that the easiest and least error 
prone way was to change the directory name Python to PyCore.

It's apparent from the comments that I'm missing something here as the 
reaction has been negative so far - to the point where Guido has rejected 
the patch. Can someone explain what I'd missing that's causing such 
strong feelings?

My second question is how do I resolve the name conflict in an approved 
way?  It's been suggested that a build directory be created (/src/build 
?) and that the target be place here. The problem that I had with this 
suggestion is that it would require an additional layer to execute the 
target and I wasn't sure what impact it whould have on running python 
from a new directory... which is the reason I took the more known path. 
:-)

Bottom line, come March 24th, Mac OS X 1.0 will be released and as of 
July 2001 all Macintoshes  will come with Mac OS X.  I'd like to see 
Python be easily built on "out of the box" these machines - rather come 
with a haphazardous list of instructions or commands as currently needed 
for 1.5.2 and 2.0 releases. And hopefully, at some point be incorporated 
into the base Mac OS X installation...

- Dan Wolfe



From esr at thyrsus.com  Sat Jan 13 21:23:50 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sat, 13 Jan 2001 15:23:50 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
Message-ID: <20010113152350.A17338@thyrsus.com>

I have a new goodie for the 2.1 standard library, a module called
"simil" that supports computation of similarity indices between
strings such as one might use for recovery-matching of misspellings
against a dictionary.

The three methods supported are stemming, normalized Hamming
similarity, and (the star of the show) Ratcliff-Obershelp gestalt
subpattern matching.  The latter is spookily effective for detecting
not just substition typos but insertions and deletions.  The module is
a C extension (my first!) for speed and because the Ratcliff-Obershelp
implementation uses pointer arithmetic heavily.

It's documented, tested, and ready to go.  But having written it, I
now have a question: why is soundex marked obsolete?  Is there
something wrong with the algorithm or implementation?  If not, then
it would be natural for simil to absorb the existing soundex 
implementation as a fourth entry point.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Whether the authorities be invaders or merely local tyrants, the
effect of such [gun control] laws is to place the individual at the 
mercy of the state, unable to resist.
        -- Robert Anson Heinlein, 1949

-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Americans have the right and advantage of being armed - unlike the citizens
of other countries whose governments are afraid to trust the people with arms.
	-- James Madison, The Federalist Papers



From tim.one at home.com  Sat Jan 13 22:34:10 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 13 Jan 2001 16:34:10 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010113152350.A17338@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>

[Eric S. Raymond]
> I have a new goodie for the 2.1 standard library, a module called
> "simil" that supports computation of similarity indices between
> strings such as one might use for recovery-matching of misspellings
> against a dictionary.

My guess is that Guido won't accept it.

> The three methods supported are stemming, normalized Hamming
> similarity, and (the star of the show) Ratcliff-Obershelp gestalt
> subpattern matching.  The latter is spookily effective for detecting
> not just substition typos but insertions and deletions.  The module is
> a C extension (my first!) for speed and because the Ratcliff-Obershelp
> implementation uses pointer arithmetic heavily.

Never heard of R-O, so tracked down some C code via google.  It appears I
invented the same algorithm at Cray Research in the early 80's for a diff
generator, which later got reincarnated in my ndiff.py (in the
Tools/scripts/ directory).  ndiff generates "human-friendly" diffs between
text files, at both the "file is a sequence of lines" and "line is a
sequence of characters" levels.  I didn't have the hyperbolic marketing
genius to call it "gestalt subpattern matching", though <wink> -- I thought
of it as what Unix diff *would* do if it constrained itself to matching
*contiguous* subsequences, and under the theory people would find that more
natural because contiguity is something the human visual system naturally
latches on to.  ndiff can be spookily natural in practice too.

> It's documented, tested, and ready to go.  But having written it, I
> now have a question: why is soundex marked obsolete?  Is there
> something wrong with the algorithm or implementation?

What is the soundex algorithm?  Not joking.  Skip Montanaro and I were
unable to find the algorithm implemented by soundex.c anywhere in the
literature, and I never found *any* two definitions that were the same.
Even Knuth changed his description of Soundex between editions 2 and 3 of
volume 3.  Skip eventually merged my and Fred Drake's Python implementations
of Knuth Vol 3 Ed 3 Soundex (see the Vaults of Parnassus).

> If not, then it would be natural for simil to absorb the existing
> soundex implementation as a fourth entry point.

Well, soundex.c doesn't match any other Soundex on earth, so it's not worth
reproducing in new code.  Guido doesn't want to be in the middle of fighting
over ill-defined algorithms, so booted Soundex entirely.  Another candidate
for inclusion is the NYSIIS algorithm, which is probably in more "serious"
use than Soundex anyway.  Same thing with NYSIIS, though (i.e., what--
exactly --is "the NYSIIS algorithm"?), except that Knuth didn't do us the
favor of making up his own variation that will *become* "the std" via force
of reputation.  Sean True implemented *a* NYSIIS in Python (and again see
the Vaults for a link to that).

So that's why the module is unlikely to make it into the core:

+ There are any number of algorithms people may want to see (I don't know
what "normalized Hamming similarity" means, but if it's not the same as
Levenshtein edit distance then add the latter to the pot too).

+ Each algorithm on its own is likely controversial.

+ Computing string similarity is something few apps need anyway.

Lots of hassle + little demand == not a natural for the core.  ndiff is in
the core only because many people found the *app* useful; its
SequenceMatcher class isn't even advertised.

may-never-understand-how-bigints-got-into-python<wink>-ly
    y'rs  - tim




From fdrake at acm.org  Sat Jan 13 22:45:12 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sat, 13 Jan 2001 16:45:12 -0500 (EST)
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>
References: <20010113152350.A17338@thyrsus.com>
	<LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>
Message-ID: <14944.52328.558763.46161@cj42289-a.reston1.va.home.com>

Tim Peters writes:
 > + Computing string similarity is something few apps need anyway.

  And this is a biggie.

 > Lots of hassle + little demand == not a natural for the core.  ndiff is in

  But it *is* an excellent type of thing to have around -- Eric: just
post it on your Web site and register it with the Vaults.

 > the core only because many people found the *app* useful; its
 > SequenceMatcher class isn't even advertised.

  Did you ever write documentation for it?  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From nas at arctrix.com  Sat Jan 13 16:17:58 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sat, 13 Jan 2001 07:17:58 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python sysmodule.c,2.82,2.83
In-Reply-To: <E14HYoJ-0002n3-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Sat, Jan 13, 2001 at 02:06:07PM -0800
References: <E14HYoJ-0002n3-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010113071758.C28643@glacier.fnational.com>

[Guido van Rossum on Demo/embed/loop]
> (Except it still leaks, but that's probably a separate issue.)

Could this be caused by modules adding things to their dict and
then forgetting to decref them?  I know I've been guilty of that.

  Neil



From esr at thyrsus.com  Sat Jan 13 23:15:28 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sat, 13 Jan 2001 17:15:28 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>; from tim.one@home.com on Sat, Jan 13, 2001 at 04:34:10PM -0500
References: <20010113152350.A17338@thyrsus.com> <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>
Message-ID: <20010113171528.A17480@thyrsus.com>

OK, now I understand why soundex isn't in the core -- there's no canonical 
version.

Tim Peters <tim.one at home.com>:
> + There are any number of algorithms people may want to see (I don't know
> what "normalized Hamming similarity" means, but if it's not the same as
> Levenshtein edit distance then add the latter to the pot too).

Normalized Hamming similarity: it's an inversion of Hamming distance
-- number of pairwise matches in two strings of the same length,
divided by the common string length.  Gives a measure in [0.0, 1.0].

I've looked up "Levenshtein edit distance" and you're rigbt.  I'll add it
as a fourth entry point as soon as I can find C source to crib.  (Would
you happen to have a pointer?)

> + Each algorithm on its own is likely controversial.

Not these.  There *are* canonical versions of all these, and exact
equivalents are all heavily used in commercial OCR software.

> + Computing string similarity is something few apps need anyway.

Tim, this isn't true.  Any time you need to validate user input
against a controlled vocabulary and give feedback on probable right
choices, R/O similarity is *very* useful.  I've had it in my personal
toolkit for a decade and used it heavily for this -- you take your
unknown input, check it against a dictionary and kick "maybe you meant
foo?" to the user for every foo with an R/O similarity above 0.6 or so.

The effects look like black magic.  Users love it.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"I hold it, that a little rebellion, now and then, is a good thing, and as 
necessary in the political world as storms in the physical."
	-- Thomas Jefferson, Letter to James Madison, January 30, 1787



From guido at python.org  Sat Jan 13 23:25:12 2001
From: guido at python.org (Guido van Rossum)
Date: Sat, 13 Jan 2001 17:25:12 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python sysmodule.c,2.82,2.83
In-Reply-To: Your message of "Sat, 13 Jan 2001 07:17:58 PST."
             <20010113071758.C28643@glacier.fnational.com> 
References: <E14HYoJ-0002n3-00@usw-pr-cvs1.sourceforge.net>  
            <20010113071758.C28643@glacier.fnational.com> 
Message-ID: <200101132225.RAA03197@cj20424-a.reston1.va.home.com>

> [Guido van Rossum on Demo/embed/loop]
> > (Except it still leaks, but that's probably a separate issue.)
> 
> Could this be caused by modules adding things to their dict and
> then forgetting to decref them?  I know I've been guilty of that.

Do you have a tool that detects leaks?  Barry has one: Insure++.  It's
expensive and we don't have a site license, so I'll ask Barry to
investigate this.

(Barry: go to Demo/embed and do "make looptest".  Then in another
shell window use "top" to watch the "loop" process grow slowly.  I'd
love to find out what's the problem here.  It's not dependent on what
you ask it to loop over; "./loop pass" also grows.  Of course it could
be one of the modules loaded during initialization...)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Sat Jan 13 23:33:34 2001
From: guido at python.org (Guido van Rossum)
Date: Sat, 13 Jan 2001 17:33:34 -0500
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
In-Reply-To: Your message of "Sat, 13 Jan 2001 10:48:21 PST."
             <0G740027Q6Q1KL@mta6.snfc21.pbi.net> 
References: <0G740027Q6Q1KL@mta6.snfc21.pbi.net> 
Message-ID: <200101132233.RAA03229@cj20424-a.reston1.va.home.com>

> Howdy Folks,
> 
> I need some help here. I'd like to see Python build out of the box with a 
> ./configure, make, make test, and make install on Darwin and Mac OS X.  
> Having it build out of the box will make it easier to be incorporated 
> into both Darwin and the base Mac OS X distribution - although not for 
> the initial release of the latter but definitely doable for subsequent 
> releases. In order to do this, I need to have it build cleanly on HFS and 
> UFS filesystems.
> 
> Under HFS system, I've got a name conflict due to case insenstivity 
> between the build target and the "Python" directory that forces me to 
> build with a -with-suffix command on HFS and manually change the name 
> after install - which is an automatic knockout factor when it comes to 
> incorporating it in an automatic build system. Not to mention a problem 
> with unix newbies trying to build from source...
> 
> Last night, I did some quick investigation to determine the best way to 
> fix this problem as documented in PEP-42 in the build section and 
> Sourceforge bug 122215 and determined that the easiest and least error 
> prone way was to change the directory name Python to PyCore.
> 
> It's apparent from the comments that I'm missing something here as the 
> reaction has been negative so far - to the point where Guido has rejected 
> the patch. Can someone explain what I'd missing that's causing such 
> strong feelings?

We use CVS to manage the sources.  CVS makes it it very hard to a
directory; it doesn't have a command for this, so you have to do the
move directly in the repository, which will then break checkouts for
everyone who has a work directory linked to the CVS repository.  Using
SourceForge makes it a bit harder still: we have to ask the SF
sysadmins to do the move for us.

And if we did the move, it would be much harder to reproduce old
versions of the source tree with a single CVS command.  A way around
that would be to do a copy instead of a move, but that would cause the
directory "PyCore" to pop up in all old versions, too.

I just don't want to go through this hassle in order to make building
easier for one relatively little-used platform.

> My second question is how do I resolve the name conflict in an approved 
> way?  It's been suggested that a build directory be created (/src/build 
> ?) and that the target be place here. The problem that I had with this 
> suggestion is that it would require an additional layer to execute the 
> target and I wasn't sure what impact it whould have on running python 
> from a new directory... which is the reason I took the more known path. 
> :-)

I don't understand what you are proposing here; I can't imagine that
an extra directory level could cause a slowdown.

A suggestion I would be open to: change the executable name during
build (currently a .exe suffix is added), but change it back (removing
the .exe suffix) during the install.  That should be a small change to
the Makefile.

> Bottom line, come March 24th, Mac OS X 1.0 will be released and as of 
> July 2001 all Macintoshes  will come with Mac OS X.  I'd like to see 
> Python be easily built on "out of the box" these machines - rather come 
> with a haphazardous list of instructions or commands as currently needed 
> for 1.5.2 and 2.0 releases. And hopefully, at some point be incorporated 
> into the base Mac OS X installation...

Just get Apple to include Python with their standard distribution and
nobody will *have* to build Python on Mac OSX. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Sun Jan 14 00:59:44 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 13 Jan 2001 18:59:44 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010113171528.A17480@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHHIIAA.tim.one@home.com>

[Eric]
> OK, now I understand why soundex isn't in the core -- there's no
> canonical version.

Actually, I think Knuth Vol 3 Ed 3 is canonical *now* -- nobody would dare
to oppose him <0.5 wink>.

> Normalized Hamming similarity: it's an inversion of Hamming distance
> -- number of pairwise matches in two strings of the same length,
> divided by the common string length.  Gives a measure in [0.0, 1.0].
>
> I've looked up "Levenshtein edit distance" and you're rigbt.  I'll add
> it as a fourth entry point as soon as I can find C source to crib.
> (Would you happen to have a pointer?)

If you throw almost everything out of Unix diff, that's what you'll be left
with.  Offhand I don't know of enencumbered, industrial-strength C source; a
problem is that writing a program to compute this is a std homework exercise
(it's a common first "dynamic programming" example), so you can find tons of
bad C source.

Caution:  many people want small variations of "edit distance", usually via
assigning different weights to insertions, replacements and deletions.  A
less common but still popular variant is to say that a transposition ("xy"
vs "yx") is less costly than a delete plus an insert.  Etc.  "edit distance"
is really a family of algorithms.

>> + Each algorithm on its own is likely controversial.

> Not these.  There *are* canonical versions of all these,

See the "edit distance" gloss above.

> and exact equivalents are all heavily used in commercial OCR
> software.

God forbid that core Python may lose the commercial OCR developer market
<wink>.  It's not accepted that for every field F, core Python needs to
supply the algorithms F uses heavily.  Heck, core Python doesn't even ship
with an FFT!  Doesn't bother the folks working in signal processing.

>> + Computing string similarity is something few apps need anyway.

> Tim, this isn't true.  Any time you need to validate user input
> against a controlled vocabulary and give feedback on probable right
> choices,

Which is something few apps need anyway -- in my experience, but more so in
my *primary* role here of trying to channel for you (& Guido) what Guido
will say.  It should be clear that I've got some familiarity with these
schemes, so it should also be clear that Guido is likely to ask me about
them whenever they pop up.  But Guido has hardly ever asked me about them
over the past decade, with the exception of the short-lived Soundex
brouhaha.  From that I guess hardly anyone ever asks *him* about them, and
that's how channeling works:  if this were an area where Guido felt core
Python needed beefier libraries, I'm pretty sure I would have heard about it
by now.

But now Guido can speak for himself.  There's no conceivable argument that
could change what I *predict* he'll say.

> R/O similarity is *very* useful.  I've had it in my personal
> toolkit for a decade and used it heavily for this -- you take your
> unknown input, check it against a dictionary and kick "maybe you meant
> foo?" to the user for every foo with an R/O similarity above 0.6 or so.
>
> The effects look like black magic.  Users love it.

I believe that.  And I'd guess we all have things in our personal toolkits
our users love.  That isn't enough to get into the core, as I expect Guido
will belabor on the next iteration of this <wink>.

doesn't-mean-the-code-isn't-mondo-cool-ly y'rs  - tim




From dkwolfe at pacbell.net  Sun Jan 14 01:19:56 2001
From: dkwolfe at pacbell.net (Dan Wolfe)
Date: Sat, 13 Jan 2001 16:19:56 -0800
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
Message-ID: <0G7400EZQM2TXD@mta5.snfc21.pbi.net>

>CVS makes it it very hard to a directory...
>which will then break checkouts for everyone...

with the potential to cause development code to be lost

>Using SourceForge...have to ask the SF sysadmins

I understand... we also use CVS and periodically (usually pre alpha) 
reorganize the source... going thru SF sysadmin makes it doublely hard... 
yuck!

However, since you have "released" tarball archives, it seems to me that 
the loss of the diffs and log notes is more troubling that the need to 
create an old version.... at least that's been my experience when 
building software. ;-)

>I just don't want to go through this hassle in order to make building
>easier for one relatively little-used platform.

humph. Ok, I'll accept that for now as we've only sold 100,000 Beta 
copies of Mac OS X... but if were not over 1 million users by this time 
next year... I'll eat my words. ;-)

>> It's been suggested that a build directory be created (/src/build ?) 
>> and that the target be place here. 

>I don't understand what you are proposing here; I can't imagine that
>an extra directory level could cause a slowdown.

moshez suggested this in his comment on the patch - moving the target to 
a seperate directory. I'm not sure of the implications of doing this 
however, and wondered if it might effect the running of the regression 
suite and the executable before it was installed.

>A suggestion I would be open to: change the executable name during
>build (currently a .exe suffix is added), but change it back (removing
>the .exe suffix) during the install.  That should be a small change to
>the Makefile.

You mean without using the -with-suffix command? That can probably be 
done... but based on my readings, I'd thought you reject it as not being 
"clean" and complicating the build process more than it should - not to 
mention renaming the executable behind the builder's back...  Lesser of 
two evils I guess - I'll investigate this however...

>> I'd like to see Python be easily built on "out of the box"...
>> [and] incorporated into the base Mac OS X installation...
>
>Just get Apple to include Python with their standard distribution and
>nobody will *have* to build Python on Mac OSX. :-)

Easier said that done as they already have the other P language 
installed. ;-) But then on the other hand, there are quite a few 
Pythonatic including me who use it in daily work at Apple. 

As I mentioned, the road to getting it in Mac OS X begins with getting it 
to build cleanly with the automated build system... so I've got to get 
this problem fixed before I start working on getting it in the build.

- Dan
  (yes, I work for Apple, but this is something that I'm doing on my own!)




From mwh21 at cam.ac.uk  Sun Jan 14 01:41:35 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 14 Jan 2001 00:41:35 +0000
Subject: [Python-Dev] a readline replacement?
In-Reply-To: Michael Hudson's message of "17 Dec 2000 18:18:24 +0000"
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> <20001215235425.A29681@xs4all.nl> <m3hf42q5cf.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <m3snmn3qyo.fsf_-_@atrus.jesus.cam.ac.uk>

Michael Hudson <mwh21 at cam.ac.uk> writes:

> It wouldn't be particularly hard to rewrite editline in Python (we
> have termios & the terminal handling functions in curses - and even
> ioctl if we get really keen).
> 
> I've been hacking on my own Python line reader on and off for a while;
> it's still pretty buggy, but if you're feeling brave you could look at:
> 
> http://www-jcsu.jesus.cam.ac.uk/~mwh21/hacks/pyrl-0.0.0.tar.gz

As I secretly planned <wink>, the embarrassment of having code that
full of holes publicly accessible spurred me to writing a much better
version, to be found at:

  http://www-jcsu.jesus.cam.ac.uk/~mwh21/hacks/pyrl-0.2.0.tar.gz

(or, now rsync works there again, in the equivalent place on the
starship...).

If you unpack it and execute

$ python python_reader.py

you should get something that closely mimics the current interpreter
top level.  It supports a wide range of cursor motion commands,
built-in support for multiple line input and history (including
incremental search).  It doesn't do completion, basically because I
haven't got round to it yet, and it will get into severe trouble if
you enter an input that is taller than your terminal (I think this
should be surmountable, but I haven't gotten round to this either).
Another thing that I haven't gotten round to yet is documentation.
After I've tackled these points I'll probably stick it up on
parnassus.

I've been using it as my standard python shell for a week or so, and
quite like it, though the lack of completion is a drag.

It is probably staggeringly unportable, so I'd appreciate finding out
how it breaks on systems other that Linux with terminals other than
xterms...

Have the changes to enable use of editline been checked in yet?  I
worry that the licensing situation around the readline module is grey
at best...

Cheers,
M.

-- 
  That's why the smartest companies use Common Lisp, but lie about it
  so all their competitors think Lisp is slow and C++ is fast.  (This
  rumor has, however, gotten a little out of hand. :)
                                        -- Erik Naggum, comp.lang.lisp




From esr at thyrsus.com  Sun Jan 14 01:58:08 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sat, 13 Jan 2001 19:58:08 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHHIIAA.tim.one@home.com>; from tim.one@home.com on Sat, Jan 13, 2001 at 06:59:44PM -0500
References: <20010113171528.A17480@thyrsus.com> <LNBBLJKPBEHFEDALKOLCIEHHIIAA.tim.one@home.com>
Message-ID: <20010113195808.B17712@thyrsus.com>

Tim Peters <tim.one at home.com>:
> If you throw almost everything out of Unix diff, that's what you'll be left
> with.  Offhand I don't know of enencumbered, industrial-strength C source; a
> problem is that writing a program to compute this is a std homework exercise
> (it's a common first "dynamic programming" example), so you can find tons of
> bad C source.

I found some formal descriptions of the algorithm and some unencumbered 
Oberon source.  I'm coding up C now.  It's not complicated if you're willing 
to hold the cost matrix in memory, which is reasonable for a string comparator
in a way it wouldn't be for a file diff.
 
> Caution:  many people want small variations of "edit distance", usually via
> assigning different weights to insertions, replacements and deletions.  A
> less common but still popular variant is to say that a transposition ("xy"
> vs "yx") is less costly than a delete plus an insert.  Etc.  "edit distance"
> is really a family of algorithms.

Which about collapse into one if your function has three weight
arguments for insert/replace/delete weights, as mine does.  It don't
get more general than that -- I can see that by looking at the formal
description.  

OK, so I'll give you that I don't weight transpositions separately,
but neither does any other variant I found on the web nor the formal
descriptions.  A fourth optional weight agument someday, maybe :-).

> God forbid that core Python may lose the commercial OCR developer market
> <wink>.  It's not accepted that for every field F, core Python needs to
> supply the algorithms F uses heavily.

That's not my point -- I don't see OCR as a big Python market either.
My point in observing that OCR uses Ratcliff/Obershelp heavily was
simplty to show that it's a well-established algorithm, not
`controversial'.

>                      Heck, core Python doesn't even ship
> with an FFT!  Doesn't bother the folks working in signal processing.

It probably won't surprise you that I considered writing an FFT extension
module at one point :-).  

> > Tim, this isn't true.  Any time you need to validate user input
> > against a controlled vocabulary and give feedback on probable right
> > choices,
> 
> Which is something few apps need anyway

I fundamentally disagree.  Few application designers *know* they need
it, but user interfaces would get a hell of a lot better if the
technique were more commonly applied -- and that's why I want it in
the Python library, so doing the right thing in Python will be a
minimum-effort proposition.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

What if you were an idiot, and what if you were a member of Congress?
But I repeat myself.
        -- Mark Twain



From tim.one at home.com  Sun Jan 14 04:17:34 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 13 Jan 2001 22:17:34 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <14944.52328.558763.46161@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHNIIAA.tim.one@home.com>

[Fred]
>   Did you ever write documentation for it?  ;-)

A lot more than you did <wink>.

just-show-me-"write-docs"-in-my-job-description-ly y'rs  - tim




From tim.one at home.com  Sun Jan 14 05:39:59 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 13 Jan 2001 23:39:59 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010113195808.B17712@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHPIIAA.tim.one@home.com>

[Eric, on "edit distance"]
> I found some formal descriptions of the algorithm and some
> unencumbered Oberon source.  I'm coding up C now.  It's not
> complicated if you're willing to hold the cost matrix in memory,
> which is reasonable for a string comparator in a way it wouldn't
> be for a file diff.

All agreed, and it should be a straightforward task then.  I'm assuming it
will work with Unicode strings too <wink>.

[on differing weights]
> Which about collapse into one if your function has three weight
> arguments for insert/replace/delete weights, as mine does.  It don't
> get more general than that -- I can see that by looking at the formal
> description.
>
> OK, so I'll give you that I don't weight transpositions separately,
> but neither does any other variant I found on the web nor the formal
> descriptions.  A fourth optional weight agument someday, maybe :-).
> ...
> and that's why I want it in the Python library, so doing the right
> thing in Python will be a minimum-effort proposition.

Guido will depart from you at a different point.  I depart here:  it's not
"the right thing".  It's a bunch of hacks that appeal not because they solve
a problem, but because they're cute algorithms that are pretty easy to
implement and kinda solve part of a problem.   "The right thing"-- which you
can buy --at least involves capturing a large base of knowledge about
phonetics and spelling.  In high school, one of my buddies was Dan
Pryzbylski.  If anyone who knew him (other than me <wink>) were to type his
name into the class reunion guy's web page, they'd probably spell it the way
they remember him pronouncing it:  sha-bill-skey (and that's how he
pronounced "Dan" <wink>).  If that hit on the text string "Pryzbylski",
*then* it would be "the right thing" in a way that makes sense to real
people, not just to implementers.

Working six years in commercial speech recog really hammered that home to
me:  95% solutions are on the margin of unsellable, because an error one try
in 20 is intolerable for real people.  Developers writing for developers get
"whoa! cool!" where my sisters walk away going "what good is that?".  Edit
distance doesn't get within screaming range of 95% in real life.

Even for most developers, it would be better to package up the single best
approach you've got (f(list, word) -> list of possible matches sorted in
confidence order), instead of a module with 6 (or so) functions they don't
understand and a pile of equally mysterious knobs.  Then it may actually get
used!  Developers of the breed who would actually take the time to
understand what you've done are, I suggest, similar to us:  they'd skim the
docs, ignore the code, and write their own variations.  Or, IOW:

> so doing the right thing in Python will be a minimum-effort
> proposition.

Make someone think first, and 95% of developers will just skip over it too.

BTW, the theoretical literature ignored transposition at first, because it
didn't fit well in the machinery.  IIRC, I first read about it in an issue
of SP&E (Software Practice & Experience), where the authors were forced into
it because the "traditional" edit sequence measure sucked in their practice.
They were much happier after taking transposition into account.  The
theoreticians have more than caught up since, and research is still active;
e.g., 1997's

    PATTERN RECOGNITION OF STRINGS WITH SUBSTITUTIONS, INSERTIONS,
    DELETIONS AND GENERALIZED TRANSPOSITIONS
    B. J. Oommen and R. K. S. Loke
    http://www.scs.carleton.ca/~oommen/papers/GnTrnsJ2.PDF

is a good read.  As they say there,

    If one views the elements of the confusion matrices as
    probabilities, this [treating each character independent
    of all others, as "edit distance" does] is equivalent to
    assuming that the transformation probabilities at each
    position in the string are statistically independent and
    possess first-order Markovian characteristics. This model
    is usually assumed for simplicity rather it [sic] having
    any statistical significance.

IOW, because it's easy to analyze, not because it solves a real problem --
and they're complaining about an earlier generalization of edit distance
that makes the weights depend on the individual symbols involved as well as
on the edit/delete/insert distinction (another variation trying to make this
approach genuinely useful in real life).  The Oommen-Loke algorithm appears
much more realistic, taking into account the observed probabilities of
mistyping specific letter pairs (although it still ignores phonetics), and
they report accuracies approaching 98% in correctly identifying mangled
words.

98% (more than twice as good as 95% -- the error rate is actually more
useful to think about, 2% vs 5%) is truly useful for non-geek end users, and
the state of the art here is far beyond what's easy to find and dead easy to
implement.

> ...
> It probably won't surprise you that I considered writing an FFT
> extension module at one point :-).

Nope!  More power to you, Eric.  At least FFTs *are* state of the art,
although *coding* them optimally is likely beyond human ability on modern
machines:

    http://www.fftw.org/

(short course:  they've generally got the fastest FFTs available, and their
code is generated by program, systematically *trying* every trick in the
book, timing it on a given box, and synthesizing a complete strategy out of
the quickest pieces).

sooner-or-later-the-only-code-real-people-will-use-won't-be-written-
    by-people-at-all-ly y'rs  - tim




From tim.one at home.com  Sun Jan 14 06:38:52 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 00:38:52 -0500
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
In-Reply-To: <0G7400EZQM2TXD@mta5.snfc21.pbi.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEIAIIAA.tim.one@home.com>

[Dan Wolfe]
> ...
> As I mentioned, the road to getting it in Mac OS X begins with
> getting it to build cleanly with the automated build system... so
> I've got to get  this problem fixed before I start working on
> getting it in the build.
>
> - Dan
>   (yes, I work for Apple, but this is something that I'm doing
>    on my own!)

Hang in there, Dan!  I did the first Python port to the KSR-1 on my own time
too, despite working for the visionless bastards at the time.  The rest is
history:  the glory, the fame, the riches, the groupies, the adulation of my
peers.  We won't mention the financial scandal and subsequent bankruptcy
lest it discourage you for no good reason <wink>.

BTW, "do the simplest thing that can possibly work"!  It's OK if it's a
little ugly.  Better that than force hundreds of Python-builders to get
divorced from a decade-old directory naming scheme.




From esr at thyrsus.com  Sun Jan 14 08:08:57 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 02:08:57 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEHPIIAA.tim.one@home.com>; from tim.one@home.com on Sat, Jan 13, 2001 at 11:39:59PM -0500
References: <20010113195808.B17712@thyrsus.com> <LNBBLJKPBEHFEDALKOLCEEHPIIAA.tim.one@home.com>
Message-ID: <20010114020857.E19782@thyrsus.com>

Tim Peters <tim.one at home.com>:
> All agreed, and it should be a straightforward task then.  I'm assuming it
> will work with Unicode strings too <wink>.

Thought about that.  Want to get it working for 8 bits first.
 
> Guido will depart from you at a different point.  I depart here:  it's not
> "the right thing".  It's a bunch of hacks that appeal not because they solve
> a problem, but because they're cute algorithms that are pretty easy to
> implement and kinda solve part of a problem.

Again, my experience says differently.  I have actually *used*
Ratcliff-Obershelp to implement Do What I Mean (actually, Tell Me What
I Mean) -- and had it work very well for non-geek users.  That's why I
want other Python programmers to have easy access to the capability.

> Working six years in commercial speech recog really hammered that home to
> me:  95% solutions are on the margin of unsellable, because an error one try
> in 20 is intolerable for real people.  Developers writing for developers get
> "whoa! cool!" where my sisters walk away going "what good is that?".  Edit
> distance doesn't get within screaming range of 95% in real life.

I suspect your speech recognition experience has given you an
unhelpful bias.  For English, what you say is certainly true -- but
that's a gross worst-case application of R/O and Levenshtein that I'm
not interested in pursuing.  Nor do I expect Python hackers to use
my module for that.

Where techniques like Ratcliff-Obershelp really shine (and what I
expect the module to be used for) is with controlled vocabularies such
as command interfaces.  These tend to have better orthogonality than
NL, so antinoise filtering by R/O or Levenshtein distance (a kindred
technique I somehow didn't learn until today -- there are
disadvantages to being an autodidact) can really go to town on them.

(Actually, my gut after thinking about both algorithms hard is that
R/O is still a better technique than Levenshtein for the kind of
application I have in mind.  But I also suspect the difference is
marginal.)

(Other good uses for algorithms in this class include cladistics and
genomic analysis.)

> Even for most developers, it would be better to package up the single best
> approach you've got (f(list, word) -> list of possible matches sorted in
> confidence order), instead of a module with 6 (or so) functions they don't
> understand and a pile of equally mysterious knobs.

That's why good documentation, with motivating usage hints, is important.
I write good documentation, Tim.

>     PATTERN RECOGNITION OF STRINGS WITH SUBSTITUTIONS, INSERTIONS,
>     DELETIONS AND GENERALIZED TRANSPOSITIONS
>     B. J. Oommen and R. K. S. Loke
>     http://www.scs.carleton.ca/~oommen/papers/GnTrnsJ2.PDF

Thanks for the pointer; I've downloaded it and will read it.  If the 
description of Ooomen's algorithm is good enough, I'll implement it and
add it to the module.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Power concedes nothing without a demand. It never did, and it never will.
Find out just what people will submit to, and you have found out the exact
amount of injustice and wrong which will be imposed upon them; and these will
continue until they are resisted with either words or blows, or with both.
The limits of tyrants are prescribed by the endurance of those whom they
oppress.
	-- Frederick Douglass, August 4, 1857



From dkwolfe at pacbell.net  Sun Jan 14 08:48:51 2001
From: dkwolfe at pacbell.net (Dan Wolfe)
Date: Sat, 13 Jan 2001 23:48:51 -0800
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEIAIIAA.tim.one@home.com>
Message-ID: <0G75009ZD6UYYE@mta5.snfc21.pbi.net>

On Saturday, January 13, 2001, at 09:38 PM, Tim Peters wrote:

> [Dan Wolfe]
>> ...
>> As I mentioned, the road to getting it in Mac OS X begins with
>> getting it to build cleanly with the automated build system... so
>> I've got to get  this problem fixed before I start working on
>> getting it in the build.
>>
>> - Dan
>> (yes, I work for Apple, but this is something that I'm doing
>> on my own!)
>
> Hang in there, Dan!  I did the first Python port to the KSR-1 on my own 
> time
> too, despite working for the visionless bastards at the time.

Well, I won't go that far..... some of them are quite visionaries (I 
can't stop drooling over a Ti portable....).

> The rest is
> history:  the glory, the fame, the riches, the groupies, the adulation 
> of my
> peers.  We won't mention the financial scandal and subsequent bankruptcy
> lest it discourage you for no good reason <wink>.

You left out the part where they turn ya into a timbot... <wink><wink>

> BTW, "do the simplest thing that can possibly work"!  It's OK if it's a
> little ugly.  Better that than force hundreds of Python-builders to get
> divorced from a decade-old directory naming scheme.

Well the mv Python to PyCore was the simplest... but obviously the most 
painful.... The longer ugly fix is working but it's such a hack that I'd 
rather not show it off...I need to fix it so that it allow nice things 
such allowing the -with-suffix to be used...and then testing all the 
edge cases such as clobber, etc so that I don't break anything. :-)

appreciating-your-note-after-attempting-to-understand-makefiles-on-Saturday-night'
ly yours,

- Dan









-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 1729 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20010113/4151a467/attachment.bin>

From tim.one at home.com  Sun Jan 14 11:45:53 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 05:45:53 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010114020857.E19782@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEIGIIAA.tim.one@home.com>

[Tim]
>> ...It's a bunch of hacks that appeal not because they solve
>> a problem, but because they're cute algorithms that are pretty
>> easy to implement and kinda solve part of a problem.

[Eric]
> Again, my experience says differently.  I have actually *used*
> Ratcliff-Obershelp to implement Do What I Mean (actually, Tell Me What
> I Mean) -- and had it work very well for non-geek users.  That's why I
> want other Python programmers to have easy access to the capability.
> ...
> Where techniques like Ratcliff-Obershelp really shine (and what I
> expect the module to be used for) is with controlled vocabularies
> such as command interfaces.

Yet the narrower the domain, the less call for a library with multiple
approaches.  If R-O really shone for you, why bother with anything else?
Seriously.  You haven't used some (most?) of these.  The core isn't a place
for research modules either (note that I have no objection whatsoever to
writing any module you like -- the only question here is what belongs in the
core, and any algorithm *nobody* here has experience with in your target
domain is plainly a poor *core* candidate for that reason alone -- we have
to maintain, justify and explain it for years to come).

> I suspect your speech recognition experience has given you an
> unhelpful bias.

Try to think of it as a helpfully different perspective <0.5 wink>.  It's in
favor of measuring error rate by controlled experiments, skeptical of
intuition, and dismissive of anecdotal evidence.  I may well agree you don't
need all that heavy machinery if I had a clear definition of what problem it
is you're trying to solve (I've learned it's not the kinds of problems *I*
had in mind when I first read your description!).

BTW, telephone speech recog requires controlled vocabularies because phone
acoustics are too poor for the customary close-talking microphone approaches
to work well enough.  A std technique there is to build a "confusability
matrix" of the words *in* the vocabulary, to spot trouble before it happens:
if two words are acoustically confusable, it flags them and bounces that
info back to the vocabulary designer.  A similar approach should work well
in your domain:  if you get to define the cmd interface, run all the words
in it pairwise through your similarity measure of choice, and dream up new
words whenever a pair is "too close".  That all but ensures that even a
naive similarity algorithm will perform well (in telephone speech recog, the
unconstrained error rate is up to 70% on cell phones; by constraining the
vocabulary with the aid of confusability measures, we cut that to under 1%).

> ...
> (Actually, my gut after thinking about both algorithms hard is that
> R/O is still a better technique than Levenshtein for the kind of
> application I have in mind.  But I also suspect the difference is
> marginal.)

So drop Levenshtein -- go with your best shot.  Do note that they both
(usually) consider a single transposition to be as much a mutation as two
replacements (or an insert plus a delete -- "pure" Levenshtein treats those
the same).

What happens when the user doesn't enter an exact match?  Does the kind of
app you have in mind then just present them with a list of choices?  If
that's all (as opposed to, e.g., substituting its best guess for what the
user actually typed and proceeding as if the user had given that from the
start), then the evidence from studies says users are almost as pleased when
the correct choice appears somewhere in the first three choices as when it
appears as *the* top choice.  A well-designed vocabulary can almost
guarantee that happy result (note that most of the current research is aimed
at the much harder job of getting the intended word into the #1 slot on the
choice list).

> (Other good uses for algorithms in this class include cladistics and
> genomic analysis.)

I believe you'll find current work in those fields has moved far beyond
these simplest algorithms too, although they remain inspirational (for
example, see
"Protein Sequence Alignment and Database Scanning" at

    http://barton.ebi.ac.uk/papers/rev93_1/rev93_1.html

Much as in typing, some mutations are more likely than others for *physical*
reasons, so treating all pairs of symbols in the alphabet alike is too gross
a simplification.).

>> Even for most developers, it would be better to package up the
>> single best approach you've got (f(list, word) -> list of possible
>> matches sorted in confidence order), instead of a module with 6
>> (or so) functions they don't understand and a pile of equally
>> mysterious knobs.

> That's why good documentation, with motivating usage hints, is
> important.  I write good documentation, Tim.

You're not going to find offense here even if you look for it, Eric <wink>:
while only a small percentage of developers don't read docs at all, everyone
else spaces out at least in linear proportion to the length of the docs.
Most people will be looking for "a solution", not for "a toolkit".  If the
docs read like a toolkit, it doesn't matter how good they are, the bulk of
the people you're trying to reach will pass on it.  If you really want this
to be *used*, supply one class that does *all* the work, including making
the expert-level choices of which algorithm is used under the covers and how
it's tuned.  That's good advice.

I still expect Guido won't want it in the core before wide use is a
demonstrated fact, though (and no, that's not a chicken-vs-egg thing:  "wide
use" for a thing outside the core is narrower than "wide use" for a thing in
the core).  An exception would likely get made if he tried it and liked it a
lot.  But to get it under his radar, it's again much easier if the usage
docs are no longer than a couple paragraphs.

I'll attach a tiny program that uses ndiff's SequenceMatcher to guess which
of the 147 std 2.0 top-level library modules a user may be thinking of (and
best I can tell, these are the same results case-folding R/O would yield):

Module name? random
Hmm.  My best guesses are random, whrandom, anydbm
(BTW, the first choice was an exact match)
Module name? disect
Hmm.  My best guesses are bisect, dis, UserDict
Module name? password
Hmm.  My best guesses are keyword, getpass, asyncore
Module name? chitchat
Hmm.  My best guesses are whichdb, stat, asynchat
Module name? xml
Hmm.  My best guesses are xmllib, mhlib, xdrlib

[So far so good]

Module name? http
Hmm.  My best guesses are httplib, tty, stat

[I was thinking of httplib, but note that it missed
 SimpleHTTPServer:  a name that long just isn't going to score
 high when the input is that short]

Module name? dictionary
Hmm.  My best guesses are Bastion, ConfigParser, tabnanny

[darn, I *think* I was thinking of UserDict there]

Module name? uuencode
Hmm.  My best guesses are code, codeop, codecs

[Missed uu]

Module name? parse
Hmm.  My best guesses are tzparse, urlparse, pre
Module name? browser
Hmm.  My best guesses are webbrowser, robotparser, user
Module name? brower
Hmm.  My best guesses are webbrowser, repr, reconvert
Module name? Thread
Hmm.  My best guesses are threading, whrandom, sched
Module name? pickle
Hmm.  My best guesses are pickle, profile, tempfile
(BTW, the first choice was an exact match)
Module name? shelf
Hmm.  My best guesses are shelve, shlex, sched
Module name? katmandu
Hmm.  My best guesses are commands, random, anydbm

[I really was thinking of "commands"!]

Module name? temporary
Hmm.  My best guesses are tzparse, tempfile, fpformat

So it gets what I was thinking of into the top 3 very often, and despite
some wildly poor guesses at the correct spelling -- you'd *almost* think it
was doing a keyword search, except the *unintended* choices on the list are
so often insane <wink>.

Something like that may be a nice addition to Paul/Ping's help facility
someday too.

Hard question:  is that "good enough" for what you want?  Checking against
147 things took no perceptible time, because SequenceMatcher is already
optimized for "compare one thing against N", doing preprocessing work on the
"one thing" that greatly speeds the N similarity computations (I suspect
you're not -- yet).  It's been tuned and tested in practice for years; it
works for any sequence type with hashable elements (so Unicode strings are
already covered); it works for long sequences too.  And if R-O is the best
trick we've got, I believe it already does it.  Do we need more?  Of course
*I'm* not convinced we even need *it* in the core, but packaging a
match-1-against-N class is just a few minutes' editing of what follows.

something-to-play-with-anyway-ly y'rs  - tim


NDIFFPATH = "/Python20/Tools/Scripts"
LIBPATH = "/Python20/Lib"

import sys, os

sys.path.append(NDIFFPATH)
from ndiff import SequenceMatcher

modules = {}  # map lowercase module stem to module name
for f in os.listdir(LIBPATH):
    if f.endswith(".py"):
        f = f[:-3]
        modules[f.lower()] = f

def match(fname, numchoices=3):
    lower = fname.lower()
    s = SequenceMatcher()
    s.set_seq2(lower)
    scores = []
    for lowermod, mod in modules.items():
        s.set_seq1(lowermod)
        scores.append((s.ratio(), mod))
    scores.sort()
    scores.reverse()
    return modules.has_key(lower), [x[1] for x in scores[:numchoices]]

while 1:
    name = raw_input("Module name? ")
    is_exact, choices = match(name)
    print "Hmm.  My best guesses are", ", ".join(choices)
    if is_exact:
        print "(BTW, the first choice was an exact match)"




From esr at thyrsus.com  Sun Jan 14 13:15:33 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 07:15:33 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEIGIIAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 14, 2001 at 05:45:53AM -0500
References: <20010114020857.E19782@thyrsus.com> <LNBBLJKPBEHFEDALKOLCEEIGIIAA.tim.one@home.com>
Message-ID: <20010114071533.A5812@thyrsus.com>

Tim Peters <tim.one at home.com>:
> Yet the narrower the domain, the less call for a library with multiple
> approaches.  If R-O really shone for you, why bother with anything else?

Well, I was bothering with Levenshtein because *you* suggested it. :-)

I put in Hamming similarity and stemming because they're O(n) where
R/O is quadratic, and both widely used in situations where a fast sloppy
job is preferable to a good but slow one.  My documentation page is explicit
about the tradeoff.

> Seriously.  You haven't used some (most?) of these. 

I've used stemming and R-O.  Haven't used Hamming or Levenshtein.

>                                   The core isn't a place
> for research modules either (note that I have no objection whatsoever to
> writing any module you like -- the only question here is what belongs in the
> core, and any algorithm *nobody* here has experience with in your target
> domain is plainly a poor *core* candidate for that reason alone -- we have
> to maintain, justify and explain it for years to come).

Fair point.  I read it, in this context, as good advice to drop the Hamming 
entry point and forget about the Levenshtein implementation -- stick to what
I've used and know is useful as opposed to what I think might be useful.

>                                                I may well agree you don't
> need all that heavy machinery if I had a clear definition of what problem it
> is you're trying to solve (I've learned it's not the kinds of problems *I*
> had in mind when I first read your description!).

I think you have it by now, judging by the following...

> What happens when the user doesn't enter an exact match?  Does the kind of
> app you have in mind then just present them with a list of choices? 

Yes.  I've used this technique a lot.  It gives users not just guidance 
but warm fuzzy feelings -- they react as though there's a friendly 
homunculus inside the software looking out for them.  Actually, in my
experience, the less techie they are the more they like this.

> If that's all (as opposed to, e.g., substituting its best guess for what the
> user actually typed and proceeding as if the user had given that from the
> start), then the evidence from studies says users are almost as pleased when
> the correct choice appears somewhere in the first three choices as when it
> appears as *the* top choice.

Interesting.  That does fit what I've seen.

>                    A well-designed vocabulary can almost
> guarantee that happy result (note that most of the current research is aimed
> at the much harder job of getting the intended word into the #1 slot on the
> choice list).

Yes.  One of my other tricks is to design command vocabularies so the
first three characters close to unique.  This means R/O will almost
always nail the right thing.

> Much as in typing, some mutations are more likely than others for *physical*
> reasons, so treating all pairs of symbols in the alphabet alike is too gross
> a simplification.).

Indeed.  Couple weeks ago I was a speaker at a conference called "After the
Genome 6" at which one of the most interesting papers was given by a lady
mathematician who designs algorithms for DNA sequence matching.  She made
exactly this point.

> > That's why good documentation, with motivating usage hints, is
> > important.  I write good documentation, Tim.
> 
> You're not going to find offense here even if you look for it, Eric <wink>:

No worries, I wasn't looking. :-)

> Most people will be looking for "a solution", not for "a toolkit".  If the
> docs read like a toolkit, it doesn't matter how good they are, the bulk of
> the people you're trying to reach will pass on it.  If you really want this
> to be *used*, supply one class that does *all* the work, including making
> the expert-level choices of which algorithm is used under the covers and how
> it's tuned.  That's good advice.

I don't think that's possible in this case -- the proper domains for
stemming and R-O are too different.  But maybe this is another nudge to drop
the Hamming code.

>       But to get it under his radar, it's again much easier if the usage
> docs are no longer than a couple paragraphs.

How's this?

\section{\module{simil} -- 
         String similarily metrics}

\declaremodule{standard}{simil}
\moduleauthor{Eric S. Raymond}{esr at thyrsus.com}
\modulesynopsis{String similarity metrics.}

\sectionauthor{Eric S. Raymond}

The \module{simil} module provides similarity functions for
approximate word or string matching.  One important application is for
checking input words against a dictionary to match possible
misspellings with the right terms in a controlled vocabulary.

The entry points provide different tradeoffs ranging from crude and
fast (stemming) to effective but slow (Ratcliff-Obershelp gestalt
subpattern matching).  The latter is one of the standard techniques
used in commercial OCR software.

The \module{simil} module defines the following functions:

\begin{funcdesc}{stem}{}
Returns the length of the longest common prefix of two strings divided
by the length of the longer.  Similarity scores range from 0.0 (no
common prefix) to 1.0 (identity).  Running time is linear in string
length.
\end{funcdesc}

\begin{funcdesc}{hamming}{}
Computes a normalized Hamming similarity between two strings of equal
length -- the number of pairwise matches in the strings, divided by
their common length.  It returns None if the strings are of unequal
length.  Similarity scores range from 0.0 (no positions equal) to 1.0
(identity).  Running time is linear in string length.
\end{funcdesc}

\begin{funcdesc}{ratcliff}{}
Returns a Ratcliff/Obershelp gestalt similarity score based on
co-occurrence of subpatterns.  Similarity scores range from 0.0 (no
common subpatterns) to 1.0 (identity).  Running time is best-case
linear, worst-case quadratic in string length.
\end{funcdesc}

> Module name? http
> Hmm.  My best guesses are httplib, tty, stat
> 
> [I was thinking of httplib, but note that it missed
>  SimpleHTTPServer:  a name that long just isn't going to score
>  high when the input is that short]

>>> simil.ratcliff("http", "httplib")
0.72727274894714355
>>> simil.ratcliff("http", "tty")
0.57142859697341919
>>> simil.ratcliff("http", "stat")
0.5
>>> simil.ratcliff("http", "simplehttpserver")
0.40000000596046448

So with the 0.6 threshold I normally use R-O does better at eliminating
the false matches but doesn't catch SimpleHTTPServer (case is, I'm
sure you'll agree, an irrelevant detail here).
 
> Module name? dictionary
> Hmm.  My best guesses are Bastion, ConfigParser, tabnanny
> 
> [darn, I *think* I was thinking of UserDict there]

>>> simil.ratcliff("dictionary", "bastion")
0.47058823704719543
>>> simil.ratcliff("dictionary", "configparser")
0.45454546809196472
>>> simil.ratcliff("dictionary", "tabnanny")
0.4444444477558136
>>> simil.ratcliff("dictionary", "userdict")
0.4444444477558136

R-O would have booted all of these.  Hiighest score to configparser.
Interesting -- I'm beginning to think R-O overweights lots of small
subpattern matches relative to a few big ones, something I didn't notice
before because the statistics of my vocabularies masked it.

> Module name? uuencode
> Hmm.  My best guesses are code, codeop, codecs

>>> simil.ratcliff("uuencode", "code")
0.66666668653488159
>>> simil.ratcliff("uuencode", "codeops")
0.53333336114883423
>>> simil.ratcliff("uuencode", "codecs")
0.57142859697341919
>>> simil.ratcliff("uuencode", "uu")
0.40000000596046448

R-O would pick "code" and boot the rest.

> [Missed uu]
> 
> Module name? parse
> Hmm.  My best guesses are tzparse, urlparse, pre

>>> simil.ratcliff("parse", "tzparse")
0.83333331346511841
>>> simil.ratcliff("parse", "urlparse")
0.76923078298568726
>>> simil.ratcliff("parse", "pre")
0.75

Same result.

> Module name? browser
> Hmm.  My best guesses are webbrowser, robotparser, user

>>> simil.ratcliff("browser", "webbrowser")
0.82352942228317261
>>> simil.ratcliff("browser", "robotparser")
0.55555558204650879
>>> simil.ratcliff("browser", "user")
0.54545456171035767

Big win for R-O.  Picks the right one, boots the wrong two.

> Module name? brower
> Hmm.  My best guesses are webbrowser, repr, reconvert

>>> simil.ratcliff("brower", "webbrowser")
0.75
>>> simil.ratcliff("brower", "repr")
0.60000002384185791
>>> simil.ratcliff("brower", "reconvert")
0.53333336114883423

Small win for R/O -- boots reconvert, and repr squeaks in under the wire.

> Module name? Thread
> Hmm.  My best guesses are threading, whrandom, sched

>>> simil.ratcliff("thread", "threading")
0.80000001192092896
>>> simil.ratcliff("thread", "whrandom")
0.57142859697341919
>>> simil.ratcliff("thread", "sched")
0.54545456171035767

Big win for R-O.

> Module name? pickle
> Hmm.  My best guesses are pickle, profile, tempfile

>>> simil.ratcliff("pickle", "pickle")
1.0
>>> simil.ratcliff("pickle", "profile")
0.61538463830947876
>>> simil.ratcliff("pickle", "tempfile")
0.57142859697341919

R-O wins again.

> (BTW, the first choice was an exact match)
> Module name? shelf
> Hmm.  My best guesses are shelve, shlex, sched

>>> simil.ratcliff("shelf", "shelve")
0.72727274894714355
>>> simil.ratcliff("shelf", "shlex")
0.60000002384185791
>>> simil.ratcliff("shelf", "sched")
0.60000002384185791

Interesting.  Shelve scoores highest, both the others squeak in.

> Module name? katmandu
> Hmm.  My best guesses are commands, random, anydbm
>
> [I really was thinking of "commands"!]

>>> simil.ratcliff("commands", "commands")
1.0
>>> simil.ratcliff("commands", "random")
0.4285714328289032
>>> simil.ratcliff("commands", "anydbm")
0.4285714328289032

R-O wins big.
 
> Module name? temporary
> Hmm.  My best guesses are tzparse, tempfile, fpformat

>>> simil.ratcliff("temporary", "tzparse")
0.5
>>> simil.ratcliff("temporary", "tempfile")
0.47058823704719543
>>> simil.ratcliff("temporary", "fpformat")
0.47058823704719543

R-O boots all of these.  

> Hard question:  is that "good enough" for what you want?

Um...notice that R-O filtering, even though it seems to be
underweighting large matches, did a rather better job on your examples!
With an 0.66 threshold it would have done *much* better.

I think you've just made an argument for replacing your SequenceMatcher
with simil.ratcliff.  Mine's even documented. :-).
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Militias, when properly formed, are in fact the people themselves and
include all men capable of bearing arms. [...] To preserve liberty it is
essential that the whole body of the people always possess arms and be
taught alike, especially when young, how to use them.
        -- Senator Richard Henry Lee, 1788, on "militia" in the 2nd Amendment



From ping at lfw.org  Sun Jan 14 13:38:42 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 04:38:42 -0800 (PST)
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
Message-ID: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>

Sorry i'm being forgetful -- could someone please refresh my memory:

Was there a good reason for allowing both lowercase and capital 'r'
as a prefix for raw-strings?  I assume that the availability of both
r'' and R'' is what led to having both u'' and U''.  Is there any
good reason for that either?

This just seems to lead to ambiguity and unneeded complexity:
more cases in tokenize.py, more cases in tokenize.c, more work
for IDLE, more annoying when searching for u' in your editor.
(I was about to fix the lack of u'' support in tokenize.py and
that made me think about this.)

What happened to TOOWTDI?

Would you believe we now have 36 different ways of starting a string:

    '      "      '''    """
    r'     r"     r'''   r"""
    u'     u"     u'''   u"""
    ur'    ur"    ur'''  ur"""
    R'     R"     R'''   R"""
    U'     U"     U'''   U"""
    uR'    uR"    uR'''  uR"""
    Ur'    Ur"    Ur'''  Ur"""
    UR'    UR"    UR'''  UR"""

Would it be outrageous to suggest deprecating the last five rows?


-- ?!ng

[1] We started with 4.  Perl has (by my count) 381 ways of starting
    a string literal, so we're halfway there, logarithmically speaking.
    Perl has 757 if you count the fancier operators qx, qw, s, and tr.




From mal at lemburg.com  Sun Jan 14 14:33:29 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sun, 14 Jan 2001 14:33:29 +0100
Subject: [Python-Dev] Why is soundex marked obsolete?
References: <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>
Message-ID: <3A61AAA9.F6F1EA9F@lemburg.com>

[Lots of talk about interesting algorithms for "human" pattern matching]

I just want to add my 2 cents to the discussion:

* Eric's package seems very useful for pattern matching, but that
  is a very specific domain -- not main stream

* I would opt to create a neat distutils style package for it
  for people to install at their own liking (I would certainly
  like it :)

* If wrapped up as a separate package, I'd suggest to add all
  known algorithms to the package and also make it Unicode
  aware. There are similar package for e.g. RNGs on Parnassus.

BTW, are there less English centric "sounds alike" matchers
around ? The NIST soundex algorithm as published on the internet:

    http://physics.nist.gov/cuu/Reference/soundex.html

works fine for English texts, but other languages of course
have different letter coding requirements (or even different
alphabets).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Sun Jan 14 14:53:03 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sun, 14 Jan 2001 14:53:03 +0100
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
References: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>
Message-ID: <3A61AF3F.EE6DAB88@lemburg.com>

Ka-Ping Yee wrote:
> 
> Sorry i'm being forgetful -- could someone please refresh my memory:
> 
> Was there a good reason for allowing both lowercase and capital 'r'
> as a prefix for raw-strings?  I assume that the availability of both
> r'' and R'' is what led to having both u'' and U''. 

Right.

> Is there any
> good reason for that either?

No idea... I have never used anything other than the lowercase
versions.
 
> This just seems to lead to ambiguity and unneeded complexity:
> more cases in tokenize.py, more cases in tokenize.c, more work
> for IDLE, more annoying when searching for u' in your editor.
> (I was about to fix the lack of u'' support in tokenize.py and
> that made me think about this.)
> 
> What happened to TOOWTDI?
> 
> Would you believe we now have 36 different ways of starting a string:
> 
>     '      "      '''    """
>     r'     r"     r'''   r"""
>     u'     u"     u'''   u"""
>     ur'    ur"    ur'''  ur"""
>     R'     R"     R'''   R"""
>     U'     U"     U'''   U"""
>     uR'    uR"    uR'''  uR"""
>     Ur'    Ur"    Ur'''  Ur"""
>     UR'    UR"    UR'''  UR"""
>
> Would it be outrageous to suggest deprecating the last five rows?

No. + 1 on the idea.
 
> -- ?!ng
> 
> [1] We started with 4.  Perl has (by my count) 381 ways of starting
>     a string literal, so we're halfway there, logarithmically speaking.
>     Perl has 757 if you count the fancier operators qx, qw, s, and tr.
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Sun Jan 14 15:24:08 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sun, 14 Jan 2001 15:24:08 +0100
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>; from ping@lfw.org on Sun, Jan 14, 2001 at 04:38:42AM -0800
References: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>
Message-ID: <20010114152408.G1005@xs4all.nl>

On Sun, Jan 14, 2001 at 04:38:42AM -0800, Ka-Ping Yee wrote:

> [1] We started with 4.  Perl has (by my count) 381 ways of starting
>     a string literal, so we're halfway there, logarithmically speaking.
>     Perl has 757 if you count the fancier operators qx, qw, s, and tr.

Don't forget 'qr//', which is quite like a raw string, except that Perl uses
it to 'precompile' regular expressions as a side effect. 

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Sun Jan 14 18:08:28 2001
From: guido at python.org (Guido van Rossum)
Date: Sun, 14 Jan 2001 12:08:28 -0500
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: Your message of "Sun, 14 Jan 2001 14:53:03 +0100."
             <3A61AF3F.EE6DAB88@lemburg.com> 
References: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>  
            <3A61AF3F.EE6DAB88@lemburg.com> 
Message-ID: <200101141708.MAA11161@cj20424-a.reston1.va.home.com>

> Ka-Ping Yee wrote:
> > 
> > Sorry i'm being forgetful -- could someone please refresh my memory:
> > 
> > Was there a good reason for allowing both lowercase and capital 'r'
> > as a prefix for raw-strings?  I assume that the availability of both
> > r'' and R'' is what led to having both u'' and U''. 
> 
> Right.
> 
> > Is there any
> > good reason for that either?
> 
> No idea... I have never used anything other than the lowercase
> versions.

It comes from the numeric literals.  C allows 0x0 and 0X0, and 0L as
well as 0l.  So does Python (and also 0j == 0J).

> > This just seems to lead to ambiguity and unneeded complexity:
> > more cases in tokenize.py, more cases in tokenize.c, more work
> > for IDLE, more annoying when searching for u' in your editor.
> > (I was about to fix the lack of u'' support in tokenize.py and
> > that made me think about this.)
> > 
> > What happened to TOOWTDI?
> > 
> > Would you believe we now have 36 different ways of starting a string:
> > 
> >     '      "      '''    """
> >     r'     r"     r'''   r"""
> >     u'     u"     u'''   u"""
> >     ur'    ur"    ur'''  ur"""
> >     R'     R"     R'''   R"""
> >     U'     U"     U'''   U"""
> >     uR'    uR"    uR'''  uR"""
> >     Ur'    Ur"    Ur'''  Ur"""
> >     UR'    UR"    UR'''  UR"""
> >
> > Would it be outrageous to suggest deprecating the last five rows?
> 
> No. + 1 on the idea.

Why bother?  All that does is outdate a bunch of documentation.  I
don't see the extra effort in various parsers as a big deal.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at effbot.org  Sun Jan 14 18:53:32 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Sun, 14 Jan 2001 18:53:32 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
Message-ID: <010f01c07e52$e9801fc0$e46940d5@hagrid>

The name database portions of SF task 17335 ("add
compressed unicode database") were postponed to
2.1.

My current patch replaces the ~450k large ucnhash
module with a new ~160k large module.  (See earlier
posts for more info on how the new database works).

Should I check it in?

</F>




From skip at mojam.com  Sun Jan 14 18:51:52 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sun, 14 Jan 2001 11:51:52 -0600 (CST)
Subject: [Python-Dev] pydoc - put it in the core
Message-ID: <14945.59192.400783.403810@beluga.mojam.com>

Ping's pydoc is awesome!  Move it out of the sandbox and put it in the
standard distribution.

Biggest hook for me:

   1. execute "pydoc -p 3200"
   2. visit "http://localhost:3200/"
   3. knock yourself out

Skip



From martin at mira.cs.tu-berlin.de  Sun Jan 14 18:57:57 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 14 Jan 2001 18:57:57 +0100
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
Message-ID: <200101141757.f0EHvvt01407@mira.informatik.hu-berlin.de>

> > Would it be outrageous to suggest deprecating the last five rows?
> Why bother?  All that does is outdate a bunch of documentation.

He suggested to deprecate it, not to remove it. By the time it is
removed, the documentation still mentioning it should be outdated for
other reasons (e.g. the string module might have disappeared).

In general, the rationale for deprecating things would be that the
simplification will make everybody's life easier in the long run. In
the case of a small change (such as this one), that advantage would be
small. OTOH, the hassle for users that rely on the then-removed
feature will be also small; I see it as quite unlikely that anybody
uses that feature actively (although I do think that people use 0X10
and 100L; the latter is common since 100l is oft confused with 1001).

Regards,
Martin



From tim.one at home.com  Sun Jan 14 20:00:21 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 14:00:21 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010114071533.A5812@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEJAIIAA.tim.one@home.com>

Very quick (swamped):

> I think you've just made an argument for replacing your
> SequenceMatcher with simil.ratcliff.

Actually, I'm certain they're the same algorithm now, except the C is
showing through in ratcliff to the floating-point eye <wink>.  For
demonstration, I *always* printed the top three scorers (that's logic in the
little driver I posted, not in SequenceMatcher), without any notion of
cutoff (ndiff does use a cutoff).  Add this line before the return (in the
posted driver) to see the actual scores:

    print scores[:numchoices]

For example:

Module name? browser
[(0.82352941176470584, 'webbrowser'),
 (0.55555555555555558, 'robotparser'),
 (0.54545454545454541, 'user')]
Hmm.  My best guesses are webbrowser, robotparser, user
Module name?

On this example you reported:

>>> simil.ratcliff("browser", "webbrowser")
0.82352942228317261
>>> simil.ratcliff("browser", "robotparser")
0.55555558204650879
>>> simil.ratcliff("browser", "user")
0.54545456171035767

which strongly suggests you're using C floats instead of Python floats to
compute the final score.  I didn't try every example in your email, but it's
the same story on the three I did try (scores identical modulo
simil.ratcliff dropping about 30 of the low-order result bits -- which is
about the difference between a C double and a C float on most boxes).

> Mine's even documented. :-).

Which I appreciate!  I dreamt up the SequenceMatcher algorithm going on 20
years ago for a friendly diff generator, and never even considered using it
for other purposes.  But then I may have mentioned that these other purposes
never come up in my apps <wink>.

or-at-least-they-haven't-in-contexts-where-r/o-would-have-been-
    strong-enough-ly y'rs  - tim




From bckfnn at worldonline.dk  Sun Jan 14 20:00:33 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Sun, 14 Jan 2001 19:00:33 GMT
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <010f01c07e52$e9801fc0$e46940d5@hagrid>
References: <010f01c07e52$e9801fc0$e46940d5@hagrid>
Message-ID: <3a61f12a.36601630@smtp.worldonline.dk>

On Sun, 14 Jan 2001 18:53:32 +0100, you wrote:

>The name database portions of SF task 17335 ("add
>compressed unicode database") were postponed to
>2.1.
>
>My current patch replaces the ~450k large ucnhash
>module with a new ~160k large module.  (See earlier
>posts for more info on how the new database works).

Do you have a link or an approx date of this earlier posts? I must have
missed it. The patch on sourceforge seems a bit empty:

https://sourceforge.net/patch/index.php?func=detailpatch&patch_id=100899&group_id=5470

As a result I invented my own compression format for the ucnhash for
jython. I managed to achive ~100k but that probably have different
performance properties.

regards,
finn



From esr at thyrsus.com  Sun Jan 14 20:09:01 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 14:09:01 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEJAIIAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 14, 2001 at 02:00:21PM -0500
References: <20010114071533.A5812@thyrsus.com> <LNBBLJKPBEHFEDALKOLCCEJAIIAA.tim.one@home.com>
Message-ID: <20010114140901.A6431@thyrsus.com>

Tim Peters <tim.one at home.com>:
> > I think you've just made an argument for replacing your
> > SequenceMatcher with simil.ratcliff.
> 
> Actually, I'm certain they're the same algorithm now, except the C is
> showing through in ratcliff to the floating-point eye <wink>.

Take a look:

/*****************************************************************************
 *
 * Ratcliff-Obershelp common-subpattern similarity.
 *
 * This code first appeared in a letter to the editor in Doctor
 * Dobbs's Journal, 11/1988.  The original article on the algorithm,
 * "Pattern Matching by Gestalt" by John Ratcliff, had appeared in the
 * July 1988 issue (#181) but the algorithm was presented in assembly.
 * The main drawback of the Ratcliff-Obershelp algorithm is the cost
 * of the pairwise comparisons.  It is significantly more expensive
 * than stemming, Hamming distance, soundex, and the like.
 *
 * Running time quadratic in the data size, memory usage constant.
 *
 *****************************************************************************/

static int RatcliffObershelp(char *st1, char *end1, char *st2, char *end2)
{
    register char *a1, *a2;
    char *b1, *b2; 
    char *s1 = st1, *s2 = st2;	/* initializations are just to pacify GCC */
    short max, i;

    if (end1 <= st1 || end2 <= st2)
	return(0);
    if (end1 == st1 + 1 && end2 == st2 + 1)
	return(0);
		
    max = 0;
    b1 = end1; b2 = end2;
	
    for (a1 = st1; a1 < b1; a1++)
    {
	for (a2 = st2; a2 < b2; a2++)
	{
	    if (*a1 == *a2)
	    {
		/* determine length of common substring */
		for (i = 1; a1[i] && (a1[i] == a2[i]); i++) 
		    continue;
		if (i > max)
		{
		    max = i; s1 = a1; s2 = a2;
		    b1 = end1 - max; b2 = end2 - max;
		}
	    }
	}
    }
    if (!max)
	return(0);
    max += RatcliffObershelp(s1 + max, end1, s2 + max, end2);	/* rhs */
    max += RatcliffObershelp(st1, s1, st2, s2);			/* lhs */
    return max;
}

static float ratcliff(char *s1, char *s2)
/* compute Ratcliff-Obershelp similarity of two strings */
{
    short l1, l2;

    l1 = strlen(s1);
    l2 = strlen(s2);
	
    /* exact match end-case */
    if (l1 == 1 && l2 == 1 && *s1 == *s2)
	return(1.0);
			
    return 2.0 * RatcliffObershelp(s1, s1 + l1, s2, s2 + l2) / (l1 + l2);
}

static PyObject *
simil_ratcliff(PyObject *self, PyObject *args)
{
    char *str1, *str2;
    
    if(!PyArg_ParseTuple(args, "ss:ratcliff", &str1, &str2))
        return NULL;

    return Py_BuildValue("f", ratcliff(str1, str2));
}
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Taking my gun away because I might shoot someone is like cutting my tongue
out because I might yell `Fire!' in a crowded theater."
        -- Peter Venetoklis



From fredrik at effbot.org  Sun Jan 14 20:31:06 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Sun, 14 Jan 2001 20:31:06 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <010f01c07e52$e9801fc0$e46940d5@hagrid> <3a61f12a.36601630@smtp.worldonline.dk>
Message-ID: <040e01c07e60$8c74d100$e46940d5@hagrid>

finn wrote:
> As a result I invented my own compression format for the ucnhash for
> jython. I managed to achive ~100k but that probably have different
> performance properties.

here's the description:

---

From: "Fredrik Lundh" <effbot at telia.com>
Date: Sun, 16 Jul 2000 20:40:46 +0200

/.../

    The unicodenames database consists of two parts: a name
    database which maps character codes to names, and a code
    database, mapping names to codes.

* The Name Database (getname)

    First, the 10538 text strings are split into 42193 words,
    and combined into a 4949-word lexicon (a 29k array).

    Each word is given a unique index number (common words get
    lower numbers), and there's a "lexicon offset" table mapping
    from numbers to words (10k).

    To get back to the original text strings, I use a "phrase
    book".  For each original string, the phrase book stores a a
    list of word numbers.  Numbers 0-127 are stored in one byte,
    higher numbers (less common words) use two bytes.  At this
    time, about 65% of the words can be represented by a single
    byte.  The result is a 56k array.

    The final data structure is an offset table, which maps code
    points to phrase book offsets.  Instead of using one big
    table, I split each code point into a "page number" and a
    "line number" on that page.

      offset = line[ (page[code>>SHIFT]<<SHIFT) + (code&MASK) ]

    Since the unicode space is sparsely populated, it's possible
    to split the code so that lots of pages gets no contents.  I
    use a brute force search to find the optimal SHIFT value.

    In the current database, the page table has 1024 entries
    (SHIFT is 6), and there are 199 unique pages in the line
    table.  The total size of the offset table is 26k.

* The code database (getcode)

    For the code table, I use a straight-forward hash table to store
    name to code mappings.  It's basically the same implementation
    as in Python's dictionary type, but a different hash algorithm.
    The table lookup loop simply uses the name database to check
    for hits.

    In the current database, the hash table is 32k.

/.../

</F>




From tim.one at home.com  Sun Jan 14 20:46:44 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 14:46:44 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <3A61AAA9.F6F1EA9F@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com>

[M.-A. Lemburg]
> BTW, are there less English centric "sounds alike" matchers
> around ?

Yes, but if anything there are far too many of them:  like Soundex, they're
just heuristics, and *everybody* who cares adds their own unique twists,
while proper studies are almost non-existent.  Few variants appear to be in
use much beyond their inventor's friends; one notable exception in the
Jewish community is the Daitch-Mokotoff variation, originally tailored to
their unique needs but later generalized; a brief description here:

    http://www.avotaynu.com/soundex.html

The similarly involved NYSIIS algorithm (New York State Identification
Intelligence System -- look for NYSIIS on Parnassus) was the winner from a
field of about two dozen competing algorithms, after measuring their
effectiveness on assorted databases maintained by the state of New York.
Since New York has a large immigrant population, NYSIIS isn't as
Anglocentric as Soundex either.

But state-of-the-art has given up on purely computational algorithms for
these purposes:  proper names are simply too much a mess.  For example, if I
search for "Richard", it *ought* to match on "Dick"; if my Arab buddy
searches on "Mohammed", it *ought* to match on "Mhd"; "the rules" people
actually use just aren't reducible to pure computation -- it takes a large
knowledge base to capture what people "just know".  You may enjoy visiting
this commercial site (AFAIK, nobody is giving away state-of-the-art for
free):

    http://www.las-inc.com/

> ...
>     http://physics.nist.gov/cuu/Reference/soundex.html
>
> works fine for English texts,

If that were true, the English-speaking researchers would have declared
victory 120 years ago <wink>.  But English pronunciation is *notoriously*
difficult to predict from spelling, partly because English is the Perl of
human languages.

or-maybe-the-borg-assuming-there's-a-difference<wink>-ly y'rs  - tim




From esr at thyrsus.com  Sun Jan 14 21:17:53 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 15:17:53 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 14, 2001 at 02:46:44PM -0500
References: <3A61AAA9.F6F1EA9F@lemburg.com> <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com>
Message-ID: <20010114151753.A6671@thyrsus.com>

Tim Peters <tim.one at home.com>:
> If that were true, the English-speaking researchers would have declared
> victory 120 years ago <wink>.  But English pronunciation is *notoriously*
> difficult to predict from spelling, partly because English is the Perl of
> human languages.

Actually, according to the Oxford Encyclopedia of Linguistics, this is
an urban myth.  The orthography of English is, in fact, quite
consistent; it looks much more wacked out than it is because the
maddening irregularities are concentrated in the 400 most commonly
used words.

The situation is much like that with French verb forms -- most French
verbs have a very regular inflection pattern, but the twenty or so
exceptions are the most commonly used ones.  In fact it's a general
rule in language evolution that irregularities are preserved in common
forms and not rare ones -- in the rare ones they get forgotten.

American personal names are are problem precisely because they sometimes
do *not* have English orthography.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

  "...quemadmodum gladius neminem occidit, occidentis telum est."
[...a sword never kills anybody; it's a tool in the killer's hand.]
        -- (Lucius Annaeus) Seneca "the Younger" (ca. 4 BC-65 AD),



From tim.one at home.com  Sun Jan 14 21:31:06 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 15:31:06 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010114140901.A6431@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEJCIIAA.tim.one@home.com>

[Tim]
> Actually, I'm certain they're the same algorithm now, except the C is
> showing through in ratcliff to the floating-point eye <wink>.

[Eric]
> Take a look:

Yup, same thing, except:

> static float ratcliff(char *s1, char *s2)

accounts for the numeric differences (change "float"->"double" and they'd be
the same; Python has to convert it to a double anyway, lacking any internal
support for C's floats; and the C code is *computing* in double regardless,
cutting it back to a float upon return just because of the "float" decl).

The code in SequenceMatcher doesn't *look* anything like it, though, due to
years of dreaming up faster ways to do this (in its original role as a diff
generator, it routinely had to deal with sequences containing 10s of
thousands of elements, and code very much like the code you posted was just
too slow for that).

One simple trick that can enormously speed the worst cases:  the "find the
longest match starting here" innermost loop is guarded by

> 	    if (*a1 == *a2)

However, it can't possibly find a *bigger* max unless it's also the case
that

    a1[max) == a2[max)

That's usually false in real life, so by adding that test to the guard you
usually get to skip the innermost loop entirely.  Probably more important in
a diff-generator role, though.

SequenceMatcher's prime trick is to preprocess one of the strings, in linear
time building up a hash table mapping each character in the string to a list
of the indices at which it appears.  Then the second-innermost loop is saved
from needing to do any search:  when we get to, e.g., 'x' in the other
string, the precomputed hash table tells us directly where to find all the
x's in the original string.  And in the match-1-against-N case, this hash
table can be computed once & reused N times.  That's a monster win.

However, I never had the patience to code that in C, so I never *did* that
before I reimplemented my stuff in Python.  Now the Python ndiff runs
circles around the old Pascal and C versions.  I'm sure that has nothing to
do with machines having gotten 100x faster in the meantime <wink>>

for-short-1-against-1-matches-yours-will-certainly-be-quicker-ly
    y'rs  - tim




From guido at python.org  Sun Jan 14 21:55:21 2001
From: guido at python.org (Guido van Rossum)
Date: Sun, 14 Jan 2001 15:55:21 -0500
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: Your message of "Sun, 14 Jan 2001 11:51:52 CST."
             <14945.59192.400783.403810@beluga.mojam.com> 
References: <14945.59192.400783.403810@beluga.mojam.com> 
Message-ID: <200101142055.PAA13041@cj20424-a.reston1.va.home.com>

> Ping's pydoc is awesome!  Move it out of the sandbox and put it in the
> standard distribution.
> 
> Biggest hook for me:
> 
>    1. execute "pydoc -p 3200"
>    2. visit "http://localhost:3200/"
>    3. knock yourself out

Yes, wow!

Now, if we could somehow get this to show both the docs that Fred
maintains and the stuff that Ping extracts from the source code, that
would be even better!  (I think that Ping's stuff should also run on
the python.org site, by the way.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Sun Jan 14 21:59:28 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 15:59:28 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEJCIIAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 14, 2001 at 03:31:06PM -0500
References: <20010114140901.A6431@thyrsus.com> <LNBBLJKPBEHFEDALKOLCIEJCIIAA.tim.one@home.com>
Message-ID: <20010114155928.A6793@thyrsus.com>

Tim Peters <tim.one at home.com>:
> [Tim]
> > Actually, I'm certain they're the same algorithm now, except the C is
> > showing through in ratcliff to the floating-point eye <wink>.
> 
> [Eric]
> > Take a look:
> 
> Yup, same thing, except:
> 
> > static float ratcliff(char *s1, char *s2)
> 
> accounts for the numeric differences (change "float"->"double" and they'd be
> the same; Python has to convert it to a double anyway, lacking any internal
> support for C's floats; and the C code is *computing* in double regardless,
> cutting it back to a float upon return just because of the "float" decl).

OK, so the right answer is to make your version visible and documented
in the library.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

No one is bound to obey an unconstitutional law and no courts are bound
to enforce it.  
	-- 16 Am. Jur. Sec. 177 late 2d, Sec 256



From tim.one at home.com  Sun Jan 14 22:01:19 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 16:01:19 -0500
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEJDIIAA.tim.one@home.com>

[?!ng]
> [1] We started with 4.

Na, *we* started with two, just ' and ".  And at the time, I thought that
was arguably one too many already <wink>.  Allowing the modifiers to be
case-insensitive seems to me much more Pythonic than the original sin of
making ' and " mean the same thing.  OTOH, if only " had been allowed at the
start, we'd probably spell raw strings with ' today, and that doesn't really
scream that they're so very different from " strings.

leaving-this-one-be-ly y'rs  - tim




From barry at digicool.com  Sun Jan 14 22:02:07 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Sun, 14 Jan 2001 16:02:07 -0500
Subject: [Python-Dev] pydoc - put it in the core
References: <14945.59192.400783.403810@beluga.mojam.com>
Message-ID: <14946.5071.92879.789400@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip at mojam.com> writes:

    SM> Ping's pydoc is awesome!  Move it out of the sandbox and put
    SM> it in the standard distribution.

    SM> Biggest hook for me:

    |    1. execute "pydoc -p 3200"
    |    2. visit "http://localhost:3200/"
    |    3. knock yourself out

Whoa.  Awesome.




From ping at lfw.org  Sun Jan 14 22:01:45 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 13:01:45 -0800 (PST)
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: <200101141708.MAA11161@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101141235520.5846-100000@skuld.kingmanhall.org>

On Sun, 14 Jan 2001, Guido van Rossum wrote:
> 
> It comes from the numeric literals.  C allows 0x0 and 0X0, and 0L as
> well as 0l.  So does Python (and also 0j == 0J).

I just did a little test.  Neither Python, Perl, nor Tcl support
"\X66", only "\x66".  Perl doesn't support 0X1234, only 0x1234.
Tcl's "expr" routine does support 0X1234.  Javascript supports
0X1234, but not "\X66".  I'd bet that no one really relies on or
expects the uppercase forms except L.


-- ?!ng




From ping at lfw.org  Sun Jan 14 22:14:34 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 13:14:34 -0800 (PST)
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of
 Python)
In-Reply-To: <14942.29609.19618.534613@cj42289-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101141309320.5846-100000@skuld.kingmanhall.org>

On Thu, 11 Jan 2001, Fred L. Drake, Jr. wrote:
> Ka-Ping Yee writes:
>  > My next two targets are:
>  >     1.  Generating text from the HTML documentation files
>  >         using Paul Prescod's stuff in onlinehelp.py.
> 
> You mean the ones I publish as the standard documentation?  Relying
> on the structure of that HTML is pure folly!

Paul's onlinehelp.py is using the HTMLParser and AbstractFormatter
to turn HTML into text.  It also contains paths to specific files,
e.g. help('assert') looks for "ref/assert.html".  Are you okay with
this technique?  Have you tried onlinehelp.py?  I was planning to
do the same to provide help on the language in pydoc.


-- ?!ng




From skip at mojam.com  Sun Jan 14 22:26:48 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sun, 14 Jan 2001 15:26:48 -0600 (CST)
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: <200101142055.PAA13041@cj20424-a.reston1.va.home.com>
References: <14945.59192.400783.403810@beluga.mojam.com>
	<200101142055.PAA13041@cj20424-a.reston1.va.home.com>
Message-ID: <14946.6552.542015.620760@beluga.mojam.com>

    Guido> Now, if we could somehow get this to show both the docs that Fred
    Guido> maintains and the stuff that Ping extracts from the source code,
    Guido> that would be even better!

I had exactly the same thought.  I suspect that if the install target were
modified to install the html-ized sections of the lib reference manual pydoc
could grovel around in sys and find the root of the library reference manual
pretty easily.  If not, it could simply redirect to the relevant section of
http://www.python.org/doc/current/lib/.

Skip




From tim.one at home.com  Sun Jan 14 22:45:48 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 16:45:48 -0500
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: <Pine.LNX.4.10.10101141235520.5846-100000@skuld.kingmanhall.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEJGIIAA.tim.one@home.com>

[?!ng]
> ...
> I'd bet that no one really relies on or expects the uppercase
> forms except L.

And 0X.  I don't think it's in the std library, but I've certainly seen
Python code do stuff like

    magic = 0XFEEDFACE

Plus it's always good for a language to be able parse the stuff it prints,
and "0X..." is generated by Python's %#X format code.

Don't believe I've ever seen the "u" or "r" string modifiers in uppercase,
though, but really don't see the harm in allowing that.




From ping at lfw.org  Sun Jan 14 22:50:43 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 13:50:43 -0800 (PST)
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: <14946.5071.92879.789400@anthem.wooz.org>
Message-ID: <Pine.LNX.4.10.10101141349040.5846-100000@skuld.kingmanhall.org>

On Sun, 14 Jan 2001, Barry A. Warsaw wrote:
> Whoa.  Awesome.

Thanks!

Two things added recently: constants (any numbers, lists, tuples,
strings, or types) in modules are shown; and packages are listed
in the index as they should be.


-- ?!ng




From bckfnn at worldonline.dk  Sun Jan 14 23:20:51 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Sun, 14 Jan 2001 22:20:51 GMT
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <040e01c07e60$8c74d100$e46940d5@hagrid>
References: <010f01c07e52$e9801fc0$e46940d5@hagrid> <3a61f12a.36601630@smtp.worldonline.dk> <040e01c07e60$8c74d100$e46940d5@hagrid>
Message-ID: <3a622615.50148579@smtp.worldonline.dk>

[/F]

>here's the description:

Thanks.

>From: "Fredrik Lundh" <effbot at telia.com>
>Date: Sun, 16 Jul 2000 20:40:46 +0200
>
>/.../
>
>    The unicodenames database consists of two parts: a name
>    database which maps character codes to names, and a code
>    database, mapping names to codes.
>
>* The Name Database (getname)
>
>    First, the 10538 text strings are split into 42193 words,
>    and combined into a 4949-word lexicon (a 29k array).

I only added a word to the lexicon if it was used more than once and if
the length was larger then the lexicon index. I ended up with 1385
entries in the lexicon. (a 7k array)

>    Each word is given a unique index number (common words get
>    lower numbers), and there's a "lexicon offset" table mapping
>    from numbers to words (10k).

My lexicon offset table is 3k and I also use 4k on a perfect hash of the
words.

>    To get back to the original text strings, I use a "phrase
>    book".  For each original string, the phrase book stores a a
>    list of word numbers.  Numbers 0-127 are stored in one byte,
>    higher numbers (less common words) use two bytes.  At this
>    time, about 65% of the words can be represented by a single
>    byte.  The result is a 56k array.

Because not all words are looked up in the lexicon, I used the values
0-38 for the letters and number, 39-250 are used for one byte lexicon
index, and 251-255 are combined with following byte to form a two byte.
This also result in a 57k array

So far it is only minor variations.

>    The final data structure is an offset table, which maps code
>    points to phrase book offsets.  Instead of using one big
>    table, I split each code point into a "page number" and a
>    "line number" on that page.
>
>      offset = line[ (page[code>>SHIFT]<<SHIFT) + (code&MASK) ]
>
>    Since the unicode space is sparsely populated, it's possible
>    to split the code so that lots of pages gets no contents.  I
>    use a brute force search to find the optimal SHIFT value.
>
>    In the current database, the page table has 1024 entries
>    (SHIFT is 6), and there are 199 unique pages in the line
>    table.  The total size of the offset table is 26k.
>
>* The code database (getcode)
>
>    For the code table, I use a straight-forward hash table to store
>    name to code mappings.  It's basically the same implementation
>    as in Python's dictionary type, but a different hash algorithm.
>    The table lookup loop simply uses the name database to check
>    for hits.
>
>    In the current database, the hash table is 32k.

I chose to split a unicode name into words even when looking up a
unicode name. Each word is hashed to a lexicon index and a "phrase book
string" is created. The sorted phrase book is then search with a binary
search among 858 entries that can be address directly followed by a
sequential search among 12 entries. The phrase book search index is 8k
and a table that maps phrase book indexes to codepoints is another 20k.

The searching I do makes jython slower then the direct calculation you
do. I'll take another look at this after jython 2.0 to see if I can
improve performance with your page/line number scheme and a total
hashing of all the unicode names.

regards,
finn



From ping at lfw.org  Sun Jan 14 23:44:47 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 14:44:47 -0800 (PST)
Subject: [Python-Dev] SourceForge and long patches
Message-ID: <Pine.LNX.4.10.10101141443200.5846-100000@skuld.kingmanhall.org>

Okay, this is getting really annoying.  SourceForge won't accept
any patches > 16k.  Why not?  Is there a way around this?

    SourceForge: Exiting with Error

    ERROR

    Patch Uploaded ERROR - Submission failed PQsendQuery() -- query is too long. Maximum length is 16382 

I'm trying to submit the update to tokenize.py, but it's too long
because i've changed test/output/test_tokenize and that's a big file.


-- ?!ng




From guido at python.org  Sun Jan 14 23:58:03 2001
From: guido at python.org (Guido van Rossum)
Date: Sun, 14 Jan 2001 17:58:03 -0500
Subject: [Python-Dev] SourceForge and long patches
In-Reply-To: Your message of "Sun, 14 Jan 2001 14:44:47 PST."
             <Pine.LNX.4.10.10101141443200.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101141443200.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101142258.RAA13606@cj20424-a.reston1.va.home.com>

> Okay, this is getting really annoying.  SourceForge won't accept
> any patches > 16k.  Why not?  Is there a way around this?

I have no idea why; can only assume it's a limitation in the database
package they use.

The standard workaround is to upload a URL pointing to the patch. :-(

>     SourceForge: Exiting with Error
> 
>     ERROR
> 
>     Patch Uploaded ERROR - Submission failed PQsendQuery() -- query is too long. Maximum length is 16382 

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Mon Jan 15 00:35:51 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 00:35:51 +0100
Subject: [Python-Dev] Where's Greg Ward ?
Message-ID: <3A6237D7.673BBB30@lemburg.com>

He seems to be offline and the people on the distutils list have some
patches and other things which would be nice to have in distutils 
for 2.1.

I suppose we could simply check in the patches, but we still want
to get his OK on things before applying patches to the distutils
tree.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Mon Jan 15 00:57:45 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 18:57:45 -0500
Subject: [Python-Dev] Where's Greg Ward ?
In-Reply-To: <3A6237D7.673BBB30@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com>

[MAL]
> He seems to be offline and the people on the distutils list have
> some patches and other things which would be nice to have in
> distutils for 2.1.

Greg's somewhere near the end of the process of moving from Virginia to
Canada; I expect he'll become visible again Real Soon.

> I suppose we could simply check in the patches, but we still want
> to get his OK on things before applying patches to the distutils
> tree.

The distutils SIG could elect a Shadow Dictator in his place; if everyone
agrees to vote for Andrew, you save the effort of counting votes <wink>.




From tismer at tismer.com  Mon Jan 15 02:35:57 2001
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 15 Jan 2001 02:35:57 +0100
Subject: [Python-Dev] Minor Bug-fix release for Stackless Python 2.0
Message-ID: <3A6253FD.E9B30462@tismer.com>

Wolfgang Lipp reported that Microthreads were executing
sequentially with SLP 2.0 .

The bug fix is available on the website.
Please use this new version, or microthreads will not
give you much fun.

http://www.stackless.com/spc20-win32.exe
http://www.stackless.com/spc-src-010115.zip

enjoy - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From tommy at ilm.com  Mon Jan 15 03:18:20 2001
From: tommy at ilm.com (Captain Senorita)
Date: Sun, 14 Jan 2001 18:18:20 -0800 (PST)
Subject: [Python-Dev] chomp()?
In-Reply-To: <14923.31238.65155.496546@buffalo.fnal.gov>
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
	<14923.31238.65155.496546@buffalo.fnal.gov>
Message-ID: <14946.23981.694472.406438@mace.lucasdigital.com>

Charles G Waldman writes:
| 
|              P=NP (Python is not Perl)

Is it too late to suggest this for the SPAM9 t-shirt? :)



From guido at python.org  Mon Jan 15 03:24:36 2001
From: guido at python.org (Guido van Rossum)
Date: Sun, 14 Jan 2001 21:24:36 -0500
Subject: [Python-Dev] chomp()?
In-Reply-To: Your message of "Sun, 14 Jan 2001 18:18:20 PST."
             <14946.23981.694472.406438@mace.lucasdigital.com> 
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> <14923.31238.65155.496546@buffalo.fnal.gov>  
            <14946.23981.694472.406438@mace.lucasdigital.com> 
Message-ID: <200101150224.VAA15254@cj20424-a.reston1.va.home.com>

> Charles G Waldman writes:
> | 
> |              P=NP (Python is not Perl)
> 
> Is it too late to suggest this for the SPAM9 t-shirt? :)

By just about a day -- I haven't seen the new design yet, but Just &
Eric were supposed to design it today and hand in the final proofs
tomorrow.  I believe the slogan will be "it fits your brain" (or "it
fits my brain").

But if you print a bunch of P=NP shirts, I'm sure you can sell them
with a profit, both in Long Beach and in San Diego (at the O'Reilly
Open Source conference)...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Mon Jan 15 07:35:05 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 01:35:05 -0500
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
In-Reply-To: <20010110101545.A21305@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEKGIIAA.tim.one@home.com>

[Timmy]
> At this point I'm +0.5 on the idea of fileobject.c using
> ms_getline_hack whenever HAVE_GETC_UNLOCKED isn't available.

[NeilS, from Wednesday]
> Compare ms_getline_hack to what Perl does in order speed up IO.

Believe me, I have <wink>.

> I think its worth maintaining that piece of relatively portable
> code given the benefit.  If the code has to be maintained then it
> might was well be used.  If we find a platform the breaks we can
> always disable it before the final release.

Given that hearty encouragement, and the utterly non-scary results so far, I
just checked in a new scheme:

On a platform with getc_unlocked():
    By default, use getc_unlocked().
    If you want to use fgets() instead, #define USE_FGETS_IN_GETLINE.
        [so motivated people can use fgets() instead if it's faster
         on their platform]
On a platform without getc_unlocked():
    By default, use fgets().
    If you don't want to use fgets(), #define DONT_USE_FGETS_IN_GETLINE.
        [so if we stumble into a platform it fails on between
         releases, the user will have an easy time turning it off
         themself]




From gstein at lyra.org  Mon Jan 15 08:18:20 2001
From: gstein at lyra.org (Greg Stein)
Date: Sun, 14 Jan 2001 23:18:20 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib httplib.py,1.26,1.27
In-Reply-To: <E14HTxn-0003nR-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Sat, Jan 13, 2001 at 08:55:35AM -0800
References: <E14HTxn-0003nR-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010114231820.C6081@lyra.org>

On Sat, Jan 13, 2001 at 08:55:35AM -0800, Guido van Rossum wrote:
> Update of /cvsroot/python/python/dist/src/Lib
> In directory usw-pr-cvs1:/tmp/cvs-serv14586
> 
> Modified Files:
> 	httplib.py 
> Log Message:
> SF Patch #103225 by Ping: httplib: smallest Python patch ever
>...

Not so small:

>...
> *** 333,337 ****
>               i = host.find(':')
>               if i >= 0:
> !                 port = int(host[i+1:])
>                   host = host[:i]
>               else:
> --- 333,340 ----
>               i = host.find(':')
>               if i >= 0:
> !                 try:
> !                     port = int(host[i+1:])
> !                 except ValueError, msg:
> !                     raise socket.error, str(msg)
>                   host = host[:i]
>               else:


Did you intend to commit this?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From moshez at zadka.site.co.il  Mon Jan 15 16:53:58 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon, 15 Jan 2001 17:53:58 +0200 (IST)
Subject: [Python-Dev] chomp()?
In-Reply-To: <200101150224.VAA15254@cj20424-a.reston1.va.home.com>
References: <200101150224.VAA15254@cj20424-a.reston1.va.home.com>, <200012281504.KAA25892@cj20424-a.reston1.va.home.com> <14923.31238.65155.496546@buffalo.fnal.gov>  
            <14946.23981.694472.406438@mace.lucasdigital.com>
Message-ID: <20010115155358.86E5AA828@darjeeling.zadka.site.co.il>

On Sun, 14 Jan 2001 21:24:36 -0500, Guido van Rossum <guido at python.org> wrote:

> But if you print a bunch of P=NP shirts, I'm sure you can sell them
> with a profit, both in Long Beach and in San Diego (at the O'Reilly
> Open Source conference)...

And the Libre Software Meeting (http://lsm.abul.org), which has a Python
subtopic too.
(Since it's in France, no one is calling it "free", so it's probable you
can sell those T-shirts there...)
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From mal at lemburg.com  Mon Jan 15 10:44:14 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 10:44:14 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <010f01c07e52$e9801fc0$e46940d5@hagrid>
Message-ID: <3A62C66E.2BB69E61@lemburg.com>

Fredrik Lundh wrote:
> 
> The name database portions of SF task 17335 ("add
> compressed unicode database") were postponed to
> 2.1.
> 
> My current patch replaces the ~450k large ucnhash
> module with a new ~160k large module.  (See earlier
> posts for more info on how the new database works).
> 
> Should I check it in?

Since the Unicode character names are probably
not used for performance sensitive tasks, I suggest to
checkin the smallest version possible.

If it is too much work to get Finn's version recoded in C
(presuming it's written in Java), then I'd suggest checking
in your version until someone comes up with a yet smaller
edition.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 15 10:48:49 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 10:48:49 +0100
Subject: [Python-Dev] pydoc - put it in the core
References: <14945.59192.400783.403810@beluga.mojam.com>
		<200101142055.PAA13041@cj20424-a.reston1.va.home.com> <14946.6552.542015.620760@beluga.mojam.com>
Message-ID: <3A62C781.22240D3C@lemburg.com>

Skip Montanaro wrote:
> 
>     Guido> Now, if we could somehow get this to show both the docs that Fred
>     Guido> maintains and the stuff that Ping extracts from the source code,
>     Guido> that would be even better!
> 
> I had exactly the same thought.  I suspect that if the install target were
> modified to install the html-ized sections of the lib reference manual pydoc
> could grovel around in sys and find the root of the library reference manual
> pretty easily.  If not, it could simply redirect to the relevant section of
> http://www.python.org/doc/current/lib/.

Since Fred remarked that the URLs for the different docs are
not fixed, how about adding a __onlinedocs__ attribute to the
standard Python modules providing the correct URL ?

Or, alternatively, pass the module's name through some Google
like "I feel lucky" documentation search engine...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 15 10:51:40 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 10:51:40 +0100
Subject: [Python-Dev] Where's Greg Ward ?
References: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com>
Message-ID: <3A62C82C.EA25AAF5@lemburg.com>

[CCed to distutils, since it matters there]
Tim Peters wrote:
> 
> [MAL]
> > He seems to be offline and the people on the distutils list have
> > some patches and other things which would be nice to have in
> > distutils for 2.1.
> 
> Greg's somewhere near the end of the process of moving from Virginia to
> Canada; I expect he'll become visible again Real Soon.

Great :)
 
> > I suppose we could simply check in the patches, but we still want
> > to get his OK on things before applying patches to the distutils
> > tree.
> 
> The distutils SIG could elect a Shadow Dictator in his place; if everyone
> agrees to vote for Andrew, you save the effort of counting votes <wink>.

Ok, let's agree to vote for Andrew :)

Andrew, is that OK with you ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Mon Jan 15 11:52:09 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 05:52:09 -0500
Subject: [Python-Dev] RE: xreadline speed vs readlines_sizehint
In-Reply-To: <3A5D602D.9DC991CB@per.dem.csiro.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCMELAIIAA.tim.one@home.com>

[Mark Favas]
> ...
> The lines range in length from 96 to 747 characters, with
> 11% @ 233, 17% @ 252 and 52% @ 254 characters, so #1 [a vendor
> who actually optimized fgets()] looks promising - most lines are
> long enough to trigger a realloc.

Plus as soon as you spill over the stack buffer, I make you pay for filling
1024 new bytes with newlines before the next fgets() call, and almost all of
those are irrelevant to you.  It doesn't degrade gracefully.  Alas, I tried
several "adaptive" schemes (adjusting how much of the initial segment of a
larger stack buffer they would use, based on the actual line lengths seen in
the past), but the costs always exceeded the savings on my box.

> Cranking up INITBUFSIZE in ms_getline_hack to 260 from 200
> improves thing again, by another 25%:
> total 131426612 chars and 514216 lines
> count_chars_lines     5.081  5.066
> readlines_sizehint    3.743  3.717
> using_fileinput      11.113 11.100
> while_readline        6.100  6.083
> for_xreadlines        3.027  3.033

Well, I couldn't let you forego *all* of 25%.  The current fileobject.c has
a stack buffer of 300 bytes, but only uses 100 of them on the first gets()
call.  On a very quiet machine, that saved 3-4% of the runtime on *my* test
case, whose line lengths are typical of the text files I crunch over, so I'm
happy for me.  If 100 bytes aren't enough, it must call fgets() again, but
just appends the next call into the full 300-byte buffer.  So it saves the
realloc for lines under 300 chars.

> Apart from the name <grin>, I like ms_getline_hack...

Ya, it's now the non-pejorative getline_via_fgets().  I hate that I became a
grown-up <0.9 wink>.

time-to-pick-wings-off-of-flies-ly y'rs  - tim




From ping at lfw.org  Mon Jan 15 12:11:16 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 03:11:16 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 httplib.py,1.26,1.27
In-Reply-To: <20010114231820.C6081@lyra.org>
Message-ID: <Pine.LNX.4.10.10101150310100.5846-100000@skuld.kingmanhall.org>

On Sun, 14 Jan 2001, Greg Stein wrote:
> Not so small:
> 
> >...
> > *** 333,337 ****
> >               i = host.find(':')
> >               if i >= 0:
> > !                 port = int(host[i+1:])
> >                   host = host[:i]
> >               else:
> > --- 333,340 ----
> >               i = host.find(':')
> >               if i >= 0:
> > !                 try:
> > !                     port = int(host[i+1:])
> > !                 except ValueError, msg:
> > !                     raise socket.error, str(msg)
> >                   host = host[:i]
> >               else:

The above changes were not part of the patch i submitted;
the patch i submitted was exactly a one-character change.
Guido has already edited the file, so there's no need to
commit anything further here.



-- ?!ng




From mal at lemburg.com  Mon Jan 15 12:56:37 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 12:56:37 +0100
Subject: [Python-Dev] Why is soundex marked obsolete?
References: <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com>
Message-ID: <3A62E575.9A584108@lemburg.com>

Tim Peters wrote:
> 
> [M.-A. Lemburg]
> > BTW, are there less English centric "sounds alike" matchers
> > around ?
> 
> Yes, but if anything there are far too many of them:  like Soundex, they're
> just heuristics, and *everybody* who cares adds their own unique twists,
> while proper studies are almost non-existent.  Few variants appear to be in
> use much beyond their inventor's friends; one notable exception in the
> Jewish community is the Daitch-Mokotoff variation, originally tailored to
> their unique needs but later generalized; a brief description here:
> 
>     http://www.avotaynu.com/soundex.html
> 
> The similarly involved NYSIIS algorithm (New York State Identification
> Intelligence System -- look for NYSIIS on Parnassus) was the winner from a
> field of about two dozen competing algorithms, after measuring their
> effectiveness on assorted databases maintained by the state of New York.
> Since New York has a large immigrant population, NYSIIS isn't as
> Anglocentric as Soundex either.

Thanks for the pointer. I'll add that module to my lib :)

       http://metagram.webreply.com/downloads/nysiis.py

Perhaps Eric ought to add this one to his package as well  ?!
BTW, where can I find your package on the web, Eric ? I'd like
to give it a ride under German language conditions ;)
 
> But state-of-the-art has given up on purely computational algorithms for
> these purposes:  proper names are simply too much a mess.  For example, if I
> search for "Richard", it *ought* to match on "Dick"; if my Arab buddy
> searches on "Mohammed", it *ought* to match on "Mhd"; "the rules" people
> actually use just aren't reducible to pure computation -- it takes a large
> knowledge base to capture what people "just know".  You may enjoy visiting
> this commercial site (AFAIK, nobody is giving away state-of-the-art for
> free):
> 
>     http://www.las-inc.com/

Sad -- "patent pending" algorithms don't help anyone on this
planet :(
 
> > ...
> >     http://physics.nist.gov/cuu/Reference/soundex.html
> >
> > works fine for English texts,
> 
> If that were true, the English-speaking researchers would have declared
> victory 120 years ago <wink>.  But English pronunciation is *notoriously*
> difficult to predict from spelling, partly because English is the Perl of
> human languages.

Then Dutch must be the Python of human languages... ;)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From moshez at zadka.site.co.il  Mon Jan 15 21:13:18 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon, 15 Jan 2001 22:13:18 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib tabnanny.py,1.10,1.11 telnetlib.py,1.8,1.9 tempfile.py,1.26,1.27 threading.py,1.10,1.11 toaiff.py,1.8,1.9 tokenize.py,1.15,1.16 traceback.py,1.18,1.19 tty.py,1.2,1.3 tzparse.py,1.8,1.9
In-Reply-To: <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
References: <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>

On Sun, 14 Jan 2001 19:26:38 -0800, Tim Peters <tim_one at users.sourceforge.net> wrote:
> Modified Files:
> 	tabnanny.py 
> Log Message:
> Whitespace normalization.

hmmmmmm.......
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From mal at lemburg.com  Mon Jan 15 13:10:30 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 13:10:30 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib 
 tabnanny.py,1.10,1.11 telnetlib.py,1.8,1.9 tempfile.py,1.26,1.27 
 threading.py,1.10,1.11 toaiff.py,1.8,1.9 tokenize.py,1.15,1.16 
 traceback.py,1.18,1.19 tty.py,1.2,1.3 tzparse.py,1.8,1.9
References: <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
Message-ID: <3A62E8B6.3DFC1FA2@lemburg.com>

Moshe Zadka wrote:
> 
> On Sun, 14 Jan 2001 19:26:38 -0800, Tim Peters <tim_one at users.sourceforge.net> wrote:
> > Modified Files:
> >       tabnanny.py
> > Log Message:
> > Whitespace normalization.
> 
> hmmmmmm.......

Perhaps you ought to make this a CRON job ?!

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From moshez at zadka.site.co.il  Mon Jan 15 21:24:48 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon, 15 Jan 2001 22:24:48 +0200 (IST)
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <3A62E8B6.3DFC1FA2@lemburg.com>
References: <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
Message-ID: <20010115202448.38F60A828@darjeeling.zadka.site.co.il>

I'm sorry! I meant to reply to tim alone, and ended up spamming python-dev!
Of course, the real culprit is the person who fixed up the reply-to in
the checkin messages to point to python-dev. Why was it done, and
isn't there a better way? This makes it painful to personally comment
on people's checkin messages. I suggest instead to add a mail-followup-to
header

(Didn't anyone read "Reply-To Munging Considered Harmful"?)
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From esr at thyrsus.com  Mon Jan 15 13:23:25 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 07:23:25 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <3A62E575.9A584108@lemburg.com>; from mal@lemburg.com on Mon, Jan 15, 2001 at 12:56:37PM +0100
References: <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com> <3A62E575.9A584108@lemburg.com>
Message-ID: <20010115072325.A10377@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> Perhaps Eric ought to add this one to his package as well  ?!

Actually, at this point, my plan is to give Tim a decent interval to
refactor ndiff so his SequenceMatcher class is exposed and documented --
otherwise *I'll* go in and do it (har! waving a bloody knife!).

His turns out to be the same as the Ratcliff-Obershelp technique I was
using, except Tim had his bullshit threshold set too low (:-)) and let
through matches I wouldn't have.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The only purpose for which power can be rightfully exercised over any
member of a civilized community, against his will, is to prevent harm
to others. His own good, either physical or moral, is not a sufficient
warrant
	-- John Stuart Mill, "On Liberty", 1859



From mal at lemburg.com  Mon Jan 15 13:26:59 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 13:26:59 +0100
Subject: [Python-Dev] Re: Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il>
Message-ID: <3A62EC93.9AA60ABA@lemburg.com>

Moshe Zadka wrote:
> 
> I'm sorry! I meant to reply to tim alone, and ended up spamming python-dev!
> Of course, the real culprit is the person who fixed up the reply-to in
> the checkin messages to point to python-dev. Why was it done, and
> isn't there a better way? This makes it painful to personally comment
> on people's checkin messages. I suggest instead to add a mail-followup-to
> header
> 
> (Didn't anyone read "Reply-To Munging Considered Harmful"?)

Naa, noone needs to be shot in the foot ;)

In fact I like it, that replies go to python-dev ... after all,
that's where these things should be discussed.

BTW, in case you misunderstood my reply: it would indeed make
sense to automate these kinds of check (tabnanny et al).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From moshez at zadka.site.co.il  Mon Jan 15 21:42:15 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon, 15 Jan 2001 22:42:15 +0200 (IST)
Subject: [Python-Dev] Re: Someone should be shot
In-Reply-To: <3A62EC93.9AA60ABA@lemburg.com>
References: <3A62EC93.9AA60ABA@lemburg.com>, <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il>
Message-ID: <20010115204215.84F0CA828@darjeeling.zadka.site.co.il>

On Mon, 15 Jan 2001 13:26:59 +0100, "M.-A. Lemburg" <mal at lemburg.com> wrote:
 
> In fact I like it, that replies go to python-dev ... after all,
> that's where these things should be discussed.

Well, that's the mailing list where things should be discussed.
But when I press the "Reply" button (as opposed to "Reply to List" button)
I expect my e-mail to go to the person originating the e-mail. 
Reply-To: means "I'd like to get replies to some other address".
What if, say, a checkin message relates to some private topic
I'd discussed with someone: I'd like to reply to him personally.

I agree that responses to Python-Checkins should be handled on Python-Dev:
that's what the mail-followup-to header is for.

> BTW, in case you misunderstood my reply: it would indeed make
> sense to automate these kinds of check (tabnanny et al).

Oh, ok. The "cron" part threw me off (why cron?) 
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From barry at digicool.com  Mon Jan 15 14:15:28 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 08:15:28 -0500
Subject: [Python-Dev] Where's Greg Ward ?
References: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com>
	<3A62C82C.EA25AAF5@lemburg.com>
Message-ID: <14946.63472.282750.828218@anthem.wooz.org>

>>>>> "M" == M  <mal at lemburg.com> writes:

    >>  The distutils SIG could elect a Shadow Dictator in his place;
    >> if everyone agrees to vote for Andrew, you save the effort of
    >> counting votes <wink>.

    M> Ok, let's agree to vote for Andrew :)

    M> Andrew, is that OK with you ?

He's got my vote.  I've been experiencing some weird problems with the
distutils installation of pybsddb3 out of the current Python cvs
tree.  It'd be nice if the outstanding distutils patches are
integrated before I dive in.  I don't see anything relevant in patches
or bugs, but I don't know if there are other repositories of distutils
fixes (like the archives?).

-Barry




From barry at digicool.com  Mon Jan 15 14:27:02 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 08:27:02 -0500
Subject: [Python-Dev] Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>
	<E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
	<20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
	<20010115202448.38F60A828@darjeeling.zadka.site.co.il>
Message-ID: <14946.64166.348139.425223@anthem.wooz.org>

>>>>> "MZ" == Moshe Zadka <moshez at zadka.site.co.il> writes:

    MZ> I'm sorry! I meant to reply to tim alone, and ended up
    MZ> spamming python-dev!  Of course, the real culprit is the
    MZ> person who fixed up the reply-to in the checkin messages to
    MZ> point to python-dev. Why was it done, and isn't there a better
    MZ> way? This makes it painful to personally comment on people's
    MZ> checkin messages. I suggest instead to add a mail-followup-to
    MZ> header

    MZ> (Didn't anyone read "Reply-To Munging Considered Harmful"?)

Or how about

    http://www.metasystema.org/essays/reply-to-useful.mhtml

for a dissenting view.  Of course Mail-Followup-To is completely
non-standard, but even if it were, having the mailing list munge it in
isn't recommended:

    http://cr.yp.to/proto/replyto.html

Bottom line (IMHO), this is just something about email that is and
will forever remain broken.  Given that, it was voted a long while
back to make Reply-To for checkins point to python-dev so until
there's a hue and cry to change it back, I'll leave it as is.  And
yeah, it bites me sometimes too!

-Barry




From tony at lsl.co.uk  Mon Jan 15 15:18:36 2001
From: tony at lsl.co.uk (Tony J Ibbs (Tibs))
Date: Mon, 15 Jan 2001 14:18:36 -0000
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <Pine.LNX.4.10.10101110803400.5846-100000@skuld.kingmanhall.org>
Message-ID: <002801c07efe$0c728a80$f05aa8c0@lslp7o.int.lsl.co.uk>

<fx: jumps up and down in glee>

Neat stuff. Ka-Ping Yee strikes again. And it works with Python 1.5.2.

<fx: more of the same>

Running on NT (4.00.1381) in an "MS-DOS" window, using Python 1.5.2
installed in the effbot manner, it works, with the slight strangeness
that if I do:

	python pydoc.py <name>

I get the documentation for <name> OK, but it is preceded with a line
claiming that:

	The system cannot find the path specified.

I don't have the time to pursue this at the moment - it's possibly an
artefact of our system?

(one minor "prettiness" hack - those of us who have been tainted by
Emacs Lisp programming tend to start module documentation off with a
line of the form:

	<name>.py -- information about the module

which, when pydoc'ed, results in a NAME line which starts with <name>
twice...
Of course, if I'm the only person doing this, I'll just have to, well,
stop...)

A request - a "-f" switch to allow the user to specify a particular
Python file (i.e., something not on the PYTHONPATH).

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"How fleeting are all human passions compared with the massive
continuity of ducks." - Dorothy L. Sayers, "Gaudy Night"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)




From jack at oratrix.nl  Mon Jan 15 15:32:02 2001
From: jack at oratrix.nl (Jack Jansen)
Date: Mon, 15 Jan 2001 15:32:02 +0100
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore 
In-Reply-To: Message by Guido van Rossum <guido@python.org> ,
	     Sat, 13 Jan 2001 17:33:34 -0500 , <200101132233.RAA03229@cj20424-a.reston1.va.home.com> 
Message-ID: <20010115143203.A44B63C2031@snelboot.oratrix.nl>

Also note that the problem only occurs when trying to build a unix-Python 
out-of-the-box on MacOSX. If you're building a Carbon Python from the 
MacPython sources (something very few people can do right now:-) the 
executable isn't called "python". And when a real MacOSX-Python will be done 
it'll have all the nifty packaging stuff that will also make sure that there's 
nothing called "python" in the toplevel folder.

And the two workarounds (1-Use a UFS filesystem, 2-Put a ".exe" extension in 
the Makefile) work fine for the mean time.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 





From guido at python.org  Mon Jan 15 15:33:23 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 09:33:23 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib httplib.py,1.26,1.27
In-Reply-To: Your message of "Sun, 14 Jan 2001 23:18:20 PST."
             <20010114231820.C6081@lyra.org> 
References: <E14HTxn-0003nR-00@usw-pr-cvs1.sourceforge.net>  
            <20010114231820.C6081@lyra.org> 
Message-ID: <200101151433.JAA17944@cj20424-a.reston1.va.home.com>

> >...
> > *** 333,337 ****
> >               i = host.find(':')
> >               if i >= 0:
> > !                 port = int(host[i+1:])
> >                   host = host[:i]
> >               else:
> > --- 333,340 ----
> >               i = host.find(':')
> >               if i >= 0:
> > !                 try:
> > !                     port = int(host[i+1:])
> > !                 except ValueError, msg:
> > !                     raise socket.error, str(msg)
> >                   host = host[:i]
> >               else:
> 
> Did you intend to commit this?

Oops.  That was a patch submitted a while ago that I applied as an
experiment but then decided I didn't like (argument: why bother).
I've reverted it.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Jan 15 15:40:30 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 09:40:30 -0500
Subject: [Python-Dev] Someone should be shot
In-Reply-To: Your message of "Mon, 15 Jan 2001 22:24:48 +0200."
             <20010115202448.38F60A828@darjeeling.zadka.site.co.il> 
References: <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>  
            <20010115202448.38F60A828@darjeeling.zadka.site.co.il> 
Message-ID: <200101151440.JAA18045@cj20424-a.reston1.va.home.com>

> I'm sorry! I meant to reply to tim alone, and ended up spamming python-dev!
> Of course, the real culprit is the person who fixed up the reply-to in
> the checkin messages to point to python-dev. Why was it done, and
> isn't there a better way? This makes it painful to personally comment
> on people's checkin messages. I suggest instead to add a mail-followup-to
> header
> 
> (Didn't anyone read "Reply-To Munging Considered Harmful"?)

I agree with you, but Barry (who set this up) seems to believe that
there's a good reason to do it this way.  Barry, do you still feel
that way?  The auto-reply-all has probably tripped me up more than
anyone.  Anyone else have a strong reason why this should be set?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From moshez at zadka.site.co.il  Tue Jan 16 00:03:25 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue, 16 Jan 2001 01:03:25 +0200 (IST)
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <14946.64166.348139.425223@anthem.wooz.org>
References: <14946.64166.348139.425223@anthem.wooz.org>, <3A62E8B6.3DFC1FA2@lemburg.com>
	<E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
	<20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
	<20010115202448.38F60A828@darjeeling.zadka.site.co.il>
Message-ID: <20010115230325.1C7F5A828@darjeeling.zadka.site.co.il>

On Mon, 15 Jan 2001 08:27:02 -0500, barry at digicool.com (Barry A. Warsaw) wrote:
> 
> Or how about
> 
>     http://www.metasystema.org/essays/reply-to-useful.mhtml

     If your mailer doesn't have this option, you should request it from
     its development team. Any mailer, whose development team refuses
     this simple request due to some ideological position, cannot be
     said to be reasonable.

As some people here know, I'm my mailer's "development team". I refuse to add
it due to an ideological position. Anyone who knows me know I'm quite 
unreasonable. Hmmm....I'm not making much headway, am I ;-)

> for a dissenting view.  Of course Mail-Followup-To is completely
> non-standard, but even if it were, having the mailing list munge it in
> isn't recommended:
> 
>     http://cr.yp.to/proto/replyto.html

This has no relevance to the current case, since python-checkin 
messages are machine-generated -- so this is closer to doing this in
the script generating the checkin message, and only differes in 
implementation.

> Bottom line (IMHO), this is just something about email that is and
> will forever remain broken.  Given that, it was voted a long while
> back to make Reply-To for checkins point to python-dev so until
> there's a hue and cry to change it back, I'll leave it as is.  And
> yeah, it bites me sometimes too!

I won't continue this thread, but remember that my vote is "no".
I simply shudder at the thought that I might send someone e-mail
with something like "nice bugfix. Didn't know you were back from
the sex-change operation", and it would be broadcast out to all
Python-Dev *and* the archives, for posterity.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From thomas at xs4all.net  Mon Jan 15 16:31:22 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 15 Jan 2001 16:31:22 +0100
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <14946.64166.348139.425223@anthem.wooz.org>; from barry@digicool.com on Mon, Jan 15, 2001 at 08:27:02AM -0500
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <14946.64166.348139.425223@anthem.wooz.org>
Message-ID: <20010115163122.I1005@xs4all.nl>

On Mon, Jan 15, 2001 at 08:27:02AM -0500, Barry A. Warsaw wrote:

> Bottom line (IMHO), this is just something about email that is and
> will forever remain broken.  Given that, it was voted a long while
> back to make Reply-To for checkins point to python-dev so until
> there's a hue and cry to change it back, I'll leave it as is.  And
> yeah, it bites me sometimes too!

I've said this before, on the Mailman-devel list, but I'll repeat it here
for the record (in case this issue ever comes up for vote again :)

The main bite (for me) is that to reply to a person in private, you have to
cut&paste the 'From' header from the original mail, and edit your new mail's
headers, in order to reply to a specific person. My mailer is mature enough
to have a 'reply', 'reply-group' and 'reply-list' keybinding, so the
'Reply-To' only interferes. There probably is a
'reply-to-from-ignoring-replyto' keybinding in there, too, somewhere, or it
could be added, but remembering to type that different key is almost as much
trouble as typing the email address by hand ;P

So, my vote, like Moshe's, is just back from a sex change, and reads 'no'.

Recount-recount-ly y'rs,
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Mon Jan 15 16:38:01 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 10:38:01 -0500
Subject: [Python-Dev] Someone should be shot
In-Reply-To: Your message of "Mon, 15 Jan 2001 08:27:02 EST."
             <14946.64166.348139.425223@anthem.wooz.org> 
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il>  
            <14946.64166.348139.425223@anthem.wooz.org> 
Message-ID: <200101151538.KAA21937@cj20424-a.reston1.va.home.com>

> Bottom line (IMHO), this is just something about email that is and
> will forever remain broken.  Given that, it was voted a long while
> back to make Reply-To for checkins point to python-dev so until
> there's a hue and cry to change it back, I'll leave it as is.  And
> yeah, it bites me sometimes too!

It sounds like a hue and cry to change it to me!  It looks like it's
time for a BDFL Pronouncement.  I pronounce:

Given that:

- we all know how to mail to python-dev;

- replying to the sender is by far the most common kind of reply;

- the mistake of replying to the sender when a reply-all was intended
  does much less potential harm than the mistake of replying to all
  when reply-to-sender was intended,

the reply-to header shall be removed.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Mon Jan 15 17:57:19 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 15 Jan 2001 11:57:19 -0500
Subject: [Python-Dev] Where's Greg Ward ?
In-Reply-To: <14946.63472.282750.828218@anthem.wooz.org>; from barry@digicool.com on Mon, Jan 15, 2001 at 08:15:28AM -0500
References: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com> <3A62C82C.EA25AAF5@lemburg.com> <14946.63472.282750.828218@anthem.wooz.org>
Message-ID: <20010115115719.B919@kronos.cnri.reston.va.us>

On Mon, Jan 15, 2001 at 08:15:28AM -0500, Barry A. Warsaw wrote:
>tree.  It'd be nice if the outstanding distutils patches are
>integrated before I dive in.  I don't see anything relevant in patches
>or bugs, but I don't know if there are other repositories of distutils
>fixes (like the archives?).

There are a few patches buried in the back archives, but I don't know
of any outstanding bugfixes, so please report whatever problem you're
seeing.

Oh, and Barry, did the issue holding up your patch for adding shar
support (#102313) ever get resolved?

--amk



From guido at python.org  Mon Jan 15 17:02:39 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 11:02:39 -0500
Subject: [Python-Dev] TELL64
In-Reply-To: Your message of "Mon, 08 Jan 2001 18:20:56 PST."
             <20010108182056.C4640@lyra.org> 
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net>  
            <20010108182056.C4640@lyra.org> 
Message-ID: <200101151602.LAA22272@cj20424-a.reston1.va.home.com>

Greg Stein noticed me checking in *yet* another system that needs
the fallback TELL64() definition in fileobjects.c, and wrote:

> All of those #ifdefs could be tossed and it would be more robust (long term)
> if an autoconf macro were used to specify when TELL64 should be defined.
> 
> [ I've looked thru fileobject.c and am a bit confused: the conditions for
>   defining TELL64 do not match the conditions for *using* it. that would
>   seem to imply a semantic error somewhere and/or a potential gotcha when
>   they get skewed (like I assume what happened to FreeBSD). simplifying with
>   an autoconf macro may help to rationalize it. ]

I have a better idea.  Since "lseek((fd),0,SEEK_CUR)" seems to be the
universal fallback, why not just define TELL64 to be that if it's not
previously defined (currently only MS_WIN64 has a different
definition)?  It isn't always *used* (the conditions under which
_portable_fseek() uses it are quite complex), but *when* it is used,
this seems to be the most common definition...

Patch:

*** fileobject.c	2001/01/15 10:36:56	2.106
--- fileobject.c	2001/01/15 16:02:06
***************
*** 58,66 ****
  /* define the appropriate 64-bit capable tell() function */
  #if defined(MS_WIN64)
  #define TELL64 _telli64
! #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(__FreeBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
! /* NOTE: this is only used on older
!    NetBSD prior to f*o() funcions */
  #define TELL64(fd) lseek((fd),0,SEEK_CUR)
  #endif
  
--- 58,65 ----
  /* define the appropriate 64-bit capable tell() function */
  #if defined(MS_WIN64)
  #define TELL64 _telli64
! #else
! /* Fallback for older systems that don't have the f*o() funcions */
  #define TELL64(fd) lseek((fd),0,SEEK_CUR)
  #endif


I'll check this in after 24 hours unless a better idea comes up.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Jan 15 17:17:07 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 11:17:07 -0500
Subject: [Python-Dev] PEP 205 comments
In-Reply-To: Your message of "Fri, 12 Jan 2001 23:19:57 +0100."
             <200101122219.f0CMJvp01376@mira.informatik.hu-berlin.de> 
References: <200101122219.f0CMJvp01376@mira.informatik.hu-berlin.de> 
Message-ID: <200101151617.LAA22359@cj20424-a.reston1.va.home.com>

I'll leave most of this to Fred, but I'll reply to two items (Fred can
add these replies to the PEP):

> Again on proxies, there is no discussion or documentation of the
> ReferenceError. Why is it a RuntimeError? LookupError, ValueError, and
> AttributeError seem to be just as fine or better.

RuntimeError was my suggestion.  The error doesn't really qualify as a
LookupError in my view (there's no key that could be valid or invalid)
and ValueError seems too general (that's typically used for
out-of-range arguments and unparseable strings and the like).  Do you
have a reason why RuntimeError is inappropriate?

> On to the type type extensions: Should there be a type flag indicating
> presence of tp_weaklistoffset? It appears that the type structure had
> tp_xxx7 for a long time, so likely all in-use binary modules have
> that field set to zero. Is that sufficient?

Yes, that should be sufficient.  (I'm also going to clain tp_xxx7 for
the rich comparison function slot, but either patch can be modified to
use tp_xxx8 instead.)  Maybe it's time to add a bunch of new spares?

> Thanks for reading all of this message,

You're welcome.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Mon Jan 15 17:39:03 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 11:39:03 -0500
Subject: [Python-Dev] Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>
	<E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
	<20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
	<20010115202448.38F60A828@darjeeling.zadka.site.co.il>
	<14946.64166.348139.425223@anthem.wooz.org>
	<200101151538.KAA21937@cj20424-a.reston1.va.home.com>
Message-ID: <14947.10151.575008.869188@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido at python.org> writes:

    GvR> the reply-to header shall be removed.

I'm more than happy to do this (I remember adding the reply-to munging
reluctantly).  Understand one thing: anybody who naively replies to
the whole list will send those replies to python-checkins, not
python-dev.

Still want it?

-Barry




From barry at digicool.com  Mon Jan 15 17:46:28 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 11:46:28 -0500
Subject: [Python-Dev] Where's Greg Ward ?
References: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com>
	<3A62C82C.EA25AAF5@lemburg.com>
	<14946.63472.282750.828218@anthem.wooz.org>
	<20010115115719.B919@kronos.cnri.reston.va.us>
Message-ID: <14947.10596.733726.995351@anthem.wooz.org>

>>>>> "AK" == Andrew Kuchling <akuchlin at mems-exchange.org> writes:

    AK> There are a few patches buried in the back archives, but I
    AK> don't know of any outstanding bugfixes, so please report
    AK> whatever problem you're seeing.

Okay, will do.

    AK> Oh, and Barry, did the issue holding up your patch for adding
    AK> shar support (#102313) ever get resolved?

No, but I'll try to take another poke at it.

-Barry




From moshez at zadka.site.co.il  Tue Jan 16 02:07:48 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue, 16 Jan 2001 03:07:48 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Demo/metaclasses Meta.py,1.3,1.4
In-Reply-To: <E14ICtM-00083b-00@usw-pr-cvs1.sourceforge.net>
References: <E14ICtM-00083b-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010116010748.41869A828@darjeeling.zadka.site.co.il>

On Mon, 15 Jan 2001, Guido van Rossum <gvanrossum at users.sourceforge.net> wrote:

> Modified Files:
> 	Meta.py 
> Log Message:
> Geoffrey Gerrietts discovered that a KeyError was caught that probably
> should have been a NameError.  I'm checking in a change that catches
> both, just to be sure -- I can't be bothered trying to understand this
> code any more. :-)
...
> !             except (KeyError, AttributeError):

Ummmm....can you be bothered to make sure you really meant AttributeError
when you said NameError? <wink>
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From guido at python.org  Mon Jan 15 18:06:07 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 12:06:07 -0500
Subject: [Python-Dev] Someone should be shot
In-Reply-To: Your message of "Mon, 15 Jan 2001 11:39:03 EST."
             <14947.10151.575008.869188@anthem.wooz.org> 
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <14946.64166.348139.425223@anthem.wooz.org> <200101151538.KAA21937@cj20424-a.reston1.va.home.com>  
            <14947.10151.575008.869188@anthem.wooz.org> 
Message-ID: <200101151706.MAA22884@cj20424-a.reston1.va.home.com>

> I'm more than happy to do this (I remember adding the reply-to munging
> reluctantly).  Understand one thing: anybody who naively replies to
> the whole list will send those replies to python-checkins, not
> python-dev.
> 
> Still want it?

Yes.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Mon Jan 15 18:11:29 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 12:11:29 -0500
Subject: [Python-Dev] Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>
	<E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
	<20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
	<20010115202448.38F60A828@darjeeling.zadka.site.co.il>
	<14946.64166.348139.425223@anthem.wooz.org>
	<200101151538.KAA21937@cj20424-a.reston1.va.home.com>
	<14947.10151.575008.869188@anthem.wooz.org>
	<200101151706.MAA22884@cj20424-a.reston1.va.home.com>
Message-ID: <14947.12097.613433.580928@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido at python.org> writes:

    >> I'm more than happy to do this (I remember adding the reply-to
    >> munging reluctantly).  Understand one thing: anybody who
    >> naively replies to the whole list will send those replies to
    >> python-checkins, not python-dev.  Still want it?

    GvR> Yes.

Done.




From thomas at xs4all.net  Mon Jan 15 18:34:37 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 15 Jan 2001 18:34:37 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib ftplib.py,1.47,1.48
In-Reply-To: <E14ICYu-000781-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Mon, Jan 15, 2001 at 08:32:52AM -0800
References: <E14ICYu-000781-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010115183437.J1005@xs4all.nl>

On Mon, Jan 15, 2001 at 08:32:52AM -0800, Guido van Rossum wrote:

> This is slightly controversial, but after reading the argumentation in
> the bug tracker for and against, I believe this is the right solution.

It's really only slightly controversional. 'mfisk' convinced me too, and I
used to use ftp to a server behind a firewall :-)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mal at lemburg.com  Mon Jan 15 19:21:54 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 19:21:54 +0100
Subject: [Python-Dev] Re: Someone should be shot
References: <3A62EC93.9AA60ABA@lemburg.com>, <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <20010115204215.84F0CA828@darjeeling.zadka.site.co.il>
Message-ID: <3A633FC2.11F90E94@lemburg.com>

Moshe Zadka wrote:
> 
> On Mon, 15 Jan 2001 13:26:59 +0100, "M.-A. Lemburg" <mal at lemburg.com> wrote:
> 
> > In fact I like it, that replies go to python-dev ... after all,
> > that's where these things should be discussed.
> 
> Well, that's the mailing list where things should be discussed.
> But when I press the "Reply" button (as opposed to "Reply to List" button)
> I expect my e-mail to go to the person originating the e-mail.
> Reply-To: means "I'd like to get replies to some other address".
> What if, say, a checkin message relates to some private topic
> I'd discussed with someone: I'd like to reply to him personally.
> 
> I agree that responses to Python-Checkins should be handled on Python-Dev:
> that's what the mail-followup-to header is for.

Ah, ok. I thought you pressed Reply-All and then wondered why
your message got copied to python-dev...
 
> > BTW, in case you misunderstood my reply: it would indeed make
> > sense to automate these kinds of check (tabnanny et al).
> 
> Oh, ok. The "cron" part threw me off (why cron?)

CRON is what's used on Unix to implement jobs which run
on a regular basis... perhaps we just need to seup the
CRON job in timbot though ;)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at python.org  Mon Jan 15 19:35:54 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 13:35:54 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Demo/metaclasses Meta.py,1.3,1.4
In-Reply-To: Your message of "Tue, 16 Jan 2001 03:07:48 +0200."
             <20010116010748.41869A828@darjeeling.zadka.site.co.il> 
References: <E14ICtM-00083b-00@usw-pr-cvs1.sourceforge.net>  
            <20010116010748.41869A828@darjeeling.zadka.site.co.il> 
Message-ID: <200101151835.NAA26712@cj20424-a.reston1.va.home.com>

> > Modified Files:
> > 	Meta.py 
> > Log Message:
> > Geoffrey Gerrietts discovered that a KeyError was caught that probably
> > should have been a NameError.  I'm checking in a change that catches
> > both, just to be sure -- I can't be bothered trying to understand this
> > code any more. :-)
> ...
> > !             except (KeyError, AttributeError):
> 
> Ummmm....can you be bothered to make sure you really meant AttributeError
> when you said NameError? <wink>

The code is correct.  Ignore the comment. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas at arctrix.com  Mon Jan 15 12:55:51 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 15 Jan 2001 03:55:51 -0800
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <14947.10151.575008.869188@anthem.wooz.org>; from barry@digicool.com on Mon, Jan 15, 2001 at 11:39:03AM -0500
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <14946.64166.348139.425223@anthem.wooz.org> <200101151538.KAA21937@cj20424-a.reston1.va.home.com> <14947.10151.575008.869188@anthem.wooz.org>
Message-ID: <20010115035550.B4336@glacier.fnational.com>

[Barry on removing the reply-to header on python-checkins messages]
> I'm more than happy to do this (I remember adding the reply-to munging
> reluctantly).  Understand one thing: anybody who naively replies to
> the whole list will send those replies to python-checkins, not
> python-dev.

Could you make the script generate mail-followup-to instead of
reply-to?  I know its not a standard header but some MUA
understand it and it is exactly what is needed to solve this
problem.  I think promoting it is a good thing.

  Neil



From thomas at xs4all.net  Mon Jan 15 19:59:12 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 15 Jan 2001 19:59:12 +0100
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <20010115035550.B4336@glacier.fnational.com>; from nas@arctrix.com on Mon, Jan 15, 2001 at 03:55:51AM -0800
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <14946.64166.348139.425223@anthem.wooz.org> <200101151538.KAA21937@cj20424-a.reston1.va.home.com> <14947.10151.575008.869188@anthem.wooz.org> <20010115035550.B4336@glacier.fnational.com>
Message-ID: <20010115195912.K1005@xs4all.nl>

On Mon, Jan 15, 2001 at 03:55:51AM -0800, Neil Schemenauer wrote:
> [Barry on removing the reply-to header on python-checkins messages]
> > I'm more than happy to do this (I remember adding the reply-to munging
> > reluctantly).  Understand one thing: anybody who naively replies to
> > the whole list will send those replies to python-checkins, not
> > python-dev.

> Could you make the script generate mail-followup-to instead of
> reply-to?  I know its not a standard header but some MUA
> understand it and it is exactly what is needed to solve this
> problem.  I think promoting it is a good thing.

The script just calls '/bin/mail'. The Reply-To munging is done by Mailman,
which is slightly more than 'a script'. syncmail could do it, but that would
mean using sendmail instead of mail, and writing all headers itself.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Mon Jan 15 20:17:27 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 14:17:27 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: Your message of "Fri, 05 Jan 2001 14:14:49 EST."
             <14934.7465.360749.199433@localhost.localdomain> 
References: <14934.7465.360749.199433@localhost.localdomain> 
Message-ID: <200101151917.OAA29687@cj20424-a.reston1.va.home.com>

There doesn't seem to be a lot of enthousiasm for a Unittest
bakeoff...  Certainly I don't think I'll get to this myself before the
conference.

How about the following though: talking of low-hanging fruit, Tim's
doctest module is an excellent thing even if it isn't a unit testing
framework!  (I found this out when I played with it -- it's real easy
to get used to...)

Would anyone object against Tim checking this in?  Since it isn't a
contender in the unit test bake-off, it shouldn't affect the outcome
there at all.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Mon Jan 15 20:40:03 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 14:40:03 -0500
Subject: [Python-Dev] Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>
	<E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
	<20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
	<20010115202448.38F60A828@darjeeling.zadka.site.co.il>
	<14946.64166.348139.425223@anthem.wooz.org>
	<200101151538.KAA21937@cj20424-a.reston1.va.home.com>
	<14947.10151.575008.869188@anthem.wooz.org>
	<20010115035550.B4336@glacier.fnational.com>
	<20010115195912.K1005@xs4all.nl>
Message-ID: <14947.21011.310090.686632@anthem.wooz.org>

>>>>> "TW" == Thomas Wouters <thomas at xs4all.net> writes:

    >> Could you make the script generate mail-followup-to instead of
    >> reply-to?  I know its not a standard header but some MUA
    >> understand it and it is exactly what is needed to solve this
    >> problem.  I think promoting it is a good thing.

    TW> The script just calls '/bin/mail'. The Reply-To munging is
    TW> done by Mailman, which is slightly more than 'a
    TW> script'. syncmail could do it, but that would mean using
    TW> sendmail instead of mail, and writing all headers itself.

I'm sure Fred or I would be happy to review such a patch to syncmail
<wink>.

-Barry



From jeremy at alum.mit.edu  Mon Jan 15 20:31:44 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 15 Jan 2001 14:31:44 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <200101151917.OAA29687@cj20424-a.reston1.va.home.com>
References: <14934.7465.360749.199433@localhost.localdomain>
	<200101151917.OAA29687@cj20424-a.reston1.va.home.com>
Message-ID: <14947.20512.140859.119597@localhost.localdomain>

>>>>> "GvR" == Guido van Rossum <guido at python.org> writes:

  GvR> There doesn't seem to be a lot of enthousiasm for a Unittest
  GvR> bakeoff...  Certainly I don't think I'll get to this myself
  GvR> before the conference.

Let's have all the interested parties vote now, then.  It would
certainly be helpful to have the new unittest module in the alpha
release of 2.1.  I'd like to write some new tests and I'd rather use
the new stuff than the old stuff.

Jeremy



From tim.one at home.com  Mon Jan 15 21:01:52 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 15:01:52 -0500
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <14947.10151.575008.869188@anthem.wooz.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEMOIIAA.tim.one@home.com>

[Barry]
> ...
> Understand one thing: anybody who naively replies to the whole
> list will send those replies to python-checkins, not python-dev.

IIRC, that's why the redirect to python-dev was added to begin with:  of
course people will reply to python-checkins, and then the next guy x-posts
to python-dev too, and the next three in turn variously remove one or the
other groups, or keep both or add c.l.py too.  In the end, no single archive
contains a coherent record on its own, and the random mix of "[Python-Dev]"
and "[Python-checkins]" Subject tags even make it impossible to sort by
(true) subject easily in your own mail client.

> Still want it?

Don't care <wink -- but simple tech approaches to human carelessness (the
true problem here!) don't work no matter which way you flip the switch>.




From tim.one at home.com  Mon Jan 15 21:08:15 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 15:08:15 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib tabnanny.py,1.10,1.11 telnetlib.py,1.8,1.9 tempfile.py,1.26,1.27 threading.py,1.10,1.11 toaiff.py,1.8,1.9 tokenize.py,1.15,1.16 traceback.py,1.18,1.19 tty.py,1.2,1.3 tzparse.py,1.8,1.9
In-Reply-To: <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEMOIIAA.tim.one@home.com>

[<tim_one at users.sourceforge.net>]
> Modified Files:
> 	tabnanny.py
> Log Message:
> Whitespace normalization.

[Moshe]
> hmmmmmm.......

LOL!  I was hoping nobody would notice that <0.7 wink>.  The appalling truth
is that late in tabnanny's development I deliberately indented a large block
of code by one column, and actually thought it was a good idea at the time.
I'm as delighted to see that finally fixed as I am emabarrassed by the
necessity.

although-perhaps-more-appalled-that-was-there-was-followup-
    debate-about-followups-containing-more-msgs-than-there-
    were-characters-in-moshe's-followup-ly y'rs  - tim




From ping at lfw.org  Mon Jan 15 21:10:10 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 12:10:10 -0800 (PST)
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and
 outside of Python)
In-Reply-To: <002801c07efe$0c728a80$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <Pine.LNX.4.10.10101151155270.5846-100000@skuld.kingmanhall.org>

On Mon, 15 Jan 2001, Tony J Ibbs (Tibs) wrote:
> I get the documentation for <name> OK, but it is preceded with a line
> claiming that:
> 
> 	The system cannot find the path specified.

Thanks for the NT testing.  That's funny -- i put in a special case
for Windows to avoid messages like the above a couple of days ago.
How recently did you download pydoc.py?  Does your copy contain:

    if hasattr(sys, 'winver'):
        return lambda text: tempfilepager(text, 'more')

?

> 	<name>.py -- information about the module
> 
> which, when pydoc'ed, results in a NAME line which starts with <name>
> twice...
> Of course, if I'm the only person doing this, I'll just have to, well,
> stop...)

I think i'm going to ask you to stop, unless Guido prefers
otherwise.  Guido, do you have a style pronouncement for module
docstrings?

> A request - a "-f" switch to allow the user to specify a particular
> Python file (i.e., something not on the PYTHONPATH).

Yes, it's on my to-do list.

So you can see what i'm up to, here's my current to-do list:

    make boldness optional (only if using more/less?  only Unix?)
    document a .py file given on the command line
  + webserver in background
    help should have a repr
    write a better htmlrepr (\n should look special, max length limit, etc.)
    generate docs from lib HTML
    generate HTML index from precis and __path__ and package contents list
    have help(...) produce a directory of available things to ask for help on
    curses.wrapper is broken: both function and package
    respect package __all__
    coherent answer to .py vs .pyc: do we show .pyc?
    fix getcomments() bug: last two lines stuck together
  + grey out shadowed modules/packages
    refactor .py/.pyc/.module.so/.module.so.1 listers in htmldoc, textdoc
    skip __main__ module
  + index built-in modules too
    Windows and Mac testing
    default to HTTP mode on GUI platforms?  (win, mac)

The ones marked with + i consider done.  Feel free to comment on
or suggest priorities for the others; in particular, what do you
think of the last one?  The idea is that double-clicking on
pydoc.py in Windows or MacOS could launch the server and then open
the localhost URL using webbrowser.py to display the documentation
index.  Should it do this by default?


-- ?!ng




From guido at python.org  Mon Jan 15 21:41:25 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 15:41:25 -0500
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and outside of Python)
In-Reply-To: Your message of "Mon, 15 Jan 2001 12:10:10 PST."
             <Pine.LNX.4.10.10101151155270.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101151155270.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101152041.PAA32298@cj20424-a.reston1.va.home.com>

> > 	<name>.py -- information about the module
> > 
> > which, when pydoc'ed, results in a NAME line which starts with <name>
> > twice...
> > Of course, if I'm the only person doing this, I'll just have to, well,
> > stop...)
> 
> I think i'm going to ask you to stop, unless Guido prefers
> otherwise.  Guido, do you have a style pronouncement for module
> docstrings?

I'm with Ping.  None of the examples in the style guide start the
docstring with the function name.  Almost none of the standard library
modules start their module docstring with the module name (codecs is
an exception, but I didn't write it :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From bckfnn at worldonline.dk  Mon Jan 15 21:45:02 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Mon, 15 Jan 2001 20:45:02 GMT
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <3A62C66E.2BB69E61@lemburg.com>
References: <010f01c07e52$e9801fc0$e46940d5@hagrid> <3A62C66E.2BB69E61@lemburg.com>
Message-ID: <3a636122.45847835@smtp.worldonline.dk>

[Fredrik Lundh]

> The name database portions of SF task 17335 ("add
> compressed unicode database") were postponed to
> 2.1.
> 
> My current patch replaces the ~450k large ucnhash
> module with a new ~160k large module.  (See earlier
> posts for more info on how the new database works).
> 
> Should I check it in?

[M.-A. Lemburg]

>Since the Unicode character names are probably
>not used for performance sensitive tasks, I suggest to
>checkin the smallest version possible.
>
>If it is too much work to get Finn's version recoded in C
>(presuming it's written in Java), then I'd suggest checking
>in your version until someone comes up with a yet smaller
>edition.

FWIW, I agree the that 160k module should be used. Please, nobody should
use the jython compression as an argument to delay any improvements in
CPython. 

I certainly didn't post because I wanted to complicate your processes. I
just wanted to show off <wink>.

regards,
finn



From fredrik at effbot.org  Mon Jan 15 21:58:11 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Mon, 15 Jan 2001 21:58:11 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <010f01c07e52$e9801fc0$e46940d5@hagrid> <3A62C66E.2BB69E61@lemburg.com> <3a636122.45847835@smtp.worldonline.dk>
Message-ID: <001f01c07f35$e2c09500$e46940d5@hagrid>

mal, finn:
> >If it is too much work to get Finn's version recoded in C
> >(presuming it's written in Java), then I'd suggest checking
> >in your version until someone comes up with a yet smaller
> >edition.
> 
> FWIW, I agree the that 160k module should be used. Please, nobody should
> use the jython compression as an argument to delay any improvements in
> CPython. 

okay, unless someone throws in a -1 vote, I'll check
this in tomorrow.

Cheers /F




From tim.one at home.com  Mon Jan 15 21:57:26 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 15:57:26 -0500
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <010f01c07e52$e9801fc0$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCGENEIIAA.tim.one@home.com>

[Fredrik Lundh]
> The name database portions of SF task 17335 ("add
> compressed unicode database") were postponed to
> 2.1.
>
> My current patch replaces the ~450k large ucnhash
> module with a new ~160k large module.  (See earlier
> posts for more info on how the new database works).
>
> Should I check it in?

Absolutely!  But not like as for 2.0:  check it in *now*, so we have a few
days to deal with surprises before the alpha release.  With 300K sitting on
the table waiting to be taken, it's not worth delaying one hour to worry
about 60K additional that may or may not be achievable later.




From ping at lfw.org  Mon Jan 15 22:02:38 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 13:02:38 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Demo/metaclasses
 Meta.py,1.3,1.4
In-Reply-To: <20010116010748.41869A828@darjeeling.zadka.site.co.il>
Message-ID: <Pine.LNX.4.10.10101151302070.5846-100000@skuld.kingmanhall.org>

On Tue, 16 Jan 2001, Moshe Zadka wrote:
> Ummmm....can you be bothered to make sure you really meant AttributeError
> when you said NameError? <wink>

Nice bugfix.  Didn't know you were back from the sex-change operation.


-- ?!ng




From tim.one at home.com  Mon Jan 15 22:15:54 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 16:15:54 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <200101151917.OAA29687@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMENFIIAA.tim.one@home.com>

[Guido]
> There doesn't seem to be a lot of enthousiasm for a Unittest
> bakeoff...

I'm enthusiastic, but ...

> Certainly I don't think I'll get to this myself before the
> conference.

Ditto.  Takes time that's not there.

> ...
> Would anyone object against Tim checking [doctest] in?

You suggested that before, and so it was already on my 2.1a1 todo list.
Hoped to get to it over the weekend but didn't.  Hope to get to it today,
but won't <wink - I hope>.  On the chance that I do, anyone inclined to
object should do so before the sun sets in Reston.

or-if-it-never-sets-the-world-ends-anyway-ly y'rs  - tim




From akuchlin at mems-exchange.org  Mon Jan 15 22:26:19 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 15 Jan 2001 16:26:19 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <14947.20512.140859.119597@localhost.localdomain>; from jeremy@alum.mit.edu on Mon, Jan 15, 2001 at 02:31:44PM -0500
References: <14934.7465.360749.199433@localhost.localdomain> <200101151917.OAA29687@cj20424-a.reston1.va.home.com> <14947.20512.140859.119597@localhost.localdomain>
Message-ID: <20010115162619.A19484@kronos.cnri.reston.va.us>

On Mon, Jan 15, 2001 at 02:31:44PM -0500, Jeremy Hylton wrote:
>Let's have all the interested parties vote now, then.  It would
>certainly be helpful to have the new unittest module in the alpha
>release of 2.1.  I'd like to write some new tests and I'd rather use
>the new stuff than the old stuff.

Huh?  If no one has tried the different modules, what's the point of
having a vote?  (Given that doctest is going to be added, though, it 
should be checked in ASAP.)

--amk



From trentm at ActiveState.com  Mon Jan 15 23:10:26 2001
From: trentm at ActiveState.com (Trent Mick)
Date: Mon, 15 Jan 2001 14:10:26 -0800
Subject: [Python-Dev] TELL64
In-Reply-To: <200101151602.LAA22272@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 15, 2001 at 11:02:39AM -0500
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com>
Message-ID: <20010115141026.I29870@ActiveState.com>

On Mon, Jan 15, 2001 at 11:02:39AM -0500, Guido van Rossum wrote:
> Greg Stein noticed me checking in *yet* another system that needs
> the fallback TELL64() definition in fileobjects.c, and wrote:
> 
> > All of those #ifdefs could be tossed and it would be more robust (long term)
> > if an autoconf macro were used to specify when TELL64 should be defined.
> > 
> > [ I've looked thru fileobject.c and am a bit confused: the conditions for
> >   defining TELL64 do not match the conditions for *using* it. that would
> >   seem to imply a semantic error somewhere and/or a potential gotcha when
> >   they get skewed (like I assume what happened to FreeBSD). simplifying with
> >   an autoconf macro may help to rationalize it. ]

The problem is that these systems lie when they "say" (according to Python's
configure tests for HAVE_LARGEFILE_SUPPORT) that they have largefile support.
This seems to have happened for a particular release of BSD (which has since
been fixed). I think that the Right(tm) (meaning the cleanest solution where
the tests and definitions in the code actually represent the truth) answer is
a proper configure test (sort of as Greg suggests). I don't really feel
comfortable writing that patch (because (1) lack of time and (2) inability to
test, I don't have any access to any of these BSD machines).

[Guido]
> 
> I have a better idea.  Since "lseek((fd),0,SEEK_CUR)" seems to be the
> universal fallback, why not just define TELL64 to be that if it's not
> previously defined (currently only MS_WIN64 has a different
> definition)?  It isn't always *used* (the conditions under which
> _portable_fseek() uses it are quite complex), but *when* it is used,
> this seems to be the most common definition...

While I agree that it is annoying that the build breaks for these platforms I
think that it is appropriate that the build breaks. Having to put these:
    #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(__FreeBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
definitions here gives a nice list of those platforms that *do* lie. I would
prefer that to having an "#else" block that just captures all other cases,
but that is just my opinion.

Options (in order of preference):

(1) Update the configure test for HAVE_LARGEFILE_SUPPORT such that the proper
    versions of these OSes do *not* #define it.
(2) Guido's suggestion.
(2) Keep extending the "#elif" list.

 ^---- using (2) twice was intentional


Trent

> 
> *** fileobject.c	2001/01/15 10:36:56	2.106
> --- fileobject.c	2001/01/15 16:02:06
> ***************
> *** 58,66 ****
>   /* define the appropriate 64-bit capable tell() function */
>   #if defined(MS_WIN64)
>   #define TELL64 _telli64
> ! #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(__FreeBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
> ! /* NOTE: this is only used on older
> !    NetBSD prior to f*o() funcions */
>   #define TELL64(fd) lseek((fd),0,SEEK_CUR)
>   #endif
>   
> --- 58,65 ----
>   /* define the appropriate 64-bit capable tell() function */
>   #if defined(MS_WIN64)
>   #define TELL64 _telli64
> ! #else
> ! /* Fallback for older systems that don't have the f*o() funcions */
>   #define TELL64(fd) lseek((fd),0,SEEK_CUR)
>   #endif
> 
> 
> I'll check this in after 24 hours unless a better idea comes up.
> 

Better idea but no patch. :(

Trent


-- 
Trent Mick
TrentM at ActiveState.com



From skip at mojam.com  Mon Jan 15 23:10:36 2001
From: skip at mojam.com (Skip Montanaro)
Date: Mon, 15 Jan 2001 16:10:36 -0600 (CST)
Subject: [Python-Dev] should we start instrumenting modules with __all__?
Message-ID: <14947.30044.934204.951564@beluga.mojam.com>

I see the from-import-* patch for __all__ has been checked in.  Should we
make an effort to add __all__ to at least some modules before 2.1a1?

Skip



From akuchlin at mems-exchange.org  Mon Jan 15 23:13:03 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 15 Jan 2001 17:13:03 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: <200101121351.IAA19676@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Jan 12, 2001 at 08:51:51AM -0500
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> <200101112155.QAA16678@cj20424-a.reston1.va.home.com> <20010111172633.A26249@kronos.cnri.reston.va.us> <200101121351.IAA19676@cj20424-a.reston1.va.home.com>
Message-ID: <20010115171303.A23626@kronos.cnri.reston.va.us>

On Fri, Jan 12, 2001 at 08:51:51AM -0500, Guido van Rossum wrote:
>Ah.  It's very simple.  I create a directory "linux" as a subdirectory
>of the Python source tree (i.e. at the same level as Lib, Objects,
>etc.).  Then I chdir into that directory, and I say "../configure".
>The configure script creates subdirectories to hold the object files ...
>Then I say "make" and it builds Python.  

This doesn't work at all for me in my copy of the CVS tree.  Are there
other steps or requirements to make this work.  (Transcript available
upon request, but I suspect I'm missing something simple.)

--amk




From tim.one at home.com  Mon Jan 15 23:32:51 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 17:32:51 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <20010115162619.A19484@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCKENMIIAA.tim.one@home.com>

[Jeremy]
> Let's have all the interested parties vote now, then.  It would
> certainly be helpful to have the new unittest module in the alpha
> release of 2.1.  I'd like to write some new tests and I'd rather use
> the new stuff than the old stuff.

[Andrew]
> Huh?  If no one has tried the different modules, what's the point of
> having a vote?

Presumably so that *something* gets into 2.1a1.  At least you, Jeremy and
Fredrik have tried them, and if that's all there can't be a tie <wink>.  I
would agree this is not an ideal decision procedure.

the-question-is-whether-it's-better-than-paralysis-ly y'rs  - tim




From ping at lfw.org  Mon Jan 15 23:35:47 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 14:35:47 -0800 (PST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
Message-ID: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>

I don't know whether this is going to be obvious or controversial,
but here goes.  Most of the time we're used to seeing a newline as
'\n', not as '\012', and newlines are typed in as '\n'.

A newcomer to Python is likely to do

    >>> 'hello\n'
    'hello\012'

and ask "what's \012?" -- whereupon one has to explain that it's an
octal escape, that 012 in octal equals 10, and that chr(10) is
newline, which is the same as '\n'.  You're bound to run into this,
and you'll see \012 a lot, because \n is such a common character.
Aside from being slightly more frightening, '\012' also takes up
twice as many characters as necessary.

So... i'm submitting a patch that causes the three most common
special whitespace characters, '\n', '\r', and '\t', to appear in
their natural form rather than as octal escapes when strings are
printed and repr()ed.

Mm?


-- ?!ng




From esr at thyrsus.com  Tue Jan 16 00:15:50 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 18:15:50 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>; from ping@lfw.org on Mon, Jan 15, 2001 at 02:35:47PM -0800
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
Message-ID: <20010115181550.A11566@thyrsus.com>

Ka-Ping Yee <ping at lfw.org>:
> I don't know whether this is going to be obvious or controversial,
> but here goes.  Most of the time we're used to seeing a newline as
> '\n', not as '\012', and newlines are typed in as '\n'.
> 
> A newcomer to Python is likely to do
> 
>     >>> 'hello\n'
>     'hello\012'
> 
> and ask "what's \012?" -- whereupon one has to explain that it's an
> octal escape, that 012 in octal equals 10, and that chr(10) is
> newline, which is the same as '\n'.  You're bound to run into this,
> and you'll see \012 a lot, because \n is such a common character.
> Aside from being slightly more frightening, '\012' also takes up
> twice as many characters as necessary.
> 
> So... i'm submitting a patch that causes the three most common
> special whitespace characters, '\n', '\r', and '\t', to appear in
> their natural form rather than as octal escapes when strings are
> printed and repr()ed.

Works for me.  I'd add \v, \b and \a to cover the whole ANSI C 
standard escape set (hmmm...am I missing any?)
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Live free or die; death is not the worst of evils.
	-- General George Stark.



From thomas at xs4all.net  Tue Jan 16 00:49:30 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 00:49:30 +0100
Subject: [Python-Dev] time functions
Message-ID: <20010116004930.L1005@xs4all.nl>

Maybe this is a dead and buried subject, but I'm going to try anyway, since
everyone's been in such a wonderful 'lets fix ugly but harmless nits' mood
lately :)

Why do we need the following atrocity <wink>:

  timestr = time.strftime("<format>", time.localtime(time.time()))

To do the simple task of 'date +<format>' ?  I never really understood why
there isn't a way to get a timetuple directly from C, rather than converting
a float that we got from C a bytecode before, even though the higher level
almost always deals with timetuples. How about making the float-to-tuple
functions (time.localtime, time.gmtime) accept 0 arguments as well, and
defaulting to time.time() in that case ? Even better, how about doing the
same for the other functions, too ? (where it makes sense, of course :) 

Actually, I'll split it up in three proposals:

- Making the time in time.strftime default to 'now', so that the above
  becomes the ever so slightly confusing:

  timestr = time.strftime("<format>")
  (confusing because it looks a bit like a regexp constructor...)

- Making the time in time.asctime and time.ctime optional, defaulting to
  'now', so you can just call 'time.ctime()' without having to pass
  time.time() (which are about half the calls in my own code :)

- Making the time in time.localtime and time.gmtime default to 'now'.

I'm 0/+1/+1 myself :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Tue Jan 16 00:55:36 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 00:55:36 +0100
Subject: [Python-Dev] TELL64
In-Reply-To: <20010115141026.I29870@ActiveState.com>; from trentm@ActiveState.com on Mon, Jan 15, 2001 at 02:10:26PM -0800
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com>
Message-ID: <20010116005536.M1005@xs4all.nl>

On Mon, Jan 15, 2001 at 02:10:26PM -0800, Trent Mick wrote:

> > > [ I've looked thru fileobject.c and am a bit confused: the conditions
> > >   for defining TELL64 do not match the conditions for *using* it. that
> > >   would seem to imply a semantic error somewhere and/or a potential
> > >   gotcha when they get skewed (like I assume what happened to
> > >   FreeBSD). simplifying with an autoconf macro may help to rationalize
> > >   it. ]

> The problem is that these systems lie when they "say" (according to
> Python's configure tests for HAVE_LARGEFILE_SUPPORT) that they have
> largefile support. This seems to have happened for a particular release of
> BSD (which has since been fixed). I think that the Right(tm) (meaning the
> cleanest solution where the tests and definitions in the code actually
> represent the truth) answer is a proper configure test (sort of as Greg
> suggests). I don't really feel comfortable writing that patch (because (1)
> lack of time and (2) inability to test, I don't have any access to any of
> these BSD machines).

There is no (longer any) 'single BSD release', so I doubt it has 'since been
fixed' :) We should consider the different BSD derived OSes as separate, if
slightly related, systems (much like SunOS <-> BSD.) The problem in the BSDI
case is really simple: the autoconf test doesn't test whether the fs really
supports large files, but rather whether the system has an off_t type that
is 64 bits. BSDI has that type, but does not actually use it in any of the
seek/tell functions. This has not been 'fixed' as far as I know, precisely
because it isn't 'broken' :)

I tried to fix the test, but I have been completely unable to find a proper
test. There doesn't seem to be a 'standard' one, and I wasn't able to figure
out what, say, 'zsh' uses -- black autoconf magic, for sure.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From trentm at ActiveState.com  Tue Jan 16 01:24:54 2001
From: trentm at ActiveState.com (Trent Mick)
Date: Mon, 15 Jan 2001 16:24:54 -0800
Subject: [Python-Dev] TELL64
In-Reply-To: <20010116005536.M1005@xs4all.nl>; from thomas@xs4all.net on Tue, Jan 16, 2001 at 12:55:36AM +0100
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl>
Message-ID: <20010115162454.D3864@ActiveState.com>

On Tue, Jan 16, 2001 at 12:55:36AM +0100, Thomas Wouters wrote:
> On Mon, Jan 15, 2001 at 02:10:26PM -0800, Trent Mick wrote:
> 
> > The problem is that these systems lie when they "say" (according to
> > Python's configure tests for HAVE_LARGEFILE_SUPPORT) that they have
> > largefile support. This seems to have happened for a particular release of
> > BSD (which has since been fixed). I think that the Right(tm) (meaning the
> > cleanest solution where the tests and definitions in the code actually
> > represent the truth) answer is a proper configure test (sort of as Greg
> > suggests). I don't really feel comfortable writing that patch (because (1)
> > lack of time and (2) inability to test, I don't have any access to any of
> > these BSD machines).
> 
> There is no (longer any) 'single BSD release', so I doubt it has 'since been
> fixed' :) 

Okay sure (showing my ignorance). My only understanding was that this
"lying" was the case for some unspecified BSDs a while ago but that the
latest releases of any of them *did* have largefile support.

> 
> I tried to fix the test, but I have been completely unable to find a proper
> test. There doesn't seem to be a 'standard' one, and I wasn't able to figure
> out what, say, 'zsh' uses -- black autoconf magic, for sure.

Hmmm... if one code encode whether or not a 64-bit fseek could be
implemented (either using fseek, fseek0, fseek64, _fseek, fsetpos/fgetpos,
etc.) in a short C program then that would be the test (or at least most of
the test, might have to see if ftell could be implemented as well). Or are
there other requirements?


Trent

-- 
Trent Mick
TrentM at ActiveState.com



From esr at thyrsus.com  Tue Jan 16 02:26:14 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 20:26:14 -0500
Subject: [Python-Dev] time functions
In-Reply-To: <20010116004930.L1005@xs4all.nl>; from thomas@xs4all.net on Tue, Jan 16, 2001 at 12:49:30AM +0100
References: <20010116004930.L1005@xs4all.nl>
Message-ID: <20010115202614.A11732@thyrsus.com>

Thomas Wouters <thomas at xs4all.net>:
> Actually, I'll split it up in three proposals:
> 
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)
> 
> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)
> 
> - Making the time in time.localtime and time.gmtime default to 'now'.
> 
> I'm 0/+1/+1 myself :)

Likewise.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Never trust a man who praises compassion while pointing a gun at you.



From barry at digicool.com  Tue Jan 16 03:14:33 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 21:14:33 -0500
Subject: [Python-Dev] time functions
References: <20010116004930.L1005@xs4all.nl>
Message-ID: <14947.44681.254332.976234@anthem.wooz.org>

>>>>> "TW" == Thomas Wouters <thomas at xs4all.net> writes:

    TW> I'm 0/+1/+1 myself :)

Maybe I'm an inch on the +0/+1/+1 side. :)



From jeremy at alum.mit.edu  Tue Jan 16 01:11:59 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 15 Jan 2001 19:11:59 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <20010115162619.A19484@kronos.cnri.reston.va.us>
References: <14934.7465.360749.199433@localhost.localdomain>
	<200101151917.OAA29687@cj20424-a.reston1.va.home.com>
	<14947.20512.140859.119597@localhost.localdomain>
	<20010115162619.A19484@kronos.cnri.reston.va.us>
Message-ID: <14947.37327.395622.66435@localhost.localdomain>

>>>>> "AMK" == Andrew Kuchling <akuchlin at mems-exchange.org> writes:

  AMK> On Mon, Jan 15, 2001 at 02:31:44PM -0500, Jeremy Hylton wrote:
  >> Let's have all the interested parties vote now, then.  It would
  >> certainly be helpful to have the new unittest module in the alpha
  >> release of 2.1.  I'd like to write some new tests and I'd rather
  >> use the new stuff than the old stuff.

  AMK> Huh?  If no one has tried the different modules, what's the
  AMK> point of having a vote?  (Given that doctest is going to be
  AMK> added, though, it should be checked in ASAP.)

Guido is the only person that said he hadn't tried anything.  If
others have given it a whirl, they ought to chime in now.  If very few
people have given them a try, we should decide whether we wait for
them or proceed without them.  We can't wait indefinitely.  I'm not
sure when we need to decide.

Jeremy



From nas at arctrix.com  Mon Jan 15 20:40:55 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 15 Jan 2001 11:40:55 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python sysmodule.c,2.82,2.83
In-Reply-To: <200101132225.RAA03197@cj20424-a.reston1.va.home.com>; from guido@python.org on Sat, Jan 13, 2001 at 05:25:12PM -0500
References: <E14HYoJ-0002n3-00@usw-pr-cvs1.sourceforge.net> <20010113071758.C28643@glacier.fnational.com> <200101132225.RAA03197@cj20424-a.reston1.va.home.com>
Message-ID: <20010115114055.A5879@glacier.fnational.com>

On Sat, Jan 13, 2001 at 05:25:12PM -0500, Guido van Rossum wrote:
> Do you have a tool that detects leaks?

debauch is showing promise athough it is still pretty rough
around the edges.  memprof is another option.  It looks like
init_exceptions may be leaking memory.  Some debauch output:

 1      Leaked Memory 0x0849cf98, size 44 (from 0x0) AllocTime: 79269 FreeTime: 43436      
        return stack:
                ???:?? (0x40016005) 
                classobject.c:84 (0x805c16d) <PyClass_New+631>
                exceptions.c:337 (0x8088594) <make_Exception+250>
                exceptions.c:1061 (0x80898dc) <init_exceptions+232>
                pythonrun.c:151 (0x8053581) <Py_Initialize+573>
                loop.c:23 (0x8053305) <main+101>

I haven't figured out if this is a real leak yet.

  Neil



From michel at digicool.com  Tue Jan 16 07:33:00 2001
From: michel at digicool.com (Michel Pelletier)
Date: Mon, 15 Jan 2001 22:33:00 -0800 (PST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <14947.37327.395622.66435@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.10101152216200.2373-100000@localhost.localdomain>

On Mon, 15 Jan 2001, Jeremy Hylton wrote:

> >>>>> "AMK" == Andrew Kuchling <akuchlin at mems-exchange.org> writes:
> 
>   AMK> On Mon, Jan 15, 2001 at 02:31:44PM -0500, Jeremy Hylton wrote:
>   >> Let's have all the interested parties vote now, then.  It would
>   >> certainly be helpful to have the new unittest module in the alpha
>   >> release of 2.1.  I'd like to write some new tests and I'd rather
>   >> use the new stuff than the old stuff.
> 
>   AMK> Huh?  If no one has tried the different modules, what's the
>   AMK> point of having a vote?  (Given that doctest is going to be
>   AMK> added, though, it should be checked in ASAP.)
> 
> Guido is the only person that said he hadn't tried anything.  If
> others have given it a whirl, they ought to chime in now.  

I have used pyunit to create a simple set of tests.  It seemed to do the
job well and it was very easy. I'd never done it before and the docs were
fat and A+.

I can only give a one-sided opinion.  I know of AMK's work but I have not
used it, are there others?

-Michel




From akuchlin at mems-exchange.org  Tue Jan 16 04:03:31 2001
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Mon, 15 Jan 2001 22:03:31 -0500
Subject: [Python-Dev] Detecting install time
Message-ID: <200101160303.WAA11632@207-172-111-91.s91.tnt1.ann.va.dialup.rcn.com>

For PEP 229, the setup.py script needs to figure out if it's running
from the build directory, because then distutils.sysconfig needs to
look at different config files; ./Modules/Makefile instead of
/usr/lib/python2.0/config/Makefile, and so forth.  Is there a
simple/clean way to do this?

--amk






From guido at python.org  Tue Jan 16 04:21:43 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:21:43 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: Your message of "Mon, 15 Jan 2001 17:13:03 EST."
             <20010115171303.A23626@kronos.cnri.reston.va.us> 
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> <200101112155.QAA16678@cj20424-a.reston1.va.home.com> <20010111172633.A26249@kronos.cnri.reston.va.us> <200101121351.IAA19676@cj20424-a.reston1.va.home.com>  
            <20010115171303.A23626@kronos.cnri.reston.va.us> 
Message-ID: <200101160321.WAA00648@cj20424-a.reston1.va.home.com>

> On Fri, Jan 12, 2001 at 08:51:51AM -0500, Guido van Rossum wrote:
> >Ah.  It's very simple.  I create a directory "linux" as a subdirectory
> >of the Python source tree (i.e. at the same level as Lib, Objects,
> >etc.).  Then I chdir into that directory, and I say "../configure".
> >The configure script creates subdirectories to hold the object files ...
> >Then I say "make" and it builds Python.  
> 
> This doesn't work at all for me in my copy of the CVS tree.  Are there
> other steps or requirements to make this work.  (Transcript available
> upon request, but I suspect I'm missing something simple.)

You can't start doing this in a tree where you have already built
Python using the default way -- you have to use a pristine tree.  The
reason is the funny way Make's VPATH feature works, it sees the .o
files in the source directory and then thinks it doesn't have to creat
the .o file in the build directory.  I think a "make clobber" at the
top level would probably eradicate everything that confuses Make.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido at python.org  Tue Jan 16 04:24:04 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:24:04 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Mon, 15 Jan 2001 14:35:47 PST."
             <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101160324.WAA00677@cj20424-a.reston1.va.home.com>

> I don't know whether this is going to be obvious or controversial,
> but here goes.  Most of the time we're used to seeing a newline as
> '\n', not as '\012', and newlines are typed in as '\n'.
> 
> A newcomer to Python is likely to do
> 
>     >>> 'hello\n'
>     'hello\012'
> 
> and ask "what's \012?" -- whereupon one has to explain that it's an
> octal escape, that 012 in octal equals 10, and that chr(10) is
> newline, which is the same as '\n'.  You're bound to run into this,
> and you'll see \012 a lot, because \n is such a common character.
> Aside from being slightly more frightening, '\012' also takes up
> twice as many characters as necessary.
> 
> So... i'm submitting a patch that causes the three most common
> special whitespace characters, '\n', '\r', and '\t', to appear in
> their natural form rather than as octal escapes when strings are
> printed and repr()ed.

+1 on the idea; no time to study the patch tonight.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 04:28:38 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:28:38 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Mon, 15 Jan 2001 18:15:50 EST."
             <20010115181550.A11566@thyrsus.com> 
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>  
            <20010115181550.A11566@thyrsus.com> 
Message-ID: <200101160328.WAA00723@cj20424-a.reston1.va.home.com>

> > So... i'm submitting a patch that causes the three most common
> > special whitespace characters, '\n', '\r', and '\t', to appear in
> > their natural form rather than as octal escapes when strings are
> > printed and repr()ed.
> 
> Works for me.  I'd add \v, \b and \a to cover the whole ANSI C 
> standard escape set (hmmm...am I missing any?)

You missed \f [*].  Unclear to me whether it's a good idea to add the
lesser-known ones; they are just as likely binary gobbledegook rather
than what their escapes stand for.

[*] http://www.python.org/doc/current/ref/strings.html

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 04:31:19 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:31:19 -0500
Subject: [Python-Dev] time functions
In-Reply-To: Your message of "Tue, 16 Jan 2001 00:49:30 +0100."
             <20010116004930.L1005@xs4all.nl> 
References: <20010116004930.L1005@xs4all.nl> 
Message-ID: <200101160331.WAA00780@cj20424-a.reston1.va.home.com>

> Maybe this is a dead and buried subject, but I'm going to try anyway, since
> everyone's been in such a wonderful 'lets fix ugly but harmless nits' mood
> lately :)
> 
> Why do we need the following atrocity <wink>:
> 
>   timestr = time.strftime("<format>", time.localtime(time.time()))
> 
> To do the simple task of 'date +<format>' ?  I never really understood why
> there isn't a way to get a timetuple directly from C, rather than converting
> a float that we got from C a bytecode before, even though the higher level
> almost always deals with timetuples. How about making the float-to-tuple
> functions (time.localtime, time.gmtime) accept 0 arguments as well, and
> defaulting to time.time() in that case ? Even better, how about doing the
> same for the other functions, too ? (where it makes sense, of course :) 
> 
> Actually, I'll split it up in three proposals:
> 
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)

I don't see the confusion.

> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)
> 
> - Making the time in time.localtime and time.gmtime default to 'now'.
> 
> I'm 0/+1/+1 myself :)

Yes, I've wondered this myself too.  I guess the current API is based
too much on the C API...

+1/+1/+1.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 04:47:32 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:47:32 -0500
Subject: [Python-Dev] Detecting install time
In-Reply-To: Your message of "Mon, 15 Jan 2001 22:03:31 EST."
             <200101160303.WAA11632@207-172-111-91.s91.tnt1.ann.va.dialup.rcn.com> 
References: <200101160303.WAA11632@207-172-111-91.s91.tnt1.ann.va.dialup.rcn.com> 
Message-ID: <200101160347.WAA01132@cj20424-a.reston1.va.home.com>

> For PEP 229, the setup.py script needs to figure out if it's running
> from the build directory, because then distutils.sysconfig needs to
> look at different config files; ./Modules/Makefile instead of
> /usr/lib/python2.0/config/Makefile, and so forth.  Is there a
> simple/clean way to do this?

You could check for the presence of config.status -- that file is not
installed.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Tue Jan 16 04:53:16 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 22:53:16 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com>

[?!ng]
> So... i'm submitting a patch that causes the three most common
> special whitespace characters, '\n', '\r', and '\t', to appear in
> their natural form rather than as octal escapes when strings are
> printed and repr()ed.

-1 on doing that when they're printed (although I probably misunderstand
what you mean there).

+1 for changing repr() as suggested.

-0 on generalizing to \a \b \f \v too (I've never used one of those in a
string literal in my life, so would be more baffled by seeing one come back
than I would the octal equivalent).

I would also be +1 on using hex escapes instead of octal (I grew up on 36-
and 60-bit machines, but that was the last time octal looked *natural*!).
Octal and hex escapes both consume 4 characters, so I can't imagine what
octal has going for it in the 21st century <wink>.

377-is-an-irritating-way-to-spell-ff-ly y'rs  - tim


PS:  Note that C doesn't define what numerical values \a etc have, just
that:

    Each of these escape sequences shall produce a unique
    implementation-defined value which can be stored in a single
    char object. The external representations in a text file need
    not be identical to the internal representations, and are
    outside the scope of this International Standard.

The current method does have the advantage of extreme clarity.




From guido at python.org  Tue Jan 16 05:08:46 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 23:08:46 -0500
Subject: [Python-Dev] TELL64
In-Reply-To: Your message of "Mon, 15 Jan 2001 16:24:54 PST."
             <20010115162454.D3864@ActiveState.com> 
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl>  
            <20010115162454.D3864@ActiveState.com> 
Message-ID: <200101160408.XAA01368@cj20424-a.reston1.va.home.com>

Looking at the code (in _portable_fseek()) that uses TELL64, I don't
understand why it can't use fgetpos().  That code is used only when
fpos_t -- the type used by fgetpos() and fsetpos() -- is 64-bit.

Trent, you wrote that code.  Why wouldn't this work just as well?

(your code):
			if ((pos = TELL64(fileno(fp))) == -1L)
				return -1;
(my suggestion):
			if (fgetpos(fp, &pos) != 0)
				return -1;

It can't be because fgetpos() doesn't exist or is otherwise unusable,
because the SEEK_CUR case uses it.

We also know that offset is 8-bit capable (the #if around the
declaration of _portable_fseek() ensures that).

I would even go as far as to collapse the entire switch as follows:

	fpos_t pos;
	switch (whence) {
	case SEEK_END:
		/* do a "no-op" seek first to sync the buffering so that
		   the low-level tell() can be used correctly */
		if (fseek(fp, 0, SEEK_END) != 0)
			return -1;
		/* fall through */
	case SEEK_CUR:
		if (fgetpos(fp, &pos) != 0)
			return -1;
		offset += pos;
		break;
	/* case SEEK_SET: break; */
	}
	return fsetpos(fp, &offset);

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 05:13:40 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 23:13:40 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Mon, 15 Jan 2001 22:53:16 EST."
             <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com> 
Message-ID: <200101160413.XAA01404@cj20424-a.reston1.va.home.com>

> [?!ng]
> > So... i'm submitting a patch that causes the three most common
> > special whitespace characters, '\n', '\r', and '\t', to appear in
> > their natural form rather than as octal escapes when strings are
> > printed and repr()ed.
> 
> -1 on doing that when they're printed (although I probably misunderstand
> what you mean there).

Ping was using imprecise language here -- he meant repr() and "printed
at the command line prompt."

> +1 for changing repr() as suggested.
> 
> -0 on generalizing to \a \b \f \v too (I've never used one of those in a
> string literal in my life, so would be more baffled by seeing one come back
> than I would the octal equivalent).
> 
> I would also be +1 on using hex escapes instead of octal (I grew up on 36-
> and 60-bit machines, but that was the last time octal looked *natural*!).

Me too.  One summer vacation while in college I had nothing better to
do than decode the Pascal runtime system for the University's CDC-6600
from an octal dump into assembly.  Learned lots!

> Octal and hex escapes both consume 4 characters, so I can't imagine what
> octal has going for it in the 21st century <wink>.

Originally, using \x for these was impractical (at least) because of
the stupid gobble-up-everything-that-looks-like-a-hex-digit semantics
of the \x escape.  Now we've fixed this, I agree.

> 377-is-an-irritating-way-to-spell-ff-ly y'rs  - tim
> 
> 
> PS:  Note that C doesn't define what numerical values \a etc have, just
> that:
> 
>     Each of these escape sequences shall produce a unique
>     implementation-defined value which can be stored in a single
>     char object. The external representations in a text file need
>     not be identical to the internal representations, and are
>     outside the scope of this International Standard.
> 
> The current method does have the advantage of extreme clarity.

Python doesn't support non-ASCII machines, like the C standard
(pretends to).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Tue Jan 16 05:26:13 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 23:26:13 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <200101160328.WAA00723@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 15, 2001 at 10:28:38PM -0500
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <20010115181550.A11566@thyrsus.com> <200101160328.WAA00723@cj20424-a.reston1.va.home.com>
Message-ID: <20010115232613.B12166@thyrsus.com>

Guido van Rossum <guido at python.org>:
> > > So... i'm submitting a patch that causes the three most common
> > > special whitespace characters, '\n', '\r', and '\t', to appear in
> > > their natural form rather than as octal escapes when strings are
> > > printed and repr()ed.
> > 
> > Works for me.  I'd add \v, \b and \a to cover the whole ANSI C 
> > standard escape set (hmmm...am I missing any?)
> 
> You missed \f [*].  Unclear to me whether it's a good idea to add the
> lesser-known ones; they are just as likely binary gobbledegook rather
> than what their escapes stand for.
> 
> [*] http://www.python.org/doc/current/ref/strings.html

Truth is, Guido, I'm kind of iffy about whether there'd be a gain in
clarity myself.  But I find I'm rather attached to the idea of
maintaining strictest possible symmetry between what Python handles on
input and what it emits on output.

So unless we think adding \f, \v, \b, and \a to the special set would
actually produce a *loss* of clarity relative to octal gibberish (!),
I say do 'em all.  Aesthetically, that feels to me like the right thing, 
and the *Pythonic* thing, to do here.

Have I erred in my intuition, O BDFL?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

A man who has nothing which he is willing to fight for, nothing 
which he cares about more than he does about his personal safety, 
is a miserable creature who has no chance of being free, unless made 
and kept so by the exertions of better men than himself. 
	-- John Stuart Mill, writing on the U.S. Civil War in 1862



From nas at arctrix.com  Mon Jan 15 22:45:28 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 15 Jan 2001 13:45:28 -0800
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <20010115232613.B12166@thyrsus.com>; from esr@thyrsus.com on Mon, Jan 15, 2001 at 11:26:13PM -0500
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <20010115181550.A11566@thyrsus.com> <200101160328.WAA00723@cj20424-a.reston1.va.home.com> <20010115232613.B12166@thyrsus.com>
Message-ID: <20010115134528.B6193@glacier.fnational.com>

On Mon, Jan 15, 2001 at 11:26:13PM -0500, Eric S. Raymond wrote:
> [...] I find I'm rather attached to the idea of maintaining
> strictest possible symmetry between what Python handles on
> input and what it emits on output.
> 
> So unless we think adding \f, \v, \b, and \a to the special set would
> actually produce a *loss* of clarity relative to octal gibberish (!),
> I say do 'em all.

Symmetry is good but I bet most people who would see \f, \v, \b,
\a wouldn't have entered those characters using escapes.  Most
likely those character's would have been read from a binary file.

That said, I don't really mind either way.

  Neil



From tim.one at home.com  Tue Jan 16 05:43:06 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 23:43:06 -0500
Subject: [Python-Dev] Whitesapce normalization
Message-ID: <LNBBLJKPBEHFEDALKOLCCEOOIIAA.tim.one@home.com>

You may have noticed that I checked in changes to most of the modules in the
top level of Lib yesterday (Sunday).  This is part of a Crusade that was
supposed to happen before 2.0a1, but got dropped on the floor then due to
misunderstandings:  make the Python code we distribute adhere to Guido's
style guide (4-space indents, no hard tabs), + clean up minor whitespace
nits (no stray blank lines at the ends of files, no trailing whitespace on
lines, last line of the file should end with a newline).

It would be nice if people cleaned up their code this way too; I'm not going
to go thru the entire distribution doing this.  So, if you give a rip, pick
a directory or some modules you're fond of, and clean 'em up.

The program Tools/scripts/reindent.py does all of the above for you, so it's
not hard.  But it takes some care in two areas, which is why I did the top
level of Lib one file at a time by hand, and studied diffs by eyeball before
checking in any changes:

+ It's unlikely but possible that some program file *depends* on trailing
whitespace.  That plain sucks (it's *going* to break sooner or later), but
reindent.py can't help you there.

+ While reindent should never otherwise damage program logic, very strange
commenting or docstring styles may get mangled by it, making code and/or
docs hard to read.  reindent works very hard to do a good job on that, and
indeed I found no need to make manual changes to anything it did in the top
level of Lib.  But check anyway.  Especially some of the very oldest modules
are littered with ugly stuff like

    #

all over the place, from back when nobody had an editor smart enough to skip
over preceding blank lines when suggesting indentation for the current line.
Then again, maybe we should just drop the Irix5 directory <wink>.

voice-in-the-wilderness-ly y'rs  - tim




From esr at thyrsus.com  Tue Jan 16 05:43:24 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 23:43:24 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 15, 2001 at 10:53:16PM -0500
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com>
Message-ID: <20010115234324.C12166@thyrsus.com>

Tim Peters <tim.one at home.com>:
> I would also be +1 on using hex escapes instead of octal (I grew up on 36-
> and 60-bit machines, but that was the last time octal looked *natural*!).
> Octal and hex escapes both consume 4 characters, so I can't imagine what
> octal has going for it in the 21st century <wink>.

Tim, on the level of aesthetic preference I'm totally with you.  I've always
found octal really ugly myself.  Hex fits my brain better; somehow I find it
easier to visualize the bit patterns from.

Sadly, there are so many other related ways in which Python
intelligently follows C/Unix conventions that I think changing to a default
of hex escapes rather than octal would violate the Rule of Least
Surprise.

One of the things I like about Python is precisely its conservatism in
areas like string escapes, that Guido refrained from inventing new OS
APIs or new conventions for things like string escapes in places where
Unix and C did them in a well-established and reasonable way.  He didn't
make the mistake, all too typical in academic languages, of confusing
novelty with value...

This conservatism is valuable because it frees the C-experienced
programmer's mind from having to think about where the language is
trivially different, so he can concentrate on where it's importantly
different.  It's worth maintaining.

On the other hand, the change would mesh well with the Unicode support.
Hmm.  Tough call.  I could go either way, I guess.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The politician attempts to remedy the evil by increasing the very thing
that caused the evil in the first place: legal plunder.
	-- Frederick Bastiat



From tim.one at home.com  Tue Jan 16 06:07:16 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 16 Jan 2001 00:07:16 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <20010115234324.C12166@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPAIIAA.tim.one@home.com>

[Eric]
> Tim, on the level of aesthetic preference I'm totally with you.
> I've always found octal really ugly myself.  Hex fits my brain
> better;  somehow I find it easier to visualize the bit patterns from.
>
> Sadly, there are so many other related ways in which Python
> intelligently follows C/Unix conventions that I think changing to
> a default of hex escapes rather than octal would violate the Rule
> of Least Surprise.
>
> ... [and skipping nice stuff I *do* agree with <wink>] ...

The saving grace here is that repr() is a form of ASCII dump.  C has nothing
to say about that, while last time I used Unix it was real easy to get dumps
in hex (and indeed that's what everyone I knew routinely did).  I expect
that od retains both its name and its octal defaults on most systems simply
due to inertia.  An octal dump would be infinitely surprising on Windows
(I'm not sure I can even get one without writing it myself).

Do people actually use octal dumps on Unices anymore?  I'd be surprised, if
they're running on power-of-2 boxes.  Defaults aren't conventions when
*everyone* overrides them, they're just old and in the way.

takes-one-to-know-one<wink>-ly y'rs  - tim




From ping at lfw.org  Tue Jan 16 06:27:33 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 21:27:33 -0800 (PST)
Subject: [Python-Dev] time functions
In-Reply-To: <20010116004930.L1005@xs4all.nl>
Message-ID: <Pine.LNX.4.10.10101152126120.5846-100000@skuld.kingmanhall.org>

On Tue, 16 Jan 2001, Thomas Wouters wrote:
> Actually, I'll split it up in three proposals:
> 
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)
> 
> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)
> 
> - Making the time in time.localtime and time.gmtime default to 'now'.

I like all of these suggestions.  Go for it!


-- ?!ng




From esr at thyrsus.com  Tue Jan 16 06:31:14 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 16 Jan 2001 00:31:14 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEPAIIAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 16, 2001 at 12:07:16AM -0500
References: <20010115234324.C12166@thyrsus.com> <LNBBLJKPBEHFEDALKOLCCEPAIIAA.tim.one@home.com>
Message-ID: <20010116003114.A12365@thyrsus.com>

Tim Peters <tim.one at home.com>:
> Do people actually use octal dumps on Unices anymore? 

Well, we do when we momentarily forget to give od(1) the -x escape :-)

This so annoyed me that back around 1983 I wrote my own hex dumper
specifically to emulate the 16-hex-bytes-with-midpage-gutter-and-ASCII-
over-on-the-right-side format that CP/M used and DOS inherited.  It's
still available at <http://www.tuxedo.org/~esr/hex/>.

Do you know the history on this?  C speaks octal because a bunch of 
mode fields in the PDP-11 instruction word were three bits wide.
Time was it was actually useful to have the output from (say)
core files chunk that way. But I haven't seen an octal code dump 
in over a decade, probably pushing fifteen years now.  
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

In the absence of any evidence tending to show that possession 
or use of a 'shotgun having a barrel of less than eighteen inches 
in length' at this time has some reasonable relationship to the 
preservation or efficiency of a well regulated militia, we cannot 
say that the Second Amendment guarantees the right to keep and bear 
such an instrument. [...] The Militia comprised all males 
physically capable of acting in concert for the common defense.  
        -- Majority Supreme Court opinion in "U.S. vs. Miller" (1939)



From ping at lfw.org  Tue Jan 16 06:33:42 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 21:33:42 -0800 (PST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <200101160413.XAA01404@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101152130090.5846-100000@skuld.kingmanhall.org>

On Mon, 15 Jan 2001, Guido van Rossum wrote:
> > > special whitespace characters, '\n', '\r', and '\t', to appear in
> > > their natural form rather than as octal escapes when strings are
> > > printed and repr()ed.
> > 
> > -1 on doing that when they're printed (although I probably misunderstand
> > what you mean there).
> 
> Ping was using imprecise language here -- he meant repr() and "printed
> at the command line prompt."

Yes, i referred to "when strings are printed and repr()ed" as two cases
because both string_print() and string_repr() have to be changed.

(Side question: when are *_print() and *_repr() ever different, and why?)

> Originally, using \x for these was impractical (at least) because of
> the stupid gobble-up-everything-that-looks-like-a-hex-digit semantics
> of the \x escape.  Now we've fixed this, I agree.

Oh, now i understand.  Good point.  I'll update the patch to do hex.

0xdeadbeef-ly yours,


-- ?!ng




From fredrik at effbot.org  Tue Jan 16 08:11:38 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 16 Jan 2001 08:11:38 +0100
Subject: [Python-Dev] time functions
References: <20010116004930.L1005@xs4all.nl>
Message-ID: <00b201c07f8b$93996820$e46940d5@hagrid>

thomas wrote:
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)

where "now" is local time, I assume?

since you're assuming a time zone, you could make it accept
an integer as well...

> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)

same here.

</F>




From thomas at xs4all.net  Tue Jan 16 08:18:38 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 08:18:38 +0100
Subject: [Python-Dev] time functions
In-Reply-To: <00b201c07f8b$93996820$e46940d5@hagrid>; from fredrik@effbot.org on Tue, Jan 16, 2001 at 08:11:38AM +0100
References: <20010116004930.L1005@xs4all.nl> <00b201c07f8b$93996820$e46940d5@hagrid>
Message-ID: <20010116081838.N1005@xs4all.nl>

On Tue, Jan 16, 2001 at 08:11:38AM +0100, Fredrik Lundh wrote:
> thomas wrote:
> > - Making the time in time.strftime default to 'now', so that the above
> >   becomes the ever so slightly confusing:
> > 
> >   timestr = time.strftime("<format>")
> >   (confusing because it looks a bit like a regexp constructor...)

> where "now" is local time, I assume?

Yes. See the patch I'll upload later today (meetings first, grrr)

> since you're assuming a time zone, you could make it accept
> an integer as well...

Could, yes... I'll include it in the 2nd revision of the patch, it can be
rejected (or accepted) separately.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Tue Jan 16 09:22:11 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 09:22:11 +0100
Subject: [Python-Dev] time functions
In-Reply-To: <20010116081838.N1005@xs4all.nl>; from thomas@xs4all.net on Tue, Jan 16, 2001 at 08:18:38AM +0100
References: <20010116004930.L1005@xs4all.nl> <00b201c07f8b$93996820$e46940d5@hagrid> <20010116081838.N1005@xs4all.nl>
Message-ID: <20010116092211.O1005@xs4all.nl>

On Tue, Jan 16, 2001 at 08:18:38AM +0100, Thomas Wouters wrote:
> On Tue, Jan 16, 2001 at 08:11:38AM +0100, Fredrik Lundh wrote:

> > >   timestr = time.strftime("<format>")

> > since you're assuming a time zone, you could make it accept
> > an integer as well...

> Could, yes... 

Actually, on second thought, lets not, not just yet anyway. Doing that for
all functions in the time module would continue to pollute the already toxic
waters of a C API translated into Python :P Who knows what 'ctime' stands
for, anyway ? And 'asctime' ? How can we expect Python programmers who think
'C' is a high note or average grade, to understand how the time module is
supposed to be used ? :)

We now have:
time() -- return current time in seconds since the Epoch as a float
gmtime() -- convert seconds since Epoch to UTC tuple
localtime() -- convert seconds since Epoch to local time tuple
asctime() -- convert time tuple to string
ctime() -- convert time in seconds to string
mktime() -- convert local time tuple to seconds since Epoch
strftime() -- convert time tuple to string according to format specification

where asctime and ctime are basically wrappers around strftime, and would do
the exact same thing if they both accepted tuples and floats. 

I think we should have something like:
time() -- current time in float
timetuple() -- current (local) time in timetuple
tuple2time(tuple) -- tuple -> float
time2tuple(float, tz=local) -- float -> tuple using timezone tz
stringtime(time=now, format="ctimeformat") -- convert time value to string

Those are just working names, to make the point, I don't have time to think
up better ones :) I'm not sure if the timezone support in the above list is
extensive enough, mostly because I hardly use timezones myself. Also,
tuple2time() could be merged with time(), and likewise for time2tuple() and
timetuple(). I think keeping strftime() and maybe ctime() for ease-of-use is
a good idea, but the rest could eventually be deprecated.

Off-to-important-meetings-*cough*-ly y'rs
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From fredrik at effbot.org  Tue Jan 16 09:30:28 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 16 Jan 2001 09:30:28 +0100
Subject: [Python-Dev] unit testing bake-off
References: <LNBBLJKPBEHFEDALKOLCKENMIIAA.tim.one@home.com>
Message-ID: <01ba01c07f96$967b7870$e46940d5@hagrid>

Tim Peters wrote:
> At least you, Jeremy and Fredrik have tried them, and
> if that's all there can't be a tie <wink>.

let me guess:

    Jeremy: PyUnit
    Andrew: unittest
    Fredrik: unittest

(I find pyunit a bit unpythonic, and both overengineered
and underengineered at the same time...  hard to explain,
but I strongly prefer unittest)

> I would agree this is not an ideal decision procedure.

well, any decision procedure that comes up with what I
want just has to be ideal ;-)

</F>




From andy at reportlab.com  Tue Jan 16 10:20:45 2001
From: andy at reportlab.com (Andy Robinson)
Date: Tue, 16 Jan 2001 09:20:45 -0000
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <20010115204701.11972EA6B@mail.python.org>
Message-ID: <PGECLPOBGNBNKHNAGIJHAEELCGAA.andy@reportlab.com>

> Subject: Re: [Python-Dev] unit testing bake-off
> From: Guido van Rossum <guido at python.org>
> Date: Mon, 15 Jan 2001 14:17:27 -0500
> 
> There doesn't seem to be a lot of enthousiasm for a Unittest
> bakeoff...  Certainly I don't think I'll get to this myself before the
> conference.
> 
> How about the following though: talking of low-hanging fruit, Tim's
> doctest module is an excellent thing even if it isn't a unit testing
> framework!  (I found this out when I played with it -- it's real easy
> to get used to...)
> 
> Would anyone object against Tim checking this in?  Since it isn't a
> contender in the unit test bake-off, it shouldn't affect the outcome
> there at all.
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)

I think it should definitely go in.  Ditto with whatever testing
framework and documentation tools (pydoc etc.) shortly emerge
as "best of breed".  I spend my time on corporate consulting
projects, and saying things like "Python has standard tools for
unit testing and documentation" is even better than saying 
"We have standard tools for unit testing and documentation".

BTW, ReportLab has recently adopted PyUnit's unittest.py
It feels a bit Java-like to me - a few more lines of code
than needed - but it certainly works.   One key feature is
aggregating test suites; a big app we installed on a
customer site can run the test suite for itself, the ReportLab
library (whose test suite we are just getting to work on)
and four or five dependent utilities; another is that
people have heard of JUnit.

Just my 2p worth,
Andy Robinson




From tony at lsl.co.uk  Tue Jan 16 10:47:01 2001
From: tony at lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 16 Jan 2001 09:47:01 -0000
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and
         outside of Python)
In-Reply-To: <200101152041.PAA32298@cj20424-a.reston1.va.home.com>
Message-ID: <003901c07fa1$46e10c70$f05aa8c0@lslp7o.int.lsl.co.uk>

In the context of my starting doc strings in an Emacs Lisp manner,
Ka-Ping Yee said:
> I think i'm going to ask you to stop, unless Guido prefers
> otherwise.  Guido, do you have a style pronouncement for module
> docstrings?

and since Guido replied
> I'm with Ping.  None of the examples in the style guide start the
> docstring with the function name.  Almost none of the standard library
> modules start their module docstring with the module name (codecs is
> an exception, but I didn't write it :-).

I shall indeed stop (of course, my habit started before we HAD
documentation tools, and if we're going to browse things with pydoc, et
al, then there's no need for it. To be honest, it's the answer I
expected.

Oh dear, another item for my TO DO list (i.e., remove the offending
nits). Still, if it's only me it's hardly high impact!

Tibs
--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Which is safer, driving or cycling?
Cycling - it's harder to kill people with a bike...
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)




From tony at lsl.co.uk  Tue Jan 16 11:13:31 2001
From: tony at lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 16 Jan 2001 10:13:31 -0000
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and
         outside of Python)
In-Reply-To: <Pine.LNX.4.10.10101151155270.5846-100000@skuld.kingmanhall.org>
Message-ID: <003a01c07fa4$fa0883c0$f05aa8c0@lslp7o.int.lsl.co.uk>

I mentioned a "spurious"
>	The system cannot find the path specified.

on NT, and Ka-Ping Yee said:
> Thanks for the NT testing.  That's funny -- i put in a special case
> for Windows to avoid messages like the above a couple of days ago.
> How recently did you download pydoc.py?  Does your copy contain:
>
>     if hasattr(sys, 'winver'):
>         return lambda text: tempfilepager(text, 'more')

Hmm. I downloaded it when I read the email message announcing it, which
was yesterday some time. But it doesn't look like the lines you mention
are there - I'll try re-downloading...

...I've redownloaded the files from http://www.lfw.org/python/pydoc.py,
etc., and done a grep for hasattr within them. There's no check such as
the one you mention, so I guess it's "download impedance".

> So you can see what i'm up to, here's my current to-do list:
>
>     make boldness optional (only if using more/less?  only Unix?)

probably sensible. By the way, I don't get boldness on the NT box - any
chance (he says, not intending to help *at all* in doing it!) of it
happening there as well? (or would that depend on what curses support is
built into the Python?)

>     document a .py file given on the command line

also allow for a directory module (i.e., something with __init__.py in
it) given on the command line?

>     write a better htmlrepr (\n should look special, max
>     length limit, etc.)

yes, but these things can always get better - the fact it's working
allows for improoooovement down the line.

>     generate HTML index from precis and __path__ and package

a neat idea - definitely Good Stuff!

>     contents list

well, I always do these, so I'm for this one as well

>     have help(...) produce a directory of available things to
>     ask for help on

bouncy fun!

>     Windows and Mac testing

I'm running Windows 98 with Python 1.5.2 at home, and will willingly try
it out on that (after all, it's not a very big download) - although it
might sometimes take a day or two to get round to it (for instance, I
haven't yet done so!). But I suspect I shan't be a very demanding
user...

>     default to HTTP mode on GUI platforms?  (win, mac)
>
> The ones marked with + i consider done.  Feel free to comment on
> or suggest priorities for the others; in particular, what do you
> think of the last one?  The idea is that double-clicking on
> pydoc.py in Windows or MacOS could launch the server and then open
> the localhost URL using webbrowser.py to display the documentation
> index.  Should it do this by default?

I'll leave that to better designers than myself (although if one is to
*have* a double click action, that seems sensible to me).

(looks up webbrowser.py - ah, a 2.0 module). Personally, I'd also like
to have the option of having a "mini-browser" supported directly,
perhaps in Tkinter, so I don't need to start up a whole web browser. But
again I may be odd in that wish (I can't remember what IDLE does).

Oh - that also means "integrate into IDLE" presumably goes on at least a
WishList as well...

Other ideas:
* command line switch to *output* HTML to a file (i.e., documentation
generation) (presumably something like "-o <name>.html", where the
"html" indicates the output format - an alternative being "txt"
* if I ever finish the docutils effort (I should be getting back to it
soon) then use that to format the texts (this would mean I need not
worry about the "frontend" to docutils too much, since pydoc is already
doing so much). Or maybe the docutils tool should be importing pydoc...

Tibs (must do some (paid) work now!)

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"Bounce with the bunny. Strut with the duck.
 Spin with the chickens now - CLUCK CLUCK CLUCK!"
BARNYARD DANCE! by Sandra Boynton
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)





From mal at lemburg.com  Tue Jan 16 11:18:44 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 16 Jan 2001 11:18:44 +0100
Subject: [Python-Dev] time functions
References: <20010116004930.L1005@xs4all.nl>
Message-ID: <3A642004.F6197E86@lemburg.com>

Thomas Wouters wrote:
> 
> Maybe this is a dead and buried subject, but I'm going to try anyway, since
> everyone's been in such a wonderful 'lets fix ugly but harmless nits' mood
> lately :)
> 
> Why do we need the following atrocity <wink>:
> 
>   timestr = time.strftime("<format>", time.localtime(time.time()))
> 
> To do the simple task of 'date +<format>' ?  I never really understood why
> there isn't a way to get a timetuple directly from C, rather than converting
> a float that we got from C a bytecode before, even though the higher level
> almost always deals with timetuples. How about making the float-to-tuple
> functions (time.localtime, time.gmtime) accept 0 arguments as well, and
> defaulting to time.time() in that case ? Even better, how about doing the
> same for the other functions, too ? (where it makes sense, of course :)
> 
> Actually, I'll split it up in three proposals:
> 
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)
> 
> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)
> 
> - Making the time in time.localtime and time.gmtime default to 'now'.
> 
> I'm 0/+1/+1 myself :)

+1 all the way -- though these days I tend not to use the
time module anymore. mxDateTime already does everything I want
and there date/time values are objects rather than Python integers
or tuples... ok, I'm just showing opff a little :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Tue Jan 16 11:32:21 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 16 Jan 2001 11:32:21 +0100
Subject: [Python-Dev] Strings: '\012' -> '\n'
References: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com> <200101160413.XAA01404@cj20424-a.reston1.va.home.com>
Message-ID: <3A642335.82358B02@lemburg.com>

Minor nit about this idea: it makes decoding repr() style
strings harder for external tools and it could cause breakage
(e.g. if "\n" is usedby the encoding for some other purpose).

BTW, since there are a gazillion ways to encode strings into
7-bit ASCII, why not use the new codec design to add additional
output schemes for 8-bit strings ?!

Strings have an .encode() method as well...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From ping at lfw.org  Tue Jan 16 11:37:42 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Tue, 16 Jan 2001 02:37:42 -0800 (PST)
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <003a01c07fa4$fa0883c0$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <Pine.LNX.4.10.10101160236330.5846-100000@skuld.kingmanhall.org>

Before somebody decides to shoot us for spamming both lists,
i'm taking this thread off of python-dev and solely to doc-sig.
Please continue further discussion there...


-- ?!ng




From ping at lfw.org  Tue Jan 16 11:47:02 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Tue, 16 Jan 2001 02:47:02 -0800 (PST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101152130090.5846-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101160240520.5846-100000@skuld.kingmanhall.org>

On Mon, 15 Jan 2001, Ka-Ping Yee wrote:
> On Mon, 15 Jan 2001, Guido van Rossum wrote:
> > Originally, using \x for these was impractical (at least) because of
> > the stupid gobble-up-everything-that-looks-like-a-hex-digit semantics
> > of the \x escape.  Now we've fixed this, I agree.
> 
> Oh, now i understand.  Good point.  I'll update the patch to do hex.

I assume you would like Unicode strings to do the same (\n, \t, \r,
and \xff rather than \377).

Guido, do you have a Pronouncement on \v, \f, \b, \a?

By the way, why do Unicode escapes appear in capitals?

    >>> u'\uface'
    u'\uFACE'

(If someone tells me that there happens to be a picture of a face at
that code point, i'll laugh.  Is there a cow at \uBEEF?)

Does anyone care that \x will be followed by lowercase and \u by uppercase?

I noticed that the tutorial claims Unicode strings can be str()-ified
and will encode themselves using UTF-8 as default.  But this doesn't
actually work for me:

    >>> us = u'\uface'
    >>> us
    u'\uFACE'
    >>> str(us)
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    UnicodeError: ASCII encoding error: ordinal not in range(128)
    >>> us.encode()
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    UnicodeError: ASCII encoding error: ordinal not in range(128)
    >>> us.encode('UTF-8')
    '\xef\xab\x8e'

Assuming i have understood this correctly, i have submitted a patch
to correct tut.tex.


-- ?!ng





From bckfnn at worldonline.dk  Tue Jan 16 11:52:10 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Tue, 16 Jan 2001 10:52:10 GMT
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
Message-ID: <3a642768.6426631@smtp.worldonline.dk>

[Ping]

>I don't know whether this is going to be obvious or controversial,
>but here goes.  Most of the time we're used to seeing a newline as
>'\n', not as '\012', and newlines are typed in as '\n'.
>
>A newcomer to Python is likely to do
>
>    >>> 'hello\n'
>    'hello\012'
>
>and ask "what's \012?" -- whereupon one has to explain that it's an
>octal escape, that 012 in octal equals 10, and that chr(10) is
>newline, which is the same as '\n'.  You're bound to run into this,
>and you'll see \012 a lot, because \n is such a common character.
>Aside from being slightly more frightening, '\012' also takes up
>twice as many characters as necessary.
>
>So... i'm submitting a patch that causes the three most common
>special whitespace characters, '\n', '\r', and '\t', to appear in
>their natural form rather than as octal escapes when strings are
>printed and repr()ed.

I like it, because it removes yet another difference between Python and
Jython. Jython happens to handle these chars specially: \n, \t, \b, \f
and \r.

regards,
finn



From esr at thyrsus.com  Tue Jan 16 11:53:00 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 16 Jan 2001 05:53:00 -0500
Subject: [Python-Dev] time functions
In-Reply-To: <3A642004.F6197E86@lemburg.com>; from mal@lemburg.com on Tue, Jan 16, 2001 at 11:18:44AM +0100
References: <20010116004930.L1005@xs4all.nl> <3A642004.F6197E86@lemburg.com>
Message-ID: <20010116055300.C12847@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> +1 all the way -- though these days I tend not to use the
> time module anymore. mxDateTime already does everything I want
> and there date/time values are objects rather than Python integers
> or tuples... ok, I'm just showing opff a little :)

mxDateTime is on my short list of "why isn't this in the Python library
already?"  Has it ever been discussed?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

You need only reflect that one of the best ways to get yourself 
a reputation as a dangerous citizen these days is to go about 
repeating the very phrases which our founding fathers used in the 
great struggle for independence.
	-- Attributed to Charles Austin Beard (1874-1948)



From mal at lemburg.com  Tue Jan 16 12:18:24 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 16 Jan 2001 12:18:24 +0100
Subject: [Python-Dev] time functions
References: <20010116004930.L1005@xs4all.nl> <3A642004.F6197E86@lemburg.com> <20010116055300.C12847@thyrsus.com>
Message-ID: <3A642E00.BD330647@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal at lemburg.com>:
> > +1 all the way -- though these days I tend not to use the
> > time module anymore. mxDateTime already does everything I want
> > and there date/time values are objects rather than Python integers
> > or tuples... ok, I'm just showing opff a little :)
> 
> mxDateTime is on my short list of "why isn't this in the Python library
> already?"  Has it ever been discussed?

Yes. I'd rather keep it separate from the standard dist for
various reasons. One of these reasons is that I will be moving
the mx tools into a new packaging scheme built on distutils --
installing it should then boil down to a simple RPM install
or maybe a "python setup.py install" thanks to distutils. The
package will then become a subpackage of the mx package.

BTW, I see distutils as strong argument for *not* including
more exotic packages in Python's stdlib. If this catches on,
I expect that together with the Vaults we are not far away
from having our own CPAN style archive of add-on packages.
I also expect the commercial vendors like ActiveState et al.
to take care of wrapping SUMO distributions of Python and
the existing add-ons.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Tue Jan 16 12:20:18 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 16 Jan 2001 06:20:18 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <3a642768.6426631@smtp.worldonline.dk>; from bckfnn@worldonline.dk on Tue, Jan 16, 2001 at 10:52:10AM +0000
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <3a642768.6426631@smtp.worldonline.dk>
Message-ID: <20010116062018.A12935@thyrsus.com>

Finn Bock <bckfnn at worldonline.dk>:
> I like it, because it removes yet another difference between Python and
> Jython. Jython happens to handle these chars specially: \n, \t, \b, \f
> and \r.

This is an argument for adding \b and \f to the special set in
CPython.  If the BDFL looks benignly on adding \v and \a, those
should go into Jython's special set too.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Sometimes it is said that man cannot be trusted with the government
of himself.  Can he, then, be trusted with the government of others?
	-- Thomas Jefferson, in his 1801 inaugural address



From fredrik at pythonware.com  Tue Jan 16 12:37:10 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 16 Jan 2001 12:37:10 +0100
Subject: [Python-Dev] Strings: '\012' -> '\n'
References: <Pine.LNX.4.10.10101160240520.5846-100000@skuld.kingmanhall.org>
Message-ID: <03eb01c07fb0$aaaa19e0$0900a8c0@SPIFF>

ping wrote:
> By the way, why do Unicode escapes appear in capitals?
> 
>     >>> u'\uface'
>     u'\uFACE'
> 
> (If someone tells me that there happens to be a picture of a face at
> that code point, i'll laugh.  Is there a cow at \uBEEF?)

iirc, 0xFACE and 0xBEEF are part of the CJK and
Hangul spaces.  not sure 0xFACE is assigned, but
0xBEEF glyph looks like a ribcage with four legs...

you'll find faces at 0x263A etc.

</F>




From skip at mojam.com  Tue Jan 16 14:09:51 2001
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 16 Jan 2001 07:09:51 -0600 (CST)
Subject: [Python-Dev] bummer - regsub/regex no longer in module index
Message-ID: <14948.18463.971334.401426@beluga.mojam.com>

I am now getting deprecation warnings about regsub so I decided to start
replacing it with more zeal than I had previously.  First thing I wanted to
replace were some regsub.split calls.  I went to the module index to look up
the description but regsub was nowhere to be found.  (I know, I know.  I can
use pydoc.)

Still... how about continuing to include deprecated modules in the library
reference manual but in a separate Deprecated Modules section and annotate
them as such in the module index?

Skip



From guido at python.org  Tue Jan 16 14:44:01 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 08:44:01 -0500
Subject: [Python-Dev] time functions
In-Reply-To: Your message of "Tue, 16 Jan 2001 08:11:38 +0100."
             <00b201c07f8b$93996820$e46940d5@hagrid> 
References: <20010116004930.L1005@xs4all.nl>  
            <00b201c07f8b$93996820$e46940d5@hagrid> 
Message-ID: <200101161344.IAA04513@cj20424-a.reston1.va.home.com>

> thomas wrote:
> > - Making the time in time.strftime default to 'now', so that the above
> >   becomes the ever so slightly confusing:
> > 
> >   timestr = time.strftime("<format>")
> >   (confusing because it looks a bit like a regexp constructor...)
> 
> where "now" is local time, I assume?
> 
> since you're assuming a time zone, you could make it accept
> an integer as well...

What would the integer mean?

> > - Making the time in time.asctime and time.ctime optional, defaulting to
> >   'now', so you can just call 'time.ctime()' without having to pass
> >   time.time() (which are about half the calls in my own code :)
> 
> same here.

Same what here?  "now" == local time, sure.  But accept an integer?
It already accepts an integer!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 14:55:01 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 08:55:01 -0500
Subject: [Python-Dev] time functions
In-Reply-To: Your message of "Tue, 16 Jan 2001 09:22:11 +0100."
             <20010116092211.O1005@xs4all.nl> 
References: <20010116004930.L1005@xs4all.nl> <00b201c07f8b$93996820$e46940d5@hagrid> <20010116081838.N1005@xs4all.nl>  
            <20010116092211.O1005@xs4all.nl> 
Message-ID: <200101161355.IAA04802@cj20424-a.reston1.va.home.com>

Let's not redesign the time module API too much.  I'm all for adding
the default argument values that Thomas proposes.  Then, instead of
changing the API, we should look into a higher-level Python module.
That's how those things typically go.

Digital Creations has its own time extension type somewhere in Zope, a
bit similar to mxDateTime.  I looked into making this a standard
Python extension but quickly gave up.  The problems with these things
seems to be that it's hard to come up with a design that makes
everyone happy: some people want small objects (because they have a
lot of them around, e.g. a timestamp on almost every other object);
others want timezone support; yet others want microsecond resolution;
leap-second support; pre-Christian era support; support for
nonstandard calendars; interval arithmetic; support for dates without
times or times without dates...

Python could use a better time type, but we'll have to look into which
requirements make sense for a generalized type, and which don't.  I
fear that a committee could easily pee away years designing an
interface to satisfy absolutely every wish.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 15:02:29 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 09:02:29 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Mon, 15 Jan 2001 21:33:42 PST."
             <Pine.LNX.4.10.10101152130090.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101152130090.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101161402.JAA05045@cj20424-a.reston1.va.home.com>

> Yes, i referred to "when strings are printed and repr()ed" as two cases
> because both string_print() and string_repr() have to be changed.
> 
> (Side question: when are *_print() and *_repr() ever different, and why?)

You mean the tp_print and tp_str function slots in type objects,
right?  tp_print *should* always render exactly the same as tp_str.
tp_print is used by the print statement, not by value display at the
interactive prompt.

tp_print and tp_str have differed historically for 3rd party extension
types by accident.

So, string_print most definitely should *not* be changed -- only
string_repr!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 15:06:23 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 09:06:23 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Tue, 16 Jan 2001 02:47:02 PST."
             <Pine.LNX.4.10.10101160240520.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101160240520.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101161406.JAA05153@cj20424-a.reston1.va.home.com>

> I assume you would like Unicode strings to do the same (\n, \t, \r,
> and \xff rather than \377).

Yeah.

> Guido, do you have a Pronouncement on \v, \f, \b, \a?

Practicality beats purity: these will remain octal.

> By the way, why do Unicode escapes appear in capitals?
> 
>     >>> u'\uface'
>     u'\uFACE'

Could it be just that that's what Unicode folks are expecting?

> (If someone tells me that there happens to be a picture of a face at
> that code point, i'll laugh.  Is there a cow at \uBEEF?)

I'm laughing even though I don't see pictures. :-)

> Does anyone care that \x will be followed by lowercase and \u by uppercase?

It's mildly weird, and I think hex escapes in lowercase are more
Pythonic than in upper case.

> I noticed that the tutorial claims Unicode strings can be str()-ified
> and will encode themselves using UTF-8 as default.  But this doesn't
> actually work for me:
> 
>     >>> us = u'\uface'
>     >>> us
>     u'\uFACE'
>     >>> str(us)
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     UnicodeError: ASCII encoding error: ordinal not in range(128)
>     >>> us.encode()
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     UnicodeError: ASCII encoding error: ordinal not in range(128)
>     >>> us.encode('UTF-8')
>     '\xef\xab\x8e'
> 
> Assuming i have understood this correctly, i have submitted a patch
> to correct tut.tex.

Yeah, I guess that part of the tutorial was written before we changed
our minds about this. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido at python.org  Tue Jan 16 15:09:56 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 09:09:56 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Tue, 16 Jan 2001 11:32:21 +0100."
             <3A642335.82358B02@lemburg.com> 
References: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com> <200101160413.XAA01404@cj20424-a.reston1.va.home.com>  
            <3A642335.82358B02@lemburg.com> 
Message-ID: <200101161409.JAA05268@cj20424-a.reston1.va.home.com>

> Minor nit about this idea: it makes decoding repr() style
> strings harder for external tools and it could cause breakage
> (e.g. if "\n" is usedby the encoding for some other purpose).

Such a tool would be broken.  If it accepts string literals it should
accept all forms of escapes.

> BTW, since there are a gazillion ways to encode strings into
> 7-bit ASCII, why not use the new codec design to add additional
> output schemes for 8-bit strings ?!
> 
> Strings have an .encode() method as well...

Good idea!  This could also be used to "hexify" a string, for which
currently one of the quickest ways is still the hack

    "%02x"*len(s) % tuple(s)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 15:11:53 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 09:11:53 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Tue, 16 Jan 2001 06:20:18 EST."
             <20010116062018.A12935@thyrsus.com> 
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <3a642768.6426631@smtp.worldonline.dk>  
            <20010116062018.A12935@thyrsus.com> 
Message-ID: <200101161411.JAA05336@cj20424-a.reston1.va.home.com>

> Finn Bock <bckfnn at worldonline.dk>:
> > I like it, because it removes yet another difference between Python and
> > Jython. Jython happens to handle these chars specially: \n, \t, \b, \f
> > and \r.

[ESR]
> This is an argument for adding \b and \f to the special set in
> CPython.  If the BDFL looks benignly on adding \v and \a, those
> should go into Jython's special set too.

No, I think Jython should remove \b and \f.  Or the language standard
could allow implementations some freedom here (as long as the output
is a string literal).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at acm.org  Tue Jan 16 16:06:34 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 16 Jan 2001 10:06:34 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKENMIIAA.tim.one@home.com>
References: <20010115162619.A19484@kronos.cnri.reston.va.us>
	<LNBBLJKPBEHFEDALKOLCKENMIIAA.tim.one@home.com>
Message-ID: <14948.25466.698063.240902@cj42289-a.reston1.va.home.com>

Tim Peters writes:
 > Presumably so that *something* gets into 2.1a1.  At least you, Jeremy and
 > Fredrik have tried them, and if that's all there can't be a tie <wink>.  I
 > would agree this is not an ideal decision procedure.

  I've been using PyUNIT some, but haven't tried the Quixote unittest
module, which tells me I can't make a particularly informed
recommendation (vote, whatever).


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From thomas at xs4all.net  Tue Jan 16 16:23:52 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 16:23:52 +0100
Subject: [Python-Dev] time functions
In-Reply-To: <200101161355.IAA04802@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Jan 16, 2001 at 08:55:01AM -0500
References: <20010116004930.L1005@xs4all.nl> <00b201c07f8b$93996820$e46940d5@hagrid> <20010116081838.N1005@xs4all.nl> <20010116092211.O1005@xs4all.nl> <200101161355.IAA04802@cj20424-a.reston1.va.home.com>
Message-ID: <20010116162350.A21010@xs4all.nl>

On Tue, Jan 16, 2001 at 08:55:01AM -0500, Guido van Rossum wrote:

> Let's not redesign the time module API too much.

[snip]

Agreed.

> I fear that a committee could easily pee away years designing an
> interface to satisfy absolutely every wish.

A committee is a life form with six or more legs and no brain.
    Lazarus Long in "Time Enough For Love", by R. A. Heinlein.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From skip at mojam.com  Tue Jan 16 18:23:56 2001
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 16 Jan 2001 11:23:56 -0600 (CST)
Subject: [Python-Dev] Re: [Patches] [Patch #102891] Alternative readline module
In-Reply-To: <m366jf4esw.fsf@atrus.jesus.cam.ac.uk>
References: <E14IXZj-0007Cc-00@usw-sf-web1.sourceforge.net>
	<m366jf4esw.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <14948.33708.332464.107009@beluga.mojam.com>

    Michael> ... (or I'll just call it pyttyinput)

Which, like "Guido", when properly pronounced should leave your monitor
slightly moist... ;-)

Skip




From thomas at xs4all.net  Tue Jan 16 18:36:03 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 18:36:03 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102891] Alternative readline module
In-Reply-To: <14948.33708.332464.107009@beluga.mojam.com>; from skip@mojam.com on Tue, Jan 16, 2001 at 11:23:56AM -0600
References: <E14IXZj-0007Cc-00@usw-sf-web1.sourceforge.net> <m366jf4esw.fsf@atrus.jesus.cam.ac.uk> <14948.33708.332464.107009@beluga.mojam.com>
Message-ID: <20010116183603.B2776@xs4all.nl>

On Tue, Jan 16, 2001 at 11:23:56AM -0600, Skip Montanaro wrote:

> Which, like "Guido", when properly pronounced should leave your monitor
> slightly moist... ;-)

Nono, 'Guido' should be pronounced using a hard, back-of-your-throat 'G',
more like a growl than a hiss. The less moisture the better :)

You-were-thinking-of-Centraal-Wiskunde-Instituut-(cwi.nl)-ly y'rs,

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From trentm at ActiveState.com  Tue Jan 16 19:36:29 2001
From: trentm at ActiveState.com (Trent Mick)
Date: Tue, 16 Jan 2001 10:36:29 -0800
Subject: [Python-Dev] TELL64
In-Reply-To: <200101160408.XAA01368@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 15, 2001 at 11:08:46PM -0500
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl> <20010115162454.D3864@ActiveState.com> <200101160408.XAA01368@cj20424-a.reston1.va.home.com>
Message-ID: <20010116103626.D30209@ActiveState.com>

On Mon, Jan 15, 2001 at 11:08:46PM -0500, Guido van Rossum wrote:
> 
> Trent, you wrote that code.  Why wouldn't this work just as well?
> 
> (your code):
> 			if ((pos = TELL64(fileno(fp))) == -1L)
> 				return -1;
> (my suggestion):
> 			if (fgetpos(fp, &pos) != 0)
> 				return -1;

I agree, that looks to me like it would. I guess I just missed that when I
wrote it.

> 
> I would even go as far as to collapse the entire switch as follows:
> 
> 	fpos_t pos;
> 	switch (whence) {
> 	case SEEK_END:
> 		/* do a "no-op" seek first to sync the buffering so that
> 		   the low-level tell() can be used correctly */
> 		if (fseek(fp, 0, SEEK_END) != 0)
> 			return -1;
> 		/* fall through */
> 	case SEEK_CUR:
> 		if (fgetpos(fp, &pos) != 0)
> 			return -1;
> 		offset += pos;
> 		break;
> 	/* case SEEK_SET: break; */
> 	}
> 	return fsetpos(fp, &offset);

Sure. Just get rid of the """do a "no-op" seek...""" comment because it is no
longer applicable. I am not setup to test this on Win64 right and I don't
suppose there are a lot of you out there with your own Win64 setups. I will
be able to test this before the scheduled 2.1 beta (late Feb), though.

Trent


-- 
Trent Mick
TrentM at ActiveState.com



From trentm at ActiveState.com  Tue Jan 16 20:34:17 2001
From: trentm at ActiveState.com (Trent Mick)
Date: Tue, 16 Jan 2001 11:34:17 -0800
Subject: [Python-Dev] TELL64
In-Reply-To: <20010116103626.D30209@ActiveState.com>; from trentm@ActiveState.com on Tue, Jan 16, 2001 at 10:36:29AM -0800
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl> <20010115162454.D3864@ActiveState.com> <200101160408.XAA01368@cj20424-a.reston1.va.home.com> <20010116103626.D30209@ActiveState.com>
Message-ID: <20010116113417.I30209@ActiveState.com>

On Tue, Jan 16, 2001 at 10:36:29AM -0800, Trent Mick wrote:
> Sure. Just get rid of the """do a "no-op" seek...""" comment because it is no
> longer applicable. I am not setup to test this on Win64 right and I don't

s/right/right now/


Trent

-- 
Trent Mick
TrentM at ActiveState.com



From cgw at fnal.gov  Tue Jan 16 21:19:09 2001
From: cgw at fnal.gov (Charles G Waldman)
Date: Tue, 16 Jan 2001 14:19:09 -0600 (CST)
Subject: [Python-Dev] Re: [Patch #103248] Fix a memory leak in _sre.c
Message-ID: <14948.44221.876681.838046@buffalo.fnal.gov>

Frederik - I noticed that you chose to check in a slightly different
patch than the one I submitted.

I wonder why you chose to do this?  In particular at line 1238 I had:

    if (PyErr_Occurred()) {
        Py_DECREF(self);
        return NULL;
    }

and you changed this to 

    if (PyErr_Occurred()) {
        PyObject_DEL(self);
        return NULL;
    }

Can you explain why you made this (seemingly arbitrary) change? 

I think that since "self" was created via:

 self = PyObject_NEW_VAR(PatternObject, &Pattern_Type, n);

which calls PyObjectINIT, which in turn calls _Py_NewReference, which
increments _Py_RefTotal, it is incorrect to simply do a PyObject_DEL
to de-allocate it -- won't this screw up the value of _Py_RefTotal?

Admittedly this is a minor nit and only matters if Py_TRACE_REFS is
defined - I just wanted to check to make sure my understanding of
reference counting w.r.t. memory allocation and deallocation is
correct - if the above is in error, I'd apprecate any corrections...




From guido at python.org  Tue Jan 16 21:53:41 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 15:53:41 -0500
Subject: [Python-Dev] TELL64
In-Reply-To: Your message of "Tue, 16 Jan 2001 10:36:29 PST."
             <20010116103626.D30209@ActiveState.com> 
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl> <20010115162454.D3864@ActiveState.com> <200101160408.XAA01368@cj20424-a.reston1.va.home.com>  
            <20010116103626.D30209@ActiveState.com> 
Message-ID: <200101162053.PAA13099@cj20424-a.reston1.va.home.com>

> I agree, that looks to me like it would. I guess I just missed that when I
> wrote it.

Excellent!  I've checked this in now -- we'll hear if it breaks
anywhere soon enough.

>I am not setup to test this on Win64 right [now] and I don't
> suppose there are a lot of you out there with your own Win64 setups.

What happened to ActiveState's Itanium boxes?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Tue Jan 16 22:53:22 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Tue, 16 Jan 2001 16:53:22 -0500
Subject: [Python-Dev] Re: Detecting install time
In-Reply-To: <200101160347.WAA01132@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 15, 2001 at 10:47:32PM -0500
References: <200101160303.WAA11632@207-172-111-91.s91.tnt1.ann.va.dialup.rcn.com> <200101160347.WAA01132@cj20424-a.reston1.va.home.com>
Message-ID: <20010116165322.B29674@kronos.cnri.reston.va.us>

[CC'ing to the distutils-sig]

On Mon, Jan 15, 2001 at 10:47:32PM -0500, Guido van Rossum wrote:
>> For PEP 229, the setup.py script needs to figure out if it's running
>> from the build directory, because then distutils.sysconfig needs to
>
>You could check for the presence of config.status -- that file is not
>installed.

This isn't a check suitable for inclusion in distutils.sysconfig,
though, because it's so liable to being fooled (consider a
Distutils-packaged module that comes with a configure script to build
some library).  Right now I'm using a hacked version of sysconfig with several patches like this:

@@ -120,12 +121,16 @@
 def get_config_h_filename():
     """Return full pathname of installed config.h file."""
     inc_dir = get_python_inc(plat_specific=1)
+    # XXX
+    if 1: inc_dir = '.'
     return os.path.join(inc_dir, "config.h")
 
One hackish approach would be to add a assume_build_directories() to
distutils.sysconfig, a little back door to be used by the setup.py
script that comes with Python, so the above would become 'if
build_time_flag: ...'.  Anyone have a cleaner idea?

--amk




From akuchlin at mems-exchange.org  Wed Jan 17 02:46:47 2001
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Tue, 16 Jan 2001 20:46:47 -0500
Subject: [Python-Dev] PEP 229 issues
Message-ID: <200101170146.UAA00542@207-172-112-159.s159.tnt4.ann.va.dialup.rcn.com>

I'm in a quandry about the patch implementing PEP 229.  The patch is
quite close to being ready, with only a few minor issues remaining,
but to fix those issues, I need to make some changes to the Distutils,
such as the sysconfig modification I recently suggested. 

Problem: I believe the patch *must* go in at the alpha stage, because
there are bound to be lots of platform-specific problems that will
show up; it should not be added in the beta stage, because it'll need
time to get tested and debugged, and I wouldn't be surprised if it has
to be reverted later because of some insurmountable problem.

Problem: Greg Ward, the Distutils maintainer, is away at the moment.
I can check in changes to the Distutils without his say-so, but when
Greg gets back he might shriek in horror and rip all of the changes
out again.  (Or he's stuck with maintaining them until 2.2.)

Problem: 2.1alpha1 is due on Friday.

So, what to do?  If I know there's going to be an alpha2, that's
probably fine; Greg should have resurfaced by then, and the patch can
go in for alpha2.  

Or, I can check in the changes before Friday, and if they're
unacceptable, they can be fixed for alpha2/beta1, or simply backed
out.  

Or, I can leave Distutils alone and make setup.py a tissue of hacks
and workarounds.  For example, it might insert new versions of various
functions into the distutils.sysconf module.  Icky and fragile, but
cleaning it up for beta1 would then be a priority.

Suggestions?  Pronouncements?

--amk



From guido at python.org  Wed Jan 17 02:39:35 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 20:39:35 -0500
Subject: [Python-Dev] PEP 229 issues
In-Reply-To: Your message of "Tue, 16 Jan 2001 20:46:47 EST."
             <200101170146.UAA00542@207-172-112-159.s159.tnt4.ann.va.dialup.rcn.com> 
References: <200101170146.UAA00542@207-172-112-159.s159.tnt4.ann.va.dialup.rcn.com> 
Message-ID: <200101170139.UAA17954@cj20424-a.reston1.va.home.com>

I expect that there will be an alpha2, but I still recommend that you
check in *something* that works for alpha1, to get maximal testing
coverage.  Alpha1 may slip a day or so (Jeremy and I are both late
with our big patches, respectively nested scopes and rich comparisons,
that we really want to have in alpha1).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Wed Jan 17 03:04:53 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 16 Jan 2001 21:04:53 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <200101161409.JAA05268@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIECBIJAA.tim.one@home.com>

[Guido]
> Good idea [using string.encode()]!  This could also be used to
> "hexify" a string, for which currently one of the quickest ways
> is still the hack
>
>     "%02x"*len(s) % tuple(s)

Note that as of 2.0, a far quicker way is to use binascii.b2a_hex(), or its
absurdist (read "Barry" <wink>) synonym binascii.hexlify().

I'm wary of using string.encode() for this, because one normally hexlifies
binary data (e.g., like sha checksums), and 4 days of 7 we're more than not
in favor of moving away from strings to carry binary data.

Of course we can change our minds about this across releases, and have
even-numbered releases deprecate the function forms while odd-numbered ones
abjure methods.  Works for me <wink>.




From nas at arctrix.com  Tue Jan 16 22:08:23 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 16 Jan 2001 13:08:23 -0800
Subject: [Python-Dev] [droux@tuks.co.za: Our application doesn't work with Debian packaged Python]
Message-ID: <20010116130823.C9640@glacier.fnational.com>

This message was on the debian-python list.  Does anyone know why
the patch is needed?

  Neil

----- Forwarded message from Danie Roux <droux at tuks.co.za> -----

Date: Tue, 16 Jan 2001 11:44:48 +0200
From: Danie Roux <droux at tuks.co.za>
Subject: Our application doesn't work with Debian packaged Python
To: Debian Python <debian-python at lists.debian.org>

Good they all,

Our program is an archiver for gnome that uses gnome-python with one
widget written in C.

I converted our program to autoconf and automake so anyone can (and please
do!) compile it and see what I mean.

Everything compiles fine. But when it runs it just throws a weird
exception.

The funny thing is, if I alien RedHat 6.2's python package, and install
that, it works! I need to change nothing else. Only the python package.

I then went and look at the source rpm. They have this patch in there:

--- Python-1.5.2/Python/importdl.c.global	Sat Jul 17 16:52:26 1999
+++ Python-1.5.2/Python/importdl.c	Sat Jul 17 16:53:19 1999
@@ -441,13 +441,13 @@
 #ifdef RTLD_NOW
 		/* RTLD_NOW: resolve externals now
 		   (i.e. core dump now if some are missing) */
-		void *handle = dlopen(pathname, RTLD_NOW);
+		void *handle = dlopen(pathname, RTLD_NOW | RTLD_GLOBAL);
 #else
 		void *handle;
 		if (Py_VerboseFlag)
 			printf("dlopen(\"%s\", %d);\n", pathname,
-			       RTLD_LAZY);
-		handle = dlopen(pathname, RTLD_LAZY);
+			       RTLD_LAZY | RTLD_GLOBAL);
+		handle = dlopen(pathname, RTLD_LAZY | RTLD_GLOBAL);
 #endif /* RTLD_NOW */
 		if (handle == NULL) {
 			PyErr_SetString(PyExc_ImportError, dlerror());

Sure enough this fixes my problem. The thing is that this means our
program only works on Redhat (and who ever patched python 1.5.2 with this).

So what can I do now? How can I get this patch into debian-python? How can
I change my program to not need the patch?

btw the program is garchiver, it will be hosted at sourceforge as soon as
they get back to me, in the mean time I will mail anyone a copy of the
sources.

-- 
Danie Roux *shuffle* Adore Unix


-- 
To UNSUBSCRIBE, email to debian-python-request at lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster at lists.debian.org


----- End forwarded message -----



From guido at python.org  Wed Jan 17 05:16:48 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 23:16:48 -0500
Subject: [Python-Dev] [droux@tuks.co.za: Our application doesn't work with Debian packaged Python]
In-Reply-To: Your message of "Tue, 16 Jan 2001 13:08:23 PST."
             <20010116130823.C9640@glacier.fnational.com> 
References: <20010116130823.C9640@glacier.fnational.com> 
Message-ID: <200101170416.XAA20515@cj20424-a.reston1.va.home.com>

> This message was on the debian-python list.  Does anyone know why
> the patch is needed?

> -		handle = dlopen(pathname, RTLD_LAZY);

> +		handle = dlopen(pathname, RTLD_LAZY | RTLD_GLOBAL);

This comes back every once in a while.  It means that they have an
module whose shared library implementation exports symbols that are
needed by another shared library (probably another module).

IMO this approach is evil, because RTLD_GLOBAL means that *all*
external symbols defined by any module are exported to all other
shared libraries, and this will cause conflicts if the same symbol is
exported by two different modules -- which can happen quite easily.
(I don't know what happens on conflicts -- maybe you get an error,
maybe it links to the wrong symbol.)

The proper solution would be to put the needed entry points beside the
init<module> entry point in a separate shared library.  But that's
often not how quick-and-dirty extension modules are designed...

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido at python.org  Wed Jan 17 05:22:54 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 23:22:54 -0500
Subject: [Python-Dev] Rich Comparisons technical prerelease
Message-ID: <200101170422.XAA20626@cj20424-a.reston1.va.home.com>

I've got a working version of the rich comparisons ready for preview.

The patch is here:

  http://www.python.org/~guido/richdiff.txt

It's also referenced at sourceforge:

  http://sourceforge.net/patch/?func=detailpatch&patch_id=103283&group_id=5470

Here's a summary:

- The comparison operators support "rich comparison overloading" (PEP
  207).  C extension types can provide a rich comparison function in
  the new tp_richcompare slot in the type object.  The cmp() function
  and the C function PyObject_Compare() first try the new rich
  comparison operators before trying the old 3-way comparison.  There
  is also a new C API PyObject_RichCompare() (which also falls back on
  the old 3-way comparison, but does not constrain the outcome of the
  rich comparison to a Boolean result).

  The rich comparison function takes two objects (at least one of
  which is guaranteed to have the type that provided the function) and
  an integer indicating the opcode, which can be Py_LT, Py_LE, Py_EQ,
  Py_NE, Py_GT, Py_GE (for <, <=, ==, !=, >, >=), and returns a Python
  object, which may be NotImplemented (in which case the tp_compare
  slot function is used as a fallback, if defined).

  Classes can overload individual comparison operators by defining one
  or more of the methods__lt__, __le__, __eq__, __ne__, __gt__,
  __ge__.  There are no explicit "reversed argument" versions of
  these; instead, __lt__ and __gt__ are each other's reverse, likewise
  for__le__ and __ge__; __eq__ and __ne__ are their own reverse
  (similar at the C level).  No other implications are made; in
  particular, Python does not assume that == is the inverse of !=, or
  that < is the inverse of >=.  This makes it possible to define types
  with partial orderings.

  Classes or types that want to implement (in)equality tests but not
  the ordering operators (i.e. unordered types) should implement ==
  and !=, and raise an error for the ordering operators.

  It is possible to define types whose comparison results are not
  Boolean; e.g. a matrix type might want to return a matrix of bits
  for A < B, giving elementwise comparisons.  Such types should ensure
  that any interpretation of their value in a Boolean context raises
  an exception, e.g. by defining __nonzero__ (or the tp_nonzero slot
  at the C level) to always raise an exception.

  XXX TO DO for this feature:

  - the test "test_compare" fails, because of the changed semantics
    for complex number comparisons (1j<2j raises an error now)
  - tuple, dict should implement EQ/NE so containers containing
    complex numbers can be compared for equality (list is already
    done) -- or complex numbers should be reverted to old behavior
  - list.sort() shoud use rich comparison
  - check for memory leaks
  - int, long, float contain new-style-cmp functions that aren't used
    to their full potential any more (the new-style-cmp functions
    introduced by Neil's coercion work are gone again)
  - decide on unresolved issues from PEP 207
  - documentation
  - more testing
  - compare performance to 2.0 (microbench?)

Please give this a good spin -- I'm hoping to check this in and
make it part of the alpha 1 release Friday...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Wed Jan 17 05:50:25 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 16 Jan 2001 23:50:25 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
References: <200101161409.JAA05268@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCIECBIJAA.tim.one@home.com>
Message-ID: <14949.9361.591610.684695@anthem.wooz.org>

>>>>> "TP" == Tim Peters <tim.one at home.com> writes:

    TP> Note that as of 2.0, a far quicker way is to use
    TP> binascii.b2a_hex(), or its absurdist (read "Barry" <wink>)
    TP> synonym binascii.hexlify().

Thanks for the compliment Tim, but I can't take credit for that name.
If it was me I'd have called it wudduptify() (and its inverse,
notmuchlify()).  I stole the name from Emacs's hexlify-buffer function
which kind of does the same thing.

would-converting-to-octal-digits-be-called-octopuslify-ly y'rs,
-Barry



From fredrik at effbot.org  Wed Jan 17 09:12:32 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Wed, 17 Jan 2001 09:12:32 +0100
Subject: [Python-Dev] Re: [Patch #103248] Fix a memory leak in _sre.c
References: <14948.44221.876681.838046@buffalo.fnal.gov>
Message-ID: <00fe01c0805d$432d4cd0$e46940d5@hagrid>

Charles G Waldman wrote:
> Can you explain why you made this (seemingly arbitrary) change? 
> 
> I think that since "self" was created via:
> 
>  self = PyObject_NEW_VAR(PatternObject, &Pattern_Type, n);
> 
> which calls PyObjectINIT, which in turn calls _Py_NewReference, which
> increments _Py_RefTotal, it is incorrect to simply do a PyObject_DEL
> to de-allocate it -- won't this screw up the value of _Py_RefTotal?

and what do you think will happen if you call the destructor before
you've initialized all pointer fields in the object?

(according to the docs, the NEW/New functions return uninitialized
memory.  in this case, we're bailing out before the object has been
fully initialized.  pattern_dealloc definitely isn't prepared to deal with
random pointer values...)

> Admittedly this is a minor nit and only matters if Py_TRACE_REFS is
> defined - I just wanted to check to make sure my understanding of
> reference counting w.r.t. memory allocation and deallocation is
> correct - if the above is in error, I'd apprecate any corrections...

same here.  I don't doubt it's working as you say it does, but I find it
strange that you shouldn't be able to DEL an object you just created
with NEW...  maybe DEL should be fixed?

Cheers /F




From thomas at xs4all.net  Wed Jan 17 10:48:12 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 17 Jan 2001 10:48:12 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules Setup.config.in,1.7,1.8 Setup.dist,1.7,1.8
In-Reply-To: <E14Inu5-00047g-00@usw-pr-cvs1.sourceforge.net>; from esr@users.sourceforge.net on Wed, Jan 17, 2001 at 12:25:13AM -0800
References: <E14Inu5-00047g-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010117104812.F2776@xs4all.nl>

On Wed, Jan 17, 2001 at 12:25:13AM -0800, Eric S. Raymond wrote:

> + # ndbm(3) may require -lndbm or similar
> + @USE_NDBM_MODULE at ndbm ndbmmodule.c @HAVE_LIBNDBM@

This is an interesting module... It's not in the Modules/ directory :-) Did
you mean 'dbmmodule.c' with a different library argument ? 

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From skip at mojam.com  Wed Jan 17 16:17:39 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 17 Jan 2001 09:17:39 -0600 (CST)
Subject: [Python-Dev] Rich comparison confusion
Message-ID: <14949.46995.259157.871323@beluga.mojam.com>

I'm a bit confused about Guido's rich comparison stuff.  In the description
he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.

From akuchlin at mems-exchange.org  Wed Jan 17 16:42:13 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 17 Jan 2001 10:42:13 -0500
Subject: [Python-Dev] PEP 229 issues
In-Reply-To: <200101170139.UAA17954@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Jan 16, 2001 at 08:39:35PM -0500
References: <200101170146.UAA00542@207-172-112-159.s159.tnt4.ann.va.dialup.rcn.com> <200101170139.UAA17954@cj20424-a.reston1.va.home.com>
Message-ID: <20010117104213.B490@kronos.cnri.reston.va.us>

On Tue, Jan 16, 2001 at 08:39:35PM -0500, Guido van Rossum wrote:
>I expect that there will be an alpha2, but I still recommend that you
>check in *something* that works for alpha1, to get maximal testing
>coverage.  Alpha1 may slip a day or so (Jeremy and I are both late
>with our big patches, respectively nested scopes and rich comparisons,
>that we really want to have in alpha1).

OK; thanks for the pronouncement!

I've checked in all the smaller changes that shouldn't break anything.
All that's left now is to actually enable the new feature, which
requires the nasty changes:

     * In the top-level Makefile.in, the "sharedmods" target simply
       runs "./python setup.py build", and "sharedinstall" runs
       "./python setup.py install".  The "clobber" target also deletes
       the build/ subdirectory where Distutils puts its output.

     * Rip stuff out of the Setup files.  Modules/Setup.config.in only
       contains entries for the gc and thread modules; the readline,
       curses, and db modules are removed because it's now setup.py's
       job to handle them.
 
     * Modules/Setup.dist now contains entries for only 3 modules --
       _sre, posix, and strop.

Guido and Jeremy are rushing to finish their patches in time for the
alpha release, though Guido seems to be checking in the rich
comparison stuff now.  I don't want to impede them by making them stop
to debug build problems, so I can either wait until they've landed
their changes (at which point there's nothing major left, I think), or
they can simply not do a 'cvs update' after the serious changes go in.
Thoughts?

--amk



From barry at digicool.com  Wed Jan 17 16:54:06 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Wed, 17 Jan 2001 10:54:06 -0500
Subject: [Python-Dev] Breakage in latest CVS
Message-ID: <14949.49182.636526.292265@anthem.wooz.org>

Looks like the latest CVS (updated just minutes ago) is broken.  I'm
trying to fix some of these complaints, but thought I'd at least
report what I've found...

-Barry

...
gcc -g -O2 -Wall -Wstrict-prototypes -fPIC -I./../Include -I.. -DHAVE_CONFIG_H   -c floatobject.c -o floatobject.o
floatobject.c:675: warning: excess elements in struct initializer after `float_as_number'
floatobject.c:700: `Py_TPFLAGS_NEWSTYLENUMBER' undeclared here (not in a function)
floatobject.c:700: initializer element for `PyFloat_Type.tp_flags' is not constant
...
intobject.c:800: warning: excess elements in struct initializer after `int_as_number'
intobject.c:825: `Py_TPFLAGS_NEWSTYLENUMBER' undeclared here (not in a function)
intobject.c:825: initializer element for `PyInt_Type.tp_flags' is not constant
make[1]: *** [intobject.o] Error 1
...
gcc -g -O2 -Wall -Wstrict-prototypes -fPIC -I./../Include -I.. -DHAVE_CONFIG_H   -c longobject.c -o longobject.o
longobject.c:1865: warning: excess elements in struct initializer after `long_as_number'
longobject.c:1890: `Py_TPFLAGS_NEWSTYLENUMBER' undeclared here (not in a function)
longobject.c:1890: initializer element for `PyLong_Type.tp_flags' is not constant
make[1]: *** [longobject.o] Error 1



From guido at python.org  Wed Jan 17 17:09:27 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 17 Jan 2001 11:09:27 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Your message of "Wed, 17 Jan 2001 09:17:39 CST."
             <14949.46995.259157.871323@beluga.mojam.com> 
References: <14949.46995.259157.871323@beluga.mojam.com> 
Message-ID: <200101171609.LAA04102@cj20424-a.reston1.va.home.com>

> I'm a bit confused about Guido's rich comparison stuff.  In the description
> he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.

Yes.  By this I mean that A<B and B>A are interchangeable, ditto for
A<=B and B>=A.  Also A==B interchanges for B==A, and A!=B for B!=A.

> From a boolean standpoint this just can't be so.  Guido mentions partial
> orderings, but I'm still confused.  Consider this example: Objects of type A
> implement rich comparisons.  Objects of type B don't.  If my code looks like
> 
>     a = A()
>     b = B()
>     ...
>     if b < a:
>         ...
> 
> My interpretation of the rich comparison stuff is that either
> 
>     1. Since b doesn't implement rich comparisons, the interpreter falls
>        back to old fashioned comparisons which may or may not allow the
>        comparison of B objects and A objects.
> 
>     or
> 
>     2. The sense of the inequality is switched (a > b) and the rich
>        comparison code in A's implementation is called.

It's case 2.

> That's my reading of it.  It has to be wrong.  The inverse comparison should
> be a >= b, not a > b, but the described pairing of comparison functions
> would imply otherwise.

We're trying very hard *not* to make any connections between a<b and
a>=b.  You've learned in grade school that these are each other's
Boolean inverse (a<b is true iff a>=b is false).  However, for partial
orderings this may not be true: for unordered a and b, none of a<b,
a<=b, a>b, a>=b, a==b may be true.

On the other hand, even for partially ordered types, a<b and b>a
(note: swapped arguments *and* swapped sense of comparison) always
give the same outcome!

> I'm sure I'm missing something obvious or revealing some fundamental failure
> of my grade school education.  Please explain...

I think what threw you off was the ambiguity of "inverse".  This means
Boolean negation.  I'm not relying on Boolean negation here -- I'm
relying on the more fundamental property that a<b and b>a have the
same outcome.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mwh21 at cam.ac.uk  Wed Jan 17 17:13:32 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 17 Jan 2001 16:13:32 +0000
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Skip Montanaro's message of "Wed, 17 Jan 2001 09:17:39 -0600 (CST)"
References: <14949.46995.259157.871323@beluga.mojam.com>
Message-ID: <m3hf2y2m37.fsf@atrus.jesus.cam.ac.uk>

Skip Montanaro <skip at mojam.com> writes:

> I'm a bit confused about Guido's rich comparison stuff.  In the description
> he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.
> >From a boolean standpoint this just can't be so.  Guido mentions partial
> orderings, but I'm still confused.  Consider this example: Objects of type A
> implement rich comparisons.  Objects of type B don't.  If my code looks like
> 
>     a = A()
>     b = B()
>     ...
>     if b < a:
>         ...
> 
> My interpretation of the rich comparison stuff is that either
> 
>     1. Since b doesn't implement rich comparisons, the interpreter falls
>        back to old fashioned comparisons which may or may not allow the
>        comparison of B objects and A objects.
> 
>     or
> 
>     2. The sense of the inequality is switched (a > b) and the rich
>        comparison code in A's implementation is called.
> 
> That's my reading of it.  It has to be wrong.  The inverse comparison should
> be a >= b, not a > b, but the described pairing of comparison functions
> would imply otherwise.
> 
> I'm sure I'm missing something obvious or revealing some fundamental failure
> of my grade school education.  Please explain...

For a total order:

a < b if and only if b > a.
This is what the rich comparison code does.

a < b if and only if a >= b. 
This is that the rich comparison code doesn't do.

Does this make sense?

Cheers,
M.

-- 
  Presumably pronging in the wrong place zogs it.
                                        -- Aldabra Stoddart, ucam.chat




From moshez at zadka.site.co.il  Thu Jan 18 01:08:06 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Thu, 18 Jan 2001 02:08:06 +0200 (IST)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <14949.46995.259157.871323@beluga.mojam.com>
References: <14949.46995.259157.871323@beluga.mojam.com>
Message-ID: <20010118000806.D1C04A828@darjeeling.zadka.site.co.il>

On Wed, 17 Jan 2001 09:17:39 -0600 (CST), Skip Montanaro <skip at mojam.com> wrote:

> I'm a bit confused about Guido's rich comparison stuff.  In the description
> he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.

I think that you're confused between two meanings of inverses.

You think:
op is an inverse of op' if for every a,b  (a op b) = not (a op' b)

Guido meant (and I hope, implemented):
op is an inverse of op' if for every a,b  (a op b) =  (b op' a)

And a<b iff b>a 
a<=b iff b>=a

Sounds sane.

Unless I'm the one confused....
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From fredrik at effbot.org  Wed Jan 17 17:47:29 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Wed, 17 Jan 2001 17:47:29 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <LNBBLJKPBEHFEDALKOLCGENEIIAA.tim.one@home.com>
Message-ID: <012901c080a5$306023a0$e46940d5@hagrid>

tim wrote:
> > Should I check it in?
> 
> Absolutely!  But not like as for 2.0:  check it in *now*, so we have a few
> days to deal with surprises before the alpha release.

as it turned out, the source I had didn't build, and the table-
building python script generated something that wasn't quite
compatible with the C code.  bit rot.

I've almost sorted it all out.  will check it in later tonight (local
time).

</F>




From tim.one at home.com  Wed Jan 17 19:27:11 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 17 Jan 2001 13:27:11 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Tools/idle CallTipWindow.py,1.2,1.3 CallTips.py,1.7,1.8 ClassBrowser.py,1.11,1.12 Debugger.py,1.14,1.15 Delegator.py,1.2,1.3 FileList.py,1.7,1.8 FormatParagraph.py,1.8,1.9 IdleConf.py,1.5,1.6 IdleHistory.py,1.3,1
In-Reply-To: <200101171358.IAA27661@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEBIJAA.tim.one@home.com>

[an anonymous developer panics, after Tim "reindent"s the IDLE dir]

> Oh no!
>
> I have a whole slew of changes to IDLE sitting in my work directory.
> If I do an update half of these will turn into merge conflicts. :-(
>
> Don't worry, I'll get over it.

I imagine this will pop up from time to time until everything is normalized.
If it's about to burn you, run reindent.py on the affected directory
*before* you update ("python redindent.py -v .").  That will make all the
same changes to your local versions as were checked in, modulo the rare
hand-edit (of which there were none in the IDLE directory).




From akuchlin at mems-exchange.org  Wed Jan 17 20:04:04 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 17 Jan 2001 14:04:04 -0500
Subject: [Python-Dev] PEP 229 checked in
Message-ID: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us>

I've checked in the last bit of the PEP 229 changes.  Be sure to
rename your Modules/Setup file (or do a 'make distclean' before
rebuilding.  Squeal if you run into trouble, or file bugs on SF.

--am"Aieee!"k



From jeremy at alum.mit.edu  Wed Jan 17 20:12:47 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Wed, 17 Jan 2001 14:12:47 -0500 (EST)
Subject: [Python-Dev] unexpected consequence of function attributes
Message-ID: <14949.61103.258714.325465@localhost.localdomain>

I have found one place in the library that depended on 
hasattr(func, '__dict__') to return false -- dis.dis.  You might want
to check and see if there is anything other code that doesn't expect
function's to have extra attributes.  I expect that only introspective
code would be affected.

Jeremy



From barry at wooz.org  Wed Jan 17 20:46:36 2001
From: barry at wooz.org (Barry A. Warsaw)
Date: Wed, 17 Jan 2001 14:46:36 -0500
Subject: [Python-Dev] Re: unexpected consequence of function attributes
References: <14949.61103.258714.325465@localhost.localdomain>
Message-ID: <14949.63132.583025.303677@anthem.wooz.org>

>>>>> "JH" == Jeremy Hylton <jeremy at alum.mit.edu> writes:

    JH> I have found one place in the library that depended on
    JH> hasattr(func, '__dict__') to return false -- dis.dis.  You
    JH> might want to check and see if there is anything other code
    JH> that doesn't expect function's to have extra attributes.  I
    JH> expect that only introspective code would be affected.

I guess we need a test_dis.py in the regression test suite, eh? :)

Here's an extremely quick and dirty fix to dis.py.
-Barry

-------------------- snip snip --------------------
Index: dis.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/dis.py,v
retrieving revision 1.28
diff -u -r1.28 dis.py
--- dis.py	2001/01/14 23:36:05	1.28
+++ dis.py	2001/01/17 19:45:40
@@ -15,6 +15,10 @@
         return
     if type(x) is types.InstanceType:
         x = x.__class__
+    if hasattr(x, 'func_code'):
+        x = x.func_code
+    if hasattr(x, 'im_func'):
+        x = x.im_func
     if hasattr(x, '__dict__'):
         items = x.__dict__.items()
         items.sort()
@@ -28,17 +32,12 @@
                 except TypeError, msg:
                     print "Sorry:", msg
                 print
+    elif hasattr(x, 'co_code'):
+        disassemble(x)
     else:
-        if hasattr(x, 'im_func'):
-            x = x.im_func
-        if hasattr(x, 'func_code'):
-            x = x.func_code
-        if hasattr(x, 'co_code'):
-            disassemble(x)
-        else:
-            raise TypeError, \
-                  "don't know how to disassemble %s objects" % \
-                  type(x).__name__
+        raise TypeError, \
+              "don't know how to disassemble %s objects" % \
+              type(x).__name__
 
 def distb(tb=None):
     """Disassemble a traceback (default: last traceback)."""



From barry at wooz.org  Wed Jan 17 20:49:51 2001
From: barry at wooz.org (Barry A. Warsaw)
Date: Wed, 17 Jan 2001 14:49:51 -0500
Subject: [Python-Dev] Re: unexpected consequence of function attributes
References: <14949.61103.258714.325465@localhost.localdomain>
Message-ID: <14949.63327.22745.359978@anthem.wooz.org>

>>>>> "JH" == Jeremy Hylton <jeremy at alum.mit.edu> writes:

    JH> I have found one place in the library that depended on
    JH> hasattr(func, '__dict__') to return false -- dis.dis.  You
    JH> might want to check and see if there is anything other code
    JH> that doesn't expect function's to have extra attributes.  I
    JH> expect that only introspective code would be affected.

Patch #103303

http://sourceforge.net/patch/?func=detailpatch&patch_id=103303&group_id=5470



From tim.one at home.com  Wed Jan 17 21:51:57 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 17 Jan 2001 15:51:57 -0500
Subject: [Python-Dev] Windows Python totally hosed
Message-ID: <LNBBLJKPBEHFEDALKOLCEEEGIJAA.tim.one@home.com>

Failures range from

test test_winsound skipped --  Module use of python20.dll
    conflicts with this version of Python.

to

test test_tokenize crashed -- exceptions.AttributeError: 're' module
    has no attribute 'compile'

I suspect the latter is really a disguised version of

C:\Code\python\dist\src\PCbuild>python
Python 2.1a1 (#8, Jan 17 2001, 13:15:23) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import re
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "c:\code\python\dist\src\lib\re.py", line 28, in ?
    from sre import *
  File "c:\code\python\dist\src\lib\sre.py", line 17, in ?
    import sre_compile
  File "c:\code\python\dist\src\lib\sre_compile.py", line 11, in ?
    import _sre
ImportError: Module use of python20.dll conflicts with this version of
Python.
>>>

Suspect all of this has to do with patchlevel.h changing.  I'll try to dope
it out, but if anyone knows the cure off the top of their head, don't be
shy!




From akuchlin at mems-exchange.org  Wed Jan 17 22:00:56 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 17 Jan 2001 16:00:56 -0500
Subject: [Python-Dev] Re: 'Setup' buglet
In-Reply-To: <200101171928.OAA21460@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 17, 2001 at 02:28:36PM -0500
References: <200101171928.OAA21460@cj20424-a.reston1.va.home.com>
Message-ID: <20010117160056.A20603@kronos.cnri.reston.va.us>

[Taking this bug public]

On Wed, Jan 17, 2001 at 02:28:36PM -0500, Guido van Rossum wrote:
>One problem seems to be that the creation
>of the (minimal) Modules/Setup file doesn't seem to be doing the right
>thing.  When I delete Modules/Setup, the next "make" doesn't create
>it; it used to be copied from Setup.dist if it doesn't exist.

This seems to have been removed from Modules/Makefile.pre.in in
revision 1.69 by Fred; instead the configure script now copies
Setup.dist to Setup, so you have to rerun configure in order to create
Modules/Setup after deleting it.  

--amk



From mal at lemburg.com  Wed Jan 17 22:04:29 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jan 2001 22:04:29 +0100
Subject: [Python-Dev] Usage of "assert" in regression tests
Message-ID: <3A6608DD.E12A2422@lemburg.com>

I've just checked in a patch which removes all uses of the
assert statement in the regression tests. This makes the
tests compatible with the -O mode of Python and also allows
centralizing error reporting (many tests already provide their
own little test function for this purpose).

I urge you to only check in tests which use the new API
verify() to verify a certain condition. The API is defined
in the regression tools module test_support.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fredrik at effbot.org  Wed Jan 17 22:21:56 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Wed, 17 Jan 2001 22:21:56 +0100
Subject: [Python-Dev] Windows Python totally hosed
References: <LNBBLJKPBEHFEDALKOLCEEEGIJAA.tim.one@home.com>
Message-ID: <028801c080cb$86658350$e46940d5@hagrid>

tim wrote:
> Suspect all of this has to do with patchlevel.h changing.  I'll try to dope
> it out, but if anyone knows the cure off the top of their head, don't be
> shy!

text.replace("python20", "python21") for all files in
the PCBuild directory, plus PC/config.h

</F>




From tim.one at home.com  Wed Jan 17 22:42:13 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 17 Jan 2001 16:42:13 -0500
Subject: [Python-Dev] Windows Python totally hosed
In-Reply-To: <028801c080cb$86658350$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEEJIJAA.tim.one@home.com>

[/F]
> text.replace("python20", "python21") for all files in
> the PCBuild directory, plus PC/config.h

Brrrr.  It strikes me as insane to have the core Python files in an MS
project file *named* after the release number (python20.dsp).  So I'm going
to change that to core.dsp so that at least that much never needs to be
changed again.

gratefully y'rs  - tim




From fredrik at effbot.org  Wed Jan 17 22:47:28 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Wed, 17 Jan 2001 22:47:28 +0100
Subject: [Python-Dev] Usage of "assert" in regression tests
References: <3A6608DD.E12A2422@lemburg.com>
Message-ID: <02b401c080cf$1a3a5530$e46940d5@hagrid>

mal wrote:
> I urge you to only check in tests which use the new API
> verify() to verify a certain condition. The API is defined
> in the regression tools module test_support.

did you run the test yourself after applying that patch?

(a patch to the patch is on the way in.  please check
that the test suite still runs on non-Windows boxes...)

</F>




From gstein at lyra.org  Wed Jan 17 22:45:44 2001
From: gstein at lyra.org (Greg Stein)
Date: Wed, 17 Jan 2001 13:45:44 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.106,2.107
In-Reply-To: <E14J06i-0003ty-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Wed, Jan 17, 2001 at 01:27:04PM -0800
References: <E14J06i-0003ty-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010117134544.H7731@lyra.org>

On Wed, Jan 17, 2001 at 01:27:04PM -0800, Guido van Rossum wrote:
> Update of /cvsroot/python/python/dist/src/Objects
> In directory usw-pr-cvs1:/tmp/cvs-serv14991
> 
> Modified Files:
> 	object.c 
> Log Message:
> Deal properly (?) with comparing recursive datastructures.
>...
> - Change the in-progress code to use static variables instead of
>   globals (both the nesting level and the key for the thread dict were
>   globals but have no reason to be globals; the key can even be a
>   function-static variable in get_inprogress_dict()).

The "compare_nesting" variable is a bit troublesome long-term -- it will
cause threading issues in a free-threaded implementation. The solution is to
put the value into the thread-state.

[ not sure if it matters right now, but just bringing it up ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From fdrake at acm.org  Wed Jan 17 22:55:02 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 17 Jan 2001 16:55:02 -0500 (EST)
Subject: [Python-Dev] [PEP 205] weak references patch
Message-ID: <14950.5302.356566.778486@cj42289-a.reston1.va.home.com>

  I've updated the patch that implements PEP 205:

http://sourceforge.net/patch/?func=detailpatch&patch_id=103203&group_id=5470

  The actual patch is too big for SF:

http://starship.python.net/crew/fdrake/patches/weakref.patch-5

  One thing about this is that it changes some of the low-level object
creation macros, so you'll need to do a "make clean" before "make"
when testing it.
  Have fun!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From mal at lemburg.com  Wed Jan 17 23:16:29 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jan 2001 23:16:29 +0100
Subject: [Python-Dev] Usage of "assert" in regression tests
References: <3A6608DD.E12A2422@lemburg.com> <02b401c080cf$1a3a5530$e46940d5@hagrid>
Message-ID: <3A6619BD.2AC8F6D3@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:
> > I urge you to only check in tests which use the new API
> > verify() to verify a certain condition. The API is defined
> > in the regression tools module test_support.
> 
> did you run the test yourself after applying that patch?

Yes, but as I wrote in the SF patch message: I can only
test it on Linux and there not all tests are run due
to missing extensions. The alpha testing will hopefully catch all
possible bugs this patch introduced.
 
> (a patch to the patch is on the way in.  please check
> that the test suite still runs on non-Windows boxes...)

I'll have to leave that to the Windows wizards, sorry.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Wed Jan 17 23:49:25 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 17 Jan 2001 23:49:25 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Wed, Jan 17, 2001 at 02:04:04PM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us>
Message-ID: <20010117234925.A17392@xs4all.nl>

On Wed, Jan 17, 2001 at 02:04:04PM -0500, Andrew Kuchling wrote:
> I've checked in the last bit of the PEP 229 changes.  Be sure to
> rename your Modules/Setup file (or do a 'make distclean' before
> rebuilding.

make distclean doesn't remove Modules/Setup anymore :) Also, I couldn't get
it to work with an old tree, even after several make distclean/reconfigures.
I got tired looking for it, so I just grabbed a new tree.

> Squeal if you run into trouble, or file bugs on SF.

I have a couple of questions: what to do when setup.py doesn't work ? Is
there a way to make it bypass a module ? What about specifying include dirs
manually, for some modules (for instance, when you have readline source in a
separate directory, and want to link it statically.)

Here are are some specific squeals. See at the bottom for the most important
one :)

On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
setup.py. Also, SSL support for the socket module was not enabled, though
OpenSSL is installed, in the default path.

On Debian GNU/Linux' 'woody', the 'testing' (soon 'stable') branch, I can't
compile dbmmodule:

building 'dbm' extension
gcc -g -O2 -Wall -Wstrict-prototypes -fPIC -fpic -I. -I/home/thomas/python/python/dist/src/./Include -IInclude/ -c /home/thomas/python/python/dist/src/Modules/dbmmodule.c -o build/temp.linux-i686-2.1/dbmmodule.o
/home/thomas/python/python/dist/src/Modules/dbmmodule.c:24: #error "No ndbm.h available!"
error: command 'gcc' failed with exit status 1
make: *** [sharedmods] Error 1

(ndbm.h does exist, as /usr/include/db1/ndbm.h. There is also
/usr/include/gdbm-ndbm.h, but I'm not sure if that's the same.)

Nor can I build the _tkinter module there:

building '_tkinter' extension
gcc -g -O2 -Wall -Wstrict-prototypes -fPIC -fpic -DWITH_APPINIT=1 -I/usr/X11R6/include -I. -I/home/thomas/python/python/dist/src/./Include -IInclude/ -c /home/thomas/python/python/dist/src/Modules/_tkinter.c -o build/temp.linux-i686-2.1/_tkinter.o
/home/thomas/python/python/dist/src/Modules/_tkinter.c:44: tcl.h: No such file or directory
In file included from /home/thomas/python/python/dist/src/Modules/_tkinter.c:45:/usr/include/tk.h:66: tcl.h: No such file or directory
error: command 'gcc' failed with exit status 1
make: *** [sharedmods] Error 1

The Tcl/Tk header files are stored in /usr/include/tcl<ver>/ on Debian,
which I personally like a lot, though it's probably a bitch to autodetect.
(I tried, using autoconf ;-P)

On Debian GNU/Linux 'sid', the current unstable branch, I can't compile
Python at all, now:

c++  -Xlinker -export-dynamic python.o \
          ../libpython2.1.a   -lpthread -ldl  -lutil -lm  -o python
../libpython2.1.a(posixmodule.o): In function `posix_tmpnam':
/home/thomas/python/python-write/dist/src/Modules/./posixmodule.c:4115: the use of `tmpnam_r' is dangerous, better use `mkstemp'
../libpython2.1.a(posixmodule.o): In function `posix_tempnam':
/home/thomas/python/python-write/dist/src/Modules/./posixmodule.c:4071: the use of `tempnam' is dangerous, better use `mkstemp'
mv python ../python
make[1]: Leaving directory `/home/thomas/python/python-write/dist/src/Modules'
./python ./setup.py build
running build
running build_ext
Traceback (most recent call last):
  File "./setup.py", line 460, in ?
    main()
  File "./setup.py", line 455, in main
    ext_modules=[Extension('struct', ['structmodule.c'])]
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/core.py", line 138, in setup
    dist.run_commands()
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/dist.py", line 871, in run_commands
    self.run_command(cmd)
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/dist.py", line 891, in run_command
    cmd_obj.run()
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/command/build.py", line 106, in run
    self.run_command(cmd_name)
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/cmd.py", line 328, in run_command
    self.distribution.run_command(command)
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/dist.py", line 891, in run_command
    cmd_obj.run()
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/command/build_ext.py", line 202, in run
    customize_compiler(self.compiler)
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/sysconfig.py", line 121, in customize_compiler
    (cc, opt, ccshared, ldshared, so_ext) = \
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/sysconfig.py", line 389, in get_config_vars
    func()
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/sysconfig.py", line 302, in _init_posix
    raise DistutilsPlatformError, my_msg
distutils.errors.DistutilsPlatformError: invalid Python installation: unable to open /usr/lib/python2.1/config/Makefile (No such file or directory)
make: *** [sharedmods] Error 1

For the record, I don't have a /usr/lib/python2.1 directory on the other
machines either.

I haven't been able to test FreeBSD yet, will get to that later tonight.

And most importantly(!), on all these machines, 'make test' stops
functioning. In fact, after setup.py started building, you can't run 'make'
without 'make clean' anymore. You get a lot of undefined-symbol warnings
(see below.) If you run 'make clean;make test' it also doesn't work, because
the build directory is not in the Python library path, and regrtest.py
requires (at least) the time module.

c++  -Xlinker -export-dynamic python.o \
          ../libpython2.1.a   -lpthread -ldl  -lutil -lm  -o python 
../libpython2.1.a(posixmodule.o): In function `posix_tmpnam':
/home/thomas/python/python/dist/src/Modules/./posixmodule.c:4115: the use of `tmpnam_r' is dangerous, better use `mkstemp'
../libpython2.1.a(posixmodule.o): In function `posix_tempnam':
/home/thomas/python/python/dist/src/Modules/./posixmodule.c:4071: the use of `tempnam' is dangerous, better use `mkstemp'
../libpython2.1.a(myreadline.o): In function `my_fgets':
/home/thomas/python/python/dist/src/Parser/myreadline.c:41: undefined reference to `PyOS_InterruptOccurred'
/home/thomas/python/python/dist/src/Parser/myreadline.c:35: undefined reference to `PyOS_InterruptOccurred'
../libpython2.1.a(errors.o): In function `PyErr_SetFromErrnoWithFilename':
/home/thomas/python/python/dist/src/Python/errors.c:260: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(pythonrun.o): In function `Py_Finalize':
/home/thomas/python/python/dist/src/Python/pythonrun.c:193: undefined reference to `PyOS_FiniInterrupts'
../libpython2.1.a(pythonrun.o): In function `initsigs':
/home/thomas/python/python/dist/src/Python/pythonrun.c:1161: undefined reference to `PyOS_InitInterrupts'
../libpython2.1.a(traceback.o): In function `tb_printinternal':
/home/thomas/python/python/dist/src/Python/traceback.c:213: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(fileobject.o): In function `get_line':
/home/thomas/python/python/dist/src/Objects/fileobject.c:883: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(longobject.o): In function `long_format':
/home/thomas/python/python/dist/src/Objects/longobject.c:644: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(longobject.o): In function `x_divrem':
/home/thomas/python/python/dist/src/Objects/longobject.c:855: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(longobject.o): In function `long_mul':
/home/thomas/python/python/dist/src/Objects/longobject.c:1193: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(object.o):/home/thomas/python/python/dist/src/Objects/object.c:174: more undefined references to `PyErr_CheckSignals' follow
../libpython2.1.a(posixmodule.o): In function `posix_fork':
/home/thomas/python/python/dist/src/Modules/./posixmodule.c:1666: undefined reference to `PyOS_AfterFork'
../libpython2.1.a(posixmodule.o): In function `posix_forkpty':
/home/thomas/python/python/dist/src/Modules/./posixmodule.c:1733: undefined reference to `PyOS_AfterFork'
collect2: ld returned 1 exit status
make[1]: *** [link] Error 1
make[1]: Leaving directory `/home/thomas/python/python/dist/src/Modules'
make: *** [python] Error 2

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mal at lemburg.com  Wed Jan 17 23:56:58 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jan 2001 23:56:58 +0100
Subject: [Python-Dev] Standard install locations for Python ?
Message-ID: <3A66233A.A6AE07BD@lemburg.com>

I'm currently busy building new version of my mx packages. While
trying to convert all of them to distutils I found that there
seems to be no standard for installing documentation or other
data files of Python extensions. I also noted, that for Windows
the standard extension installation defaults to \Python instead
of some \Python\Site-Packages. So the general question is:

Where should Python extensions install themselves and their docs ?

(On Linux the typical place for docs is /usr/doc/packages,
for Python code it is /usr/local/lib/pythonX.X/site-packages,
BTW)

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Thu Jan 18 00:04:09 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 17 Jan 2001 18:04:09 -0500
Subject: [Python-Dev] Rich Comparisons technical prerelease
In-Reply-To: <200101170422.XAA20626@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Jan 16, 2001 at 11:22:54PM -0500
References: <200101170422.XAA20626@cj20424-a.reston1.va.home.com>
Message-ID: <20010117180409.A17897@thyrsus.com>

Guido van Rossum <guido at python.org>:
>   This makes it possible to define types with partial orderings.

Guido's time machine is working again, and seems now to have been
augmented by telepathy.  I was just thinking about bugging him about
this...

I will definitely check this out with my set() class -- it was waiting on
rich comparisons so I could do partial-orderings properly.  If it works,
we'll have set algebra for the standard library.  Coolness.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Under democracy one party always devotes its chief energies
to trying to prove that the other party is unfit to rule--and
both commonly succeed, and are right... The United States
has never developed an aristocracy really disinterested or an
intelligentsia really intelligent. Its history is simply a record
of vacillations between two gangs of frauds. 
	--- H. L. Mencken



From akuchlin at mems-exchange.org  Thu Jan 18 00:09:47 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 17 Jan 2001 18:09:47 -0500
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010117234925.A17392@xs4all.nl>; from thomas@xs4all.net on Wed, Jan 17, 2001 at 11:49:25PM +0100
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl>
Message-ID: <20010117180947.E9384@kronos.cnri.reston.va.us>

On Wed, Jan 17, 2001 at 11:49:25PM +0100, Thomas Wouters wrote:
>I have a couple of questions: what to do when setup.py doesn't work ? Is
>there a way to make it bypass a module ? What about specifying include dirs

There's a 'disabled_module_list' global in the code, but no way to set
it from the command-line yet, since I couldn't figure out how to do
that in time.

>On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
>setup.py. Also, SSL support for the socket module was not enabled, though
>OpenSSL is installed, in the default path.

Can you take a look at the detection code in setup.py and see what's
going wrong.  I believe it should be found if OpenSSL is in
/usr/local/, but /usr/contrib isn't checked currently.

>The Tcl/Tk header files are stored in /usr/include/tcl<ver>/ on Debian,
>which I personally like a lot, though it's probably a bitch to autodetect.
>(I tried, using autoconf ;-P)

There's code to handle Debian, though I have no way of testing it, and
it worked on Neil's Debian box for some reason.  Search for
debian_tcl_include in setup.py, and see if you can fix it.

>distutils.errors.DistutilsPlatformError: invalid Python installation: unable to open /usr/lib/python2.1/config/Makefile (No such file or directory)

Are you sure setup.py is up to date; do a 'cvs update setup.py' to check.  
You might get a "setup.py is in the way; remove it' message if you 
downloaded the first setup.py script manually.

>without 'make clean' anymore. You get a lot of undefined-symbol warnings
>(see below.) If you run 'make clean;make test' it also doesn't work, because
>the build directory is not in the Python library path, and regrtest.py
>requires (at least) the time module.

Again, be sure the tree is up to date; I think this stems from
attempting to compile the signal module as shared, which doesn't work.
I know that "make test" doesn't work, but am not sure how to fix it
yet.

--amk



From tim.one at home.com  Thu Jan 18 00:42:24 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 17 Jan 2001 18:42:24 -0500
Subject: [Python-Dev] Windows Python totally rad
Message-ID: <LNBBLJKPBEHFEDALKOLCIEFCIJAA.tim.one@home.com>

Windows Python runs normally again, modulo four test failures I figure are
due to the "get rid of assert" patch.

Note that the python20 DevStudio subproject is gone.  It's been replaced by
a new subproject named pythoncore.




From thomas at xs4all.net  Thu Jan 18 00:44:00 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 18 Jan 2001 00:44:00 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010117234925.A17392@xs4all.nl>; from thomas@xs4all.net on Wed, Jan 17, 2001 at 11:49:25PM +0100
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl>
Message-ID: <20010118004400.B17392@xs4all.nl>

On Wed, Jan 17, 2001 at 11:49:25PM +0100, Thomas Wouters wrote:

I got around to testing on FreeBSD now, and it actually went pretty smooth!
However, some small points:

> On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
> setup.py. Also, SSL support for the socket module was not enabled, though
> OpenSSL is installed, in the default path.

Curiously enough, FreeBSD, with OpenSSL installed in /usr/include/openssl,
*did* get the socketmodule compiled with SSL support, but without the
necessary -I directive, so the compile failed. 

> And most importantly(!), on all these machines, 'make test' stops
> functioning. In fact, after setup.py started building, you can't run 'make'
> without 'make clean' anymore. You get a lot of undefined-symbol warnings

Strangely enough, this problem does not exist on FreeBSD. I can run 'make'
or 'make test' after 'make' just fine. 'make test' still doesn't work
because of the incorrect library path, but it doesn't barf like the other
systems (BSDI and Debian Linux)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From esr at thyrsus.com  Thu Jan 18 01:32:53 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 17 Jan 2001 19:32:53 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <20010118000806.D1C04A828@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Thu, Jan 18, 2001 at 02:08:06AM +0200
References: <14949.46995.259157.871323@beluga.mojam.com> <20010118000806.D1C04A828@darjeeling.zadka.site.co.il>
Message-ID: <20010117193253.A18565@thyrsus.com>

Moshe Zadka <moshez at zadka.site.co.il>:
> I think that you're confused between two meanings of inverses.
> 
> You think:
> op is an inverse of op' if for every a,b  (a op b) = not (a op' b)
> 
> Guido meant (and I hope, implemented):
> op is an inverse of op' if for every a,b  (a op b) =  (b op' a)

I thought the same.

<pedantic role="defrocked mathematician">

if (a op1 b) <=> (b op2 a), op2 is properly described as the "reflection"
of op1, and vice-versa.

</pedantic>
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Sometimes the law defends plunder and participates in it. Sometimes
the law places the whole apparatus of judges, police, prisons and
gendarmes at the service of the plunderers, and treats the victim --
when he defends himself -- as a criminal.
	-- Frederic Bastiat, "The Law"



From greg at cosc.canterbury.ac.nz  Thu Jan 18 01:22:11 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Jan 2001 13:22:11 +1300 (NZDT)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <m3hf2y2m37.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <200101180022.NAA00898@s454.cosc.canterbury.ac.nz>

Michael Hudson <mwh21 at cam.ac.uk>:

> a < b if and only if b > a.
> This is what the rich comparison code does.

Someone is bound to come up with a use for comparison
operator overloading in which this isn't true, just
to be difficult!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From guido at python.org  Thu Jan 18 04:40:31 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 17 Jan 2001 22:40:31 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.106,2.107
In-Reply-To: Your message of "Wed, 17 Jan 2001 13:45:44 PST."
             <20010117134544.H7731@lyra.org> 
References: <E14J06i-0003ty-00@usw-pr-cvs1.sourceforge.net>  
            <20010117134544.H7731@lyra.org> 
Message-ID: <200101180340.WAA00655@cj20424-a.reston1.va.home.com>

> > - Change the in-progress code to use static variables instead of
> >   globals (both the nesting level and the key for the thread dict were
> >   globals but have no reason to be globals; the key can even be a
> >   function-static variable in get_inprogress_dict()).
> 
> The "compare_nesting" variable is a bit troublesome long-term -- it will
> cause threading issues in a free-threaded implementation. The solution is to
> put the value into the thread-state.
> 
> [ not sure if it matters right now, but just bringing it up ]

Good point -- especially since the in-progress-dict is already part of
the thread state.  Jeremy explained to me that the compare_nesting
variable is mostly an optimization (avoiding the work with the
in-progress-dict when we don't know for sure that it's worth it) but
yes, mixing nesting levels (even if the dicts are separate) could
cause coupling or interference between threads...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Thu Jan 18 05:20:30 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 17 Jan 2001 22:20:30 -0600 (CST)
Subject: [Python-Dev] urllib.urlencode & repeated values
Message-ID: <14950.28430.572215.10643@beluga.mojam.com>

I'm pretty sure this has come up before, but urllib.urlencode doesn't handle
repeated parameters properly.  If I call

    urllib.urlencode({"performers": ("U2","Lawrence Martin")})

instead of getting

    performers=U2&performers=Lawrence+Martin

I get a quoted stringified tuple:

    performers=%28%27U2%27%2c+%27Lawrence+Martin%27%29

Obviously, fixing this will change the function's current semantics, but I
think it's worth treating lists and tuples (actually, any sequence) as
repeated values.  If the existing semantics are deemed valuable enough, a
third default parameter could be added to switch on the new behavior when
desired.

If others agree I'd be happy to whip up a patch.  I think it's a bug.

Skip



From jeremy at alum.mit.edu  Thu Jan 18 03:58:19 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Wed, 17 Jan 2001 21:58:19 -0500 (EST)
Subject: [Python-Dev] bug in grammar
Message-ID: <14950.23499.275398.963621@localhost.localdomain>

As part of the implementation of PEP 227 (and in an attempt to reach
some low-hanging fruit Guido mentioned on the types-sig long ago), I
have been working on a compiler pass that generates a module-level
symbol table.  I recently discovered a bug in the handling of list
comprehensions that was giving me headaches.

I realize now that the problem is with the current grammar and/or
compiler.  Here's a simple demonstration; try it in your friendly
python 2.0 interpreter.

>>> [i for i in range(10)] = (1, 2, 3)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: unpack list of wrong size

The generated bytecode is:

          0 SET_LINENO               0

          3 SET_LINENO               1
          6 LOAD_CONST               0 (1)
          9 LOAD_CONST               1 (2)
         12 LOAD_CONST               2 (3)
         15 BUILD_TUPLE              3
         18 UNPACK_SEQUENCE          1
         21 STORE_NAME               0 (i)
         24 LOAD_CONST               3 (None)
         27 RETURN_VALUE        

I assume this isn't intended :-).  The compiler is ignoring everything
after the initial atom in the list comprehension.  It's basically
compiling the code as if it were:

[i] = (1, 2, 3)

I'm not sure how to try and fix this.  Should the grammar allow one to
construct the example statement above?  If not, I'm not sure how to
fix the grammar.  If not, I suppose the compiler should detect that
the list comp is misplaced.  This seems fairly messy, since there are
about 10 nodes between the expr_stmt and the list_for.

Or is this a cool way to use list comprehensions to generate
ValueErrors?

Jeremy



From akuchlin at mems-exchange.org  Thu Jan 18 06:19:31 2001
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Thu, 18 Jan 2001 00:19:31 -0500
Subject: [Python-Dev] Embedded language discussion
Message-ID: <200101180519.AAA00612@207-172-111-227.s227.tnt1.ann.va.dialup.rcn.com>

http://www.kuro5hin.org/?op=displaystory;sid=2001/1/16/11334/2280

The poster is on a project that's trying to use Python, but they're
encountering unspecified problems (perhaps because of the global
interpreter lock).

--amk



From mal at lemburg.com  Thu Jan 18 10:32:54 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jan 2001 10:32:54 +0100
Subject: [Python-Dev] Windows Python totally rad
References: <LNBBLJKPBEHFEDALKOLCIEFCIJAA.tim.one@home.com>
Message-ID: <3A66B846.3D24B959@lemburg.com>

Tim Peters wrote:
> 
> Windows Python runs normally again, modulo four test failures I figure are
> due to the "get rid of assert" patch.

Could you tell me which these are ? The tests tested all passed
just fine, so I guess these must be Windows-related problems.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fredrik at effbot.org  Thu Jan 18 07:48:41 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Thu, 18 Jan 2001 07:48:41 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <LNBBLJKPBEHFEDALKOLCGENEIIAA.tim.one@home.com> <012901c080a5$306023a0$e46940d5@hagrid>
Message-ID: <008701c0811a$b3371c00$e46940d5@hagrid>

I wrote:
> I've almost sorted it all out.  will check it in later tonight (local
> time).

python build problems and real life got in the way.

will 2.1a1 be released according to plan?  will there
be a 2.1a2 release?  maybe I should postpone this?

</F>




From esr at thyrsus.com  Thu Jan 18 08:23:21 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 18 Jan 2001 02:23:21 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
Message-ID: <20010118022321.A9021@thyrsus.com>

So I'm writing a module to that needs to generate unique cookies.  The
module will run inside one of two environments: (1) a trivial test wrapper,
not threaded, and (2) a lomg-running multithreaded server.

Because Python garbage-collects, hash() of a just-created object isn't
good enough.  Because we may be threading, millisecond time isn't
good enough.  Because we may *not* be threading, thread ID isn't good
either.  

On the other hand, I'm on Linux getting millisecond time resolution.
And it's not hard to notice that an object hash is a memory address.

So, how about `time.time()` + hex(hash([]))?

It looks to me like this will remain unique forever, because another thread
would have to create an object at the same memory address during the same
millisecond to collide.

Furthermore, it looks to me like this hack might be portable to any OS
with a clock tick shorter than its timeslice.

Comments?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Good intentions will always be pleaded for every assumption of
authority. It is hardly too strong to say that the Constitution was
made to guard the people against the dangers of good intentions. There
are men in all ages who mean to govern well, but they mean to
govern. They promise to be good masters, but they mean to be masters.
	-- Daniel Webster



From ping at lfw.org  Thu Jan 18 10:29:13 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 18 Jan 2001 01:29:13 -0800 (PST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <200101161402.JAA05045@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org>

On Tue, 16 Jan 2001, Guido van Rossum wrote:
> You mean the tp_print and tp_str function slots in type objects,
> right?  tp_print *should* always render exactly the same as tp_str.
> tp_print is used by the print statement, not by value display at the
> interactive prompt.

Uh, i hate to disagree with you about your own interpreter, but:

    com_expr_stmt in Python/compile.c
        inserts a PRINT_EXPR opcode if c_interactive is true;
    eval_code2 in Python/ceval.c
        handles PRINT_EXPR by calling displayhook;
    sys_displayhook in Python/sysmodule.c
        prints the object by calling PyFile_WriteObject on sys.stdout;
    PyFile_WriteObject in Objects/fileobject.c
        calls PyObject_Print if the file is really a PyFileObject;
    PyObject_Print in Objects/object.c
        calls op->ob_type->tp_print if it's not NULL.

The print statement produces a PRINT_ITEM opcode, which invokes
PyFile_WriteObject with a Py_PRINT_RAW flag.  That Py_PRINT_RAW
flag is propagated down to PyObject_Print and into string_print,
where it causes the string to fwrite itself directly without quoting.

> So, string_print most definitely should *not* be changed -- only
> string_repr!

I had to change them both before i actually saw the change in the
interactive interpreter.  Actually, your statement above (that the
two should always render the same) seems to imply that if i change
one, i must also change the other.


-- ?!ng




From sjoerd at oratrix.nl  Thu Jan 18 11:11:09 2001
From: sjoerd at oratrix.nl (Sjoerd Mullender)
Date: Thu, 18 Jan 2001 11:11:09 +0100
Subject: [Python-Dev] distutils in Python 2.1 not ready for prime time
Message-ID: <20010118101110.6D29C31E1B8@bireme.oratrix.nl>

I just updated my copy of python with the current CVS version and I am
not happy.

The current version uses distutils for configuring and compiling most
modules that are written in C.  That is a nice idea in theory, but in
practice it's not ready for prime time yet.  The major advantage of
using a Setup file is that you can add your own -I and -L compiler
flags on a module-by-module basis.  I *need* those flags since not all
libraries and include files are in standard places (e.g. I need
-I/usr/local/include and -L/usr/local/lib for some modules which my
compiler doesn't provide by itself).  There seems to be no way to tell
distutils to supply those flags.  The documentation (only on the web
site, also not great, but I assume more documentation (at least an
up-to-date README) will be provided in the final release) says that
that has not yet been implemented.

-- Sjoerd Mullender <sjoerd.mullender at oratrix.com>



From ping at lfw.org  Thu Jan 18 11:14:19 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 18 Jan 2001 02:14:19 -0800 (PST)
Subject: [Python-Dev] Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: <3A66BCCC.14997FE3@lemburg.com>
Message-ID: <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org>

I hope you don't mind that i'm taking this over to python-dev,
because it led me to discover a more general issue (see below).

For the others on python-dev, here's the background: MAL was
about to check in the unistr() function, described as follows:

> This patch adds a utility function unistr() which works just like
> the standard builtin str()  -- only that the return value will
> always be a Unicode object.
> 
> The patch also adds a new object level C API PyObject_Unicode()
> which complements PyObject_Str().

I responded:
> Why are unistr() and unicode() two separate functions?
> 
> str() performs one task: convert to string.  It can convert anything,
> including strings or Unicode strings, numbers, instances, etc.
> 
> The other type-named functions e.g. int(), long(), float(), list(),
> tuple() are similar in intent.
> 
> Why have unicode() just for converting strings to Unicode strings,
> and unistr() for converting everything else to a Unicode string?
> What does unistr(x) do differently from unicode(x) if x is a string?

MAL responded:
> unistr() is meant to complement str() very closely. unicode()
> works as constructor for Unicode objects which can also take
> care of decoding encoded data. str() and unistr() don't provide
> this capability but instead always assume the default encoding.
> 
> There's also a subtle difference in that str() and unistr() 
> try the tp_str slot which unicode() doesn't. unicode()
> supports any character buffer which str() and unistr() don't.

Okay, given this explanation, i still feel fairly confident
that unicode() should subsume unistr().  Many of the other
type-named functions try various slots:

    int() looks for __int__
    float() looks for __float__
    long() looks for __long__
    str() looks for __str__

In testing this i also discovered the following:

    >>> class Foo:
    ...     def __int__(self):
    ...         return 3
    ... 
    >>> f = Foo()
    >>> int(f)
    3
    >>> long(f) 
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    AttributeError: Foo instance has no attribute '__long__'
    >>> float(f)
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    AttributeError: Foo instance has no attribute '__float__'

This is kind of surprising.  How about:

    int() looks for __int__
    float() looks for __float__, then tries __int__
    long() looks for __long__, then tries __int__
    str() looks for __str__
    unicode() looks for __unicode__, then tries __str__

The extra parameter to unicode() is very similar to the extra
parameter to int(), so i think there is a natural parallel here.

Hmm... what about the other types?

Wow!!  __complex__ can produce a segfault!

    >>> complex
    <built-in function complex>
    >>> class Foo:
    ...   def __complex__(self): return 3
    ... 
    >>> Foo()
    <__main__.Foo instance at 0x81e8684>
    >>> f = _
    >>> complex(f)
    Segmentation fault (core dumped)

This happens because builtin_complex first retrieves and saves
the PyNumberMethods of the argument (in this case, from the
instance), then tries to call __complex__ (in this case, returning 3),
and THEN coerces the result using nbr->nb_float if the result is
not complex!  (This calls the instance's nb_float method on the
integer object 3!!)

I think __complex__ should probably look for __complex__, then
__float__, then __int__.

One could argue for __list__, __tuple__, or __dict__, but that
seems much weaker; the Pythonic way has always been to implement
__getitem__ instead.  There is no built-in dict(); if it existed
i suppose it would do the opposite of x.items(); again a weak
argument, though i might have found such a function useful once
or twice.

And that about covers the built-in types for data.


-- ?!ng




From ping at lfw.org  Thu Jan 18 11:16:42 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 18 Jan 2001 02:16:42 -0800 (PST)
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org>

On Thu, 18 Jan 2001, Ka-Ping Yee wrote:
>     str() looks for __str__

Oops.  I forgot that

      str() looks for __str__, then tries __repr__

So, presumably,

      unicode() should look for __unicode__, then __str__, then __repr__


-- ?!ng




From mal at lemburg.com  Thu Jan 18 11:51:46 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jan 2001 11:51:46 +0100
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. unistr()
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org>
Message-ID: <3A66CAC2.74FC894@lemburg.com>

Ka-Ping Yee wrote:
> 
> On Thu, 18 Jan 2001, Ka-Ping Yee wrote:
> >     str() looks for __str__
> 
> Oops.  I forgot that
> 
>       str() looks for __str__, then tries __repr__
> 
> So, presumably,
> 
>       unicode() should look for __unicode__, then __str__, then __repr__

Not quite... str() does this:

1. strings are passed back as-is
2. the type slot tp_str is tried
3. the method __str__ is tried
4. Unicode returns are converted to strings
5. anything other than a string return value is rejected

unistr() does the same, but makes sure that the return
value is an Unicode object.

unicode() does the following:

1. for instances, __str__ is called
2. Unicode objects are returned as-is
3. string objects or character buffers are used as basis for decoding
4. decoding is applied to the character buffer and the results
   are returned

I think we should perhaps merge the two approaches into one
which then applies all of the above in unicode() (and then
forget about unistr()). This might lose hide some type errors,
but since all other generic constructors behave more or less
in the same way, I think unicode() should too.

Thoughts ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From martin at mira.cs.tu-berlin.de  Thu Jan 18 11:48:30 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 11:48:30 +0100
Subject: [Python-Dev] Having extensions builtin
Message-ID: <200101181048.f0IAmU210251@mira.informatik.hu-berlin.de>

With the new distutils configuration scheme, it appears to be
difficult to build modules in a non-shared way. Building modules
non-shared is desirable when freezing is attempted, and also to reduce
the startup time and memory consumption.

It is still possible to add modules to Setup or Setup.local, so that
they will be build into the interpreter. However, setup.py will still
build them in a shared way afterwards. I propose that setup.py builds
only those modules that are not builtin.

Regards,
Martin




From martin at mira.cs.tu-berlin.de  Thu Jan 18 13:20:06 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 13:20:06 +0100
Subject: [Python-Dev] Standard install locations for Python ?
Message-ID: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de>

> Where should Python extensions install themselves and their docs?

I feel that extensions should not need to care. For extensions,
distutils will pick a location, and the system administrator
configuration the package can chose a different location.

Unfortunately, distutils does not support the installation of
documentation, which I think it should.

Now switching sides, as an administrator, I'd wish distutils to follow
the system conventions by default. 

That means on Linux, documentation should go into the system's <doc>
directory, which is /usr/share/doc according to latest
standards. Distributions vary, so distutils should find out - e.g. by
querying the location from rpm. In addition, when building RPMs,
distutils should declare these files as %doc in the spec file, so RPM
will install it following the system conventions.

On Windows, the convention apparently is to put the documentation
"nearby" the software, so it should probably go into Doc or a
subdirectory thereof.

On Unix, there appears to be no standard location, unless the
documentation consists of man pages or perhaps info files. So
<prefix>/share/doc is probably a place as good as any other.

Regards,
Martin



From martin at mira.cs.tu-berlin.de  Thu Jan 18 11:39:30 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 11:39:30 +0100
Subject: [Python-Dev] SSL detection problem
Message-ID: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de>

The distutils-based configuration fails to build on my system (SuSE
7.0) with the error

/usr/src/python/Modules/socketmodule.c:159: rsa.h: Datei oder Verzeichnis nicht
gefunden
/usr/src/python/Modules/socketmodule.c:160: crypto.h: Datei oder Verzeichnis nicht gefunden
/usr/src/python/Modules/socketmodule.c:161: x509.h: Datei oder Verzeichnis nicht gefunden
/usr/src/python/Modules/socketmodule.c:162: pem.h: Datei oder Verzeichnis nicht
gefunden
/usr/src/python/Modules/socketmodule.c:163: ssl.h: Datei oder Verzeichnis nicht
gefunden                                                                       

The problem is that these header files are in /usr/include/openssl,
which is not in the standard include search path.

So the obvious request is: could this be fixed? I guess when setup.py
finds the openssl library, it should also try to find ssl.h, in some
obvious locations.

The not-so-obvious question: How can one work-around such a problem
with the new setup scheme? In the old scheme, I could have chosen to
either provide the right -I option in Modules/Setup, to disable SSL
support, or to disable the _socket module altogether. How can I
achieve either configuration with the new scheme?

Regards,
Martin

P.S. As a quick hack, I added a custom include_dirs parameter to the
SSL extension.



From martin at mira.cs.tu-berlin.de  Thu Jan 18 13:39:54 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 13:39:54 +0100
Subject: [Python-Dev] bug in grammar
Message-ID: <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de>

> Should the grammar allow one to construct the example statement
> above?

It should not. Please note that the grammar allows a number of other
things, e.g.

  a+b = c

(pass this to parser.suite to see details)

> If not, I'm not sure how to fix the grammar.

The central problem is that it allows testlist on the LHS of an
augassign or '=', whereas the languages only allows a small subset in
that position. It is not possible to restrict the grammar in itself,
as that will necessarily produce a conflict - you only know that the
'+' was incorrect when you see the '='.

> I suppose the compiler should detect that the list comp is misplaced

I think there should be a well-formedness pass in-between. I.e. after
the AST has been build, a single pass should descend through the tree,
looking for an expr_statement with more than a single testlist. Once
it finds one, it should confirm that this really is a well-formed
lvalue (in C speak). In this case, the test should be that each term
is a an atom without factors.

If the parser itself performs such checks, the compiler could be
simplified in many places, I guess.

Regards,
Martin



From thomas at xs4all.net  Thu Jan 18 10:53:14 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 18 Jan 2001 10:53:14 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010117180947.E9384@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Wed, Jan 17, 2001 at 06:09:47PM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010117180947.E9384@kronos.cnri.reston.va.us>
Message-ID: <20010118105314.D17392@xs4all.nl>

On Wed, Jan 17, 2001 at 06:09:47PM -0500, Andrew Kuchling wrote:

> >On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
> >setup.py. Also, SSL support for the socket module was not enabled, though
> >OpenSSL is installed, in the default path.
> 
> Can you take a look at the detection code in setup.py and see what's
> going wrong.  I believe it should be found if OpenSSL is in
> /usr/local/, but /usr/contrib isn't checked currently.

Well, OpenSSL rests in the default location, which is
/usr/local/ssl/include/openssl. Haven't the time to look into it right now,
sorry.

> >The Tcl/Tk header files are stored in /usr/include/tcl<ver>/ on Debian,
> >which I personally like a lot, though it's probably a bitch to autodetect.
> >(I tried, using autoconf ;-P)

> There's code to handle Debian, though I have no way of testing it, and
> it worked on Neil's Debian box for some reason.  Search for
> debian_tcl_include in setup.py, and see if you can fix it.

Ah, yes. The problem in my case is that the *library* files are just in
/usr/lib, but the include files are not. I re-indented the code to pull the
debian-specific code out of the 'if prefix + os.sep + 'lib' not in
lib_dirs' block, and it works now. Haven't tested it on other code yet, but
I think it should work regardless.

> >distutils.errors.DistutilsPlatformError: invalid Python installation: unable to open /usr/lib/python2.1/config/Makefile (No such file or directory)

> Are you sure setup.py is up to date; do a 'cvs update setup.py' to check.  
> You might get a "setup.py is in the way; remove it' message if you 
> downloaded the first setup.py script manually.

D'oh, I guess not. I thought I did (I did on all other platforms :) but I
guess I didn't, 'cause it works now. Thanx.

> >without 'make clean' anymore. You get a lot of undefined-symbol warnings
> >(see below.) If you run 'make clean;make test' it also doesn't work, because
> >the build directory is not in the Python library path, and regrtest.py
> >requires (at least) the time module.

> Again, be sure the tree is up to date; I think this stems from
> attempting to compile the signal module as shared, which doesn't work.

This happened even with completely fresh, newly checked out trees, on all
but FreeBSD (three different trees: Debian woody, BSDI 4.0 and BSDI 4.1) so
I'm pretty sure that's not it.

It works now, though, so I guess the move from a dynamic signalmodule to a
static one does the trick ;) I got 'make test' working by applying the
following patch to Makefile{,.in}, and running 'make PYTHONPATH=.:<builddir>
test' (determining builddir by hand, for now.):

***************
*** 216,223 ****
  TESTPYTHON=   ./python$(EXE) -tt
  test:         all
                -rm -f $(srcdir)/Lib/test/*.py[co]
!               -PYTHONPATH= $(TESTPYTHON) $(TESTPROG) $(TESTOPTS)
!               PYTHONPATH= $(TESTPYTHON) $(TESTPROG) $(TESTOPTS)
  
  # Install everything
  install:      altinstall bininstall maninstall
--- 216,223 ----
  TESTPYTHON=   ./python$(EXE) -tt
  test:         all
                -rm -f $(srcdir)/Lib/test/*.py[co]
!               -PYTHONPATH=$(PYTHONPATH) $(TESTPYTHON) $(TESTPROG) $(TESTOPTS)
!               PYTHONPATH=$(PYTHONPATH) $(TESTPYTHON) $(TESTPROG) $(TESTOPTS)
  
  # Install everything
  install:      altinstall bininstall maninstall

And because of that, I also noticed something funny: BSDI calls itself
'BSD/OS <version>', so distutils actually makes a directory called 'lib.bsd'
and 'temp.bsd', with inside those a directory 'os-<version>-i386-2.1'. Is
that a distutils bug, a setup.py bug, or intentional behaviour of one of the
two ?


-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From nas at arctrix.com  Thu Jan 18 08:59:22 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 17 Jan 2001 23:59:22 -0800
Subject: [Python-Dev] new Makefile.in
Message-ID: <20010117235922.A12356@glacier.fnational.com>

Spurred on by comments made by Andrew, I spent some time last
night overhauling the Python Makefiles.  I now have a toplevel
non-recursive Makefile.in that seems to work fairly well.  I'm
pretty sure it still should be portable.  It doesn't use includes
or any special GNU make features.  It is half the size of the old
Makefiles.  The build is faster and its now easier to follow if
something goes wrong.

A question: is it possible to break the Python static library up?
For example, instead of having libpython<version>.a have
Parser/parser<version>.a, Objects/objects<version>.a, etc?  There
would still only be one shared library.  This would speed up
incremental builds and also help Andrew with PEP 229.  I'm
thinking that the Makefile do something like this:

    all: python$(EXE)

    PYLIBS= Parser/parser.a Objects/objects.a ...  Modules/modules.a

    python$(EXE): $(PYLIBS)
        $(LINKCC) -o python$(EXE) $(PYLIBS) ...

    Modules/modules.a: minpython$(EXE)
        ./minpython$(EXE) setup.py


AFACT, the only thing affected by splitting up the static library
is Misc/Makefile.pre.in.  Is this correct?

  Neil



From guido at digicool.com  Thu Jan 18 15:52:23 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 09:52:23 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Your message of "Thu, 18 Jan 2001 13:22:11 +1300."
             <200101180022.NAA00898@s454.cosc.canterbury.ac.nz> 
References: <200101180022.NAA00898@s454.cosc.canterbury.ac.nz> 
Message-ID: <200101181452.JAA06899@cj20424-a.reston1.va.home.com>

> > a < b if and only if b > a.
> > This is what the rich comparison code does.
> 
> Someone is bound to come up with a use for comparison
> operator overloading in which this isn't true, just
> to be difficult!

They'll get what they deserve -- this will be clearly documented!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Thu Jan 18 16:15:25 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jan 2001 10:15:25 -0500 (EST)
Subject: [Python-Dev] Re: bug in grammar
In-Reply-To: <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de>
References: <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de>
Message-ID: <14951.2189.14393.52725@localhost.localdomain>

If I summarize your suggestion, I think you've said that ideally the
grammar should not allow assignment to list comprehensions (or a
variety of other constructs) -- but it doesn't so the compiler has to
deal with it.

This morning it seemed a lot easier to fix the bug than it did last
night :-).  com_assign() already has a number of checks for syntax
errors in assignments.  A test for list comprehensions belongs at the
same place as tests for assignment to [] and augmented assignments
applied to lists.

I'll include a fix for assignment to list comprehensions in my big
compiler patch.

Jeremy




From akuchlin at mems-exchange.org  Thu Jan 18 16:28:19 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 18 Jan 2001 10:28:19 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <20010118022321.A9021@thyrsus.com>; from esr@thyrsus.com on Thu, Jan 18, 2001 at 02:23:21AM -0500
References: <20010118022321.A9021@thyrsus.com>
Message-ID: <20010118102819.A21503@kronos.cnri.reston.va.us>

On Thu, Jan 18, 2001 at 02:23:21AM -0500, Eric S. Raymond wrote:
>And it's not hard to notice that an object hash is a memory address.

Unless the object defines __hash__()!  If you want the memory address, 
use id() instead.

--amk



From akuchlin at mems-exchange.org  Thu Jan 18 16:30:36 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 18 Jan 2001 10:30:36 -0500
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010118004400.B17392@xs4all.nl>; from thomas@xs4all.net on Thu, Jan 18, 2001 at 12:44:00AM +0100
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl>
Message-ID: <20010118103036.B21503@kronos.cnri.reston.va.us>

>On Wed, Jan 17, 2001 at 11:49:25PM +0100, Thomas Wouters wrote:
>> On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
>> setup.py. Also, SSL support for the socket module was not enabled, though
>> OpenSSL is installed, in the default path.

What does the layout of /usr/contrib look like?  Is it
/usr/contrib/openssl/include/, /usr/contrib/include/, or something
else?

>Strangely enough, this problem does not exist on FreeBSD. I can run 'make'
>or 'make test' after 'make' just fine. 'make test' still doesn't work
>because of the incorrect library path, but it doesn't barf like the other
>systems (BSDI and Debian Linux)

Have you already run "make install"?  Perhaps it's picking up the
already-installed modules when running "make test", because it really
shouldn't be working.

--amk




From gward at cnri.reston.va.us  Thu Jan 18 16:42:51 2001
From: gward at cnri.reston.va.us (Greg Ward)
Date: Thu, 18 Jan 2001 10:42:51 -0500
Subject: [Python-Dev] Where's Greg Ward ?
In-Reply-To: <3A6237D7.673BBB30@lemburg.com>; from mal@lemburg.com on Mon, Jan 15, 2001 at 12:35:51AM +0100
References: <3A6237D7.673BBB30@lemburg.com>
Message-ID: <20010118104250.A27049@thrak.cnri.reston.va.us>

On 15 January 2001, M.-A. Lemburg said:
> He seems to be offline and the people on the distutils list have some
> patches and other things which would be nice to have in distutils 
> for 2.1.

Tim was right -- I'm *really* close to being back online.  Just have to
figure out why qmail's not answering port 25 and why LILO doesn't like my
newly repartitioned hard drive, and all will be well.  Oh yeah, and getting
insurance, and a credit card, and unpacking all these cardboard boxes, and
getting some furniture, ...

(If anyone is considering it, I do *not* recommend buying a new computer,
moving internationally, and getting a high speed home Internet connection
all at the same time.)

BTW I quite approve of Andrew being temporary Distutils dictator.  Should
have done it in December, but I didn't think I'd be out of commission for so
long.  Sigh.

        Greg



From moshez at zadka.site.co.il  Fri Jan 19 01:19:45 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 19 Jan 2001 02:19:45 +0200 (IST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org>
Message-ID: <20010119001945.80DC8A83E@darjeeling.zadka.site.co.il>

On Thu, 18 Jan 2001 01:29:13 -0800 (PST), Ka-Ping Yee <ping at lfw.org> wrote:
> On Tue, 16 Jan 2001, Guido van Rossum wrote:
> > You mean the tp_print and tp_str function slots in type objects,
> > right?  tp_print *should* always render exactly the same as tp_str.
> > tp_print is used by the print statement, not by value display at the
> > interactive prompt.
> 
> Uh, i hate to disagree with you about your own interpreter, but:
> 
>     com_expr_stmt in Python/compile.c
>         inserts a PRINT_EXPR opcode if c_interactive is true;
>     eval_code2 in Python/ceval.c
>         handles PRINT_EXPR by calling displayhook;
>     sys_displayhook in Python/sysmodule.c
>         prints the object by calling PyFile_WriteObject on sys.stdout;
>     PyFile_WriteObject in Objects/fileobject.c
>         calls PyObject_Print if the file is really a PyFileObject;
>     PyObject_Print in Objects/object.c
>         calls op->ob_type->tp_print if it's not NULL.
> 
> The print statement produces a PRINT_ITEM opcode, which invokes
> PyFile_WriteObject with a Py_PRINT_RAW flag.  That Py_PRINT_RAW
> flag is propagated down to PyObject_Print and into string_print,
> where it causes the string to fwrite itself directly without quoting.
> 
> > So, string_print most definitely should *not* be changed -- only
> > string_repr!
> 
> I had to change them both before i actually saw the change in the
> interactive interpreter.  Actually, your statement above (that the
> two should always render the same) seems to imply that if i change
> one, i must also change the other.
> 
> 
> -- ?!ng
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> 
> 
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From guido at digicool.com  Thu Jan 18 17:23:19 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 11:23:19 -0500
Subject: [Python-Dev] unistr() vs. unicode()
Message-ID: <200101181623.LAA07389@cj20424-a.reston1.va.home.com>

Ping wrote in response to a SourceForge mail about MAL's unistr()
checking:

------- Forwarded Message

Date:    Wed, 17 Jan 2001 23:51:48 -0800
From:    Ka-Ping Yee <ping at lfw.org>
To:      noreply at sourceforge.net
cc:      mal at lemburg.com, guido at python.org, patches at python.org
Subject: Re: [Patches] [Patch #101664] Add new unistr() builtin + PyObject_Unic
	  ode() C API

On Wed, 17 Jan 2001 noreply at sourceforge.net wrote:
> Comment:
> This patch adds a utility function unistr() which works just like
> the standard builtin str()  -- only that the return value will
> always be a Unicode object.

Sorry for barging in, but i have an issue/question:

Why are unistr() and unicode() two separate functions?

str() performs one task: convert to string.  It can convert anything,
including strings or Unicode strings, numbers, instances, etc.

The other type-named functions e.g. int(), long(), float(), list(),
tuple() are similar in intent.

Why have unicode() just for converting strings to Unicode strings,
and unistr() for converting everything else to a Unicode string?
What does unistr(x) do differently from unicode(x) if x is a string?


- -- ?!ng

------- End of Forwarded Message

(And no, Tim, this did *not* end up in the patches list because I made
Barry remove the reply-to.  SourceForge mails never had reply-to to
begin with.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Thu Jan 18 17:28:12 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 11:28:12 -0500
Subject: [Python-Dev] urllib.urlencode & repeated values
In-Reply-To: Your message of "Wed, 17 Jan 2001 22:20:30 CST."
             <14950.28430.572215.10643@beluga.mojam.com> 
References: <14950.28430.572215.10643@beluga.mojam.com> 
Message-ID: <200101181628.LAA07406@cj20424-a.reston1.va.home.com>

> I'm pretty sure this has come up before, but urllib.urlencode doesn't handle
> repeated parameters properly.  If I call
> 
>     urllib.urlencode({"performers": ("U2","Lawrence Martin")})
> 
> instead of getting
> 
>     performers=U2&performers=Lawrence+Martin
> 
> I get a quoted stringified tuple:
> 
>     performers=%28%27U2%27%2c+%27Lawrence+Martin%27%29
> 
> Obviously, fixing this will change the function's current semantics, but I
> think it's worth treating lists and tuples (actually, any sequence) as
> repeated values.  If the existing semantics are deemed valuable enough, a
> third default parameter could be added to switch on the new behavior when
> desired.
> 
> If others agree I'd be happy to whip up a patch.  I think it's a bug.

Agreed.  If you can come up with something that supports all sequence
types, and treats singleton sequences the same as their one and only
item, it would even be the inverse of cgi.parse_qs()!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Thu Jan 18 17:43:49 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 18 Jan 2001 17:43:49 +0100
Subject: [Python-Dev] Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org>; from ping@lfw.org on Thu, Jan 18, 2001 at 02:14:19AM -0800
References: <3A66BCCC.14997FE3@lemburg.com> <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org>
Message-ID: <20010118174349.E17392@xs4all.nl>

On Thu, Jan 18, 2001 at 02:14:19AM -0800, Ka-Ping Yee wrote:

> Wow!!  __complex__ can produce a segfault!

>     >>> complex
>     <built-in function complex>
>     >>> class Foo:
>     ...   def __complex__(self): return 3
>     ... 
>     >>> Foo()
>     <__main__.Foo instance at 0x81e8684>
>     >>> f = _
>     >>> complex(f)
>     Segmentation fault (core dumped)

> This happens because builtin_complex first retrieves and saves
> the PyNumberMethods of the argument (in this case, from the
> instance), then tries to call __complex__ (in this case, returning 3),
> and THEN coerces the result using nbr->nb_float if the result is
> not complex!  (This calls the instance's nb_float method on the
> integer object 3!!)

I've noticed that lurking bug in the coercion code when I added augmented
assignment, though I don't recall whether I fixed it then, nor do I know if
that part's been "touched" by the recent coercion changes. If none of the
coercion champions speak up, I'll look at this sometime this weekend.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From akuchlin at mems-exchange.org  Thu Jan 18 17:50:28 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 18 Jan 2001 11:50:28 -0500
Subject: [Python-Dev] SSL detection problem
In-Reply-To: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de>; from martin@mira.cs.tu-berlin.de on Thu, Jan 18, 2001 at 11:39:30AM +0100
References: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de>
Message-ID: <20010118115028.D21503@kronos.cnri.reston.va.us>

On Thu, Jan 18, 2001 at 11:39:30AM +0100, Martin v. Loewis wrote:
>The problem is that these header files are in /usr/include/openssl,
>which is not in the standard include search path.

I have an improved version of setup.py (not checked in yet) that tries
to do better, checking for both header and library files.  One point:
the OpenSSL docs imply that the headers should be loaded as
<openssl/rsa.h>, not as <rsa.h>; the header files themselves use the
openssl/*.h form, which means you'd need two -I directives..  I'll
patch the socket module accordingly.

>The not-so-obvious question: How can one work-around such a problem
>with the new setup scheme? In the old scheme, I could have chosen to
>either provide the right -I option in Modules/Setup, to disable SSL
>support, or to disable the _socket module altogether. How can I
>achieve either configuration with the new scheme?

I still need to implement command-line options to specify such
overrides, but that couldn't possibly get done in time for alpha1.  I
was thinking of something like --<modulename>-libs="foo bar",
--<modulename>-includes="/usr/include/blah/", and so forth.
Suggestions for a better interface welcomed...

--amk



From guido at digicool.com  Thu Jan 18 17:55:39 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 11:55:39 -0500
Subject: [Python-Dev] bug in grammar
In-Reply-To: Your message of "Wed, 17 Jan 2001 21:58:19 EST."
             <14950.23499.275398.963621@localhost.localdomain> 
References: <14950.23499.275398.963621@localhost.localdomain> 
Message-ID: <200101181655.LAA08001@cj20424-a.reston1.va.home.com>

> As part of the implementation of PEP 227 (and in an attempt to reach
> some low-hanging fruit Guido mentioned on the types-sig long ago), I
> have been working on a compiler pass that generates a module-level
> symbol table.  I recently discovered a bug in the handling of list
> comprehensions that was giving me headaches.
> 
> I realize now that the problem is with the current grammar and/or
> compiler.  Here's a simple demonstration; try it in your friendly
> python 2.0 interpreter.
> 
> >>> [i for i in range(10)] = (1, 2, 3)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> ValueError: unpack list of wrong size
> 
> The generated bytecode is:
> 
>           0 SET_LINENO               0
> 
>           3 SET_LINENO               1
>           6 LOAD_CONST               0 (1)
>           9 LOAD_CONST               1 (2)
>          12 LOAD_CONST               2 (3)
>          15 BUILD_TUPLE              3
>          18 UNPACK_SEQUENCE          1
>          21 STORE_NAME               0 (i)
>          24 LOAD_CONST               3 (None)
>          27 RETURN_VALUE        
> 
> I assume this isn't intended :-).  The compiler is ignoring everything
> after the initial atom in the list comprehension.  It's basically
> compiling the code as if it were:
> 
> [i] = (1, 2, 3)
> 
> I'm not sure how to try and fix this.  Should the grammar allow one to
> construct the example statement above?  If not, I'm not sure how to
> fix the grammar.  If not, I suppose the compiler should detect that
> the list comp is misplaced.  This seems fairly messy, since there are
> about 10 nodes between the expr_stmt and the list_for.
> 
> Or is this a cool way to use list comprehensions to generate
> ValueErrors?

Good catch!  Not everything cool deserves to be preserved.

It looks like this happens because the code that traverses lists on
the left-hand side of an assignment was never told about list
comprehensions.  You're right that the grammar can't be fixed; it's
for the same reason that it can't be fixed to disallow "f() = 1".

The solution is to add a test for this to the compiler that flags this
as an error.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Thu Jan 18 18:01:02 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 12:01:02 -0500
Subject: [Python-Dev] Embedded language discussion
In-Reply-To: Your message of "Thu, 18 Jan 2001 00:19:31 EST."
             <200101180519.AAA00612@207-172-111-227.s227.tnt1.ann.va.dialup.rcn.com> 
References: <200101180519.AAA00612@207-172-111-227.s227.tnt1.ann.va.dialup.rcn.com> 
Message-ID: <200101181701.MAA08046@cj20424-a.reston1.va.home.com>

> http://www.kuro5hin.org/?op=displaystory;sid=2001/1/16/11334/2280
> 
> The poster is on a project that's trying to use Python, but they're
> encountering unspecified problems (perhaps because of the global
> interpreter lock).

I've sent the poster an email asking to be more specific about his
questions; probably doing the right dance when calling Python from a
thread created in C++ should do the trick.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Thu Jan 18 18:04:43 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 12:04:43 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Thu, 18 Jan 2001 01:29:13 PST."
             <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org> 
Message-ID: <200101181704.MAA08074@cj20424-a.reston1.va.home.com>

> On Tue, 16 Jan 2001, Guido van Rossum wrote:
> > You mean the tp_print and tp_str function slots in type objects,
> > right?  tp_print *should* always render exactly the same as tp_str.
> > tp_print is used by the print statement, not by value display at the
> > interactive prompt.
> 
> Uh, i hate to disagree with you about your own interpreter, but:
> 
>     com_expr_stmt in Python/compile.c
>         inserts a PRINT_EXPR opcode if c_interactive is true;
>     eval_code2 in Python/ceval.c
>         handles PRINT_EXPR by calling displayhook;
>     sys_displayhook in Python/sysmodule.c
>         prints the object by calling PyFile_WriteObject on sys.stdout;
>     PyFile_WriteObject in Objects/fileobject.c
>         calls PyObject_Print if the file is really a PyFileObject;
>     PyObject_Print in Objects/object.c
>         calls op->ob_type->tp_print if it's not NULL.
> 
> The print statement produces a PRINT_ITEM opcode, which invokes
> PyFile_WriteObject with a Py_PRINT_RAW flag.  That Py_PRINT_RAW
> flag is propagated down to PyObject_Print and into string_print,
> where it causes the string to fwrite itself directly without quoting.
> 
> > So, string_print most definitely should *not* be changed -- only
> > string_repr!
> 
> I had to change them both before i actually saw the change in the
> interactive interpreter.  Actually, your statement above (that the
> two should always render the same) seems to imply that if i change
> one, i must also change the other.

Oops.  I'm so grateful that we have a collective memory! :-)

You're right: tp_print() can be invoked in two modes: with or without
Py_PRINT_RAW flag.  In raw mode, it should behave exactly like str();
in cooked mode exactly like repr().

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin at mira.cs.tu-berlin.de  Thu Jan 18 20:31:29 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 20:31:29 +0100
Subject: [Python-Dev] Weird use of hash() -- will this work?
Message-ID: <200101181931.f0IJVTc00932@mira.informatik.hu-berlin.de>

> Comments?

Yes, three of them:

1. To guarantee uniqueness atleast within the process, the easiest
   solution would be

   if using_threads:
     import thread
     lock=thread.allocate_lock()
     _acquire = lock.acquire_lock
     _release = lock.release_lock
   else:
     _acquire = _release = lambda:None
     
   _cookie = time.time()
   def getCookie():
     global _cookie
     _acquire()
     _cookie+=1
     result = _cookie
     _release()
     return result

2. Invoking [] repeatedly likely returns the an object with the same
   id() when called twice in a row (i.e. with no intermediate objects
   allocated in-between).

3. Why did you send this question to python-dev? python-list is more
   appropriate.

Regards,
Martin




From tim.one at home.com  Thu Jan 18 20:49:12 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 14:49:12 -0500
Subject: [Python-Dev] Windows Python totally rad
In-Reply-To: <3A66B846.3D24B959@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEGKIJAA.tim.one@home.com>

[MAL]
> Could you tell me which these are [new test failures on Windows]?
> The tests tested all passed just fine, so I guess these must be
> Windows-related problems.

Not to worry, all the tests pass now.  Don't want to spend time
backtracking, as I'm not the one who fixed them and don't know who did.
FWIW, they "smelled like" shallow failures (== easy to diagnose & fix).

onward!-ly y'rs  - tim




From martin at mira.cs.tu-berlin.de  Thu Jan 18 20:37:04 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 20:37:04 +0100
Subject: [Python-Dev] new Makefile.in
Message-ID: <200101181937.f0IJb4N00997@mira.informatik.hu-berlin.de>

> A question: is it possible to break the Python static library up?
> For example, instead of having libpython<version>.a have
> Parser/parser<version>.a, Objects/objects<version>.a, etc?

Please, no. It was that way in Python 1.4 (libModules, libObjects, and
I forgot which the others were :-). We had that all documented in our
book, then Guido tried to build an extension module for the first
time, saw that these many libraries were terrible, and combined them
into a single one. That was a good thing, and we have it documented in
our book. I'm not at all looking forward to answering all the
questions why the build infrastructure of Python changed yet again...

Regards,
Martin




From fdrake at acm.org  Thu Jan 18 21:22:30 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 18 Jan 2001 15:22:30 -0500 (EST)
Subject: [Python-Dev] weak references in 2.1alpha
Message-ID: <14951.20614.176140.672447@cj42289-a.reston1.va.home.com>

  I'd like to put the weak references patch into the alpha, but
haven't received any feedback on the latest patch.  I have some
comments from Martin von L?wis on the PEP that need to be addressed,
and that could change the implementation a bit, but the basic
machinery seems to be pretty reasonable and works for me.
  Does anyone have any objections to it going into the alpha?  I'd
like to enable more wide-spread testing.
  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From mal at lemburg.com  Thu Jan 18 18:10:14 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jan 2001 18:10:14 +0100
Subject: [Python-Dev] Weird use of hash() -- will this work?
References: <20010118022321.A9021@thyrsus.com>
Message-ID: <3A672376.4B951848@lemburg.com>

"Eric S. Raymond" wrote:
> 
> So I'm writing a module to that needs to generate unique cookies.  The
> module will run inside one of two environments: (1) a trivial test wrapper,
> not threaded, and (2) a lomg-running multithreaded server.
> 
> Because Python garbage-collects, hash() of a just-created object isn't
> good enough.  Because we may be threading, millisecond time isn't
> good enough.  Because we may *not* be threading, thread ID isn't good
> either.
> 
> On the other hand, I'm on Linux getting millisecond time resolution.
> And it's not hard to notice that an object hash is a memory address.
> 
> So, how about `time.time()` + hex(hash([]))?
> 
> It looks to me like this will remain unique forever, because another thread
> would have to create an object at the same memory address during the same
> millisecond to collide.
> 
> Furthermore, it looks to me like this hack might be portable to any OS
> with a clock tick shorter than its timeslice.
> 
> Comments?

A combination of time.time(), process id and counter should
work in all cases. Make sure you use a lock around the counter,
though.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Thu Jan 18 18:30:52 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jan 2001 18:30:52 +0100
Subject: [Python-Dev] Standard install locations for Python ?
References: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de>
Message-ID: <3A67284C.B6C617A@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > Where should Python extensions install themselves and their docs?
> 
> I feel that extensions should not need to care. For extensions,
> distutils will pick a location, and the system administrator
> configuration the package can chose a different location.
> 
> Unfortunately, distutils does not support the installation of
> documentation, which I think it should.

Right.
 
> Now switching sides, as an administrator, I'd wish distutils to follow
> the system conventions by default.
> 
> That means on Linux, documentation should go into the system's <doc>
> directory, which is /usr/share/doc according to latest
> standards. Distributions vary, so distutils should find out - e.g. by
> querying the location from rpm. In addition, when building RPMs,
> distutils should declare these files as %doc in the spec file, so RPM
> will install it following the system conventions.

You currently have to do this by hand (e.g. in setup.cfg or
using the doc_files option). It should fairly easy to add
a command similar to install_data though which then applies
all the necessary magic to the paths.

If there a common landmark to look for on Unix (e.g. in case the
system does not use RPM) ?

Which paths should distutils check ?

(/usr/share/doc/packages, /usr/share/doc, /usr/doc/packages,
/usr/doc in that order ?)
 
> On Windows, the convention apparently is to put the documentation
> "nearby" the software, so it should probably go into Doc or a
> subdirectory thereof.

Na, I'd rather have \Python\Site-Packages and \Python\Site-Docs
for that purpose.
 
> On Unix, there appears to be no standard location, unless the
> documentation consists of man pages or perhaps info files. So
> <prefix>/share/doc is probably a place as good as any other.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From skip at mojam.com  Thu Jan 18 18:45:29 2001
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 18 Jan 2001 11:45:29 -0600 (CST)
Subject: [Python-Dev] urllib.urlencode & repeated values
In-Reply-To: <200101181628.LAA07406@cj20424-a.reston1.va.home.com>
References: <14950.28430.572215.10643@beluga.mojam.com>
	<200101181628.LAA07406@cj20424-a.reston1.va.home.com>
Message-ID: <14951.11193.150232.564700@beluga.mojam.com>

    >> If others agree I'd be happy to whip up a patch.  I think it's a bug.

    Guido> Agreed.

Patch #103314:

    http://sourceforge.net/patch/?func=detailpatch&patch_id=103314&group_id=5470

I assigned it to Fred for doc review.

Skip





From akuchlin at mems-exchange.org  Thu Jan 18 19:56:40 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 18 Jan 2001 13:56:40 -0500
Subject: [Python-Dev] Standard install locations for Python ?
In-Reply-To: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de>; from martin@mira.cs.tu-berlin.de on Thu, Jan 18, 2001 at 01:20:06PM +0100
References: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de>
Message-ID: <20010118135640.G21503@kronos.cnri.reston.va.us>

On Thu, Jan 18, 2001 at 01:20:06PM +0100, Martin v. Loewis wrote:
>On Unix, there appears to be no standard location, unless the
>documentation consists of man pages or perhaps info files. So
><prefix>/share/doc is probably a place as good as any other.

This seems like a good suggestion.  Should docs go in
<prefix>/share/doc/python<version>/, then?  Perhaps with
subdirectories for different extensions?

--amk





From tismer at tismer.com  Thu Jan 18 22:39:18 2001
From: tismer at tismer.com (Christian Tismer)
Date: Thu, 18 Jan 2001 22:39:18 +0100
Subject: [Python-Dev] Rich comparison confusion
References: <14949.46995.259157.871323@beluga.mojam.com> <200101171609.LAA04102@cj20424-a.reston1.va.home.com>
Message-ID: <3A676286.C33823B4@tismer.com>


Guido van Rossum wrote:
> 
> > I'm a bit confused about Guido's rich comparison stuff.  In the description
> > he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.
> 
> Yes.  By this I mean that A<B and B>A are interchangeable, ditto for
> A<=B and B>=A.  Also A==B interchanges for B==A, and A!=B for B!=A.

...

> I think what threw you off was the ambiguity of "inverse".  This means
> Boolean negation.  I'm not relying on Boolean negation here -- I'm
> relying on the more fundamental property that a<b and b>a have the
> same outcome.

Yes, the "inverse" is confusing. Is what you mean the "reverse" ?
Like the other right-side operators __radd__, is it correct to
think of

   __ge__  == __rle__

if __rle__ was written in the same fashion like __radd__ ?
It looks semantically the same, although the reason for a
call might be different.

And if my above view is right, would it perhaps be less
confusing to use in fact __rle__ and __rlt__,
or woudl it be more confusing, since __rlt__ would also be
invoked left-to-right, implementing ">".

Not shure if I added even more confusion.

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From tim.one at home.com  Thu Jan 18 22:53:44 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 16:53:44 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <20010118022321.A9021@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEGPIJAA.tim.one@home.com>

[Eric S. Raymond, in search of uniqueness]
> ...
> So, how about `time.time()` + hex(hash([]))?
>
> It looks to me like this will remain unique forever, because
> another thread would have to create an object at the same memory
> address during the same millisecond to collide.

I'm afraid it's much more vulnerable than that:  Python's thread granularity
is at the bytecode level, not the statement level.  It's very easy for
thread A and B to see the same `time.time()` value, and after that
arbitrarily long amounts of time may pass before they get around to doing
the hash([]) business.  When hash() completes, the storage for [] is
immediately reclaimed under CPython, and it's again very easy for another
thread to reuse the storage.

I'm attaching an executable test case.  It uses time.clock() because that
has much higher resolution than time.time() on Windows (better than
microsecond), but rounds it back to three decimal places to simulate
millisecond resolution.  The first three runs:

    saw 14600 unique in 30000 total
    saw 14597 unique in 30000 total
    saw 14645 unique in 30000 total

So it sucks bigtime on my box.

Better idea:  borrow the _ThreadSafeCounter class from the tail end of the
current CVS tempfile.py.  The code works whether or not threads are
available.  Then

    `time.time()` + str(_counter.get_next())

is thread-safe.  For that matter, plain old

    str(_counter.get_next())

will always be unique within a single run.  However, in either case you're
still not safe against concurrent *processes* generating the same cookies.

tempfile.py has to worry about that too, of course, so the *best* idea is to
call tempfile.mktemp() and leave it at that.  It wastes some time checking
the filesystem for a file of the same name (which, btw, goes much quicker on
Linux than on Windows).


From tismer at tismer.com  Thu Jan 18 22:56:08 2001
From: tismer at tismer.com (Christian Tismer)
Date: Thu, 18 Jan 2001 22:56:08 +0100
Subject: [Python-Dev] Weird use of hash() -- will this work?
References: <20010118022321.A9021@thyrsus.com>
Message-ID: <3A676678.7E4AF278@tismer.com>


"Eric S. Raymond" wrote:
> 
> So I'm writing a module to that needs to generate unique cookies.  The
> module will run inside one of two environments: (1) a trivial test wrapper,
> not threaded, and (2) a lomg-running multithreaded server.

What do you mean by "unique"? Unique regarding your long-running server?

If so, then I wonder why one should do
> 
> So, how about `time.time()` + hex(hash([]))?
>
instead of using a single, simple counter for all sessions?

> It looks to me like this will remain unique forever, because another thread
> would have to create an object at the same memory address during the same
> millisecond to collide.
> 
> Furthermore, it looks to me like this hack might be portable to any OS
> with a clock tick shorter than its timeslice.
> 
> Comments?

If I'm not overlooking something fundamental, the counter approach
seems to be simpler and most portable. :-)

but-sometimes-my-brain-malfunctions-badly-ly y'rs  - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From nas at arctrix.com  Thu Jan 18 16:07:13 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 18 Jan 2001 07:07:13 -0800
Subject: [Python-Dev] Re: new Makefile.in
In-Reply-To: <200101181937.f0IJb4N00997@mira.informatik.hu-berlin.de>; from martin@mira.cs.tu-berlin.de on Thu, Jan 18, 2001 at 08:37:04PM +0100
References: <200101181937.f0IJb4N00997@mira.informatik.hu-berlin.de>
Message-ID: <20010118070713.A13581@glacier.fnational.com>

On Thu, Jan 18, 2001 at 08:37:04PM +0100, Martin v. Loewis wrote:
> > A question: is it possible to break the Python static library up?
> > For example, instead of having libpython<version>.a have
> > Parser/parser<version>.a, Objects/objects<version>.a, etc?
> 
> Please, no.

Okay.

> I'm not at all looking forward to answering all the questions
> why the build infrastructure of Python changed yet again...

My Makefile patch shouldn't change the way you build extensions.

  Neil



From tim.one at home.com  Fri Jan 19 02:45:42 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 20:45:42 -0500
Subject: [Python-Dev] unistr() vs. unicode()
In-Reply-To: <200101181623.LAA07389@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEHLIJAA.tim.one@home.com>

[Guido]
> (And no, Tim, this did *not* end up in the patches list because I made
> Barry remove the reply-to.  SourceForge mails never had reply-to to
> begin with.)

Aha!  Another thing to blame Barry for <wink>.




From tim.one at home.com  Thu Jan 18 23:11:23 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 17:11:23 -0500
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <008701c0811a$b3371c00$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEGPIJAA.tim.one@home.com>

[/F]
> python build problems and real life got in the way.
>
> will 2.1a1 be released according to plan?  will there
> be a 2.1a2 release?  maybe I should postpone this?

Depends on how confident you are.  Since this is purely an optimization, I
don't think it *needs* to get into a1 in order to make the final release;
postponing a few days would be better than pushing too hard on something
that's proved hairier than anticipated.

do-the-right-thing-whatever-that-is<wink>-ly y'rs  - tim




From guido at digicool.com  Fri Jan 19 03:17:36 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 21:17:36 -0500
Subject: [Python-Dev] Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: Your message of "Thu, 18 Jan 2001 02:14:19 PST."
             <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org> 
Message-ID: <200101190217.VAA01497@cj20424-a.reston1.va.home.com>

> I hope you don't mind that i'm taking this over to python-dev,
> because it led me to discover a more general issue (see below).

No -- in fact I wanted to see this here!  (My mail backlog seems to be
clearing -- or maybe it was only a temporary unclogging... :-)

> For the others on python-dev, here's the background: MAL was
> about to check in the unistr() function, described as follows:
> 
> > This patch adds a utility function unistr() which works just like
> > the standard builtin str()  -- only that the return value will
> > always be a Unicode object.
> > 
> > The patch also adds a new object level C API PyObject_Unicode()
> > which complements PyObject_Str().
> 
> I responded:
> > Why are unistr() and unicode() two separate functions?
> > 
> > str() performs one task: convert to string.  It can convert anything,
> > including strings or Unicode strings, numbers, instances, etc.
> > 
> > The other type-named functions e.g. int(), long(), float(), list(),
> > tuple() are similar in intent.
> > 
> > Why have unicode() just for converting strings to Unicode strings,
> > and unistr() for converting everything else to a Unicode string?
> > What does unistr(x) do differently from unicode(x) if x is a string?
> 
> MAL responded:
> > unistr() is meant to complement str() very closely. unicode()
> > works as constructor for Unicode objects which can also take
> > care of decoding encoded data. str() and unistr() don't provide
> > this capability but instead always assume the default encoding.
> > 
> > There's also a subtle difference in that str() and unistr() 
> > try the tp_str slot which unicode() doesn't. unicode()
> > supports any character buffer which str() and unistr() don't.
> 
> Okay, given this explanation, i still feel fairly confident
> that unicode() should subsume unistr().  Many of the other
> type-named functions try various slots:
> 
>     int() looks for __int__
>     float() looks for __float__
>     long() looks for __long__
>     str() looks for __str__
> 
> In testing this i also discovered the following:
> 
>     >>> class Foo:
>     ...     def __int__(self):
>     ...         return 3
>     ... 
>     >>> f = Foo()
>     >>> int(f)
>     3
>     >>> long(f) 
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     AttributeError: Foo instance has no attribute '__long__'
>     >>> float(f)
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     AttributeError: Foo instance has no attribute '__float__'
> 
> This is kind of surprising.  How about:
> 
>     int() looks for __int__
>     float() looks for __float__, then tries __int__
>     long() looks for __long__, then tries __int__
>     str() looks for __str__
>     unicode() looks for __unicode__, then tries __str__

For the numeric types this could perhaps be done by calling
PyNumber_Long() from PyNumber_Float(), calling PyNumber_Int() from
PyNumber_Long().  Complex is a bit of an exception -- there's no
PyNumber_Complex(), just because I felt that nobody would need it. :-)

> The extra parameter to unicode() is very similar to the extra
> parameter to int(), so i think there is a natural parallel here.

Makes sense.

> Hmm... what about the other types?
> 
> Wow!!  __complex__ can produce a segfault!
> 
>     >>> complex
>     <built-in function complex>
>     >>> class Foo:
>     ...   def __complex__(self): return 3
>     ... 
>     >>> Foo()
>     <__main__.Foo instance at 0x81e8684>
>     >>> f = _
>     >>> complex(f)
>     Segmentation fault (core dumped)
> 
> This happens because builtin_complex first retrieves and saves
> the PyNumberMethods of the argument (in this case, from the
> instance), then tries to call __complex__ (in this case, returning 3),
> and THEN coerces the result using nbr->nb_float if the result is
> not complex!  (This calls the instance's nb_float method on the
> integer object 3!!)

Thanks!  Fixed now in CVS.

> I think __complex__ should probably look for __complex__, then
> __float__, then __int__.

I make it call PyNumber_Float(), which could be made smarter as
explained above.

> One could argue for __list__, __tuple__, or __dict__, but that
> seems much weaker; the Pythonic way has always been to implement
> __getitem__ instead.

Yes -- since __list__ etc. aren't used, let's not add them.

> There is no built-in dict(); if it existed
> i suppose it would do the opposite of x.items(); again a weak
> argument, though i might have found such a function useful once
> or twice.

Yeah, it's not very common.  Dict comprehensions anyone?

    d = {k:v for k,v in zip(range(10), range(10))}    # :-)

> And that about covers the built-in types for data.

Thanks!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Thu Jan 18 23:13:14 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 17:13:14 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <20010118022321.A9021@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHAIJAA.tim.one@home.com>

BTW, why doesn't hash([]) blow up in 2.1a1?  In 2.0 it raised

    TypeError: unhashable type

Did someone change this deliberately?




From tim.one at home.com  Thu Jan 18 23:58:22 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 17:58:22 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHAIJAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHCIJAA.tim.one@home.com>

[Tim whined]
> BTW, why doesn't hash([]) blow up in 2.1a1?  In 2.0 it raised
>
>     TypeError: unhashable type
>
> Did someone change this deliberately?

Answer:  it's an unintended consequence of the rich-comparison changes.
Guido knows how to fix it and probably <wink> will.  The list type grew a
tp_richcompare slot but lost its non-NULL tp_compare pointer.  PyObject_Hash
wasn't changed accordingly (it now believes lists support neither direct
hashing nor comparison, so does them a favor and hashes their memory
addresses).  Something trickier is probably going wrong elsewhere too, but I
won't try to remember what that is unless Guido gets hit by a bus tonight.

in-which-case-we-can-push-off-the-funeral-until-after-the-release-ly
    y'rs  - tim




From thomas at xs4all.net  Fri Jan 19 00:02:09 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 00:02:09 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010118103036.B21503@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Jan 18, 2001 at 10:30:36AM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us>
Message-ID: <20010119000209.F17392@xs4all.nl>

On Thu, Jan 18, 2001 at 10:30:36AM -0500, Andrew Kuchling wrote:
> >On Wed, Jan 17, 2001 at 11:49:25PM +0100, Thomas Wouters wrote:
> >> On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
> >> setup.py. Also, SSL support for the socket module was not enabled, though
> >> OpenSSL is installed, in the default path.

> What does the layout of /usr/contrib look like?  Is it
> /usr/contrib/openssl/include/, /usr/contrib/include/, or something
> else?

Actually, it's /usr/local, not /usr/contrib. I've never installed OpenSSL in
/usr/contrib, though I could, and maybe BSDI will, in the future. (BSDI
installs its own software in /usr, and optional free, pre-compiled software
in /usr/contrib.) OpenSSL installs into
/usr/local/ssl/include/openssl by default, and installing into /usr/contrib
would make it /usr/contrib/ssl/include/openssl.

> >Strangely enough, this problem does not exist on FreeBSD. I can run 'make'
> >or 'make test' after 'make' just fine. 'make test' still doesn't work
> >because of the incorrect library path, but it doesn't barf like the other
> >systems (BSDI and Debian Linux)

> Have you already run "make install"?  Perhaps it's picking up the
> already-installed modules when running "make test", because it really
> shouldn't be working.

Hm, I think you misread my statement. 'make test' *doesn't* work. But it
doesn't barf on the signal module being built dynamically either. You fixed
that for every platform now, I was just pointing out that this was not a
problem for FreeBSD for some reason.

'make test' still doesn't work, but I can make it work by specifying a
hand-tweaked PYTHONPATH that includes the OS/arch-dependant build directory.

This brings me to another point: how can 'make test' work at all ? Does
python always check for './Lib' (and './Modules') for modules ? If that's
specific for 'make test' and running python in the source distribution, that
sounds like a bit of a weird hack. I can't find any such hackery in the
source, but I also can't figure out how else it's working :)

More-later--Meteor-((c)-1979)-is-on-ly y'rs
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From martin at mira.cs.tu-berlin.de  Fri Jan 19 00:14:05 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 19 Jan 2001 00:14:05 +0100
Subject: [Python-Dev] weak references in 2.1alpha
Message-ID: <200101182314.f0INE5B00338@mira.informatik.hu-berlin.de>

> Does anyone have any objections to it going into the alpha? 

I'd like to request that the .clear() method is removed from the patch
for this alpha, and also that the weak dictionaries are removed until
their semantics is clarified.

It's always easier to add stuff later than to remove it.

Regards,
Martin




From nas at arctrix.com  Thu Jan 18 17:31:09 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 18 Jan 2001 08:31:09 -0800
Subject: [Python-Dev] SSL detection problem
In-Reply-To: <20010118115028.D21503@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Jan 18, 2001 at 11:50:28AM -0500
References: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de> <20010118115028.D21503@kronos.cnri.reston.va.us>
Message-ID: <20010118083109.A13972@glacier.fnational.com>

On Thu, Jan 18, 2001 at 11:50:28AM -0500, Andrew Kuchling wrote:
> On Thu, Jan 18, 2001 at 11:39:30AM +0100, Martin v. Loewis wrote:
> >The not-so-obvious question: How can one work-around such a problem
> >with the new setup scheme?
> 
> I still need to implement command-line options to specify such
> overrides, but that couldn't possibly get done in time for alpha1.

My non-recursive makefile patch allows you to use both Setup and
setup.py.  Its not quite really for prime time but its getting
close.

I would be interested if someone could point me to the source for
some crappy makes.  I've tried GNU make, BSD 4.4 pmake and
whatever comes with SunOS 5.6.  Searching for "make" doesn't work
too well. :-(

  Neil



From thomas at xs4all.net  Fri Jan 19 00:45:32 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 00:45:32 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Thu, Jan 18, 2001 at 08:46:54AM -0800
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010119004532.G17392@xs4all.nl>

On Thu, Jan 18, 2001 at 08:46:54AM -0800, Guido van Rossum wrote:

> filename = '/tmp/delete_me'

This reminds me: we need a portable way to handle test-files :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Fri Jan 19 00:56:04 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 18:56:04 -0500
Subject: [Python-Dev] new Makefile.in
In-Reply-To: Your message of "Wed, 17 Jan 2001 23:59:22 PST."
             <20010117235922.A12356@glacier.fnational.com> 
References: <20010117235922.A12356@glacier.fnational.com> 
Message-ID: <200101182356.SAA19616@cj20424-a.reston1.va.home.com>

Hi Neil,

My mail suffers delays of 12-24 hours while mail.python.org is working
on some enormous backlog.  So I just saw your message about a new
Makefile...

> Spurred on by comments made by Andrew, I spent some time last
> night overhauling the Python Makefiles.  I now have a toplevel
> non-recursive Makefile.in that seems to work fairly well.  I'm
> pretty sure it still should be portable.  It doesn't use includes
> or any special GNU make features.  It is half the size of the old
> Makefiles.  The build is faster and its now easier to follow if
> something goes wrong.

I'd like to see this!

> A question: is it possible to break the Python static library up?
> For example, instead of having libpython<version>.a have
> Parser/parser<version>.a, Objects/objects<version>.a, etc?  There
> would still only be one shared library.  This would speed up
> incremental builds and also help Andrew with PEP 229.  I'm
> thinking that the Makefile do something like this:
> 
>     all: python$(EXE)
> 
>     PYLIBS= Parser/parser.a Objects/objects.a ...  Modules/modules.a
> 
>     python$(EXE): $(PYLIBS)
>         $(LINKCC) -o python$(EXE) $(PYLIBS) ...
> 
>     Modules/modules.a: minpython$(EXE)
>         ./minpython$(EXE) setup.py

Sounds cool to me.  (Where's the patch for a shared libpython???)

> AFACT, the only thing affected by splitting up the static library
> is Misc/Makefile.pre.in.  Is this correct?

Yeah, and that should be phased out in favor of distutils anyway.  Now
would be a great time!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 01:34:02 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:34:02 -0500
Subject: [Python-Dev] Mail delays and SourceForge bugs
Message-ID: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>

Through no fault of my own, email to guido at python.org (which includes
the python-dev list) is currently suffering delays of 12-24 hours.  I
have a feeling this is probably true for all mail going through
python.org, so checkin messages ans python-dev discussion have been
greatly frustrated, with about 1 day to go until the planned 2.1a1
release date!

On top of that, the SourceForge bug manager has developed a problem:
all references to http://sourceforge.net/bugs/?group_id=5470/ come
back with this error:

  An error occured in the logger. ERROR: pg_atoi: error in "5470/":
  can't parse "/"

I'm still hoping to release Python 2.1a1 tomorrow, unless Jeremy tells
me that he needs more time for his nested scopes patch.

In the mean time, please everybody, do check out the latest CVS
version and give it a good workout!  Andrew's setup.py still has some
rough edges, I believe that in order to run it from the build
directory you still have to point PYTHONPATH to the build/lib*
directory, where he hides the shared libraries for all modules.
Andrew, are you planning to fix this?

If there's anything that you need me to know about, please mail to
guido at digicool.com -- that address suffers no delays.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Fri Jan 19 01:51:19 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 19:51:19 -0500
Subject: [Python-Dev] RE: [Pycabal] Mail delays and SourceForge bugs
In-Reply-To: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEHGIJAA.tim.one@home.com>

[Guido. notes current woes w/ python.org email, and SourceForge]

Note too that, over the past two days, it's not possible to follow
Python-Dev email via

http://mail.python.org/pipermail/python-dev/2001-January/date.html

either, as (unlike during previous occurrences of python.org email delays)
msgs aren't showing up there in a timely fashion either (for example, the
msg of Guido's to which I'm replying isn't there).

good-thing-guido's-so-easy-to-channel<wink>-ly y'rs  - tim




From guido at digicool.com  Fri Jan 19 01:52:02 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:52:02 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: Your message of "Thu, 18 Jan 2001 02:23:21 EST."
             <20010118022321.A9021@thyrsus.com> 
References: <20010118022321.A9021@thyrsus.com> 
Message-ID: <200101190052.TAA26849@cj20424-a.reston1.va.home.com>

> So I'm writing a module to that needs to generate unique cookies.  The
> module will run inside one of two environments: (1) a trivial test wrapper,
> not threaded, and (2) a lomg-running multithreaded server.
> 
> Because Python garbage-collects, hash() of a just-created object isn't
> good enough.  Because we may be threading, millisecond time isn't
> good enough.  Because we may *not* be threading, thread ID isn't good
> either.  
> 
> On the other hand, I'm on Linux getting millisecond time resolution.
> And it's not hard to notice that an object hash is a memory address.
> 
> So, how about `time.time()` + hex(hash([]))?
> 
> It looks to me like this will remain unique forever, because another thread
> would have to create an object at the same memory address during the same
> millisecond to collide.
> 
> Furthermore, it looks to me like this hack might be portable to any OS
> with a clock tick shorter than its timeslice.

Argh!  hash([]) should raise TypeError, since lists are not hashable
objects -- mutable objects can't be allowed as dictionary keys.  This
(hash([]) accidentally returned a value for a brief period after I
checked in the rich comparisons -- I've fixed that now.

But not to worry: instead of using hash([]), you can use hex(id([])).
Same thing.

On the other hand, remember how much you can do in a millisecond!
(E.g. I can call tempfile.mktemp() 5 times in that time.)  And when
you create an object and immediately delete it, the next object
created is very likely to have the same address.

But what's wrong with this:

    try:
        from thread import get_ident as unique_id
    else:
        def unique_id(): return id([])

--Guido van Rossum (home page: http://www.python.org/~guido/)



From billtut at microsoft.com  Fri Jan 19 01:53:15 2001
From: billtut at microsoft.com (Bill Tutt)
Date: Thu, 18 Jan 2001 16:53:15 -0800
Subject: [Python-Dev] MS CRT crashing:
Message-ID: <58C671173DB6174A93E9ED88DCB0883DB863F1@red-msg-07.redmond.corp.microsoft.com>


From guido at digicool.com  Fri Jan 19 01:53:13 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:53:13 -0500
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: Your message of "Thu, 18 Jan 2001 07:48:41 +0100."
             <008701c0811a$b3371c00$e46940d5@hagrid> 
References: <LNBBLJKPBEHFEDALKOLCGENEIIAA.tim.one@home.com> <012901c080a5$306023a0$e46940d5@hagrid>  
            <008701c0811a$b3371c00$e46940d5@hagrid> 
Message-ID: <200101190053.TAA26862@cj20424-a.reston1.va.home.com>

> I wrote:
> > I've almost sorted it all out.  will check it in later tonight (local
> > time).
> 
> python build problems and real life got in the way.

What?  You've got a real life?  Can't be allowed, not when we're
working on a release!

> will 2.1a1 be released according to plan?  will there
> be a 2.1a2 release?  maybe I should postpone this?

Please check it in, there's still time (2.1a1 won't go out before
Friday night, possibly it'll be delayed until Monday).

And yes, there will be a 2.1a2.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 01:55:15 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:55:15 -0500
Subject: [Python-Dev] SSL detection problem
In-Reply-To: Your message of "Thu, 18 Jan 2001 11:39:30 +0100."
             <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de> 
References: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de> 
Message-ID: <200101190055.TAA26905@cj20424-a.reston1.va.home.com>

> The distutils-based configuration fails to build on my system (SuSE
> 7.0) with the error
> 
> /usr/src/python/Modules/socketmodule.c:159: rsa.h: Datei oder Verzeichnis nicht
> gefunden
> /usr/src/python/Modules/socketmodule.c:160: crypto.h: Datei oder Verzeichnis nicht gefunden
> /usr/src/python/Modules/socketmodule.c:161: x509.h: Datei oder Verzeichnis nicht gefunden
> /usr/src/python/Modules/socketmodule.c:162: pem.h: Datei oder Verzeichnis nicht
> gefunden
> /usr/src/python/Modules/socketmodule.c:163: ssl.h: Datei oder Verzeichnis nicht
> gefunden                                                                       

The same happened to Fred on Mandrake 7.0 (except for the German
messages :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 01:58:16 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:58:16 -0500
Subject: [Python-Dev] Re: unistr() vs. unicode()
Message-ID: <200101190058.TAA26931@cj20424-a.reston1.va.home.com>

MAL's reply to Ping in this thread.

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Thu, 18 Jan 2001 10:52:12 +0100
From:    "M.-A. Lemburg" <mal at lemburg.com>
To:      Ka-Ping Yee <ping at lfw.org>
cc:      guido at python.org, patches at python.org
Subject: Re: [Patches] [Patch #101664] Add new unistr() builtin + PyObject_Unic
	  ode()C API

Ka-Ping Yee wrote:
> 
> On Wed, 17 Jan 2001 noreply at sourceforge.net wrote:
> > Comment:
> > This patch adds a utility function unistr() which works just like
> > the standard builtin str()  -- only that the return value will
> > always be a Unicode object.
> 
> Sorry for barging in, but i have an issue/question:
> 
> Why are unistr() and unicode() two separate functions?
> 
> str() performs one task: convert to string.  It can convert anything,
> including strings or Unicode strings, numbers, instances, etc.
> 
> The other type-named functions e.g. int(), long(), float(), list(),
> tuple() are similar in intent.
> 
> Why have unicode() just for converting strings to Unicode strings,
> and unistr() for converting everything else to a Unicode string?
> What does unistr(x) do differently from unicode(x) if x is a string?

unistr() is meant to complement str() very closely. unicode()
works as constructor for Unicode objects which can also take
care of decoding encoded data. str() and unistr() don't provide
this capability but instead always assume the default encoding.

There's also a subtle difference in that str() and unistr() 
try the tp_str slot which unicode() doesn't. unicode()
supports any character buffer which str() and unistr() don't.

Perhaps you are right though in that we should make all three
APIs behave in the same way with respect to coercing their
arguments. This could hide some errors... still in the long
run, I agree that the existing setup probably causes more confusion
than good.

Guido ?

- -- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/

_______________________________________________
Patches mailing list
Patches at python.org
http://mail.python.org/mailman/listinfo/patches

------- End of Forwarded Message




From guido at digicool.com  Fri Jan 19 02:04:22 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 20:04:22 -0500
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: Your message of "Thu, 18 Jan 2001 11:51:46 +0100."
             <3A66CAC2.74FC894@lemburg.com> 
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org>  
            <3A66CAC2.74FC894@lemburg.com> 
Message-ID: <200101190104.UAA27056@cj20424-a.reston1.va.home.com>

> Ka-Ping Yee wrote:
> > 
> > On Thu, 18 Jan 2001, Ka-Ping Yee wrote:
> > >     str() looks for __str__
> > 
> > Oops.  I forgot that
> > 
> >       str() looks for __str__, then tries __repr__
> > 
> > So, presumably,
> > 
> >       unicode() should look for __unicode__, then __str__, then __repr__
> 
> Not quite... str() does this:
> 
> 1. strings are passed back as-is
> 2. the type slot tp_str is tried
> 3. the method __str__ is tried
> 4. Unicode returns are converted to strings
> 5. anything other than a string return value is rejected
> 
> unistr() does the same, but makes sure that the return
> value is an Unicode object.
> 
> unicode() does the following:
> 
> 1. for instances, __str__ is called
> 2. Unicode objects are returned as-is
> 3. string objects or character buffers are used as basis for decoding
> 4. decoding is applied to the character buffer and the results
>    are returned
> 
> I think we should perhaps merge the two approaches into one
> which then applies all of the above in unicode() (and then
> forget about unistr()). This might lose hide some type errors,
> but since all other generic constructors behave more or less
> in the same way, I think unicode() should too.

Yes, I would like to see these merged.  I noticed that e.g. there is
special code to compare Unicode strings in the comparison code (I
think I *could* get rid of this now we have rich comparisons, but I
decided to put that off), and when I looked at it it uses the same set
of conversions as unicode().  Some of these seem questionable to me --
why do you try so many ways to get a string out of an object?  (On the
other hand the merge of unicode() and unistr() might have this effect
anyway...)

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido at digicool.com  Fri Jan 19 02:06:23 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 20:06:23 -0500
Subject: [Python-Dev] bug in grammar
In-Reply-To: Your message of "Thu, 18 Jan 2001 13:39:54 +0100."
             <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de> 
References: <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de> 
Message-ID: <200101190106.UAA27073@cj20424-a.reston1.va.home.com>

> I think there should be a well-formedness pass in-between. I.e. after
> the AST has been build, a single pass should descend through the tree,
> looking for an expr_statement with more than a single testlist. Once
> it finds one, it should confirm that this really is a well-formed
> lvalue (in C speak). In this case, the test should be that each term
> is a an atom without factors.

Good ideal.

> If the parser itself performs such checks, the compiler could be
> simplified in many places, I guess.

Not sure that in practice it makes much of a difference: there aren't
that many of these kinds of checks, and writing a separate pass is
expensive.  On the other hand, Jeremy is just writing a separate pass
anyway, to collect name usage information for the nested scopes.
Maybe it could be folded into that pass...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Fri Jan 19 04:20:08 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jan 2001 22:20:08 -0500 (EST)
Subject: [Python-Dev] deprecated regex used by un-deprecated modules
Message-ID: <14951.45672.806978.600944@localhost.localdomain>

There are several modules in the standard library that use the regex
module.  When they are imported, they print a warning about using a
deprecated module.  I think this is bad form.  Either the modules that
depend on regex should by updated to use re or they should be
deprecated themselves.  

I discovered the following offenders:
asynchat
knee
poplib
reconvert

I would suggest fixing asynchat and poplib and deprecating knee.  The
reconvert module may be a special case.

Jeremy



From jeremy at alum.mit.edu  Fri Jan 19 04:31:02 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jan 2001 22:31:02 -0500 (EST)
Subject: [Python-Dev] setup.py and build subdirectories
Message-ID: <14951.46326.743921.988828@localhost.localdomain>

I have a bunch of build directories under the source tree, e.g.
src/python/dist/src/build
src/python/dist/src/build-pg
src/python/dist/src/build-O3
...

The new setup.py did not successfully build in these directories.  I
hacked distutils a tiny bit and had some success.  Patch below.  I'm
not sure if the approach is kosher, but it allows me to build
successfully.

I also have a problem running 'make test' from these build
directories.  The reference to the distutils build directory has '..'
prepended to it that shouldn't exist.

Jeremy


Index: setup.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/setup.py,v
retrieving revision 1.8
diff -c -r1.8 setup.py
*** setup.py	2001/01/18 20:39:34	1.8
--- setup.py	2001/01/19 03:26:55
***************
*** 536,540 ****
            
  # --install-platlib
  if __name__ == '__main__':
!     sysconfig.set_python_build()
      main()
--- 536,541 ----
            
  # --install-platlib
  if __name__ == '__main__':
!     path, file = os.path.split(sys.argv[0])
!     sysconfig.set_python_build(path)
      main()
Index: Lib/distutils/sysconfig.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/distutils/sysconfig.py,v
retrieving revision 1.31
diff -c -r1.31 sysconfig.py
*** Lib/distutils/sysconfig.py	2001/01/17 15:16:52	1.31
--- Lib/distutils/sysconfig.py	2001/01/19 03:27:01
***************
*** 24,37 ****
  
  python_build = 0
  
! def set_python_build():
      """Set the python_build flag to true; this means that we're
      building Python itself.  Only called from the setup.py script
      shipped with Python.
      """
      
      global python_build
!     python_build = 1
  
  def get_python_inc(plat_specific=0, prefix=None):
      """Return the directory containing installed Python header files.
--- 24,37 ----
  
  python_build = 0
  
! def set_python_build(loc):
      """Set the python_build flag to true; this means that we're
      building Python itself.  Only called from the setup.py script
      shipped with Python.
      """
      
      global python_build
!     python_build = loc + "/"
  
  def get_python_inc(plat_specific=0, prefix=None):
      """Return the directory containing installed Python header files.
***************
*** 48,54 ****
          prefix = (plat_specific and EXEC_PREFIX or PREFIX)
      if os.name == "posix":
          if python_build:
!             return "Include/"
          return os.path.join(prefix, "include", "python" + sys.version[:3])
      elif os.name == "nt":
          return os.path.join(prefix, "Include") # include or Include?
--- 48,54 ----
          prefix = (plat_specific and EXEC_PREFIX or PREFIX)
      if os.name == "posix":
          if python_build:
!             return python_build + "Include/"
          return os.path.join(prefix, "include", "python" + sys.version[:3])
      elif os.name == "nt":
          return os.path.join(prefix, "Include") # include or Include?





From tim.one at home.com  Fri Jan 19 04:46:16 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 22:46:16 -0500
Subject: [Python-Dev] Type-converting functions, esp. unicode() vs. unistr() 
Message-ID: <LNBBLJKPBEHFEDALKOLCGEIBIJAA.tim.one@home.com>

[attribution lost]
> There is no built-in dict(); if it existed i suppose it would do
> the opposite of x.items(); again a weak argument, though i might
> have found such a function useful once or twice.

[Guido]
> Yeah, it's not very common.  Dict comprehensions anyone?
>
>    d = {k:v for k,v in zip(range(10), range(10))}    # :-)

It's very common in Perl code, but is in no sense the inverse of .items()
there:   when you build a dict from a list L in Perl, it acts like Python

   {L[0]: L[1],
    L[2]: L[3],
    L[4]: L[5],
    ...
   }

That's what seems most practical most often; e.g., when crunching over text
files with records of the form

    key value

(e.g., mail headers are of this form; simple contact databases; to-do lists
segregated by date; etc), whatever fancy re.split() is used to break things
apart naturally returns a flat list.  A list of two-tuples is natural only
if it was obtained from another dict's .items() <0.9 wink>.

pushing-the-limits-of-"practicality-beats-purity"?-ly y'rs  - tim




From tim.one at home.com  Fri Jan 19 07:00:27 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 19 Jan 2001 01:00:27 -0500
Subject: [Python-Dev] test_urllib failing on Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCAEIGIJAA.tim.one@home.com>

test test_urllib crashed 
    -- exceptions.AssertionError: urllib.quote problem




From tim.one at home.com  Fri Jan 19 07:39:30 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 19 Jan 2001 01:39:30 -0500
Subject: [Python-Dev] (no subject)
Message-ID: <LNBBLJKPBEHFEDALKOLCAEIIIJAA.tim.one@home.com>

[some MS internal support group]
> Turns out the C standard explicitly says you can't have an input
> follow iutput on a stream without doing fflush or fseek in-between,
> to make sure the stdio buffer is cleared.  So this program is illegal.

It's undefined (there are no "illegal" programs -- that word doesn't appear
in the std; "undefined" does and has a precise technical meaning).

In the presence of threads-- which the C std doesn't mention --you have to
address issues the std doesn't touch.  To date, MS's is the only C runtime
we've seen that corrupts itself in this situation.  It can do anything it
likes short of blowing up and still be considered a good threaded
implementation.  As is, it has to be considered sub-standard, in the
ordinary sense of displaying worse behavior than other threaded C stdio
implementations.  It falls short there on other counts too (like the lack of
getc_unlocked() & friends), but internal corruption is a particularly
egregious failing.

and-that's-the-end-of-it-for-me-ly y'rs  - tim




From mwh21 at cam.ac.uk  Fri Jan 19 09:31:18 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 19 Jan 2001 08:31:18 +0000
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: Thomas Wouters's message of "Fri, 19 Jan 2001 00:02:09 +0100"
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl>
Message-ID: <m3vgrc0wq1.fsf@atrus.jesus.cam.ac.uk>

Thomas Wouters <thomas at xs4all.net> writes:

> This brings me to another point: how can 'make test' work at all ? Does
> python always check for './Lib' (and './Modules') for modules ? If that's
> specific for 'make test' and running python in the source distribution, that
> sounds like a bit of a weird hack. I can't find any such hackery in the
> source, but I also can't figure out how else it's working :)

It's in Modules/getpath.c

Cheers,
M.

-- 
  I really hope there's a catastrophic bug insome future e-mail
  program where if you try and send an attachment it cancels your
  ISP account, deletes your harddrive, and pisses in your coffee
                                                         -- Adam Rixey




From gstein at lyra.org  Fri Jan 19 09:38:54 2001
From: gstein at lyra.org (Greg Stein)
Date: Fri, 19 Jan 2001 00:38:54 -0800
Subject: [Python-Dev] initializing ob_type (was: CVS: python/dist/src/Modules _cursesmodule.c,2.46,2.47)
In-Reply-To: <E14JPPW-0008Bt-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Thu, Jan 18, 2001 at 04:28:10PM -0800
References: <E14JPPW-0008Bt-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010119003854.F7731@lyra.org>

On Thu, Jan 18, 2001 at 04:28:10PM -0800, Guido van Rossum wrote:
>...
>   PyTypeObject PyCursesWindow_Type = {
> ! 	PyObject_HEAD_INIT(NULL)
>   	0,			/*ob_size*/
>   	"curses window",	/*tp_name*/
>...
> --- 2432,2443 ----
>   /* Initialization function for the module */
>   
> ! DL_EXPORT(void)
>   init_curses(void)
>   {
>   	PyObject *m, *d, *v, *c_api_object;
>   	static void *PyCurses_API[PyCurses_API_pointers];
> + 
> + 	/* Initialize object type */
> + 	PyCursesWindow_Type.ob_type = &PyType_Type;
>   
>   	/* Initialize the C API pointer array */


I've never truly understood this. Is it because Windows cannot initialize
(at load-time) a pointer to a data structure that is located in a different
DLL?

It is a bit painful to keep moving inits from load-time to run-time.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From tim.one at home.com  Fri Jan 19 10:01:22 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 19 Jan 2001 04:01:22 -0500
Subject: [Python-Dev] test_urllib failing on Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCCEINIJAA.tim.one@home.com>

Bet it was failing everywhere; it's fixed now.




From moshez at zadka.site.co.il  Fri Jan 19 18:53:36 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 19 Jan 2001 19:53:36 +0200 (IST)
Subject: [Python-Dev] Dbm failure
Message-ID: <20010119175336.3B5A0A83E@darjeeling.zadka.site.co.il>

test test_dbm skipped --  /home/moshez/prog/src/python/python/dist/src/build/lib.linux-i686-2.1/dbm.so: undefined symbol: dbm_firstkey

Did it happen to anyone else? 
Anything else you need to know?

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From mal at lemburg.com  Fri Jan 19 10:58:08 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 10:58:08 +0100
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. 
 unistr()
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org>  
	            <3A66CAC2.74FC894@lemburg.com> <200101190104.UAA27056@cj20424-a.reston1.va.home.com>
Message-ID: <3A680FB0.AED2DB55@lemburg.com>

Guido van Rossum wrote:
> 
> > Ka-Ping Yee wrote:
> > >
> > > On Thu, 18 Jan 2001, Ka-Ping Yee wrote:
> > > >     str() looks for __str__
> > >
> > > Oops.  I forgot that
> > >
> > >       str() looks for __str__, then tries __repr__
> > >
> > > So, presumably,
> > >
> > >       unicode() should look for __unicode__, then __str__, then __repr__
> >
> > Not quite... str() does this:
> >
> > 1. strings are passed back as-is
> > 2. the type slot tp_str is tried
> > 3. the method __str__ is tried
> > 4. Unicode returns are converted to strings
> > 5. anything other than a string return value is rejected
> >
> > unistr() does the same, but makes sure that the return
> > value is an Unicode object.
> >
> > unicode() does the following:
> >
> > 1. for instances, __str__ is called
> > 2. Unicode objects are returned as-is
> > 3. string objects or character buffers are used as basis for decoding
> > 4. decoding is applied to the character buffer and the results
> >    are returned
> >
> > I think we should perhaps merge the two approaches into one
> > which then applies all of the above in unicode() (and then
> > forget about unistr()). This might lose hide some type errors,
> > but since all other generic constructors behave more or less
> > in the same way, I think unicode() should too.
> 
> Yes, I would like to see these merged.  I noticed that e.g. there is
> special code to compare Unicode strings in the comparison code (I
> think I *could* get rid of this now we have rich comparisons, but I
> decided to put that off), and when I looked at it it uses the same set
> of conversions as unicode().  Some of these seem questionable to me --
> why do you try so many ways to get a string out of an object?  (On the
> other hand the merge of unicode() and unistr() might have this effect
> anyway...)

... because there are so many ways to get at string
representations of objects in Python at C level.

If we agree to merge the semantics of the two APIs, then str()
would have to change too: is this desirable ? (IMHO, yes)

Here's what we could do:

a) merge the semantics of unistr() into unicode()
b) apply the same semantics in str()
c) remove unistr() -- how's that for a short-living builtin ;)

About the semantics:

These should be backward compatible to str() in that everything
that worked before should continue to work after the merge.

A strawman for processing str() and unicode():

1. strings/Unicode is passed back as-is
2. tp_str is tried
3. the method __str__ is tried
4. the PyObject_AsCharBuffer() API is tried (bf_getcharbuffer)
5. for str(): Unicode return values are converted to strings using
              the default encoding
   for unicode(): Unicode return values are passed back as-is;
              string return values are decoded according to the
              encoding parameter
6. the return object is type-checked: str() will always return
   a string object, unicode() always a Unicode object

Note that passing back Unicode is only allowed in case no encoding
was given. Otherwise an execption is raised: you can't decode
Unicode.

As extension we could add encoding and error parameters to str()
as well. The result would be either an encoding of Unicode objects
passed back by tp_str or __str__ or a recoding of string objects
returned by checks 2, 3 or 4.

If we agree to take this approach, then we should remove the
unistr() Python API before the alpha ships.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fredrik at effbot.org  Fri Jan 19 11:19:06 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 19 Jan 2001 11:19:06 +0100
Subject: [Python-Dev] initializing ob_type (was: CVS: python/dist/src/Modules _cursesmodule.c,2.46,2.47)
References: <E14JPPW-0008Bt-00@usw-pr-cvs1.sourceforge.net> <20010119003854.F7731@lyra.org>
Message-ID: <010c01c08201$4b0ec050$e46940d5@hagrid>

greg wrote:
> I've never truly understood this. Is it because Windows cannot initialize
> (at load-time) a pointer to a data structure that is located in a different
> DLL?

Windows can do it (via DLL initialization code), but the compiler
doesn't generate initialization code for C programs.

you can compile the module as C++, but that's also a bit painful...

</F>




From jack at oratrix.nl  Fri Jan 19 12:02:00 2001
From: jack at oratrix.nl (Jack Jansen)
Date: Fri, 19 Jan 2001 12:02:00 +0100
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
Message-ID: <20010119110200.9E455373C95@snelboot.oratrix.nl>

I get the impression that I'm currently seeing a non-NULL third argument in my 
(C) methods even though the method is called without keyword arguments.

Is this new semantics that I missed the discussion about, or is this a bug?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | ++++ see http://www.xs4all.nl/~tank/ ++++





From thomas at xs4all.net  Fri Jan 19 13:22:06 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 13:22:06 +0100
Subject: [Python-Dev] deprecated regex used by un-deprecated modules
In-Reply-To: <14951.45672.806978.600944@localhost.localdomain>; from jeremy@alum.mit.edu on Thu, Jan 18, 2001 at 10:20:08PM -0500
References: <14951.45672.806978.600944@localhost.localdomain>
Message-ID: <20010119132206.H17392@xs4all.nl>

On Thu, Jan 18, 2001 at 10:20:08PM -0500, Jeremy Hylton wrote:

> I would suggest fixing asynchat and poplib and deprecating knee.  The
> reconvert module may be a special case.

Can't reconvert just disable the warning before importing regex ? That would
seem the sane thing to do, at least to me.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Fri Jan 19 13:26:31 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 13:26:31 +0100
Subject: [Python-Dev] Mail delays and SourceForge bugs
In-Reply-To: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Jan 18, 2001 at 07:34:02PM -0500
References: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>
Message-ID: <20010119132631.I17392@xs4all.nl>

On Thu, Jan 18, 2001 at 07:34:02PM -0500, Guido van Rossum wrote:

> Through no fault of my own, email to guido at python.org (which includes
> the python-dev list) is currently suffering delays of 12-24 hours.  I
> have a feeling this is probably true for all mail going through
> python.org, so checkin messages ans python-dev discussion have been
> greatly frustrated, with about 1 day to go until the planned 2.1a1
> release date!

I doubt it's (just) you, Guido. I'm seeing similar delays, and I already
talked with Barry about it, too. It looks like it's clearing up a bit, now,
but it's confusing as hell, for sure ;)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Fri Jan 19 13:33:47 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 13:33:47 +0100
Subject: [Python-Dev] Dbm failure
In-Reply-To: <20010119175336.3B5A0A83E@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Fri, Jan 19, 2001 at 07:53:36PM +0200
References: <20010119175336.3B5A0A83E@darjeeling.zadka.site.co.il>
Message-ID: <20010119133347.J17392@xs4all.nl>

On Fri, Jan 19, 2001 at 07:53:36PM +0200, Moshe Zadka wrote:
> test test_dbm skipped --  /home/moshez/prog/src/python/python/dist/src/build/lib.linux-i686-2.1/dbm.so: undefined symbol: dbm_firstkey
> Did it happen to anyone else? 

Yes, to me. You're suffering from the same thing I did: GNU sucks. Okay,
okay, not as much as MS products or most other UNIX software, but still ;)
The problem is a conflict between gdbm and glibc.

gdbm (1.7.3, which is what woody currently carries, not sure why it isn't
updated) offers a dbm interface/replacement, which includes a libdbm.(so|a)
and /usr/include/gdbm-ndbm.h. Glibc (or at least the debian package) *also*
offers a dbm interface/replacement, which consists of libdb1.(so|a) and
/usr/include/db1/ndbm.h (which needs /usr/include/db1/*.h). If you add
/usr/include/db1 to your include path, and -ldbm to the dbmmodule, you end
up with the wrong versions. You need either to include /usr/include/db1 in
your includepath and use -ldb1, or fix up dbmmodule.c so it includes
gdbm-ndbm.h and uses -ldbm.

I only figured this out yesterday, and sent Andrew a mail about that... I'm
not sure what the Right(tm) way to fix this is :( I've always loathed these
library/version mismatches :P

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mal at lemburg.com  Fri Jan 19 14:07:00 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 14:07:00 +0100
Subject: [Python-Dev] Standard install locations for Python ?
References: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de> <20010118135640.G21503@kronos.cnri.reston.va.us>
Message-ID: <3A683BF4.BD74A979@lemburg.com>

Andrew Kuchling wrote:
> 
> On Thu, Jan 18, 2001 at 01:20:06PM +0100, Martin v. Loewis wrote:
> >On Unix, there appears to be no standard location, unless the
> >documentation consists of man pages or perhaps info files. So
> ><prefix>/share/doc is probably a place as good as any other.
> 
> This seems like a good suggestion.  Should docs go in
> <prefix>/share/doc/python<version>/, then?  Perhaps with
> subdirectories for different extensions?

Hmm, I guess it's better to follow bdist_rpm here: put
the docs into a subdir under .../doc/ using the package
name and version.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From jeremy at alum.mit.edu  Fri Jan 19 15:39:13 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 19 Jan 2001 09:39:13 -0500 (EST)
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: <20010119110200.9E455373C95@snelboot.oratrix.nl>
References: <20010119110200.9E455373C95@snelboot.oratrix.nl>
Message-ID: <14952.20881.848489.869512@localhost.localdomain>

>>>>> "JJ" == Jack Jansen <jack at oratrix.nl> writes:

  JJ> I get the impression that I'm currently seeing a non-NULL third
  JJ> argument in my (C) methods even though the method is called
  JJ> without keyword arguments.

  JJ> Is this new semantics that I missed the discussion about, or is
  JJ> this a bug? 

This is a bug in the changes I made to the call function
implementation.  I wasn't sure what was supposed to happen to a
function that expected a kw argument but was called without one.  I
thought I saw some crashes when I passed NULL, so I changed the
implementation to pass an empty dictionary.

(Is the correct behavior documented anywhere?)

If a NULL value is correct, I'll update the implementation and see if
I can rediscover those crashes.

Jeremy



From nas at arctrix.com  Fri Jan 19 08:39:50 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 18 Jan 2001 23:39:50 -0800
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010119000209.F17392@xs4all.nl>; from thomas@xs4all.net on Fri, Jan 19, 2001 at 12:02:09AM +0100
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl>
Message-ID: <20010118233950.A15636@glacier.fnational.com>

On Fri, Jan 19, 2001 at 12:02:09AM +0100, Thomas Wouters wrote:
> I can't find any such hackery in the source, but I also can't
> figure out how else it's working :)

I thank you want to look at getpath.c.  

  Neil



From jeremy at alum.mit.edu  Fri Jan 19 15:44:50 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 19 Jan 2001 09:44:50 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.107,2.108
In-Reply-To: <E14JND2-0004Tl-00@usw-pr-cvs1.sourceforge.net>
References: <E14JND2-0004Tl-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <14952.21218.416551.695660@localhost.localdomain>

>>>>> "GvR" == Guido van Rossum <gvanrossum at users.sourceforge.net> writes:

  GvR> Log Message: Changes to recursive-object comparisons, having to
  GvR> do with a test case I found where rich comparison of unequal
  GvR> recursive objects gave unintuituve results.  In a discussion
  GvR> with Tim, where we discovered that our intuition on when a<=b
  GvR> should be true was failing, we decided to outlaw ordering
  GvR> comparisons on recursive objects.  (Once we have fixed our
  GvR> intuition and designed a matching algorithm that's practical
  GvR> and reasonable to implement, we can allow such orderings
  GvR> again.)

Sounds sensible to me!  I was quite puzzled about what <= should
return for recursive objects.

  GvR> - Changed the nesting limit to a more reasonable small 20; this
  GvR>   only slows down comparisons of very deeply nested objects
  GvR>   (unlikely to occur in practice), while speeding up
  GvR>   comparisons of recursive objects (previously, this would
  GvR>   first waste time and space on 500 nested comparisons before
  GvR>   it would start detecting recursion).

After we talked through this code yesterday, I was also thinking that
the limit was too high :-).

Jeremy



From guido at digicool.com  Fri Jan 19 16:49:54 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 10:49:54 -0500
Subject: [Python-Dev] new Makefile.in
In-Reply-To: Your message of "Thu, 18 Jan 2001 18:56:04 EST."
             <200101182356.SAA19616@cj20424-a.reston1.va.home.com> 
References: <20010117235922.A12356@glacier.fnational.com>  
            <200101182356.SAA19616@cj20424-a.reston1.va.home.com> 
Message-ID: <200101191549.KAA28699@cj20424-a.reston1.va.home.com>

[Neil]
> > A question: is it possible to break the Python static library up?

[me]
> Sounds cool to me.

Of course after Martin's response I agree with him -- let's keep it
one library.  (Although I expect that the combined effect of setup.py
and Neil's flat Makefile will still affect the infrastructure to build
extensions... :-( )

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 16:56:58 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 10:56:58 -0500
Subject: [Python-Dev] MS CRT crashing:
In-Reply-To: Your message of "Thu, 18 Jan 2001 16:53:15 PST."
             <58C671173DB6174A93E9ED88DCB0883DB863F1@red-msg-07.redmond.corp.microsoft.com> 
References: <58C671173DB6174A93E9ED88DCB0883DB863F1@red-msg-07.redmond.corp.microsoft.com> 
Message-ID: <200101191556.KAA28761@cj20424-a.reston1.va.home.com>

Bill Tutt writes:
> From the internal support squad:
> Turns out the C standard explicitly says you can't have an input follow
> output on a stream without doing fflush or fseek in-between, to make sure
> the stdio buffer is cleared.  So this program is illegal.
> 
> They've gone and resolved it by design.

I'd just like to note for the record that this is exactly what I had
predicted.

I'd also like to note that I *agree*.  Tim seems to think there's a
race condition in the threading code, but it's really much simpler
than that: the same bug can easily be provoked with a single-threaded
program: just randomly read and write alternatingly.  So obviously the
people who wrote the threading code aren't interested in the bug,
because it's not in their code -- and the people who wrote the code
that doesn't behave well when abused are protected by the C standard...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 17:00:30 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:00:30 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Your message of "Thu, 18 Jan 2001 22:39:18 +0100."
             <3A676286.C33823B4@tismer.com> 
References: <14949.46995.259157.871323@beluga.mojam.com> <200101171609.LAA04102@cj20424-a.reston1.va.home.com>  
            <3A676286.C33823B4@tismer.com> 
Message-ID: <200101191600.LAA28788@cj20424-a.reston1.va.home.com>

> Yes, the "inverse" is confusing. Is what you mean the "reverse" ?
> Like the other right-side operators __radd__, is it correct to
> think of
> 
>    __ge__  == __rle__
> 
> if __rle__ was written in the same fashion like __radd__ ?
> It looks semantically the same, although the reason for a
> call might be different.

Yes, it's semantically the same, and the reason for the call is the
same too ("the left argument doesn't support the operator so let's try
if the right one knows").

> And if my above view is right, would it perhaps be less
> confusing to use in fact __rle__ and __rlt__,
> or woudl it be more confusing, since __rlt__ would also be
> invoked left-to-right, implementing ">".

I prefer 6 new operators over 12 any day.  I can see no valid reason
why someone would want to overload a>b different than b<a, while
there are plenty of reasons why a+b and b+a should be different:
e.g. string concatenation.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Fri Jan 19 17:14:55 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Fri, 19 Jan 2001 11:14:55 -0500
Subject: [Python-Dev] new Makefile.in
In-Reply-To: <200101191549.KAA28699@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Fri, Jan 19, 2001 at 10:49:54AM -0500
References: <20010117235922.A12356@glacier.fnational.com> <200101182356.SAA19616@cj20424-a.reston1.va.home.com> <200101191549.KAA28699@cj20424-a.reston1.va.home.com>
Message-ID: <20010119111455.C25056@kronos.cnri.reston.va.us>

On Fri, Jan 19, 2001 at 10:49:54AM -0500, Guido van Rossum wrote:
>Of course after Martin's response I agree with him -- let's keep it
>one library.  (Although I expect that the combined effect of setup.py
>and Neil's flat Makefile will still affect the infrastructure to build
>extensions... :-( )

Which reminds me... there should really be a way to ignore the
setup.py stuff and use the old method.  How should that be done.  A
--use-makesetup flag to configure, maybe?

--amk




From guido at digicool.com  Fri Jan 19 17:14:20 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:14:20 -0500
Subject: [Python-Dev] Re: test_support.py
In-Reply-To: Your message of "Thu, 18 Jan 2001 21:59:23 PST."
             <E14JUa3-0006xu-00@usw-pr-cvs1.sourceforge.net> 
References: <E14JUa3-0006xu-00@usw-pr-cvs1.sourceforge.net> 
Message-ID: <200101191614.LAA28881@cj20424-a.reston1.va.home.com>

>       if not condition:
> !         raise AssertionError(reason)

Wouldn't it be better if this raised TestFailed rather than
AssertionError?  Or is there code that catches the AssertionError?

[...grep...]

Yes, there's code that catches AssertionError:

(1) in Marc-Andre's own test_unicode.py;

(2) in test_re, which catches AssertionError and raises TestFailed
    instead.

Proposal:

(1) change verify() to raise TestFailed;

(2) change test_unicode.py to catch TestFailed instead.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tismer at tismer.com  Fri Jan 19 17:17:06 2001
From: tismer at tismer.com (Christian Tismer)
Date: Fri, 19 Jan 2001 17:17:06 +0100
Subject: [Python-Dev] Rich comparison confusion
References: <14949.46995.259157.871323@beluga.mojam.com> <200101171609.LAA04102@cj20424-a.reston1.va.home.com>  
	            <3A676286.C33823B4@tismer.com> <200101191600.LAA28788@cj20424-a.reston1.va.home.com>
Message-ID: <3A686882.F78C1268@tismer.com>


Guido van Rossum wrote:
> 
> > Yes, the "inverse" is confusing. Is what you mean the "reverse" ?
> > Like the other right-side operators __radd__, is it correct to
> > think of
> >
> >    __ge__  == __rle__
> >
> > if __rle__ was written in the same fashion like __radd__ ?
> > It looks semantically the same, although the reason for a
> > call might be different.
> 
> Yes, it's semantically the same, and the reason for the call is the
> same too ("the left argument doesn't support the operator so let's try
> if the right one knows").
> 
> > And if my above view is right, would it perhaps be less
> > confusing to use in fact __rle__ and __rlt__,
> > or woudl it be more confusing, since __rlt__ would also be
> > invoked left-to-right, implementing ">".
> 
> I prefer 6 new operators over 12 any day.  I can see no valid reason
> why someone would want to overload a>b different than b<a, while
> there are plenty of reasons why a+b and b+a should be different:
> e.g. string concatenation.

Sure, I didn't want to introduce new operators, but use the
"r" versions for three of the six new operators. But I should have
read you proposal before. The confusion is not due to you,
but Skip had a read error, since you don't talk about inverses
at all:

Skip=="""
In the description
he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.
"""

Truth=="""
There are no explicit "reversed argument" versions of
  these; instead, __lt__ and __gt__ are each other's reverse, likewise
  for__le__ and __ge__; __eq__ and __ne__ are their own reverse
  (similar at the C level).
"""

No reason for confusion at all > python-dev/null - ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From thomas at xs4all.net  Fri Jan 19 17:20:56 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 17:20:56 +0100
Subject: [Python-Dev] test_ucn errors ?
Message-ID: <20010119172056.K17392@xs4all.nl>

I'm currently seeing a failure in test_ucn:

test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
error: Illegal Unicode character

It looks like one of the unicode literals in test_ucn is invalid, but it's
damned hard to pin down which:

Python 2.1a1 (#7, Jan 19 2001, 17:06:32) 
[GCC 2.95.2 20000220 (Debian GNU/Linux)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> import test.test_ucn
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: Unicode-Escape decoding error: Illegal Unicode character
>>> 

I get the same crashes on FreeBSD and (Debian) Linux.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Fri Jan 19 17:26:34 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:26:34 -0500
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: Your message of "Fri, 19 Jan 2001 00:02:09 +0100."
             <20010119000209.F17392@xs4all.nl> 
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us>  
            <20010119000209.F17392@xs4all.nl> 
Message-ID: <200101191626.LAA29165@cj20424-a.reston1.va.home.com>

> This brings me to another point: how can 'make test' work at all ? Does
> python always check for './Lib' (and './Modules') for modules ?

Look at the logic in Modules/getpath.c, which calculates the initial
(default) sys.path.  It detects that it's running from the build tree
and then modifies the default path a bit to include Lib and Modules
relative to where the python executable was found.

> If that's
> specific for 'make test' and running python in the source distribution, that
> sounds like a bit of a weird hack. I can't find any such hackery in the
> source, but I also can't figure out how else it's working :)

It's not jut for 'make test' -- it's to make life easy for developers
in general (and me in particular :-) who want to try out their hacks
without going through 'make install'.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Fri Jan 19 17:34:58 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 17:34:58 +0100
Subject: [Python-Dev] Re: test_support.py
References: <E14JUa3-0006xu-00@usw-pr-cvs1.sourceforge.net> <200101191614.LAA28881@cj20424-a.reston1.va.home.com>
Message-ID: <3A686CB2.C75D184D@lemburg.com>

Guido van Rossum wrote:
> 
> >       if not condition:
> > !         raise AssertionError(reason)
> 
> Wouldn't it be better if this raised TestFailed rather than
> AssertionError?  Or is there code that catches the AssertionError?
> 
> [...grep...]
> 
> Yes, there's code that catches AssertionError:
> 
> (1) in Marc-Andre's own test_unicode.py;
> 
> (2) in test_re, which catches AssertionError and raises TestFailed
>     instead.
> 
> Proposal:
> 
> (1) change verify() to raise TestFailed;
> 
> (2) change test_unicode.py to catch TestFailed instead.

+1

Why not simply make TestFailed a subclass of AssertionError ?
Then we wouldn't have to fear about breaking test code...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Fri Jan 19 17:34:15 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 17:34:15 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <200101191626.LAA29165@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Fri, Jan 19, 2001 at 11:26:34AM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl> <200101191626.LAA29165@cj20424-a.reston1.va.home.com>
Message-ID: <20010119173415.M17295@xs4all.nl>

On Fri, Jan 19, 2001 at 11:26:34AM -0500, Guido van Rossum wrote:
> > This brings me to another point: how can 'make test' work at all ? Does
> > python always check for './Lib' (and './Modules') for modules ?

> Look at the logic in Modules/getpath.c, which calculates the initial
> (default) sys.path.  It detects that it's running from the build tree
> and then modifies the default path a bit to include Lib and Modules
> relative to where the python executable was found.

Aye, I found it now.

> > If that's
> > specific for 'make test' and running python in the source distribution, that
> > sounds like a bit of a weird hack. I can't find any such hackery in the
> > source, but I also can't figure out how else it's working :)

> It's not jut for 'make test' -- it's to make life easy for developers
> in general (and me in particular :-) who want to try out their hacks
> without going through 'make install'.

Well, after some old SF movies & some sleep, I realized that :) But it is
going to have to change: you now have to include the build tree as well, and
that is quite a bit more difficult to figure out. I'd suggest a 'make run'
that calls python with the appropriate PYTHONPATH environment variable, but
that doesn't cover test-scripts (which I use a lot myself.)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Fri Jan 19 17:34:45 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:34:45 -0500
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: Your message of "Fri, 19 Jan 2001 12:02:00 +0100."
             <20010119110200.9E455373C95@snelboot.oratrix.nl> 
References: <20010119110200.9E455373C95@snelboot.oratrix.nl> 
Message-ID: <200101191634.LAA29239@cj20424-a.reston1.va.home.com>

> I get the impression that I'm currently seeing a non-NULL third
> argument in my (C) methods even though the method is called without
> keyword arguments.

> Is this new semantics that I missed the discussion about, or is this a bug?

Can't tell without spending more time looking at the code and
experimenting than I can afford today; but Jeremy refactored the
calling code, and it could be that you're seeing an empty dictionary
instead of a NULL.

Do you really need the NULL?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 17:41:02 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:41:02 -0500
Subject: [Python-Dev] Mail delays and SourceForge bugs
In-Reply-To: Your message of "Fri, 19 Jan 2001 13:26:31 +0100."
             <20010119132631.I17392@xs4all.nl> 
References: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>  
            <20010119132631.I17392@xs4all.nl> 
Message-ID: <200101191641.LAA29324@cj20424-a.reston1.va.home.com>

> I doubt it's (just) you, Guido. I'm seeing similar delays, and I already
> talked with Barry about it, too. It looks like it's clearing up a bit, now,
> but it's confusing as hell, for sure ;)

It's worse for me though than for most people: for others, only mail
sent through mailman at mail.python.org is affected.  For me, mail
sent directly to guido at python.org is affected too (which is why I've
changed my From address again to that old standby,
guido at digicool.com).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 17:53:39 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:53:39 -0500
Subject: [Python-Dev] deprecated regex used by un-deprecated modules
In-Reply-To: Your message of "Thu, 18 Jan 2001 22:20:08 EST."
             <14951.45672.806978.600944@localhost.localdomain> 
References: <14951.45672.806978.600944@localhost.localdomain> 
Message-ID: <200101191653.LAA29774@cj20424-a.reston1.va.home.com>

> There are several modules in the standard library that use the regex
> module.  When they are imported, they print a warning about using a
> deprecated module.  I think this is bad form.  Either the modules that
> depend on regex should by updated to use re or they should be
> deprecated themselves.  
> 
> I discovered the following offenders:
> asynchat
> knee
> poplib
> reconvert
> 
> I would suggest fixing asynchat and poplib and deprecating knee.  The
> reconvert module may be a special case.

Agreed.  There's an idiom to disable the warning, which you can find
in regsub.py:

    import warnings
    warnings.filterwarnings("ignore", "", DeprecationWarning, __name__)

(The "" should be replaced by the specific warning message though.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 18:21:28 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 12:21:28 -0500
Subject: [Python-Dev] test_ucn errors ?
In-Reply-To: Your message of "Fri, 19 Jan 2001 17:20:56 +0100."
             <20010119172056.K17392@xs4all.nl> 
References: <20010119172056.K17392@xs4all.nl> 
Message-ID: <200101191721.MAA31937@cj20424-a.reston1.va.home.com>

> I'm currently seeing a failure in test_ucn:
> 
> test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
> error: Illegal Unicode character
> 
> It looks like one of the unicode literals in test_ucn is invalid, but it's
> damned hard to pin down which:

Feels to me like there's a bug in the string literal processing that
makes *any* string literal containing \N{...} fail during code
generation.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at effbot.org  Fri Jan 19 18:37:41 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 19 Jan 2001 18:37:41 +0100
Subject: [Python-Dev] test_ucn errors ?
References: <20010119172056.K17392@xs4all.nl>
Message-ID: <023801c0823e$86fcedc0$e46940d5@hagrid>

> test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
> error: Illegal Unicode character

Make sure you rebuild Objects/unicodeobject.o and the
ucnhash extension.  If they build without warnings, run
the following script.

import ucnhash
count = 0
for code in range(65536):
    try:
        name = ucnhash.getname(code)
        if ucnhash.getcode(name) != code:
            print name
        count += 1
    except ValueError:
        pass
print count

if it prints anything but "10538", let me know.

> It looks like one of the unicode literals in test_ucn is invalid, but it's
> damned hard to pin down which:

If the ucnhash extension cannot be found, the script won't
even compile...  shouldn't be too hard to fix.

</F>




From Barrett at stsci.edu  Fri Jan 19 18:32:26 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Fri, 19 Jan 2001 12:32:26 -0500 (EST)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <200101191600.LAA28788@cj20424-a.reston1.va.home.com>
References: <14949.46995.259157.871323@beluga.mojam.com>
	<200101171609.LAA04102@cj20424-a.reston1.va.home.com>
	<3A676286.C33823B4@tismer.com>
	<200101191600.LAA28788@cj20424-a.reston1.va.home.com>
Message-ID: <14952.30800.112503.123675@nem-srvr.stsci.edu>

Guido van Rossum writes:
 > 
 > ... I can see no valid reason why someone would want to overload
 > a>b different than b<a, ... 
 > 

I agree.  But this assumes that the result of A<B and B>A is a
collection of Booleans.  In the Interactive Data Language (IDL) these
operators are essentially mapped to ceiling and floor functions which
are not commutative.  I personally find this silly, but IDL users
coming to Python may be surprised when the comparison of two Numeric
arrays returns a Boolean-like result.

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218



From nas at arctrix.com  Fri Jan 19 11:43:12 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 19 Jan 2001 02:43:12 -0800
Subject: [Python-Dev] new Makefile.in
In-Reply-To: <20010119111455.C25056@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Fri, Jan 19, 2001 at 11:14:55AM -0500
References: <20010117235922.A12356@glacier.fnational.com> <200101182356.SAA19616@cj20424-a.reston1.va.home.com> <200101191549.KAA28699@cj20424-a.reston1.va.home.com> <20010119111455.C25056@kronos.cnri.reston.va.us>
Message-ID: <20010119024312.A16179@glacier.fnational.com>

On Fri, Jan 19, 2001 at 11:14:55AM -0500, Andrew Kuchling wrote:
> Which reminds me... there should really be a way to ignore the
> setup.py stuff and use the old method.  How should that be done.  A
> --use-makesetup flag to configure, maybe?

A different target for make would be easy.

  Neil



From fredrik at effbot.org  Fri Jan 19 19:13:15 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 19 Jan 2001 19:13:15 +0100
Subject: [Python-Dev] test_ucn errors ?
References: <20010119172056.K17392@xs4all.nl>  <200101191721.MAA31937@cj20424-a.reston1.va.home.com>
Message-ID: <03a201c08243$7fa62af0$e46940d5@hagrid>

thomas wrote:
> > I'm currently seeing a failure in test_ucn:
> > 
> > test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
> > error: Illegal Unicode character
> > 
> > It looks like one of the unicode literals in test_ucn is invalid, but it's
> > damned hard to pin down which:
> 
> Feels to me like there's a bug in the string literal processing that
> makes *any* string literal containing \N{...} fail during code
> generation.

I took another look at the error message: the only explanation
I can see here is that the lookup succeeds, but the call to ucn-
hash returns a value larger than 0x10ffff.

What is Py_UCS4 set to under gcc?

Confusing /F




From guido at digicool.com  Fri Jan 19 19:11:21 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 13:11:21 -0500
Subject: [Python-Dev] Re: test_support.py
In-Reply-To: Your message of "Fri, 19 Jan 2001 17:34:58 +0100."
             <3A686CB2.C75D184D@lemburg.com> 
References: <E14JUa3-0006xu-00@usw-pr-cvs1.sourceforge.net> <200101191614.LAA28881@cj20424-a.reston1.va.home.com>  
            <3A686CB2.C75D184D@lemburg.com> 
Message-ID: <200101191811.NAA32539@cj20424-a.reston1.va.home.com>

> > Proposal:
> > 
> > (1) change verify() to raise TestFailed;
> > 
> > (2) change test_unicode.py to catch TestFailed instead.
> 
> +1
> 
> Why not simply make TestFailed a subclass of AssertionError ?
> Then we wouldn't have to fear about breaking test code...

No, I'd rather see the two separated.  There can be assert statements
in the modules we're testing, and I'd prefer not to see those caught
by test code that is trying to catch TestFailed.

I'll check this in momentarily.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at effbot.org  Fri Jan 19 19:19:37 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 19 Jan 2001 19:19:37 +0100
Subject: [Python-Dev] test_ucn errors ?
References: <20010119172056.K17392@xs4all.nl>  <200101191721.MAA31937@cj20424-a.reston1.va.home.com>
Message-ID: <03b301c08244$627f22a0$e46940d5@hagrid>

> Feels to me like there's a bug in the string literal processing that
> makes *any* string literal containing \N{...} fail during code
> generation.

umm.  can anyone explain how this can happen:

python ../lib/test/regrtest.py test_ucn
test_ucn
1 test OK.

python ../lib/test/test_ucn.py
UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character Name

how can a test that works under regrtest.py fail when
it's run separately?  what am I missing here?

</F>




From mal at lemburg.com  Fri Jan 19 19:48:53 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 19:48:53 +0100
Subject: [Python-Dev] test_ucn errors ?
References: <20010119172056.K17392@xs4all.nl>  <200101191721.MAA31937@cj20424-a.reston1.va.home.com> <03a201c08243$7fa62af0$e46940d5@hagrid>
Message-ID: <3A688C15.8C9CFF46@lemburg.com>

Fredrik Lundh wrote:
> 
> thomas wrote:
> > > I'm currently seeing a failure in test_ucn:
> > >
> > > test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
> > > error: Illegal Unicode character
> > >
> > > It looks like one of the unicode literals in test_ucn is invalid, but it's
> > > damned hard to pin down which:
> >
> > Feels to me like there's a bug in the string literal processing that
> > makes *any* string literal containing \N{...} fail during code
> > generation.
> 
> I took another look at the error message: the only explanation
> I can see here is that the lookup succeeds, but the call to ucn-
> hash returns a value larger than 0x10ffff.
> 
> What is Py_UCS4 set to under gcc?

Should be "unsigned int" on all modern Intel platforms.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Fri Jan 19 19:48:45 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 13:48:45 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Your message of "Fri, 19 Jan 2001 12:32:26 EST."
             <14952.30800.112503.123675@nem-srvr.stsci.edu> 
References: <14949.46995.259157.871323@beluga.mojam.com> <200101171609.LAA04102@cj20424-a.reston1.va.home.com> <3A676286.C33823B4@tismer.com> <200101191600.LAA28788@cj20424-a.reston1.va.home.com>  
            <14952.30800.112503.123675@nem-srvr.stsci.edu> 
Message-ID: <200101191848.NAA02765@cj20424-a.reston1.va.home.com>

>  > ... I can see no valid reason why someone would want to overload
>  > a>b different than b<a, ... 
>  > 
> 
> I agree.  But this assumes that the result of A<B and B>A is a
> collection of Booleans.  In the Interactive Data Language (IDL) these
> operators are essentially mapped to ceiling and floor functions which
> are not commutative.  I personally find this silly, but IDL users
> coming to Python may be surprised when the comparison of two Numeric
> arrays returns a Boolean-like result.

This means that Python can't be used to emulate this part of IDL.  I
don't understand how these can be not commutative unless they have a
side effect on the left argument, and that's not possible in Python
anyway.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Fri Jan 19 20:18:04 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 19 Jan 2001 14:18:04 -0500
Subject: [Python-Dev] test_ucn errors ?
Message-ID: <LNBBLJKPBEHFEDALKOLCGELEIJAA.tim.one@home.com>

[/F]
> umm.  can anyone explain how this can happen:
>
> python ../lib/test/regrtest.py test_ucn
> test_ucn
> test OK.
>
> python ../lib/test/test_ucn.py
> UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character
Name
>
> how can a test that works under regrtest.py fail when
> it's run separately?  what am I missing here?

Dunno, but add to the pile of mysteries that you're unique.  Here on
Win98SE:

python ../lib/test/regrtest.py test_ucn
test_ucn
test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape
      decoding error:
 Invalid Unicode Character Name
1 test failed: test_ucn


python ../lib/test/test_ucn.py
UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character Name


I suggest you reformat your hard drive, and reinstall Windows <wink>.




From mwh21 at cam.ac.uk  Fri Jan 19 20:25:03 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 19 Jan 2001 19:25:03 +0000
Subject: [Python-Dev] test_ucn errors ?
In-Reply-To: "Fredrik Lundh"'s message of "Fri, 19 Jan 2001 19:19:37 +0100"
References: <20010119172056.K17392@xs4all.nl> <200101191721.MAA31937@cj20424-a.reston1.va.home.com> <03b301c08244$627f22a0$e46940d5@hagrid>
Message-ID: <m3n1cn1h0w.fsf@atrus.jesus.cam.ac.uk>

"Fredrik Lundh" <fredrik at effbot.org> writes:

> > Feels to me like there's a bug in the string literal processing that
> > makes *any* string literal containing \N{...} fail during code
> > generation.
> 
> umm.  can anyone explain how this can happen:
> 
> python ../lib/test/regrtest.py test_ucn
> test_ucn
> 1 test OK.

This will run the .pyc if present?
 
> python ../lib/test/test_ucn.py
> UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character Name

This won't?  

Note: no traceback -> (in effect, if not design) compile time error.

> how can a test that works under regrtest.py fail when
> it's run separately?  what am I missing here?

Well, this is just my guess.

Cheers,
M.

-- 
  Well, you pretty much need Microsoft stuff to get misbehaviours bad
  enough to actually tear the time-space continuum.  Luckily for you,
  MS Internet Explorer is available for Solaris.
                              -- Calle Dybedahl, alt.sysadmin.recovery




From skip at mojam.com  Fri Jan 19 20:55:29 2001
From: skip at mojam.com (Skip Montanaro)
Date: Fri, 19 Jan 2001 13:55:29 -0600 (CST)
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010119173415.M17295@xs4all.nl>
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us>
	<20010117234925.A17392@xs4all.nl>
	<20010118004400.B17392@xs4all.nl>
	<20010118103036.B21503@kronos.cnri.reston.va.us>
	<20010119000209.F17392@xs4all.nl>
	<200101191626.LAA29165@cj20424-a.reston1.va.home.com>
	<20010119173415.M17295@xs4all.nl>
Message-ID: <14952.39857.83065.24889@beluga.mojam.com>

    Thomas> But it is going to have to change: you now have to include the
    Thomas> build tree as well, and that is quite a bit more difficult to
    Thomas> figure out. I'd suggest a 'make run' that calls python with the
    Thomas> appropriate PYTHONPATH environment variable, but that doesn't
    Thomas> cover test-scripts (which I use a lot myself.)

Doesn't Andrew's new "platform" target in the top-level Makefile do the
right thing?  It *should* generate a platform-specific path to the correct
build subdirectory.

Skip



From MarkH at ActiveState.com  Fri Jan 19 21:11:02 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Fri, 19 Jan 2001 12:11:02 -0800
Subject: [Python-Dev] initializing ob_type (was: CVS: python/dist/src/Modules _cursesmodule.c,2.46,2.47)
In-Reply-To: <010c01c08201$4b0ec050$e46940d5@hagrid>
Message-ID: <LCEPIIGDJPKCOIHOBJEPIEHFCPAA.MarkH@ActiveState.com>

> you can compile the module as C++, but that's also a bit painful...

My understanding is that the C std doesn't guarantee the order of static
object initialization, whereas C++ does provide these semantics.  At least
that is the excuse I found when digging into this some years ago.

Can't-believe-I-mentioned-the-C-standard-while-Tim-is-listening ly,

Mark.




From guido at digicool.com  Fri Jan 19 21:44:53 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 15:44:53 -0500
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: Your message of "Fri, 19 Jan 2001 10:58:08 +0100."
             <3A680FB0.AED2DB55@lemburg.com> 
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org> <3A66CAC2.74FC894@lemburg.com> <200101190104.UAA27056@cj20424-a.reston1.va.home.com>  
            <3A680FB0.AED2DB55@lemburg.com> 
Message-ID: <200101192044.PAA04154@cj20424-a.reston1.va.home.com>

> If we agree to merge the semantics of the two APIs, then str()
> would have to change too: is this desirable ? (IMHO, yes)

Not clear.  Which is why I'm backing off from my initial support for
merging the two.

I believe unicode() (which is really just an interface to
PyUnicode_FromEncodedObject()) currently already does too much.  In
particular this whole business with calling __str__ on instances seems
to me to be unnecessary.  I think it should *only* bother to look for
something that supports the buffer interface (checking for regular
strings only as a tiny optimization), or existing unicode objects.

> Here's what we could do:
> 
> a) merge the semantics of unistr() into unicode()
> b) apply the same semantics in str()
> c) remove unistr() -- how's that for a short-living builtin ;)
> 
> About the semantics:
> 
> These should be backward compatible to str() in that everything
> that worked before should continue to work after the merge.
> 
> A strawman for processing str() and unicode():
> 
> 1. strings/Unicode is passed back as-is

I hope you mean str() passes 8-bit strings back as-is, unicode()
passes Unicode strings back as-is, right?

> 2. tp_str is tried
> 3. the method __str__ is tried

Shouldn't have to -- instances should define tp_str and all the magic
for calling __str__ should be there.  I don't understand why it's not
done that way, probably just for historical reasons.  I also don't
think __str__ should be tried for non-instance types.

But, more seriously, I believe tp_str or __str__ shouldn't be tried at
all by unicode().

> 4. the PyObject_AsCharBuffer() API is tried (bf_getcharbuffer)
> 5. for str(): Unicode return values are converted to strings using
>               the default encoding
>    for unicode(): Unicode return values are passed back as-is;
>               string return values are decoded according to the
>               encoding parameter
> 6. the return object is type-checked: str() will always return
>    a string object, unicode() always a Unicode object
> 
> Note that passing back Unicode is only allowed in case no encoding
> was given. Otherwise an execption is raised: you can't decode
> Unicode.
> 
> As extension we could add encoding and error parameters to str()
> as well. The result would be either an encoding of Unicode objects
> passed back by tp_str or __str__ or a recoding of string objects
> returned by checks 2, 3 or 4.

Naaaah!

> If we agree to take this approach, then we should remove the
> unistr() Python API before the alpha ships.

Frankly, I believe we need more time to sort this out, and therefore I
propose to remove the unistr() built-in before the release.  Marc,
would you do the honors?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Fri Jan 19 21:55:53 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 21:55:53 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <14952.39857.83065.24889@beluga.mojam.com>; from skip@mojam.com on Fri, Jan 19, 2001 at 01:55:29PM -0600
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl> <200101191626.LAA29165@cj20424-a.reston1.va.home.com> <20010119173415.M17295@xs4all.nl> <14952.39857.83065.24889@beluga.mojam.com>
Message-ID: <20010119215552.O17295@xs4all.nl>

On Fri, Jan 19, 2001 at 01:55:29PM -0600, Skip Montanaro wrote:
> 
>     Thomas> But it is going to have to change: you now have to include the
>     Thomas> build tree as well, and that is quite a bit more difficult to
>     Thomas> figure out. I'd suggest a 'make run' that calls python with the
>     Thomas> appropriate PYTHONPATH environment variable, but that doesn't
>     Thomas> cover test-scripts (which I use a lot myself.)

> Doesn't Andrew's new "platform" target in the top-level Makefile do the
> right thing?  It *should* generate a platform-specific path to the correct
> build subdirectory.

Yes, it does, that's what I meant with 'make run'. But that isn't quite as
user-friendly as the current method. How would you run a script with the
current python ? 'make SCRIPT=./spamtest.py runscript' ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Fri Jan 19 23:06:03 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 17:06:03 -0500
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: Your message of "Fri, 19 Jan 2001 17:34:15 +0100."
             <20010119173415.M17295@xs4all.nl> 
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl> <200101191626.LAA29165@cj20424-a.reston1.va.home.com>  
            <20010119173415.M17295@xs4all.nl> 
Message-ID: <200101192206.RAA12072@cj20424-a.reston1.va.home.com>

I finally figured the best way to fix sys.path to find shared modules
built by setup.py.  At first I thought I had to add it to getpath.c,
but the problem is that the name is calculated by calling
distutils.util.get_platform(), and that requires a working Python
interpreter, so we'd end up with a chicken-or-egg situation.

So instead I added 5 lines to site.py, which tests for
os.name=='posix', then for sys.path[-1] ending in '/Modules' -- this
tests only succeeds when running from the build directory.  Then it
calls distutils.util.get_platform() and uses the result to calculate
the correct directory name, which is then appended to sys.path.

Yes, this slows down startup (it imports a large portion of the
distutils package), but I don't care -- after all this is mostly for
me so I can play with the interpreter right after I've built it,
right?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Fri Jan 19 22:32:34 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 22:32:34 +0100
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. 
 unistr()
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org> <3A66CAC2.74FC894@lemburg.com> <200101190104.UAA27056@cj20424-a.reston1.va.home.com>  
	            <3A680FB0.AED2DB55@lemburg.com> <200101192044.PAA04154@cj20424-a.reston1.va.home.com>
Message-ID: <3A68B272.BBBAECD1@lemburg.com>

Guido van Rossum wrote:
> 
> > If we agree to merge the semantics of the two APIs, then str()
> > would have to change too: is this desirable ? (IMHO, yes)
> 
> Not clear.  Which is why I'm backing off from my initial support for
> merging the two.
> 
> I believe unicode() (which is really just an interface to
> PyUnicode_FromEncodedObject()) currently already does too much.  In
> particular this whole business with calling __str__ on instances seems
> to me to be unnecessary.  I think it should *only* bother to look for
> something that supports the buffer interface (checking for regular
> strings only as a tiny optimization), or existing unicode objects.

Hmm, unicode() should (just like str()) take an object and
convert it to a Unicode string. Since many objects either don't
support the tp_str slot (instances don't for some reason -- just
like they don't tp_call), I had to add some special cases to
make Python instances compatible to Unicode in the same way
str() does.

What I think is really needed is a concept for "stringification"
in Python. We currently have these schemes:

1. tp_str
2. method __str__ (not only of Python instances, but any object)
3. character buffer interface

These three could easily be unified into the tp_str slot:
e.g. tp_str could do the necessary magic to call __str__
or the buffer interface.

Note that the same is true for e.g. tp_call -- the special
cases we have in ceval.c for the different builtin callable
objects would not be necessary if they would implement tp_call.

> > Here's what we could do:
> >
> > a) merge the semantics of unistr() into unicode()
> > b) apply the same semantics in str()
> > c) remove unistr() -- how's that for a short-living builtin ;)
> >
> > About the semantics:
> >
> > These should be backward compatible to str() in that everything
> > that worked before should continue to work after the merge.
> >
> > A strawman for processing str() and unicode():
> >
> > 1. strings/Unicode is passed back as-is
> 
> I hope you mean str() passes 8-bit strings back as-is, unicode()
> passes Unicode strings back as-is, right?

Right.
 
> > 2. tp_str is tried
> > 3. the method __str__ is tried
> 
> Shouldn't have to -- instances should define tp_str and all the magic
> for calling __str__ should be there.  I don't understand why it's not
> done that way, probably just for historical reasons.  I also don't
> think __str__ should be tried for non-instance types.

Ok.
 
> But, more seriously, I believe tp_str or __str__ shouldn't be tried at
> all by unicode().

Hmm, but how would you implement generic conversion to Unicode 
then ? 

We'll need some way for instances (and other types) to
provide a conversion to Unicode. Some time ago we discussed this
issue and came to the conclusion that tp_str should be allowed
to return Unicode data instead of inventing a new tp_unicode
slot for this purpose.

> > 4. the PyObject_AsCharBuffer() API is tried (bf_getcharbuffer)
> > 5. for str(): Unicode return values are converted to strings using
> >               the default encoding
> >    for unicode(): Unicode return values are passed back as-is;
> >               string return values are decoded according to the
> >               encoding parameter
> > 6. the return object is type-checked: str() will always return
> >    a string object, unicode() always a Unicode object
> >
> > Note that passing back Unicode is only allowed in case no encoding
> > was given. Otherwise an execption is raised: you can't decode
> > Unicode.
> >
> > As extension we could add encoding and error parameters to str()
> > as well. The result would be either an encoding of Unicode objects
> > passed back by tp_str or __str__ or a recoding of string objects
> > returned by checks 2, 3 or 4.
> 
> Naaaah!

Would be nice for symmetry and useful in the light of making
Unicode the only string type in Py4k ;-)
 
> > If we agree to take this approach, then we should remove the
> > unistr() Python API before the alpha ships.
> 
> Frankly, I believe we need more time to sort this out, and therefore I
> propose to remove the unistr() built-in before the release.  Marc,
> would you do the honors?

Ok. 

I'll remove the builtin and the docs, but will leave the
PyObject_Unicode() API enabled.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From uche.ogbuji at fourthought.com  Fri Jan 19 22:42:40 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Fri, 19 Jan 2001 14:42:40 -0700
Subject: [Python-Dev] Extension doc bugs
Message-ID: <200101192142.OAA29168@localhost.localdomain>

I'm using the bleeding-edge documentation at 

http://python.sourceforge.net/devel-docs/api/api.html

I know that it's not complete until someone has the time to do so, but I've 
run into a few places where it's completely wrong.

For instance, from the object protocol docs: 

"""
int PyObject_Cmp (PyObject *o1, PyObject *o2, int *result) 
      Compare the values of o1 and o2 using a routine provided by o1, if one   
       exists, otherwise with a routine provided by o2. The result of the
      comparison is returned in result. Returns -1 on failure. This is the     
       equivalent of the Python statement "result = cmp(o1, o2)".
"""

After getting weird behavior implementing this, and then squinting at the 
relevant Python 2.0 code, it appears that in actuality the Cmp function is to 
return the direct comparison results (-1, 0, 1 based on ordering of the 
parameters)  furthermore, there is no such "result" argument.

4Suite has a lot of C extension code developed by squinting at Python sources 
and long gdb sessions and I have a feeling that in many cases we're taking up 
hacks that would get us into trouble across versions, and all that; but the 
"official" interfaces and behaviors are not documented (or only poorly 
documented).  In general, the C API docs are in a rather sorry state and 
though I doubt I could do a great deal about fixing it, I'd be interested in 
discussion of the matter, and perhaps making what contribution I can.

Is the doc-sig the best place for this?  My experience there wouldn't seem to 
encourage this conclusion (most of the discussion is of docstring syntax and 
neat-o automagic document generators).


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From mal at lemburg.com  Fri Jan 19 22:46:24 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 22:46:24 +0100
Subject: [Python-Dev] readline and setup.py
Message-ID: <3A68B5B0.771412F7@lemburg.com>

The new setup.py procedure for Python causes readline not to
be built on my machine. Instead I get a linker error telling
me that termcap is not found.

Looking at my old Setup file, I have this line:

readline readline.c \
	 -I/usr/include/readline -L/usr/lib/termcap \
	 -lreadline -lterm

I guess, setup.py should be modified to include additional
library search paths -- shouldn't hurt on platforms which
don't need them.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Fri Jan 19 22:50:53 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 22:50:53 +0100
Subject: [Python-Dev] _tkinter and setup.py
Message-ID: <3A68B6BD.BAD038D6@lemburg.com>

Why does setup.py stop with an error in case _tkinter cannot
be built (due to an old Tk/Tcl version in my case) ?

I think the policy in setup.py should be to output warnings,
but continue building the rest of the Python modules.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Fri Jan 19 23:38:22 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 17:38:22 -0500
Subject: [Python-Dev] 2.1 alpha 1 release schedule
Message-ID: <200101192238.RAA12413@cj20424-a.reston1.va.home.com>

Practicality beats purity: we're very close to a release, but I've
decided to hold off to give Jeremy a chance to finish the nested
scopes, to give Fred a chance to revise the weak references according
to Martin's wishes, and in general for things to settle.

Most likely we'll be able to release Monday night (Jan 22).

Unfortunately email through python.org seems to be wedged again (I
swear, it seems like it starts getting wedged every afternoon between
3 and 4!) so I don't have a clear view of what the latest checkins
were; but from cvs update it seems that the following things happened
this afternoon:

- Barry fixed a core dump in function attribute assignments

- Marc-Andre withrew unistr(), pending more discussion

- Fredrik fixed the ucnhash problem

- I fixed two path problems in the new build process that only
  occurred when you were building in a subdirectory of the source tree

Good work, crew!  I'm taking the weekend off.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jack at oratrix.nl  Sat Jan 20 00:23:18 2001
From: jack at oratrix.nl (Jack Jansen)
Date: Sat, 20 Jan 2001 00:23:18 +0100
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments 
In-Reply-To: Message by Guido van Rossum <guido@digicool.com> ,
	     Fri, 19 Jan 2001 11:34:45 -0500 , <200101191634.LAA29239@cj20424-a.reston1.va.home.com> 
Message-ID: <20010119232323.70B03116392@oratrix.oratrix.nl>

Recently, Guido van Rossum <guido at digicool.com> said:
> > I get the impression that I'm currently seeing a non-NULL third
> > argument in my (C) methods even though the method is called without
> > keyword arguments.
> 
> > Is this new semantics that I missed the discussion about, or is this a bug?
> 
> [...] 
> Do you really need the NULL?

The places that I know I was counting on the NULL now have "if ( kw && 
PyObject_IsTrue(kw))", so I'll just have to hope there aren't any more 
lingering in there.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 



From tim.one at home.com  Sat Jan 20 01:04:10 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 19 Jan 2001 19:04:10 -0500
Subject: [Python-Dev] MS CRT crashing:
In-Reply-To: <200101191556.KAA28761@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMENLIJAA.tim.one@home.com>

[Guido]
> I'd just like to note for the record that this is exactly what I had
> predicted.

I would have hoped you'd be content to let the record speak for itself
<wink>.

> I'd also like to note that I *agree*.

With what?  That the program is undefined by the C std was never in dispute.

> Tim seems to think there's a race condition in the threading code,
> but it's really much simpler than that: the same bug can easily be
> provoked with a single-threaded program: just randomly read and
> write alternatingly.

And this is a point in their favor?!  "It's OK that the MT library corrupts
itself, because even the single-threaded library does"?

> So obviously the people who wrote the threading code aren't interested
> in the bug,

I don't know that it ever got as far as the people who wrote the threading
code, but I sure doubt it:  when the reply starts "Turns out the C standard
explicitly says  ...", it strongly suggests it was written by someone who
didn't already know what the C std says, and went looking for an excuse to
get it off their plate without further effort.  Par for the course, if so.

> because it's not in their code -- and the people who wrote the code
> that doesn't behave well when abused are protected by the C standard...

The behavior of things designated "undefined" and "implementation-defined"
by the std fall under "quality of implementation".  In the real world, the
latter is what vendors compete on; meeting the letter of the std is a bare
minimum for playing the game at all.

The plain fact is that their library is less robust than others in this
case.  I worked on a multithreaded stdio implementation at KSR, and that
sure couldn't corrupt itself.  Looks like no flavor of Linux does either.
It's not *reasonable* for a library to corrupt itself in this case, although
it's certainly reasonable for its behavior to vary from run to run.  There's
nothing in the C std that says a conforming implementation can't *crash* on
the program

void main() {int i = 1;}

either <wink>.

a-std-is-a-floor-on-acceptable-behavior-not-a-ceiling-ly y'rs  - tim




From gstein at lyra.org  Sat Jan 20 02:21:56 2001
From: gstein at lyra.org (Greg Stein)
Date: Fri, 19 Jan 2001 17:21:56 -0800
Subject: [Python-Dev] initializing ob_type
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPIEHFCPAA.MarkH@ActiveState.com>; from MarkH@ActiveState.com on Fri, Jan 19, 2001 at 12:11:02PM -0800
References: <010c01c08201$4b0ec050$e46940d5@hagrid> <LCEPIIGDJPKCOIHOBJEPIEHFCPAA.MarkH@ActiveState.com>
Message-ID: <20010119172156.Y7731@lyra.org>

On Fri, Jan 19, 2001 at 12:11:02PM -0800, Mark Hammond wrote:
> > you can compile the module as C++, but that's also a bit painful...
> 
> My understanding is that the C std doesn't guarantee the order of static
> object initialization, whereas C++ does provide these semantics.  At least
> that is the excuse I found when digging into this some years ago.

True, but when PyWhatever_Type is initialized, &PyType_Type ought to be
ready (even if it isn't initialized). Heck, &PyType_Type points into the
Python core which is *definitely* loaded by that point.

Now, if "initialization" also means "relocation to a specific address" then
I can understand.

Hrm... I've just spent some time with the Windows SDK docs, and I can't find
anything that really discusses the problem and resolution. There certainly
isn't any warning about "don't do this." It all talks about how fixups are
stored with the DLL, how you can optionally use BIND to pre-bind the values,
blah blah blah. But nothing saying "it doesn't work."

It would be interesting to know more about the actual symptoms that appears
when the ob_type init is performed by the structure (rather than at
runtime). What happens? Bad address? NULL value? Failure to resolve and
load? Is PyType_Type not exported correctly or something?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From guido at digicool.com  Sat Jan 20 03:05:39 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 21:05:39 -0500
Subject: [Python-Dev] How to get setup.py to build expat?
Message-ID: <200101200205.VAA13299@cj20424-a.reston1.va.home.com>

The setup.py script does not build the expat module for me.

I have expat installed in /usr/local, at least I believe so: I have
/usr/local/include/xmlparse.h and /usr/local/lib/libexpat.a -- do I
need more?

How can I get setup.py to spit out what it tries, and why it fails?
setup.py -v build doesn't give any extra output.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at effbot.org  Sat Jan 20 03:41:43 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Sat, 20 Jan 2001 03:41:43 +0100
Subject: [Python-Dev] initializing ob_type
References: <010c01c08201$4b0ec050$e46940d5@hagrid> <LCEPIIGDJPKCOIHOBJEPIEHFCPAA.MarkH@ActiveState.com> <20010119172156.Y7731@lyra.org>
Message-ID: <00f001c0828a$bc903900$e46940d5@hagrid>

greg wrote:

> It would be interesting to know more about the actual symptoms that appears
> when the ob_type init is performed by the structure (rather than at runtime).
> What happens?

    http://www.python.org/doc/FAQ.html#3.24
    "3.24. "Initializer not a constant" while building DLL
    on MS-Windows

    "Static type object initializers in extension modules
    may cause compiles to fail with an error message
    like "initializer not a constant"

Cheers /F




From uche.ogbuji at fourthought.com  Sat Jan 20 06:29:23 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Fri, 19 Jan 2001 22:29:23 -0700
Subject: [Python-Dev] Extension doc bugs 
In-Reply-To: Message from uche.ogbuji@fourthought.com 
   of "Fri, 19 Jan 2001 14:42:40 MST." <200101192142.OAA29168@localhost.localdomain> 
Message-ID: <200101200529.WAA30349@localhost.localdomain>

> For instance, from the object protocol docs: 
> 
> """
> int PyObject_Cmp (PyObject *o1, PyObject *o2, int *result) 
>       Compare the values of o1 and o2 using a routine provided by o1, if one   
>        exists, otherwise with a routine provided by o2. The result of the
>       comparison is returned in result. Returns -1 on failure. This is the     
>        equivalent of the Python statement "result = cmp(o1, o2)".
> """
> 
> After getting weird behavior implementing this, and then squinting at the 
> relevant Python 2.0 code, it appears that in actuality the Cmp function is to 
> return the direct comparison results (-1, 0, 1 based on ordering of the 
> parameters)  furthermore, there is no such "result" argument.

Bother.  I didn't squint hard enough.  I mistook the tp_compare slot for the 
PyObject_Cmp equivalent.  I have indeed run into what I'm sure are nits in the 
Python/C API but given that my greatest alarm was false, I'll be more careful 
before bringing up the others.

I'm still curious as to the best forum for this.

-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From tim.one at home.com  Sat Jan 20 06:36:12 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 20 Jan 2001 00:36:12 -0500
Subject: [Python-Dev] Extension doc bugs
In-Reply-To: <200101192142.OAA29168@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPKIJAA.tim.one@home.com>

[uche.ogbuji at fourthought.com]
> ...
> In general, the C API docs are in a rather sorry state and though
> I doubt I could do a great deal about fixing it, I'd be interested in
> discussion of the matter, and perhaps making what contribution I can.
>
> Is the doc-sig the best place for this?

Nope!  Discussing it won't do any good, there or anywhere else.  What it
needs is for people to send better docs to python-docs at python.org or upload
LaTeX patches to SourceForge, and to report doc bugs on SourceForge (which
is where the start of this msg should have gone!).  Most days we just work
on whatever is backed up at SourceForge; if doc bugs don't show up there,
they won't get repaired.

the-docs-are-only-10x-better-than-the-sum-of-the-individual-
    contributions<wink>-ly y'rs  - tim




From tim.one at home.com  Sat Jan 20 07:17:04 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 20 Jan 2001 01:17:04 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Objects object.c,2.109,2.110
In-Reply-To: <E14JrC9-00056U-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPNIJAA.tim.one@home.com>

[Barry]
> Modified Files:
> 	object.c
> Log Message:
> default_3way_compare(): When comparing the pointers, they must be cast
> to integer types (i.e. Py_uintptr_t, our spelling of C9X's uintptr_t).
> ANSI specifies that pointer compares other than == and != to
> non-related structures are undefined.  This quiets an Insure
> portability warning.

Barry, that comment belongs in the code, not in the checkin msg.  The code
*used* to do this correctly (as you well know, since you & I went thru
considerable pain to fix this the first time).  However, because the
*reason* for the convolution wasn't recorded in the code as a comment,
somebody threw it all away the first time it got reworked.

c-code-isn't-often-self-explanatory-ly y'rs  - tim




From tim.one at home.com  Sat Jan 20 07:30:42 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 20 Jan 2001 01:30:42 -0500
Subject: [Python-Dev] Stupid Python Tricks, Volume 38 Number 1
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPOIJAA.tim.one@home.com>

I had a huge string and wanted to put a double-quote on each end.  The
boring:

    '"' + huge + '"'

does the job, but is inefficent <snort>.  Then this transparent variation
sprang unbidden from my hoary brow:

    huge.join('""')

*That* should put to rest the argument over whether .join() is more properly
a method of the separator or the sequence -- '""'.join(huge) instead would
look plain silly <wink>.

not-entirely-sure-i'm-channeling-on-this-one-ly y'rs  - tim





From tim.one at home.com  Sat Jan 20 10:28:18 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 20 Jan 2001 04:28:18 -0500
Subject: [Python-Dev] Comparison of recursive objects
In-Reply-To: <14952.21218.416551.695660@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEABIKAA.tim.one@home.com>

[Guido's checkin msg]
> ...
> In a discussion with Tim, where we discovered that our intuition
> on when a<=b should be true was failing, we decided to outlaw
> ordering comparisons on recursive objects.  (Once we have fixed our
> intuition and designed a matching algorithm that's practical and
> reasonable to implement, we can allow such orderings again.)

[Jeremy]
> Sounds sensible to me!  I was quite puzzled about what <= should
> return for recursive objects.

That's easy:  x <= y for recursive objects should return true if and only if
x < y or x == y return true <0.9 wink>.

x == y isn't a problem, although Python gives a remarkable answer:
recursive objects in Python are instances of rooted, ordered, directed,
finite, node-labeled graphs, and "x == y" in Python answers whether their
graphs are isomorphic.

Viewed that way (which is the correct way <0.5 wink>), the *natural* meaning
for "x <= y" is "y contains a subgraph isomorphic to x".  And that has
*almost* all the nice properties we like:

    x <= x is true
    (x <= y and y <= z) implies x <= z
    (x <= y and y <= x) if and only if x == y

However,

1. That's much harder to compute.
2. It implies, e.g., [2] <= [1, 2], and that's not what we *want*
   non-recursive sequence comparison to mean.
3. It's a partial ordering:  given arbitrary x and y, it may be that
   neither contains an isomorphic image of the other.
4. We've again given up on avoiding surprises in *simple* comparisons
   among builtin types, like (under current CVS):

>>> 1 < [1] < 0L < 1
1
>>> 1 < 1
0
>>>

   so it's hard to see why we should do any work at all to avoid
   violating "intuition" when comparing recursive objects:  we're
   already scrubbing the face of intuition with steel wool,
   setting it on fire, then putting it out with an axe <wink>.

Now let's look at Guido's example (or one of them, anyway):

>>> a = []
>>> a.append(a)
>>> a.append("x")
>>> b = []
>>> b.append(b)
>>> b.append("y")
>>> a
[[...], 'x']
>>> b
[[...], 'y']
>>>

I think it's a trick of *typography* that caused my first thought to be
"well, clearly, a < b".  That is, the *display* shows me two 2-element
lists, each with the same "blob" as the first element, and where a[1] is
obviously less than b[1].  Since "the blobs" are the same, the second
elements control the outcome.

But those "blobs" aren't really the same:  a[0] is a, and b[0] is b, so
asking whether a < b by looking first at their first elements just leads
back to the original question:  asking whether a[0] < b[0] is again asking
whether a < b, and that makes no progress.  Saying that a is less than b by
fiat is *consistent* with the rules for lexicographic ordering, but so is
insisting that a is greater than b.  There's no basis for picking one over
the other, and so no clear hope of coming up with a generally consistent
scheme.  Well, one clear hope:  if recursive comparison says "not equal", it
could resolve the dilemma by comparing object id instead.  That would be
consistent (I mostly think at the moment ...), but if you run the program
above multiple times it may say a < b on some runs and b < a on others.

WRT "the right way", it should be clear from the attached picture that
neither a nor b contains an isomorphic image of the other, so from that POV
they're not comparable (a != b, but neither a <= b nor b <= a holds).

So this is what Guido made Python do:

>>> a == b  # still cool:  they're not isomorphic and Python knows it
0
>>> a < b
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: can't order recursive values
>>> a <= b
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: can't order recursive values

In light of that, I still find these mildly surprising:

>>> a < a
0
>>> a <= a
1
>>>

I guess some recursive values are more orderable than others <wink -- but
that's true!  the ones Python can prove are equal are indeed "more
orderable">.

>>> import copy
>>> c = copy.deepcopy(a)
>>> c
[[...], 'x']
>>> a == c
1
>>> a <= c
1
>>> a < c
0
>>>

BTW, this kind of construction appears to give equality-testing that's at
best(!) exponential-time in the size of the dicts:

def timeeq(x, y):
    from time import clock
    import sys
    s = clock()
    result = x == y
    f = clock()
    print x, result, round(f-s, 1), "seconds"
    sys.stdout.flush()

d = {}
e = {}
timeeq(d, e)
d[0] = d
e[0] = e
timeeq(d, e)
d[1] = d
e[1] = e
timeeq(d, e)
d[2] = d
e[2] = e
timeeq(d, e)

Output:

{} 1 0.0 seconds
{0: {...}} 1 0.0 seconds
{1: {...}, 0: {...}} 1 6.5 seconds

After more than 15 minutes, the 3-element dict comparison still hasn't
completed (yikes!).

ackerman's-function-eat-your-heart-out-ly y'rs  - tim
-------------- next part --------------
A non-text attachment was scrubbed...
Name: loopy.jpg
Type: image/jpeg
Size: 11363 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20010120/fce25c79/attachment.jpg>

From thomas at xs4all.net  Sat Jan 20 15:30:26 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sat, 20 Jan 2001 15:30:26 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <200101192206.RAA12072@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Fri, Jan 19, 2001 at 05:06:03PM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl> <200101191626.LAA29165@cj20424-a.reston1.va.home.com> <20010119173415.M17295@xs4all.nl> <200101192206.RAA12072@cj20424-a.reston1.va.home.com>
Message-ID: <20010120153026.L17392@xs4all.nl>

On Fri, Jan 19, 2001 at 05:06:03PM -0500, Guido van Rossum wrote:

> So instead I added 5 lines to site.py, which tests for
> os.name=='posix', then for sys.path[-1] ending in '/Modules' -- this
> tests only succeeds when running from the build directory.  Then it
> calls distutils.util.get_platform() and uses the result to calculate
> the correct directory name, which is then appended to sys.path.

> Yes, this slows down startup (it imports a large portion of the
> distutils package), but I don't care -- after all this is mostly for
> me so I can play with the interpreter right after I've built it,
> right?

Right. The only downside (as far as I can tell) is that 'python -S' no
longer works, in the build tree. I don't think that's that big a deal, but
it should be documented somewhere, so we don't end up being boggled by it
once we forget about it :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Sat Jan 20 17:18:39 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 11:18:39 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: Your message of "Fri, 19 Jan 2001 00:45:32 +0100."
             <20010119004532.G17392@xs4all.nl> 
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net>  
            <20010119004532.G17392@xs4all.nl> 
Message-ID: <200101201618.LAA15675@cj20424-a.reston1.va.home.com>

> On Thu, Jan 18, 2001 at 08:46:54AM -0800, Guido van Rossum wrote:
> 
> > filename = '/tmp/delete_me'
> 
> This reminds me: we need a portable way to handle test-files :)

Yeah, I noticed that this test failed on Windows -- fixed now.

The test_support module exports TESTFN; there's also tempfile.mktemp()
which should generate temporary files on all platforms.

Is that enough?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Sat Jan 20 17:36:05 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sat, 20 Jan 2001 17:36:05 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <200101201618.LAA15675@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Sat, Jan 20, 2001 at 11:18:39AM -0500
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net> <20010119004532.G17392@xs4all.nl> <200101201618.LAA15675@cj20424-a.reston1.va.home.com>
Message-ID: <20010120173605.P17295@xs4all.nl>

On Sat, Jan 20, 2001 at 11:18:39AM -0500, Guido van Rossum wrote:
> > On Thu, Jan 18, 2001 at 08:46:54AM -0800, Guido van Rossum wrote:
> > 
> > > filename = '/tmp/delete_me'
> > 
> > This reminds me: we need a portable way to handle test-files :)
> Yeah, I noticed that this test failed on Windows -- fixed now.

> The test_support module exports TESTFN; there's also tempfile.mktemp()
> which should generate temporary files on all platforms.
> Is that enough?

Well, there is one more issue, which we can't fix terribly easy: test_fcntl
tries to flock() the file. flock() doesn't work on all filesystems (like
NFS) :P If we cared a lot, we could try several alternatives (current dir,
/tmp, /var/tmp) in the specific case of flock, but personally I don't want to
bother, and real sysadmins (who should care about the test failure) are more
likely to build Python on a local disk than in their NFS-mounted
homedirectory. At least that's how we do it :-) 

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Sat Jan 20 17:43:49 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 11:43:49 -0500
Subject: [Python-Dev] Stupid Python Tricks, Volume 38 Number 1
In-Reply-To: Your message of "Sat, 20 Jan 2001 01:30:42 EST."
             <LNBBLJKPBEHFEDALKOLCIEPOIJAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPOIJAA.tim.one@home.com> 
Message-ID: <200101201643.LAA16269@cj20424-a.reston1.va.home.com>

> I had a huge string and wanted to put a double-quote on each end.  The
> boring:
> 
>     '"' + huge + '"'
> 
> does the job, but is inefficent <snort>.  Then this transparent variation
> sprang unbidden from my hoary brow:
> 
>     huge.join('""')

Points off for obscurity though!  My favorite for this is:

    '"%s"' % huge

Worth a microbenchmark?

> *That* should put to rest the argument over whether .join() is more properly
> a method of the separator or the sequence -- '""'.join(huge) instead would
> look plain silly <wink>.
> 
> not-entirely-sure-i'm-channeling-on-this-one-ly y'rs  - tim

Give up the channeling for a while -- there's too much interference in
the air from the Microsoft threaded stdio debate still. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Sat Jan 20 17:47:44 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 20 Jan 2001 10:47:44 -0600 (CST)
Subject: [Python-Dev] how to test my __all__ lists?
Message-ID: <14953.49456.654121.987189@beluga.mojam.com>

How do I test the __all__ lists I'm building?  I'm worried about a couple
things:

    1. I may have typos
    2. I may leave something out of a list that should be imported by
       from-module-import-*.

Thoughts?

Skip



From guido at digicool.com  Sat Jan 20 18:00:05 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 12:00:05 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: Your message of "Sat, 20 Jan 2001 17:36:05 +0100."
             <20010120173605.P17295@xs4all.nl> 
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net> <20010119004532.G17392@xs4all.nl> <200101201618.LAA15675@cj20424-a.reston1.va.home.com>  
            <20010120173605.P17295@xs4all.nl> 
Message-ID: <200101201700.MAA16491@cj20424-a.reston1.va.home.com>

> > > > filename = '/tmp/delete_me'
> > > 
> > > This reminds me: we need a portable way to handle test-files :)
> > Yeah, I noticed that this test failed on Windows -- fixed now.
> 
> > The test_support module exports TESTFN; there's also tempfile.mktemp()
> > which should generate temporary files on all platforms.
> > Is that enough?
> 
> Well, there is one more issue, which we can't fix terribly easy: test_fcntl
> tries to flock() the file. flock() doesn't work on all filesystems (like
> NFS) :P If we cared a lot, we could try several alternatives (current dir,
> /tmp, /var/tmp) in the specific case of flock, but personally I don't want to
> bother, and real sysadmins (who should care about the test failure) are more
> likely to build Python on a local disk than in their NFS-mounted
> homedirectory. At least that's how we do it :-) 

These days, I would think that it's a pretty sure bet that the
system's tmp directory is not on NFS.  Then we could just use
tempfile.mktemp() in that module, right?  Or does the /tmp filesystem
on Linux (which AFAIK is a RAM disk implemented in virtual memory so
it uses swap space when it runs out of RAM) not support locking?

I don't particularly care about fixing this -- I haven't seen bug
reports about this.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Sat Jan 20 18:38:38 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 12:38:38 -0500
Subject: [Python-Dev] how to test my __all__ lists?
In-Reply-To: Your message of "Sat, 20 Jan 2001 10:47:44 CST."
             <14953.49456.654121.987189@beluga.mojam.com> 
References: <14953.49456.654121.987189@beluga.mojam.com> 
Message-ID: <200101201738.MAA16636@cj20424-a.reston1.va.home.com>

> How do I test the __all__ lists I'm building?  I'm worried about a couple
> things:
> 
>     1. I may have typos

Do "from M import *" -- this will raise an AttributeError if there's
something in __all__ that's not defined in the module.

>     2. I may leave something out of a list that should be imported by
>        from-module-import-*.

That's what alpha-testing's for.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at netaxs.com  Sat Jan 20 18:49:43 2001
From: esr at netaxs.com (Eric Raymond)
Date: Sat, 20 Jan 2001 12:49:43 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <3A672376.4B951848@lemburg.com>; from M.-A. Lemburg on Thu, Jan 18, 2001 at 06:10:14PM +0100
References: <20010118022321.A9021@thyrsus.com> <3A672376.4B951848@lemburg.com>
Message-ID: <20010120124943.C6073@unix3.netaxs.com>

> A combination of time.time(), process id and counter should
> work in all cases. Make sure you use a lock around the counter,
> though.

Yes, but...this hack has to work in a multithreaded environment,
so process ID isn't good enough.  And I don't want to keep a counter
around if I don't have to.
-- 
	<a href="http://www.tuxedo.org/~esr/home.html">Eric S. Raymond</a>



From guido at digicool.com  Sat Jan 20 19:01:04 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 13:01:04 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: Your message of "Sat, 20 Jan 2001 12:49:43 EST."
             <20010120124943.C6073@unix3.netaxs.com> 
References: <20010118022321.A9021@thyrsus.com> <3A672376.4B951848@lemburg.com>  
            <20010120124943.C6073@unix3.netaxs.com> 
Message-ID: <200101201801.NAA16880@cj20424-a.reston1.va.home.com>

> > A combination of time.time(), process id and counter should
> > work in all cases. Make sure you use a lock around the counter,
> > though.
> 
> Yes, but...this hack has to work in a multithreaded environment,
> so process ID isn't good enough.  And I don't want to keep a counter
> around if I don't have to.

Sorry Eric, this just doesn't make sense.  Keeping a counter around in
your module (protected by a semaphore) is obviously the right
solution.  Why are you fighting it?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at netaxs.com  Sat Jan 20 19:20:26 2001
From: esr at netaxs.com (Eric Raymond)
Date: Sat, 20 Jan 2001 13:20:26 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <200101201801.NAA16880@cj20424-a.reston1.va.home.com>; from Guido van Rossum on Sat, Jan 20, 2001 at 01:01:04PM -0500
References: <20010118022321.A9021@thyrsus.com> <3A672376.4B951848@lemburg.com> <20010120124943.C6073@unix3.netaxs.com> <200101201801.NAA16880@cj20424-a.reston1.va.home.com>
Message-ID: <20010120132026.E6073@unix3.netaxs.com>

On Sat, Jan 20, 2001 at 01:01:04PM -0500, Guido van Rossum wrote:
> > Yes, but...this hack has to work in a multithreaded environment,
> > so process ID isn't good enough.  And I don't want to keep a counter
> > around if I don't have to.
> 
> Sorry Eric, this just doesn't make sense.  Keeping a counter around in
> your module (protected by a semaphore) is obviously the right
> solution.  Why are you fighting it?

Actually, I'm not fighting it any more.  I changed my mind a few minutes
after shipping that response.
-- 
	<a href="http://www.tuxedo.org/~esr/home.html">Eric S. Raymond</a>



From thomas at xs4all.net  Sat Jan 20 19:37:10 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sat, 20 Jan 2001 19:37:10 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <200101201700.MAA16491@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Sat, Jan 20, 2001 at 12:00:05PM -0500
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net> <20010119004532.G17392@xs4all.nl> <200101201618.LAA15675@cj20424-a.reston1.va.home.com> <20010120173605.P17295@xs4all.nl> <200101201700.MAA16491@cj20424-a.reston1.va.home.com>
Message-ID: <20010120193710.Q17295@xs4all.nl>

On Sat, Jan 20, 2001 at 12:00:05PM -0500, Guido van Rossum wrote:

> > Well, there is one more issue, which we can't fix terribly easy: test_fcntl
> > tries to flock() the file. flock() doesn't work on all filesystems (like
> > NFS) :P If we cared a lot, we could try several alternatives (current dir,
> > /tmp, /var/tmp) in the specific case of flock, but personally I don't want to
> > bother, and real sysadmins (who should care about the test failure) are
> > more likely to build Python on a local disk than in their NFS-mounted
> > homedirectory. At least that's how we do it :-)

> These days, I would think that it's a pretty sure bet that the
> system's tmp directory is not on NFS.  Then we could just use
> tempfile.mktemp() in that module, right?  Or does the /tmp filesystem
> on Linux (which AFAIK is a RAM disk implemented in virtual memory so
> it uses swap space when it runs out of RAM) not support locking?

Actually, most Linux distributions don't care enough about /tmp to make it a
RAM-based filesystem. At least Debian and RedHat don't :) (There's a good
reason for that: Linux's disk-data cache rocks if you have enough RAM, so
there's no real gain in using a ramdisk) BSDI does (optionally) have such a
/tmp, and probably the other BSD derived systems as well. But that doesn't
mean it doesn't support locking, so that's not a real excuse.

But like I said, I don't care enough to worry about it. I'll look at it
before alpha2.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tim.one at home.com  Sat Jan 20 21:10:51 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 20 Jan 2001 15:10:51 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEABIKAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com>

[Tim]
> ...
> 4. We've again given up on avoiding surprises in *simple* comparisons
>    among builtin types, like (under current CVS):
>
> >>> 1 < [1] < 0L < 1
> 1
> >>> 1 < 1
> 0
> >>>

I really dislike that.  Here's a consequence at a higher level:

N = 5
x = [1 for i in range(N)] + \
    [[1] for i in range(N)] + \
    [0L for i in range(N)]

x.sort()
print x

from random import shuffle
tries = failures = 0
while failures < 5:
    tries += 1
    y = x[:]
    shuffle(y)
    y.sort()
    if x != y:
        print "oops, on try number", tries
        print y
        failures += 1

and here's a typical run (2.1a1):

[1, 1, 1, 1, 1, [1], [1], [1], [1], [1], 0L, 0L, 0L, 0L, 0L]
oops, on try number 3
[0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1]]
oops, on try number 5
[[1], 0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1]]
oops, on try number 6
[0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1]]
oops, on try number 7
[[1], 0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1]]
oops, on try number 8
[0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1], 0L, 0L, 0L, 0L]

I've often used list.sort() on a heterogeneous list simply to bring the
elements of the same type next to each other.  But as "try number 5" shows,
I can no longer rely on even getting all the lists together.  Indeed,
heterogenous list.sort() has become a very bad (biased and slow)
implementation of random.shuffle() <wink>.

Under 2.0, the program never prints "oops", because the only violations of
transitivity in 2.0's ordering of builtin types were bugs in the
implementation (none of which show up in this simple test case); 2.0's
.sort() *always* produces

[0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1]]

The base trick in 2.0 was sound:  when falling back to the "compare by name
of the type" last resort, treat all numeric types as if they had the same
name.

While Python can't enforce that any user-defined __cmp__ is consistent, I
think it should continue to set a good example in the way it implements its
own comparisons.

grumblingly y'rs  - tim




From skip at mojam.com  Sat Jan 20 21:42:27 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 20 Jan 2001 14:42:27 -0600 (CST)
Subject: [Python-Dev] should a module's thread safety be documented?
Message-ID: <14953.63539.629197.232848@beluga.mojam.com>

A bit late for 2.1alpha1, but it just occurred to me that perhaps there
should be an annotation in the documentation that indicates whether or not a
module is thread-safe.  For example, many functions in fileinput rely on a
module global called _state.  It strikes me that this module is not likely
to be thread-safe, yet the documentation doesn't appear to mention this,
certainly not in an obvious fashion.

Anyone for adding \notthreadsafe{} and \threadsafe{} macros to the litany of
LaTex macros in Fred's arsenal?  This would make documenting these
properties both easy and consistent across modules.

Skip




From tim.one at home.com  Sat Jan 20 22:13:41 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 20 Jan 2001 16:13:41 -0500
Subject: [Python-Dev] Stupid Python Tricks, Volume 38 Number 1
In-Reply-To: <200101201643.LAA16269@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEBEIKAA.tim.one@home.com>

[Tim]
>     huge.join('""')

[Guido]
> Points off for obscurity though!

The Subject line was "Stupid Python Tricks" for a reason <wink>.  Those who
don't know the language inside-out should be tickled by figuring out why it
even *works* (hint for the baffled:  you have to view '""' as a sequence
rather than as an atomic string).

> My favorite for this is:
>
>     '"%s"' % huge
>
> Worth a microbenchmark?

Absolutely!  I get:

     obvious  15.574
     obscure   8.165
     sprintf   8.133

after running:

ITERS = 1000
indices = [0] * ITERS

def obvious(huge):
    for i in indices:  '"' + huge + '"'

def obscure(huge):
    for i in indices:  huge.join('""')

def sprintf(huge):
    for i in indices:  '"%s"' % huge

def runtimes(huge):
    from time import clock
    for f in obvious, obscure, sprintf:
        start = clock()
        f(huge)
        finish = clock()
        print "%12s %7.3f" % (f.__name__, finish - start)

runtimes("x" * 1000000)

under current 2.1a1.  Not a dead-quiet machine, but the difference is too
small to care.  Speed up huge.join attr lookup, and it would probably be
faster <wink>.  Hmm:  if I boost ITERS high enough and cut back the size of
huge, "obscure" eventually becomes *slower* than "obvious", and even if the
"huge.join" lookup is floated out of the loop.  I guess that points to the
relative burden of calling a bound method.  So, in real life, the huge.join
approach may well be the slowest!

>> not-entirely-sure-i'm-channeling-on-this-one-ly y'rs  - tim

> Give up the channeling for a while -- there's too much interference in
> the air from the Microsoft threaded stdio debate still. :-)

What debate?  You need two arguably valid points of view for a debate to
even start <wink>.

gloating-in-victory-vicious-in-defeat-but-simply-unbearable-in-
    ambiguity-ly y'rs  - tim




From fdrake at acm.org  Sat Jan 20 22:23:58 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sat, 20 Jan 2001 16:23:58 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <200101201700.MAA16491@cj20424-a.reston1.va.home.com>
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net>
	<20010119004532.G17392@xs4all.nl>
	<200101201618.LAA15675@cj20424-a.reston1.va.home.com>
	<20010120173605.P17295@xs4all.nl>
	<200101201700.MAA16491@cj20424-a.reston1.va.home.com>
Message-ID: <14954.494.223724.705495@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > tempfile.mktemp() in that module, right?  Or does the /tmp filesystem
 > on Linux (which AFAIK is a RAM disk implemented in virtual memory so
 > it uses swap space when it runs out of RAM) not support locking?

  I thought it was Solaris that used available+virtual memory for
/tmp; that was what we ran into at CNRI.  (Which doesn't preclude
Linux from doing the same, I just don't recall that we've encountered
that.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From fdrake at acm.org  Sat Jan 20 23:05:27 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sat, 20 Jan 2001 17:05:27 -0500 (EST)
Subject: [Python-Dev] should a module's thread safety be documented?
In-Reply-To: <14953.63539.629197.232848@beluga.mojam.com>
References: <14953.63539.629197.232848@beluga.mojam.com>
Message-ID: <14954.2983.450755.761653@cj42289-a.reston1.va.home.com>

Skip Montanaro writes:
 > A bit late for 2.1alpha1, but it just occurred to me that perhaps there
 > should be an annotation in the documentation that indicates whether or not a
 > module is thread-safe.  For example, many functions in fileinput rely on a

  If you can create a list of the known thread safe and known thread
unsafe modules, I'll come up with appropriate annotations for the
documentation.

 > Anyone for adding \notthreadsafe{} and \threadsafe{} macros to the litany of
 > LaTex macros in Fred's arsenal?  This would make documenting these
 > properties both easy and consistent across modules.

  Not sure that this is exactly the right approach to the markup; I'll
think about this one.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From skip at mojam.com  Sat Jan 20 23:31:52 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 20 Jan 2001 16:31:52 -0600 (CST)
Subject: [Python-Dev] should a module's thread safety be documented?
In-Reply-To: <14954.2983.450755.761653@cj42289-a.reston1.va.home.com>
References: <14953.63539.629197.232848@beluga.mojam.com>
	<14954.2983.450755.761653@cj42289-a.reston1.va.home.com>
Message-ID: <14954.4568.460875.662560@beluga.mojam.com>

    Fred> If you can create a list of the known thread safe and known thread
    Fred> unsafe modules, I'll come up with appropriate annotations for the
    Fred> documentation.

I think that's going to be a significant undertaking, requiring examination
of a lot of Python and C code.  I'd rather approach it incrementally, which
was why I suggested the LaTeX macros.  As modules are determined to be safe
or unsafe, the appropriate safety macro could just be inserted into the
correct lib*.tex file.  It would (in my mind) expand to a stock bit of text
inserted at a standard place in the file.

Skip



From tim.one at home.com  Sat Jan 20 23:52:09 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 20 Jan 2001 17:52:09 -0500
Subject: [Python-Dev] should a module's thread safety be documented?
In-Reply-To: <14953.63539.629197.232848@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEBNIKAA.tim.one@home.com>

[Skip Montanaro]
> ...
> Anyone for adding \notthreadsafe{} and \threadsafe{} macros to
> the litany of LaTex macros in Fred's arsenal?  This would make
> documenting these properties both easy and consistent across
> modules.

When a module is *not* threadsafe, that's usually considered "a bug" in the
module.  So we should just point out modules that aren't threadsafe by
design.  Alas, that's A Project.




From nas at arctrix.com  Sat Jan 20 16:59:14 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sat, 20 Jan 2001 07:59:14 -0800
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com>; from tim.one@home.com on Sat, Jan 20, 2001 at 03:10:51PM -0500
References: <LNBBLJKPBEHFEDALKOLCAEABIKAA.tim.one@home.com> <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com>
Message-ID: <20010120075914.B18840@glacier.fnational.com>

On Sat, Jan 20, 2001 at 03:10:51PM -0500, Tim Peters wrote:
> While Python can't enforce that any user-defined __cmp__ is consistent, I
> think it should continue to set a good example in the way it implements its
> own comparisons.

I think the 2.0 behavior should be fairly easy to restore.  I'll
leave it up to Guido though since he's "Mr. Comparison" now and I
haven't looked at the code since I checked in the coercion patch.

  Neil



From nas at arctrix.com  Sat Jan 20 17:03:36 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sat, 20 Jan 2001 08:03:36 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <14954.494.223724.705495@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Sat, Jan 20, 2001 at 04:23:58PM -0500
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net> <20010119004532.G17392@xs4all.nl> <200101201618.LAA15675@cj20424-a.reston1.va.home.com> <20010120173605.P17295@xs4all.nl> <200101201700.MAA16491@cj20424-a.reston1.va.home.com> <14954.494.223724.705495@cj42289-a.reston1.va.home.com>
Message-ID: <20010120080336.C18840@glacier.fnational.com>

On Sat, Jan 20, 2001 at 04:23:58PM -0500, Fred L. Drake, Jr. wrote:
> 
> Guido van Rossum writes:
>  > tempfile.mktemp() in that module, right?  Or does the /tmp filesystem
>  > on Linux (which AFAIK is a RAM disk implemented in virtual memory so
>  > it uses swap space when it runs out of RAM) not support locking?
> 
>   I thought it was Solaris that used available+virtual memory for
> /tmp; that was what we ran into at CNRI.  (Which doesn't preclude
> Linux from doing the same, I just don't recall that we've encountered
> that.)

I don't know of any Linux system that uses a RAM based /tmp.  The
Linux implemention of ext2 is so fast it doesn't make any sense.
If you have enough memory all the data is stored in the buffer,
page, and inode caches anyhow.


  Neil



From trentm at ActiveState.com  Sun Jan 21 00:35:56 2001
From: trentm at ActiveState.com (Trent Mick)
Date: Sat, 20 Jan 2001 15:35:56 -0800
Subject: [Python-Dev] spurious print and faulty return values: Is this a bug...?
Message-ID: <20010120153556.C18375@ActiveState.com>

... or am I missing something?

With Python 2.0 on Windows 2000, when playing with sys.exit() and sys.argv()
I get some unexpected results.

First here is a simple case that shows what I expect. I run "caller_good.py"
which call "callee_good.py" and prints its return value. "callee_good.py"
returns 42 so "42" is printed:
    ----------------- caller_good.py --------------------
    import os
    retval = os.system("python callee_good.py")
    print "caller: the retval is", retval
    -----------------------------------------------------

    ----------------- callee_good.py --------------------
    import sys
    sys.exit(42)
    -----------------------------------------------------

    D:\trentm\tmp>python caller_good.py
    caller: the retval is 42


Now here is what I didn't expect. I changed "caller_bad.py" to pass, as an
argument, the value that "callee_bad.py" should return.

    ----------------- caller_bad.py ---------------------
    import os
    retval = os.system("python callee_bad.py 42")
    print "caller: the retval is", retval
    -----------------------------------------------------

    ----------------- callee_bad.py ---------------------
    import sys
    firstarg = sys.argv[1]
    print "callee_bad: firstarg is", firstarg
    sys.exit(firstarg)
    -----------------------------------------------------

    D:\trentm\tmp>python caller_bad.py
    callee_bad: firstarg is 42
    42                             # <---- where did *this* print come from?
    caller: the retval is 1        # <---- and this retval is incorrect


Any ideas? I have not tried to track this down yet nor have I tried the
latest Python-CVS state.

Trent

-- 
Trent Mick
TrentM at ActiveState.com



From moshez at zadka.site.co.il  Sun Jan 21 13:37:57 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Sun, 21 Jan 2001 14:37:57 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,NONE,1.1
In-Reply-To: <E14K45e-00030e-00@usw-pr-cvs1.sourceforge.net>
References: <E14K45e-00030e-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010121123757.D897BA83E@darjeeling.zadka.site.co.il>

Yay! I can change to python-dev manually!
(hear sounds of the timbot's teeth grinding)

On Sat, 20 Jan 2001, Skip Montanaro <montanaro at users.sourceforge.net> wrote:
> def check_all(_modname):
>     exec "import %s" % _modname
>     verify(hasattr(sys.modules[_modname],"__all__"),
>            "%s has no __all__ attribute" % _modname)
>     exec "del %s" % _modname
>     exec "from %s import *" % _modname
>     
>     _keys = locals().keys()
....

Wouldn't it be better to use the

d = {}
exec "foo", d

And verify "d" instead?

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From guido at digicool.com  Sun Jan 21 17:51:45 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sun, 21 Jan 2001 11:51:45 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: Your message of "Sat, 20 Jan 2001 15:10:51 EST."
             <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com> 
Message-ID: <200101211651.LAA25346@cj20424-a.reston1.va.home.com>

[Tim, complaining that numerical types are no longer lumped together
in default comparisons:]
> I've often used list.sort() on a heterogeneous list simply to bring the
> elements of the same type next to each other.  But as "try number 5" shows,
> I can no longer rely on even getting all the lists together.  Indeed,
> heterogenous list.sort() has become a very bad (biased and slow)
> implementation of random.shuffle() <wink>.
> 
> Under 2.0, the program never prints "oops", because the only violations of
> transitivity in 2.0's ordering of builtin types were bugs in the
> implementation (none of which show up in this simple test case); 2.0's
> .sort() *always* produces
> 
> [0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1]]
> 
> The base trick in 2.0 was sound:  when falling back to the "compare by name
> of the type" last resort, treat all numeric types as if they had the same
> name.
> 
> While Python can't enforce that any user-defined __cmp__ is consistent, I
> think it should continue to set a good example in the way it implements its
> own comparisons.

I think I can put this behavior back.  (I believe that before I
reorganized the comparison code, it seemed really tricky to do this,
but after refactoring the code, it's quite easy to do.)

My only concern is that under the old schele, two different numeric
extension types that somehow can't be compared will end up being
*equal*.  To fix this, I propose that if the names compare equal, as a
last resort we compare the type pointers -- this should be consistent
too.

Here's a patch that stops your test program from reporting failures:

*** object.c	2001/01/21 16:25:18	2.112
--- object.c	2001/01/21 16:50:16
***************
*** 522,527 ****
--- 522,528 ----
  default_3way_compare(PyObject *v, PyObject *w)
  {
  	int c;
+ 	char *vname, *wname;
  
  	if (v->ob_type == w->ob_type) {
  		/* When comparing these pointers, they must be cast to
***************
*** 550,557 ****
  	}
  
  	/* different type: compare type names */
! 	c = strcmp(v->ob_type->tp_name, w->ob_type->tp_name);
! 	return (c < 0) ? -1 : (c > 0) ? 1 : 0;
  }
  
  #define CHECK_TYPES(o) PyType_HasFeature((o)->ob_type, Py_TPFLAGS_CHECKTYPES)
--- 551,571 ----
  	}
  
  	/* different type: compare type names */
! 	if (v->ob_type->tp_as_number)
! 		vname = "";
! 	else
! 		vname = v->ob_type->tp_name;
! 	if (w->ob_type->tp_as_number)
! 		wname = "";
! 	else
! 		wname = w->ob_type->tp_name;
! 	c = strcmp(vname, wname);
! 	if (c < 0)
! 		return -1;
! 	if (c > 0)
! 		return 1;
! 	/* Same type name, or (more likely) incomparable numeric types */
! 	return (v->ob_type < w->ob_type) ? -1 : 1;
  }
  
  #define CHECK_TYPES(o) PyType_HasFeature((o)->ob_type, Py_TPFLAGS_CHECKTYPES)

Let me know if you agree with this.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Sun Jan 21 18:00:02 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sun, 21 Jan 2001 12:00:02 -0500
Subject: [Python-Dev] should a module's thread safety be documented?
In-Reply-To: Your message of "Sat, 20 Jan 2001 14:42:27 CST."
             <14953.63539.629197.232848@beluga.mojam.com> 
References: <14953.63539.629197.232848@beluga.mojam.com> 
Message-ID: <200101211700.MAA25479@cj20424-a.reston1.va.home.com>

> A bit late for 2.1alpha1, but it just occurred to me that perhaps there
> should be an annotation in the documentation that indicates whether or not a
> module is thread-safe.  For example, many functions in fileinput rely on a
> module global called _state.  It strikes me that this module is not likely
> to be thread-safe, yet the documentation doesn't appear to mention this,
> certainly not in an obvious fashion.
> 
> Anyone for adding \notthreadsafe{} and \threadsafe{} macros to the litany of
> LaTex macros in Fred's arsenal?  This would make documenting these
> properties both easy and consistent across modules.

It's hard to say whether a *whole module* is threadsafe.  E.g. in the
fileinput example, there's the clear implication that if you use this
in multiple threads, you should instantiate your own FileInput
instances, and then you're totally thread-safe.  Clearly the semantics
of the module-global functions are thread-unsafe though.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Sun Jan 21 19:45:07 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 13:45:07 -0500
Subject: [Python-Dev] test_sax failing (Windows)
Message-ID: <LNBBLJKPBEHFEDALKOLCGEDGIKAA.tim.one@home.com>

test test_sax crashed -- 
    exceptions.SystemError: 'finally' pops bad exception

Sometimes it crashes (some flavor of memory fault) instead.

Elsewhere?




From nas at arctrix.com  Sun Jan 21 13:28:35 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 21 Jan 2001 04:28:35 -0800
Subject: [Python-Dev] autoconf --enable vs. --with
Message-ID: <20010121042835.A19774@glacier.fnational.com>

I've been working a bit on the build process lately.  I came
across this in the autoconf documentation:


    If a software package has optional compile-time features, the
    user can give `configure' command line options to specify
    whether to compile them. The options have one of these forms:

        --enable-FEATURE[=ARG]
        --disable-FEATURE

    Some packages require, or can optionally use, other software
    packages which are already installed.  The user can give
    `configure' command line options to specify which such
    external software to use.  The options have one of these
    forms:

        --with-package[=ARG]
        --without-package


Is it worth fixing the Python configure script to comply with
these definitions?  It looks like with-cycle-gc and mybe
with-pydebug would have to be changed.

  Neil

    AC_ARG_ENABLE

    



From tim.one at home.com  Sun Jan 21 20:44:38 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 14:44:38 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101211651.LAA25346@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com>

[Guido, on again lumping numbers together]
> I think I can put this behavior back.  (I believe that before I
> reorganized the comparison code, it seemed really tricky to do this,
> but after refactoring the code, it's quite easy to do.)

I can believe that; and I believe the "bugs" in 2.0 ended up somewhere in or
around the bowels of the xxxHalfBinOp-like routines (which were really
tricky to my eyes -- the interactions among coercions and comparisons were
hard to keep straight).

> My only concern is that under the old schele, two different numeric
> extension types that somehow can't be compared will end up being
> *equal*.  To fix this, I propose that if the names compare equal, as a
> last resort we compare the type pointers -- this should be consistent
> too.

Agreed, and sounds fine!  Save Barry a little work, though:

> ! 	/* Same type name, or (more likely) incomparable numeric types */
> ! 	return (v->ob_type < w->ob_type) ? -1 : 1;

That's non-std C in a way Insure complains about elsewhere; change to

	return ((Py_uintptr_t)v->ob_type <
		  (Py_uintptr_t)w->ob_type) ? -1 : 1;

if-vendors-stuck-to-the-letter-of-the-c-std-python-wouldn't-
     compile-at-all<wink>-ly y'rs  - tim




From trentm at ActiveState.com  Sun Jan 21 21:01:44 2001
From: trentm at ActiveState.com (Trent Mick)
Date: Sun, 21 Jan 2001 12:01:44 -0800
Subject: [Python-Dev] spurious print and faulty return values: Is this a bug...?
In-Reply-To: <20010120153556.C18375@ActiveState.com>; from trentm@ActiveState.com on Sat, Jan 20, 2001 at 03:35:56PM -0800
References: <20010120153556.C18375@ActiveState.com>
Message-ID: <20010121120144.B28643@ActiveState.com>

On Sat, Jan 20, 2001 at 03:35:56PM -0800, Trent Mick wrote:
> 
> ... or am I missing something?

Ignore me. RTFM (sys.exit), Trent.

Sorry,
Trent


-- 
Trent Mick
TrentM at ActiveState.com



From tim.one at home.com  Sun Jan 21 21:13:02 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 15:13:02 -0500
Subject: [Python-Dev] spurious print and faulty return values: Is this a bug...?
In-Reply-To: <20010121120144.B28643@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEDKIKAA.tim.one@home.com>

[Trent, quoting Trent]
>>
>> ... or am I missing something?

[and back to Trent]
> Ignore me. RTFM (sys.exit), Trent.

Nobody wants to ignore *you*, Trent!  If it's not the case that you wanted
to code

sys.exit(int(firstarg))

instead, holler, cuz if that wasn't the problem I'm still baffled.

or-if-it-was-it-caught-you-because-sys.exit's-tricks-aren't-
    really-pythonic-ly y'rs  - tim




From loewis at informatik.hu-berlin.de  Sun Jan 21 22:21:24 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 21 Jan 2001 22:21:24 +0100 (MET)
Subject: [Python-Dev] test_sax failing (Windows)
Message-ID: <200101212121.WAA16327@pandora.informatik.hu-berlin.de>

> Elsewhere?

Not for me, on neither Solaris nor Linux. What expat version?

Regards,
Martin



From loewis at informatik.hu-berlin.de  Sun Jan 21 22:22:44 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 21 Jan 2001 22:22:44 +0100 (MET)
Subject: [Python-Dev] autoconf --enable vs. --with
Message-ID: <200101212122.WAA16371@pandora.informatik.hu-berlin.de>

> It looks like with-cycle-gc and mybe with-pydebug would have to be
> changed.

I'm in favour of changing it.

Regards,
Martin



From loewis at informatik.hu-berlin.de  Sun Jan 21 22:34:08 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 21 Jan 2001 22:34:08 +0100 (MET)
Subject: [Python-Dev] test___all__ fails with no bsddb
Message-ID: <200101212134.WAA16446@pandora.informatik.hu-berlin.de>

On my Solaris 2.6 installation, with no bsddb module, I get

test test___all__ failed -- dbhash has no __all__ attribute

This is caused by anydbm importing dbhash first. After that fails,
dbhash is still in sys.modules, and the next import of dbhash silently
loads an incomplete module.

Regards,
Martin



From tim.one at home.com  Sun Jan 21 22:38:11 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 16:38:11 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <200101212121.WAA16327@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEDPIKAA.tim.one@home.com>

[Martin von Loewis]
> Not for me, on neither Solaris nor Linux. What expat version?

Tell me how to answer the question, and I'll be happy to (I have no idea
what any of this stuff is or does).

My pyexpat.c (well, my *everything*) is current CVS, pyexpat.c in particular
is revision 2.33.

xmltok.dll and xmlparse.dll were obtained from

    ftp://ftp.jclark.com/pub/xml/expat.zip

for the 2.0 release.

Is any of that relevant?

The tests passed in the wee hours (EST; UTC -0500) this morning.  They began
failing after I updated around 1pm EST today.




From thomas at xs4all.net  Sun Jan 21 22:54:05 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sun, 21 Jan 2001 22:54:05 +0100
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 21, 2001 at 02:44:38PM -0500
References: <200101211651.LAA25346@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com>
Message-ID: <20010121225405.M17392@xs4all.nl>

On Sun, Jan 21, 2001 at 02:44:38PM -0500, Tim Peters wrote:

> > ! 	/* Same type name, or (more likely) incomparable numeric types */
> > ! 	return (v->ob_type < w->ob_type) ? -1 : 1;

> That's non-std C in a way Insure complains about elsewhere; change to

> 	return ((Py_uintptr_t)v->ob_type <
> 		  (Py_uintptr_t)w->ob_type) ? -1 : 1;

Why is comparing v->ob_type with w->ob_type illegal ? They're both pointers
to the same type, aren't they ?

> if-vendors-stuck-to-the-letter-of-the-c-std-python-wouldn't-
>      compile-at-all<wink>-ly y'rs  - tim

That's easy to check, gcc has these nice (and from a users point of view,
fairly useless) options: '-ansi', '-pedantic' and '-pedantic-errors'.
'-ansi' disables some GCC-specific features, -pedantic turns gcc into a
whiney pedantic I'm sure you'd get along with just fine <wink>, and
-pedantic-errors turns those whines into errors.

Doing a quick check I see one error I added myself (but haven't commited) in
the continue-inside-try patch (a trailing comma in an enumerator
definition), and one error in configure (it mis-detects the arguments to
setpgrp() in strict-ANSI mode, for some reason.) I don't see any errors in
the core Python. I see an error in the nis module (missing function
prototype, and broken system-include file) and a *lot* of errors in
linuxaudiodev, but nothing else in the set of modules I can compile. Not
bad!

Note that this was tested in a current tree. I couldn't find either Guido's
'broken' code or your proposed 'good' code, so I don't know if you checked
in a fix yet. If you didn't, don't bother, it's not broken :-)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From loewis at informatik.hu-berlin.de  Sun Jan 21 23:00:47 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 21 Jan 2001 23:00:47 +0100 (MET)
Subject: [Python-Dev] Re: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEDPIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEDPIKAA.tim.one@home.com>
Message-ID: <200101212200.XAA16672@pandora.informatik.hu-berlin.de>

> [Martin von Loewis]
> > Not for me, on neither Solaris nor Linux. What expat version?
> 
> Tell me how to answer the question, and I'll be happy to (I have no idea
> what any of this stuff is or does).
>
> My pyexpat.c (well, my *everything*) is current CVS, pyexpat.c in
> particular is revision 2.33.

That's good; mine too.

> xmltok.dll and xmlparse.dll were obtained from
> 
>     ftp://ftp.jclark.com/pub/xml/expat.zip
> 
> for the 2.0 release.
> 
> Is any of that relevant?

That gives some clue, yes. Unfortunately, that URL itself is a symlink
that was expat1_1.zip (157936 bytes) at some point, and now is
expat1_2.zip (153591 bytes). The files themselves are not
self-identifying, it's hard to tell once unzipped...

Anyway, I was using 1.1 in my own tests, and 1.2 in PyXML - either
works for me. I never tested 1.95.x (which is also not available from
jclark.com).

> The tests passed in the wee hours (EST; UTC -0500) this morning.
> They began failing after I updated around 1pm EST today.

I just merged pyexpat changes from PyXML into Python 2 so that could
be the cause. However, this very code has been used for some time by
PyXML users, why it crashes for you is a mystery to me.

Any chance of producing a C backtrace?

Regards,
Martin



From tim.one at home.com  Sun Jan 21 23:09:30 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 17:09:30 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEDPIKAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDPIKAA.tim.one@home.com>

FYI, under the debug-build Python, running test_sax.py under the debugger
dies like so:

Passed test_attrs_empty
Passed test_attrs_wattr
Passed test_escape_all
Passed test_escape_basic
Passed test_escape_extra
Passed test_expat_attrs_empty
Passed test_expat_attrs_wattr
Passed test_expat_dtdhandler
Passed test_expat_entityresolver
Passed test_expat_file
Traceback (most recent call last):
  File "../lib/test/test_sax.py", line 603, in ?
    confirm(value(), name)
  File "../lib/test/test_sax.py", line 435, in test_expat_incomplete
    parser.parse(StringIO("<foo>"))
  File "c:\code\python\dist\src\lib\xml\sax\expatreader.py", line 42, in
parse
    xmlreader.IncrementalParser.parse(self, source)
  File "c:\code\python\dist\src\lib\xml\sax\xmlreader.py", line 122, in
parse
    self.close()
  File "c:\code\python\dist\src\lib\xml\sax\expatreader.py", line 91, in
close
    self.feed("", isFinal = 1)
  File "c:\code\python\dist\src\lib\xml\sax\expatreader.py", line 82, in
feed
    except expat.error:
SystemError: 'finally' pops bad exception

Running it from a command line instead produces the same output up to but
not including the traceback, and Python crashes with a memory fault then.
Attaching to the process with a debugger at that point shows it trying to do
_Py_Dealloc on an op whose op->op_type member is NULL.  Here's the call
stack at that point:

_Py_Dealloc(_object * 0x007af100) line 1304 + 6 bytes
insertdict(dictobject * 0x007637ec, _object * 0x007a8270,
           long -1601350627, _object * 0x1e1eff18 __Py_NoneStruct)
           line 364 + 48 bytes
PyDict_SetItem(_object * 0x007637ec, _object * 0x007a8270,
          _object * 0x1e1eff18 __Py_NoneStruct) line 498 + 21 bytes
PyDict_SetItemString(_object * 0x007637ec, char * 0x1e1d84fc,
          _object * 0x1e1eff18 __Py_NoneStruct) line 1272 + 17 bytes
PySys_SetObject(char * 0x1e1d84fc, _object * 0x1e1eff18 __Py_NoneStruct)
          line 67 + 17 bytes
reset_exc_info(_ts * 0x00760630) line 2207 + 17 bytes
eval_code2(PyCodeObject * 0x00993df0, _object * 0x0098794c,
          _object * 0x00000000, _object * * 0x007a9d28, int 2,
          _object * * 0x007a9d30, int 1, _object * * 0x009a0b60,
          int 1) line 2125 + 9 bytes
fast_function(_object * 0x009a4f6c, _object * * * 0x0063f5a0, int 4,
          int 2, int 1) line 2817 + 61 bytes
eval_code2(PyCodeObject * 0x00993910, _object * 0x0098794c,
          _object * 0x00000000, _object * * 0x007a05e8, int 1,
          _object * * 0x007a05ec, int 0, _object * * 0x00000000,
         int 0) line 1860 + 37 bytes
fast_function(_object * 0x009a549c, _object * * * 0x0063f738, int 1,
         int 1, int 0) line 2817 + 61 bytes
eval_code2(PyCodeObject * 0x007b35e0, _object * 0x0098110c,
          _object * 0x00000000, _object * * 0x009beb10, int 2,
          _object * * 0x00000000, int 0, _object * * 0x00000000,
          int 0) line 1860 + 37 bytes
call_eval_code2(_object * 0x0098a97c, _object * 0x009beafc,
         _object * 0x00000000) line 2765 + 57 bytes
call_object(_object * 0x0098a97c, _object * 0x009beafc,
         _object * 0x00000000) line 2594 + 17 bytes
call_method(_object * 0x0098a97c, _object * 0x009beafc,
         _object * 0x00000000) line 2717 + 17 bytes
call_object(_object * 0x007e125c, _object * 0x009beafc,
         _object * 0x00000000) line 2592 + 17 bytes
do_call(_object * 0x007e125c, _object * * * 0x0063f96c, int 2,
        int 0) line 2915 + 17 bytes
eval_code2(PyCodeObject * 0x00991560, _object * 0x0098794c,
        _object * 0x00000000, _object * * 0x009bce98, int 2,
        _object * * 0x009bcea0, int 0, _object * * 0x00000000,
        int 0) line 1863 + 30 bytes
fast_function(_object * 0x009a7dfc, _object * * * 0x0063fb04, int 2,
        int 2, int 0) line 2817 + 61 bytes
eval_code2(PyCodeObject * 0x009f7e00, _object * 0x0076f14c,
       _object * 0x00000000, _object * * 0x00775904, int 0,
       _object * * 0x00775904, int 0, _object * * 0x00000000,
       int 0) line 1860 + 37 bytes
fast_function(_object * 0x009bc8ac, _object * * * 0x0063fc9c, int 0,
       int 0, int 0) line 2817 + 61 bytes
eval_code2(PyCodeObject * 0x009f86d0, _object * 0x0076f14c,
      _object * 0x0076f14c, _object * * 0x00000000, int 0,
      _object * * 0x00000000, int 0, _object * * 0x00000000,
      int 0) line 1860 + 37 bytes
PyEval_EvalCode(PyCodeObject * 0x009f86d0, _object * 0x0076f14c,
      _object * 0x0076f14c) line 338 + 29 bytes
run_node(_node * 0x007aa740, char * 0x00760dd9, _object * 0x0076f14c,
     _object * 0x0076f14c) line 919 + 17 bytes
run_err_node(_node * 0x007aa740, char * 0x00760dd9, _object * 0x0076f14c,
     _object * 0x0076f14c) line 907 + 21 bytes
PyRun_FileEx(_iobuf * 0x10261888, char * 0x00760dd9, int 257,
     _object * 0x0076f14c, _object * 0x0076f14c, int 1) line 899 + 21 bytes
PyRun_SimpleFileEx(_iobuf * 0x10261888, char * 0x00760dd9, int 1)
      line 612 + 30 bytes
PyRun_AnyFileEx(_iobuf * 0x10261888, char * 0x00760dd9, int 1)
      line 466 + 17 bytes
Py_Main(int 2, char * * 0x00760da0) line 295 + 44 bytes
main(int 2, char * * 0x00760da0) line 10 + 13 bytes

insertdict is doing

    Py_DECREF(old_value);

reset_exc_info is doing

    PySys_SetObject("exc_type", frame->f_exc_type);

Bet that's as helpful to you as it was to me <wink>.




From thomas at xs4all.net  Sun Jan 21 23:13:02 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sun, 21 Jan 2001 23:13:02 +0100
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <20010121225405.M17392@xs4all.nl>; from thomas@xs4all.net on Sun, Jan 21, 2001 at 10:54:05PM +0100
References: <200101211651.LAA25346@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com> <20010121225405.M17392@xs4all.nl>
Message-ID: <20010121231302.N17392@xs4all.nl>

On Sun, Jan 21, 2001 at 10:54:05PM +0100, Thomas Wouters wrote:
> I see an error in the nis module (missing function prototype, and broken
> system-include file) and a *lot* of errors in linuxaudiodev

The errors in linuxaudiodev are only errors because for some reason, in
-ansi -pedantic-errors mode, gcc doesn't define the 'linux' symbol. IMHO,
not worth fixing. The nismodule is 'broken' because of this:

static
nismaplist *
nis_maplist (void)
{
        nisresp_maplist *list;
        char *dom;
        CLIENT *cl, *clnt_create();

clnt_create() should be declared by the system include files. Anyone have
objections to me moving it to pyport.h, inside the '#if 0' ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tim.one at home.com  Sun Jan 21 23:28:45 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 17:28:45 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <20010121225405.M17392@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com>

[Thomas Wouters]
> Why is comparing v->ob_type with w->ob_type illegal ? They're
> both pointers to the same type, aren't they ?

Non-equality comparison of pointers is defined if and only if the pointers
are both addresses in the same contiguous structure (think struct or array);
an exception is made for a pointer "one beyond the end" of an array, i.e. if

    sometype a[N];

then &a[0] < &a[N] == 1 is guaranteed despite that &a[N] is outside the
bounds of a; but &a[0] < &a[N+1] is undefined (which *means* undefined!
e.g., it's OK if they compare equal, or if the comparison causes a hardware
fault, or ...).

> That's easy to check, gcc has these nice (and from a users point of view,
> fairly useless) options: '-ansi', '-pedantic' and '-pedantic-errors'.
> '-ansi' disables some GCC-specific features, -pedantic turns gcc into a
> whiney pedantic I'm sure you'd get along with just fine <wink>, and
> -pedantic-errors turns those whines into errors.

Your faith in gcc is as charming as it is naive <wink>:  the most
interesting cases of undefined behavior can't be checked no-way, no-how at
compile-time.  That's why Barry keeps talking employers into dumping
thousands of dollars into a single Insure++ license.  Insure++ actually tags
every pointer at runtime with its source, and gripes if non-equality
comparisons are done on a pair not derived from the same array or malloc
etc.  Since Python type objects are individually allocated (not taken from a
preallocated contiguous vector), Insure++ should complain about that
compare.

> ...
> Note that this was tested in a current tree. I couldn't find
> either Guido's 'broken' code or your proposed 'good' code, so I
> don't know if you checked in a fix yet. If you didn't, don't bother,
> it's not broken :-)

Guido hasn't checked it in yet, but gcc isn't smart enough to detect *this*
breakage anyway.





From fredrik at effbot.org  Mon Jan 22 00:02:10 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Mon, 22 Jan 2001 00:02:10 +0100
Subject: [Python-Dev] more unicode database changes
Message-ID: <030501c083fe$2fe7dbf0$e46940d5@hagrid>

Just checked in another unicode database patch, which
saves another ~60k.  On my Windows box, the Unicode
tables are now about 200k (down from 600k in 2.0).

After this change, Modules/unicodedatabase.[ch] are no
longer used.

Since I'm on a Windows box with MSVC 5.0, I don't really
want to try removing them from the official build files. In-
stead, I've checked in empty versions of the files.

Can anyone help me get rid of all references to them from
the build files (and CVS)?

</F>

PS. btw, if my changes broke the build somewhere, let me
know asap!




From tim.one at home.com  Mon Jan 22 00:07:14 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 18:07:14 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <200101212200.XAA16672@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEEDIKAA.tim.one@home.com>

[Martin, on ftp://ftp.jclark.com/pub/xml/expat.zip]
> ...
> That gives some clue, yes. Unfortunately, that URL itself is a symlink
> that was expat1_1.zip (157936 bytes) at some point,

That's the one I've been using.

> and now is expat1_2.zip (153591 bytes).

I'm assuming you're recommending that one!  Based on that assumption, I've
downloaded a new one and will put that in the 2.1a1 Windows release.  Scream
if that's not what you want.

> ...
> Anyway, I was using 1.1 in my own tests, and 1.2 in PyXML - either
> works for me. I never tested 1.95.x (which is also not available from
> jclark.com).

If you do and love it, let me know where to get it and I'll ship that
instead.

>> The tests passed in the wee hours (EST; UTC -0500) this morning.
>> They began failing after I updated around 1pm EST today.

> I just merged pyexpat changes from PyXML into Python 2 so that could
> be the cause. However, this very code has been used for some time by
> PyXML users, why it crashes for you is a mystery to me.

Perhaps gc, perhaps uninitialized vars, ..., hard to say.  Unfortunately,
it's not unusual for flawed code to display different behavior across
platforms; or, from the long-term QA perspective, it's *great* that flawed
code doesn't always appear to work on all platforms <wink>.

> Any chance of producing a C backtrace?

Sent that before; doesn't look like much help; we're seeing a NULL type
pointer, but at that stage there's no telling when or where or why it
*became* NULL.

I'm going to rebuild the world from scratch, and use the new DLLs.  You
should assume that didn't help unless I say otherwise within 15 minutes.




From tim.one at home.com  Mon Jan 22 00:09:51 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 18:09:51 -0500
Subject: [Python-Dev] more unicode database changes
In-Reply-To: <030501c083fe$2fe7dbf0$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEEEIKAA.tim.one@home.com>

[/F]
> Just checked in another unicode database patch, which
> saves another ~60k.  On my Windows box, the Unicode
> tables are now about 200k (down from 600k in 2.0).

Yay!  I take it CNRI wasn't paying you by the byte <wink>.

> After this change, Modules/unicodedatabase.[ch] are no
> longer used.
>
> Since I'm on a Windows box with MSVC 5.0, I don't really
> want to try removing them from the official build files. In-
> stead, I've checked in empty versions of the files.

That's fine.

> Can anyone help me get rid of all references to them from
> the build files (and CVS)?
>
> </F>
>
> PS. btw, if my changes broke the build somewhere, let me
> know asap!

I'll take care of the MS project files -- and I was just about to rebuild
the world from scratch anyway.




From tim.one at home.com  Mon Jan 22 00:20:03 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 18:20:03 -0500
Subject: [Python-Dev] more unicode database changes
In-Reply-To: <030501c083fe$2fe7dbf0$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEEFIKAA.tim.one@home.com>

> After this change, Modules/unicodedatabase.[ch] are no
> longer used.

Not so:  unicodedata.c still #includes unicodedatabase.h.




From tim.one at home.com  Mon Jan 22 00:53:13 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 18:53:13 -0500
Subject: [Python-Dev] more unicode database changes
In-Reply-To: <030501c083fe$2fe7dbf0$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEEGIKAA.tim.one@home.com>

[/F]
> ...
> PS. btw, if my changes broke the build somewhere, let me
> know asap!

The Windows build is fine now and changes checked-in.  You can remove

    Modules/unicodedatabase.[ch]

from the project without hurting it (although I imagine the Unixish builds
still need to learn about this!).




From tim.one at home.com  Mon Jan 22 01:12:21 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 19:12:21 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <200101212200.XAA16672@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEEHIKAA.tim.one@home.com>

More FYI:  With the new expat1_2.zip (153591 bytes) DLLs, all tests pass on
Windows except for test_sax.  No change in symptoms.  The failure modes for
test_sax depend on all of:

+ Whether run in release or debug builds.

+ Whether text_sax.py is run directly or via regrtest.py.

+ Whether I delete all .pyc/.pyo files first, or use precomplied ones.

+ In debug builds, whether the test is started from within the
  debugger, or I start it via cmdline and attach to the process after
  it crashes (with a memory fault).

Here's a new failure mode:

test test_sax crashed -- XMLParserType: no element found: line 1, column 5

So this smells to high heaven of either a nasty gc problem or referencing
uninitialized memory.  Symptoms don't change if I stick

    import gc
    gc.disable()

at the start of test_sax.py.

Barry, can you try running test_sax under Insure?  I've got little chance of
making enough time tonight to figure this out the hard way ...




From nas at arctrix.com  Sun Jan 21 18:28:52 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 21 Jan 2001 09:28:52 -0800
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEEHIKAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 21, 2001 at 07:12:21PM -0500
References: <200101212200.XAA16672@pandora.informatik.hu-berlin.de> <LNBBLJKPBEHFEDALKOLCIEEHIKAA.tim.one@home.com>
Message-ID: <20010121092852.A24605@glacier.fnational.com>

On Sun, Jan 21, 2001 at 07:12:21PM -0500, Tim Peters wrote:
> So this smells to high heaven of either a nasty gc problem or referencing
> uninitialized memory.  Symptoms don't change if I stick
> 
>     import gc
>     gc.disable()
> 
> at the start of test_sax.py.

Can you try it with WITH_CYCLE_GC undefined?

  Neil



From greg at cosc.canterbury.ac.nz  Mon Jan 22 01:25:08 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 13:25:08 +1300 (NZDT)
Subject: [Python-Dev] a>b == b<a dangerous?
In-Reply-To: <200101191600.LAA28788@cj20424-a.reston1.va.home.com>
Message-ID: <200101220025.NAA01809@s454.cosc.canterbury.ac.nz>

Suppose I have a class which checks whether it knows
how to do a comparison, and if not, wants to pass it
on to the other operand in case it knows:

  class Foo:

    def __lt__(self, other):
      if I_know_about(other):
        # do the comparison
      else:
        return other.__gt__(self)

If the other operand has a __gt__ method which is
doing similar tricks, infinite recursion could result.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From greg at cosc.canterbury.ac.nz  Mon Jan 22 01:36:51 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 13:36:51 +1300 (NZDT)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <200101191848.NAA02765@cj20424-a.reston1.va.home.com>
Message-ID: <200101220036.NAA01813@s454.cosc.canterbury.ac.nz>

Guido:

> I don't understand how these can be not commutative unless they have a
> side effect on the left argument

I think he meant "not reflective". If a<b == floor(a,b) and a>b ==
ceil(a,b), then clearly a<b != b>a.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From mwh21 at cam.ac.uk  Mon Jan 22 01:48:16 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 22 Jan 2001 00:48:16 +0000
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Greg Ewing's message of "Mon, 22 Jan 2001 13:36:51 +1300 (NZDT)"
References: <200101220036.NAA01813@s454.cosc.canterbury.ac.nz>
Message-ID: <m31ytw1kfj.fsf@atrus.jesus.cam.ac.uk>

Greg Ewing <greg at cosc.canterbury.ac.nz> writes:

> Guido:
> 
> > I don't understand how these can be not commutative unless they have a
> > side effect on the left argument
> 
> I think he meant "not reflective". If a<b == floor(a,b) and a>b ==
> ceil(a,b), then clearly a<b != b>a.

What's floor of two arguments?  In common lisp, (floor a b) is the
largest integer n such that (<= n (/ a b)), in Python it's a type
error...  if you meant min(a,b), then I then think the programmer who
thinks "min(a,b)" is spelt "a<b" has problems we can't be expected to
deal with (if min has a symbol it's /\, but never mind that).

More generally, people who define their comparison operators in
non-intuitive ways shouldn't really expect intuitive behaviour.  I
thought Guido threatened to document this fact in large letters
somewhere...

Cheers,
M.

-- 
  Premature optimization is the root of all evil in programming.  
                                                       -- C.A.R. Hoare




From greg at cosc.canterbury.ac.nz  Mon Jan 22 01:52:25 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 13:52:25 +1300 (NZDT)
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com>
Message-ID: <200101220052.NAA01817@s454.cosc.canterbury.ac.nz>

> Non-equality comparison of pointers is defined if and only if the pointers
> are both addresses in the same contiguous structure

I'm not sure that the proposed alternative (casting both
pointers to ints and comparing the ints) is any better.
Does the C std define the result of doing that to two
unrelated pointers?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From tim.one at home.com  Mon Jan 22 01:56:16 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 19:56:16 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <20010121092852.A24605@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEEJIKAA.tim.one@home.com>

[Neil Schemenauer]
> Can you try it with WITH_CYCLE_GC undefined?

Good idea -- for someone with an infinite amount of free time <wink>.

But being a good sport, I did as you asked with giddy cheer.  Alas, it
didn't help (all the same bizarre context-dependent test_sax failure modes).
I'm sure I disabled WITH_CYCLE_GC correctly, because "import gc" now fails
with ImportError in both release and debug builds.

BTW, a refcount-too-low problem is another good candidate.




From greg at cosc.canterbury.ac.nz  Mon Jan 22 02:00:46 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 14:00:46 +1300 (NZDT)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <m31ytw1kfj.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <200101220100.OAA01820@s454.cosc.canterbury.ac.nz>

Michael Hudson <mwh21 at cam.ac.uk>:

> if you meant min(a,b),

Yes, sorry, that's what I meant. Or at least that's what
I thought the original poster meant - if he didn't, then
I'm confused, too!

Anyway, I agree that it's a silly thing to want to make
a>b mean, and I'm not all that disappointed that it won't
be possible.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From tim.one at home.com  Mon Jan 22 02:11:52 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 20:11:52 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101220052.NAA01817@s454.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEEKIKAA.tim.one@home.com>

[Greg Ewing]
> I'm not sure that the proposed alternative (casting both
> pointers to ints and comparing the ints) is any better.
> Does the C std define the result of doing that to two
> unrelated pointers?

C99 guarantees that, if the type exists, casting a pointer to type uintptr_t
won't blow up, and also guarantees that comparisons between (at least) ints
of the same type won't blow up.  Beyond that, we don't care what it returns.
Mostly we're trying to eliminate warnings Barry has to wade thru from
Insure++ -- same reason we have a "no compiler warnings!" build policy.
Doing the cast is obviously "better" when viewed through Barry's 4AM eyes.

You can find out *why* C has this rule (which was in C89, not new in C99) by
reading the C FAQ.




From tim.one at home.com  Mon Jan 22 02:23:27 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 20:23:27 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <m31ytw1kfj.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEELIKAA.tim.one@home.com>

[Michael Hudson]
> ...
> if you meant min(a,b), then I then think the programmer who
> thinks "min(a,b)" is spelt "a<b" has problems we can't be expected to
> deal with (if min has a symbol it's /\, but never mind that).

Curiously, in the Icon language, if a is less than b then

   a < b

returns b while

   b > a

returns a.

In this way they get the same effect as Python's chained comparisons

   a < b < c < d

via purely binary operators (if a is *not* less than b, a < b in Icon
"fails", which is a silent event that causes the expression's context to
backtrack -- but we won't go into that here <wink>).

Anyway, that accounts for this curious Icon idiom:

   a <:= b

which is short for

   a := a < b

and binds a to max(a, b) (if a is smaller, a < b returns b and the
assignment proceeds; but if a is not smaller, a < b fails and that
propagates into its context, which here has no other possibilities to
backtrack into, so the stmt just ends leaving a alone).

"<"-and-">"-are-just-bags-of-pixels-ly y'rs  - tim




From uche.ogbuji at fourthought.com  Mon Jan 22 02:24:46 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Sun, 21 Jan 2001 18:24:46 -0700
Subject: [Python-Dev] should a module's thread safety be documented? 
In-Reply-To: Message from Guido van Rossum <guido@digicool.com> 
   of "Sun, 21 Jan 2001 12:00:02 EST." <200101211700.MAA25479@cj20424-a.reston1.va.home.com> 
Message-ID: <200101220124.SAA08868@localhost.localdomain>

> > A bit late for 2.1alpha1, but it just occurred to me that perhaps there
> > should be an annotation in the documentation that indicates whether or not a
> > module is thread-safe.  For example, many functions in fileinput rely on a
> > module global called _state.  It strikes me that this module is not likely
> > to be thread-safe, yet the documentation doesn't appear to mention this,
> > certainly not in an obvious fashion.
> > 
> > Anyone for adding \notthreadsafe{} and \threadsafe{} macros to the litany of
> > LaTex macros in Fred's arsenal?  This would make documenting these
> > properties both easy and consistent across modules.
> 
> It's hard to say whether a *whole module* is threadsafe.  E.g. in the
> fileinput example, there's the clear implication that if you use this
> in multiple threads, you should instantiate your own FileInput
> instances, and then you're totally thread-safe.  Clearly the semantics
> of the module-global functions are thread-unsafe though.

Perhaps what is needed rather is a prose annotation for thread-safety issues.

My TeX is rusty, but in Docbook, with the use of role attributes, one could 
have, taking your FileInput example

<sect1 role="thread-safety"><para>
  The module-global functions are not safe, but if you instantiate your own 
FileInput instances, they will be totally thread-safe.
</para></sect>

That way the MT issues could be styled differently on rendering, gathered into 
separate documentation, stripped by those who don't care, etc.  I imagine this 
is also possible in TeX.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From tim.one at home.com  Mon Jan 22 02:32:30 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 20:32:30 -0500
Subject: [Python-Dev] a>b == b<a dangerous?
In-Reply-To: <200101220025.NAA01809@s454.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEEMIKAA.tim.one@home.com>

[Greg Ewing]
> Suppose I have a class which checks whether it knows
> how to do a comparison, and if not, wants to pass it
> on to the other operand in case it knows:
>
>   class Foo:
>
>     def __lt__(self, other):
>       if I_know_about(other):
>         # do the comparison
>       else:
>         return other.__gt__(self)
>
> If the other operand has a __gt__ method which is
> doing similar tricks, infinite recursion could result.

Does this have something to do with comparisons?  That is, wouldn't the same
be true if you coded two methods named "spam" and "eggs" in this way?

whatever = 0

class Foo:
    def spam(self, other):
       if whatever:
           return 1
       else:
           return other.eggs(self)

class Bar:
    def eggs(self, other):
       if whatever:
           return 1
       else:
           return other.spam(self)

Foo().spam(Bar())  # RuntimeError: Maximum recursion depth exceeded

It that's all there is to it, you got what you asked for.




From greg at cosc.canterbury.ac.nz  Mon Jan 22 04:31:41 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 16:31:41 +1300 (NZDT)
Subject: [Python-Dev] a>b == b<a dangerous?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEEMIKAA.tim.one@home.com>
Message-ID: <200101220331.QAA01833@s454.cosc.canterbury.ac.nz>

Tim Peters <tim.one at home.com>:

> Does this have something to do with comparisons?  That is, wouldn't the same
> be true if you coded two methods named "spam" and "eggs" in this
> way?

Yes, but Guido hasn't decreed that a.spam(b) and b.eggs(a) are
to have a reflective relationship with each other.

But don't worry - I've belatedly realised that the correct way
to do what I was talking about is to return NotImplemented and
let the interpreter take care of calling the reflected method.
So I withdraw my objection.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From tim.one at home.com  Mon Jan 22 08:54:32 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 02:54:32 -0500
Subject: [Python-Dev] Worse news
Message-ID: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>

I still don't have a clue about test_sax, but have stumbled into more
failure modes.  Most of them seem related to the SystemError ("'finally'
pops bad exception").  Around that part of ceval.c, sometimes the v popped
off the stack has a NULL type pointer, other times it's a pointer to a
damaged PyTuple_Type (for example, with a tp_dealloc field of 0x61, which
leads to an illegal instruction exception).

The MS debug heap routines fill all newly malloc'ed memory with 0Xcd ("clean
landfill"), fill free'ed memory with 0Xdd ("dead landfill"), and *pad*
malloc'ed memory with some number of 0xfd bytes on both sides ("no-man's
land").  The clean landfill and no-man's land patterns are showing up more
often they should "by chance", and especially in high-order bytes.  Just
more evidence of the obvious:  something is really screwed up <wink>.

I cannot get the subtest that test_sax is calling (test_expat_incomplete) to
fail in isolation.

Next headache:  If I delete all .pyc files from Lib/ and Lib/test/, and then
run:

python ../lib/test/regrtest.py -x test_sax

by hand, all the 98 tests that *should* run on Windows (excluding, of
course, test_sax, which is no longer tried) pass.  If I immediately run them
again (without deleting .pyc) by hand:

python ../lib/test/regrtest.py -x test_sax

then they again pass.  However, if I do

rt -x test_sax

which does exactly the steps (delete .pyc, run regrest excluding test_sax,
run regrtest again) via the little MS batch file rt.bat, then on the second
time thru regrtest, and 5 times out of 5, it died in test_extcall with an
"illegal operation", while executing

		if (TYPE(c) == DOUBLESTAR) {

near the end of symtable_params in compile.c.  This is an optimized build,
and the debugger has no idea what's in c at this point; to judge from the
offending machine instruction and register contents, though, c is a bad
pointer.

Have not been able to get test_extcall to fail in isolation.

Have also been unable to get test_extcall to fail in the debug build.


So there's evidence of Deep Rot beyond test_sax, but test_sax remains the
only test that fails every time and under both build types.

Running regrtest with -r (randomize test order) is also "interesting":
first time I tried that, test_cpickle failed (truncated output) as well as
test_sax.

I doubt anyone has run the tests more often than me over the last week, so
I'm not surprised I'm seeing the most problems.  However, since *nobody* is
seeing anything on Linux, I'd at least like to get *someone* else to run the
tests on Windows.  While I'm not having any unusual problems with my box,
it's certainly possible that I've got a corrupted file or a flaky memory
chip etc, or that MSVC is generating bad code for some recent change
(although that's unlikely since the debug build generates *really*
straightforward code).

Deleting my entire PCbuild subtree and refetching it from CVS didn't make
any difference.




From esr at thyrsus.com  Mon Jan 22 09:01:27 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 03:01:27 -0500
Subject: [Python-Dev] autoconf --enable vs. --with
In-Reply-To: <200101212122.WAA16371@pandora.informatik.hu-berlin.de>; from loewis@informatik.hu-berlin.de on Sun, Jan 21, 2001 at 10:22:44PM +0100
References: <200101212122.WAA16371@pandora.informatik.hu-berlin.de>
Message-ID: <20010122030127.C20804@thyrsus.com>

Martin von Loewis <loewis at informatik.hu-berlin.de>:
> > It looks like with-cycle-gc and mybe with-pydebug would have to be
> > changed.
> 
> I'm in favour of changing it.

Likewise.  Let's be good neighbors.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Where rights secured by the Constitution are involved, there can be no
rule making or legislation which would abrogate them.
        -- Miranda vs. Arizona, 384 US 436 p. 491



From loewis at informatik.hu-berlin.de  Mon Jan 22 09:26:15 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 22 Jan 2001 09:26:15 +0100 (MET)
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEDPIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCOEDPIKAA.tim.one@home.com>
Message-ID: <200101220826.JAA20819@pandora.informatik.hu-berlin.de>

> Running it from a command line instead produces the same output up to but
> not including the traceback, and Python crashes with a memory fault then.
> Attaching to the process with a debugger at that point shows it trying to do
> _Py_Dealloc on an op whose op->op_type member is NULL.
[...]
> Bet that's as helpful to you as it was to me <wink>.

Well, it was atleast motivating enough to try it out on my Whistler
installation. Purify would probably find this rather quickly; the code
writes into the 257th element of a 256-elements array. I've committed
a fix.

Depending on the exact organization of globals, this could have easily
gone unnoticed. MSVC packs variables more than gcc does, so the write
would overwrite one byte in ErrorObject, which would then not point to
a PyObject anymore.

Thanks for your patience,
Martin



From tim.one at home.com  Mon Jan 22 10:18:04 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 04:18:04 -0500
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing (Windows))
In-Reply-To: <200101220826.JAA20819@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com>

[Martin]
> Well, it was atleast motivating enough to try it out on my Whistler
> installation. Purify would probably find this rather quickly; the code
> writes into the 257th element of a 256-elements array.

Ah!  You shouldn't do that <wink>.

> I've committed a fix.

But you should do that.  Thank you!

Here's where I am now:

=========================================================================
All test_sax failures have gone away (yay!).
=========================================================================
Running

    rt -x test_sax

on Windows still blows up in test_extcall on the 2nd pass.  It does not blow
up:

    using the debug build; or
    if test_sax is *not* excluded; or
    in the 1st pass; or
    when running text_extcall in isolation; or
    if the steps rt performs are done by hand
=========================================================================
Running

    rt -r

on Windows still sees test_cpickle fail in the first pass (with truncated
output), but succeed in the second pass.  First-pass failure is always like
so (modulo line breaks I'm inserting by hand):

test test_cpickle failed -- Tail of expected stdout unseen:
'dumps()\012
loads()\012
ok\012
loads() DATA\012
ok\012
dumps() binary\012
loads() binary\012
ok\012
loads() BINDATA\012
ok\012
dumps() RECURSIVE\012
ok\012'

I've also seen it fail at least once when doing the same thing by hand:

    del ..\lib\*.pyc
    del ..\lib\test\*.pyc
    python ../lib/test/regrtest.py -r

else-i-would-have-asked-martin-to-look-for-a digit-to-change-in-
    command.com<wink>-ly y'rs  - tim




From mal at lemburg.com  Mon Jan 22 11:19:18 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 11:19:18 +0100
Subject: [Python-Dev] more unicode database changes
References: <030501c083fe$2fe7dbf0$e46940d5@hagrid>
Message-ID: <3A6C0926.D0A004E4@lemburg.com>

Fredrik Lundh wrote:
> 
> Just checked in another unicode database patch, which
> saves another ~60k.  On my Windows box, the Unicode
> tables are now about 200k (down from 600k in 2.0).

Great work, Fredrik :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 22 11:42:52 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 11:42:52 +0100
Subject: [Python-Dev] readline and setup.py
References: <3A68B5B0.771412F7@lemburg.com>
Message-ID: <3A6C0EAC.7D322174@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> The new setup.py procedure for Python causes readline not to
> be built on my machine. Instead I get a linker error telling
> me that termcap is not found.
> 
> Looking at my old Setup file, I have this line:
> 
> readline readline.c \
>          -I/usr/include/readline -L/usr/lib/termcap \
>          -lreadline -lterm
> 
> I guess, setup.py should be modified to include additional
> library search paths -- shouldn't hurt on platforms which
> don't need them.

Here's a patch which works for me:

projects/Python> diff CVS-Python/setup.py Dev-Python/
--- CVS-Python/setup.py Mon Jan 22 11:36:56 2001
+++ Dev-Python/setup.py Mon Jan 22 11:40:15 2001
@@ -216,10 +216,11 @@ class PyBuildExt(build_ext):
             exts.append( Extension('rgbimg', ['rgbimgmodule.c']) )
 
         # readline
         if (self.compiler.find_library_file(lib_dirs, 'readline')):
             exts.append( Extension('readline', ['readline.c'],
+                                   library_dirs=['/usr/lib/termcap'],
                                    libraries=['readline', 'termcap']) )
 
         # The crypt module is now disabled by default because it breaks builds
         # on many systems (where -lcrypt is needed), e.g. Linux (I believe).
 


-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 22 11:52:17 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 11:52:17 +0100
Subject: [Python-Dev] _tkinter and setup.py
References: <3A68B6BD.BAD038D6@lemburg.com>
Message-ID: <3A6C10E1.EF890356@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> Why does setup.py stop with an error in case _tkinter cannot
> be built (due to an old Tk/Tcl version in my case) ?
> 
> I think the policy in setup.py should be to output warnings,
> but continue building the rest of the Python modules.

I haven't heard anything from the powers to be... what should the
policy be for auto-detected and -configured modules ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Mon Jan 22 13:37:04 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 13:37:04 +0100
Subject: [Python-Dev] _tkinter and setup.py
In-Reply-To: <3A6C10E1.EF890356@lemburg.com>; from mal@lemburg.com on Mon, Jan 22, 2001 at 11:52:17AM +0100
References: <3A68B6BD.BAD038D6@lemburg.com> <3A6C10E1.EF890356@lemburg.com>
Message-ID: <20010122133704.O17392@xs4all.nl>

On Mon, Jan 22, 2001 at 11:52:17AM +0100, M.-A. Lemburg wrote:
> "M.-A. Lemburg" wrote:

> > I think the policy in setup.py should be to output warnings,
> > but continue building the rest of the Python modules.

> I haven't heard anything from the powers to be... what should the
> policy be for auto-detected and -configured modules ?

I think Andrew is still working on a way to disable modules from the command
line somehow. (I think moving setup.py to setup.py.in, and using autoconf
--options would be easiest on both developer and user, but that's just me.)
I also think everyone agrees with you that a module that can't be build
shouldn't stop the entire process in the final release (and possibly the
betas) but that it's definately a good way to debug setup.py in the alphas.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tismer at tismer.com  Mon Jan 22 14:13:46 2001
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 22 Jan 2001 14:13:46 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com>
Message-ID: <3A6C320A.37CBB4E5@tismer.com>

Maybe I can help.

Tim Peters wrote:
...
> Here's where I am now:
> 
> =========================================================================
> All test_sax failures have gone away (yay!).
> =========================================================================
> Running
> 
>     rt -x test_sax
> 
> on Windows still blows up in test_extcall on the 2nd pass.  It does not blow
> up:
> 
>     using the debug build; or
>     if test_sax is *not* excluded; or
>     in the 1st pass; or
>     when running text_extcall in isolation; or
>     if the steps rt performs are done by hand
...

I got problems with XML as well. I'm not using SAX, but plain
expat for speed. The following error happens after parsing
thousands of small XML files:

from_my_log_window="""
\\bned-s1\tismer\pxml\sdf\mdl\DisplayRGB\1
\\bned-s1\tismer\pxml\sdf\mdl\DisplayVideo\1
Traceback (innermost last):
  File "<interactive input>", line 1, in ?
  File "D:\crml_doc\pxml\clean.py", line 151, in getall
    getall(here, res)
  File "D:\crml_doc\pxml\clean.py", line 151, in getall
    getall(here, res)
  File "D:\crml_doc\pxml\clean.py", line 151, in getall
    getall(here, res)
  File "D:\crml_doc\pxml\clean.py", line 149, in getall
    res.append(p.parse())
  File "D:\crml_doc\pxml\clean.py", line 81, in parse
    self.parsers[0].Parse(self.txt1, 1)
  File "D:\crml_doc\pxml\clean.py", line 53, in endElementMaster
    if self.txt2: self.parsers[1].Parse(self.txt2, 1)
  File "D:\crml_doc\pxml\clean.py", line 46, in startElementOther
    if name <> "MASTER":
UnicodeError: UTF-8 decoding error: invalid data
"""

The good news: The error is reproducible, happens the same under
PythonWin and DOS Python, and I can reduce it to a single XML file.
That indicates to me that I am near the reason of the bug,
not at late, indirect effects.
It also *might* be related to Unicode.

I will now try to create a minimized script and XML data that
produces the above again.

back in an hour - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From thomas at xs4all.net  Mon Jan 22 14:52:44 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 14:52:44 +0100
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 21, 2001 at 05:28:45PM -0500
References: <20010121225405.M17392@xs4all.nl> <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com>
Message-ID: <20010122145244.Y17295@xs4all.nl>

On Sun, Jan 21, 2001 at 05:28:45PM -0500, Tim Peters wrote:
> [Thomas Wouters]
> > Why is comparing v->ob_type with w->ob_type illegal ? They're
> > both pointers to the same type, aren't they ?

> Non-equality comparison of pointers is defined if and only if the pointers
> are both addresses in the same contiguous structure (think struct or array);
> an exception is made for a pointer "one beyond the end" of an array, i.e. if

>     sometype a[N];

> then &a[0] < &a[N] == 1 is guaranteed despite that &a[N] is outside the
> bounds of a; but &a[0] < &a[N+1] is undefined (which *means* undefined!
> e.g., it's OK if they compare equal, or if the comparison causes a hardware
> fault, or ...).

Ok, I guess I stand corrected. I was confused by the name of Py_uintptr_t: I
thought it was a pointer-to-int, not an int large enough to hold a pointer.
I'm also positively appalled by the fact the standard refuses to define sane
behaviour for out-of-bounds access on an array, but attaches some weird
significance to what pointers are pointing *to*, when comparing the values
of those pointers, regardless of what type of object they are stored in. But
I guess I don't have to whine about that to you, Tim :-)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tismer at tismer.com  Mon Jan 22 15:03:25 2001
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 22 Jan 2001 15:03:25 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com> <3A6C320A.37CBB4E5@tismer.com>
Message-ID: <3A6C3DAD.522CE623@tismer.com>


Christian Tismer wrote:
> 
> Maybe I can help.

...

...
> I will now try to create a minimized script and XML data that
> produces the above again.
> 
> back in an hour - chris

Here we go.
The following session produces the mentioned UTF8 error:

>>> txt = "<master desc='blah\325weird' />"
>>> def startelt(name, dic):
... 	print name, dic
... 	
>>> p=expat.ParserCreate()
>>> p.StartElementHandler = startelt
>>> p.Parse(txt)
Traceback (innermost last):
  File "<interactive input>", line 1, in ?
UnicodeError: UTF-8 decoding error: invalid data

Behavior depends of the ASCII code.

From jeremy at alum.mit.edu  Mon Jan 22 15:19:34 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 09:19:34 -0500 (EST)
Subject: [Python-Dev] Worse news
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>
Message-ID: <14956.16758.68050.257212@localhost.localdomain>

Tim,

Funny (strange or haha?) that test_extcall is failing since the two
pieces of code I've modified most recently are compile.c and the
section of ceval.c that handles extended call syntax.  I just got
through my mail this morning and I'll see what I can reproduce on
Linux.

As for the test_sax failure, is any of the Python code being executed
conditional on platform?  The compiler may be generating bad bytecode
for a code path that is only executed on Windows.

Jeremy




From mal at lemburg.com  Mon Jan 22 15:27:38 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 15:27:38 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com> <3A6C320A.37CBB4E5@tismer.com> <3A6C3DAD.522CE623@tismer.com>
Message-ID: <3A6C4359.BCB06252@lemburg.com>

Christian Tismer wrote:
> 
> Christian Tismer wrote:
> >
> > Maybe I can help.
> 
> ...
> 
> ...
> > I will now try to create a minimized script and XML data that
> > produces the above again.
> >
> > back in an hour - chris
> 
> Here we go.
> The following session produces the mentioned UTF8 error:
> 
> >>> txt = "<master desc='blah\325weird' />"
> >>> def startelt(name, dic):
> ...     print name, dic
> ...
> >>> p=expat.ParserCreate()
> >>> p.StartElementHandler = startelt
> >>> p.Parse(txt)
> Traceback (innermost last):
>   File "<interactive input>", line 1, in ?
> UnicodeError: UTF-8 decoding error: invalid data
> 
> Behavior depends of the ASCII code.
> >From code 128 (0200) to 191 (0277) the parser gives an
> not well-formed exception, as it should be.
> 
> The codes from 192 to 236, 238-243 produce
> "UTF-8 decoding error: invalid data",
> the rest gives "not well-formed".
> 
> I would like to know if this happens with your (Tim) modified
> version as well. I'm using plain vanilla BeOpen Python 2.0 .

This has nothing to do with Python. UTF-8 marks the codes 
from 128-191 as illegal prefix. See Object/unicodeobject.c:

static 
char utf8_code_length[256] = {
    /* Map UTF-8 encoded prefix byte to sequence length.  zero means
       illegal prefix.  see RFC 2279 for details */
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
    2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
    3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
    4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 0, 0
};

Perhaps the parser should catch the UnicodeError and
instead return a not-wellformed exception ?!
 
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 22 15:38:14 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 15:38:14 +0100
Subject: [Python-Dev] _tkinter and setup.py
References: <3A68B6BD.BAD038D6@lemburg.com> <3A6C10E1.EF890356@lemburg.com> <20010122133704.O17392@xs4all.nl>
Message-ID: <3A6C45D5.9A6FA25C@lemburg.com>

Thomas Wouters wrote:
> 
> On Mon, Jan 22, 2001 at 11:52:17AM +0100, M.-A. Lemburg wrote:
> > "M.-A. Lemburg" wrote:
> 
> > > I think the policy in setup.py should be to output warnings,
> > > but continue building the rest of the Python modules.
> 
> > I haven't heard anything from the powers to be... what should the
> > policy be for auto-detected and -configured modules ?
> 
> I think Andrew is still working on a way to disable modules from the command
> line somehow. (I think moving setup.py to setup.py.in, and using autoconf
> --options would be easiest on both developer and user, but that's just me.)

This is fairly simple to do: distutils allows great flexibility
when it comes to adding user options, e.g. we could have

python setup.py --enable-tkinter --disable-readline

or more generic

python setup.py --enable-package tkinter --disable-package readline

The options could then be edited in setup.cfg.

> I also think everyone agrees with you that a module that can't be build
> shouldn't stop the entire process in the final release (and possibly the
> betas) but that it's definately a good way to debug setup.py in the alphas.

True... but currently the only way to get Python to compile is
to hand-edit setup.py and this is not easy for people with no 
prior distutils experience.

BTW, in my case, setup.py did find the TK-libs for 8.0, but for
a beta version -- as a result, _tkinter.c's version #error line 
triggered and the build failed.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Mon Jan 22 15:38:30 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 09:38:30 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,NONE,1.1
In-Reply-To: Your message of "Sun, 21 Jan 2001 14:37:57 +0200."
             <20010121123757.D897BA83E@darjeeling.zadka.site.co.il> 
References: <E14K45e-00030e-00@usw-pr-cvs1.sourceforge.net>  
            <20010121123757.D897BA83E@darjeeling.zadka.site.co.il> 
Message-ID: <200101221438.JAA29303@cj20424-a.reston1.va.home.com>

> Wouldn't it be better to use the
> 
> d = {}
> exec "foo", d

Surely you meant

    exec "foo" in d

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Mon Jan 22 15:43:42 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 15:43:42 +0100
Subject: [Python-Dev] _tkinter and setup.py
In-Reply-To: <3A6C45D5.9A6FA25C@lemburg.com>; from mal@lemburg.com on Mon, Jan 22, 2001 at 03:38:14PM +0100
References: <3A68B6BD.BAD038D6@lemburg.com> <3A6C10E1.EF890356@lemburg.com> <20010122133704.O17392@xs4all.nl> <3A6C45D5.9A6FA25C@lemburg.com>
Message-ID: <20010122154342.B17295@xs4all.nl>

On Mon, Jan 22, 2001 at 03:38:14PM +0100, M.-A. Lemburg wrote:

> > I think Andrew is still working on a way to disable modules from the command
> > line somehow. (I think moving setup.py to setup.py.in, and using autoconf
> > --options would be easiest on both developer and user, but that's just me.)

> This is fairly simple to do: distutils allows great flexibility
> when it comes to adding user options, e.g. we could have
> 
> python setup.py --enable-tkinter --disable-readline
> 
> or more generic
> 
> python setup.py --enable-package tkinter --disable-package readline
> 
> The options could then be edited in setup.cfg.

Note that the 'user' only has 'configure' and 'make' to run, so optimally,
the options would have to be given to one of those (preferably to
'configure', to keep it similar to 90% of the packages out there.)

> but currently the only way to get Python to compile is
> to hand-edit setup.py and this is not easy for people with no 
> prior distutils experience.

You only have to edit the 'disabled_module_list' variable... not too hard
even if you don't have distutils experience (though you do need some python
experience.) I don't think its wrong to expect people who compile alpha
versions to have at least that much knowledge (though it should be noted in
the README somewhere.)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From loewis at informatik.hu-berlin.de  Mon Jan 22 15:46:39 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 22 Jan 2001 15:46:39 +0100 (MET)
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
In-Reply-To: <3A6C4359.BCB06252@lemburg.com> (mal@lemburg.com)
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com> <3A6C320A.37CBB4E5@tismer.com> <3A6C3DAD.522CE623@tismer.com> <3A6C4359.BCB06252@lemburg.com>
Message-ID: <200101221446.PAA05164@pandora.informatik.hu-berlin.de>

> This has nothing to do with Python. UTF-8 marks the codes 
> from 128-191 as illegal prefix. 
[...]
> Perhaps the parser should catch the UnicodeError and
> instead return a not-wellformed exception ?!

Right on both accounts. If no encoding is specified, and if the
document appears not to be UTF-16 in any endianness, an XML processor
shall assume it is UTF-8. As Marc-Andre explains, your document is not
proper UTF-8, hence the error.

The confusing thing is that expat itself does not care about it not
being UTF-8; that is only detected when the callback is invoked in
pyexpat, and therefore conversion to a Unicode object is attempted.

The right solution probably would be to change expat so that it
determines correctness of the encoding for each string it gets as part
of the wellformedness analysis, and produces illformedness exceptions
when an encoding error occurs. Patches are welcome, although they
probable should go to sourceforge.net/projects/expat.

Regards,
Martin



From jack at oratrix.nl  Mon Jan 22 15:57:33 2001
From: jack at oratrix.nl (Jack Jansen)
Date: Mon, 22 Jan 2001 15:57:33 +0100
Subject: [Python-Dev] test_sax and site-python
Message-ID: <20010122145733.85E51373C95@snelboot.oratrix.nl>

I'm not sure whether this is really a bug, but I had the problem that there 
was something wrong with the xml package I had installed into my 
Lib/site-python, and this caused test_sax to complain.

If the test stuff is expected to test only the core functionality maybe 
sys.path should be edited so that it only contains directories that are part 
of the core distribution?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | ++++ see http://www.xs4all.nl/~tank/ ++++





From tismer at tismer.com  Mon Jan 22 16:05:24 2001
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 22 Jan 2001 16:05:24 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com> <3A6C320A.37CBB4E5@tismer.com> <3A6C3DAD.522CE623@tismer.com> <3A6C4359.BCB06252@lemburg.com>
Message-ID: <3A6C4C34.4D1252C9@tismer.com>


"M.-A. Lemburg" wrote:
...
> > The codes from 192 to 236, 238-243 produce
> > "UTF-8 decoding error: invalid data",
> > the rest gives "not well-formed".
> >
> > I would like to know if this happens with your (Tim) modified
> > version as well. I'm using plain vanilla BeOpen Python 2.0 .
> 
> This has nothing to do with Python. UTF-8 marks the codes
> from 128-191 as illegal prefix. See Object/unicodeobject.c:
...

Schade.

> Perhaps the parser should catch the UnicodeError and
> instead return a not-wellformed exception ?!

I belive it would be better.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From guido at digicool.com  Mon Jan 22 16:06:06 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 10:06:06 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include pyport.h,2.24,2.25
In-Reply-To: Your message of "Sun, 21 Jan 2001 15:34:14 PST."
             <E14KTzy-0002Xt-00@usw-pr-cvs1.sourceforge.net> 
References: <E14KTzy-0002Xt-00@usw-pr-cvs1.sourceforge.net> 
Message-ID: <200101221506.KAA29773@cj20424-a.reston1.va.home.com>

> Move declaration of 'clnt_create()' NIS function to pyport.h, as it's
> supposed to be declared in system include files (with a proper prototype.)
> Should be moved to a platform-specific block if anyone finds out which
> broken platforms need it :-)

[The following is inside #if 0]
> + /* From Modules/nismodule.c */
> + CLIENT *clnt_create();
> + 

Thomas, I'm not sure if this particular declaration belongs in
pyport.h, even inside #if 0.

CLIENT is declared in a NIS-specific header file that's not included by
pyport.h, but which *is* included by nismodule.c.

I think you did the right thing to nismodule.c; the pyport.h patch is
redundant in my eyes.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Mon Jan 22 16:12:49 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 16:12:49 +0100
Subject: [Python-Dev] _tkinter and setup.py
References: <3A68B6BD.BAD038D6@lemburg.com> <3A6C10E1.EF890356@lemburg.com> <20010122133704.O17392@xs4all.nl> <3A6C45D5.9A6FA25C@lemburg.com> <20010122154342.B17295@xs4all.nl>
Message-ID: <3A6C4DF1.F71AA631@lemburg.com>

Thomas Wouters wrote:
> 
> On Mon, Jan 22, 2001 at 03:38:14PM +0100, M.-A. Lemburg wrote:
> 
> > > I think Andrew is still working on a way to disable modules from the command
> > > line somehow. (I think moving setup.py to setup.py.in, and using autoconf
> > > --options would be easiest on both developer and user, but that's just me.)
> 
> > This is fairly simple to do: distutils allows great flexibility
> > when it comes to adding user options, e.g. we could have
> >
> > python setup.py --enable-tkinter --disable-readline
> >
> > or more generic
> >
> > python setup.py --enable-package tkinter --disable-package readline
> >
> > The options could then be edited in setup.cfg.
> 
> Note that the 'user' only has 'configure' and 'make' to run, so optimally,
> the options would have to be given to one of those (preferably to
> 'configure', to keep it similar to 90% of the packages out there.)

Hmm, but then you'll have to hack autoconf again... (even if only
to pass the options to setup.py somehow, e.g. via your proposed
setup.cfg.in trick).
 
> > but currently the only way to get Python to compile is
> > to hand-edit setup.py and this is not easy for people with no
> > prior distutils experience.
> 
> You only have to edit the 'disabled_module_list' variable... not too hard
> even if you don't have distutils experience (though you do need some python
> experience.) I don't think its wrong to expect people who compile alpha
> versions to have at least that much knowledge (though it should be noted in
> the README somewhere.)

Oops, you're right; must have overlooked that one in setup.py.


-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Mon Jan 22 16:14:02 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 16:14:02 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include pyport.h,2.24,2.25
In-Reply-To: <200101221506.KAA29773@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 10:06:06AM -0500
References: <E14KTzy-0002Xt-00@usw-pr-cvs1.sourceforge.net> <200101221506.KAA29773@cj20424-a.reston1.va.home.com>
Message-ID: <20010122161402.D17295@xs4all.nl>

On Mon, Jan 22, 2001 at 10:06:06AM -0500, Guido van Rossum wrote:
> > Move declaration of 'clnt_create()' NIS function to pyport.h, as it's
> > supposed to be declared in system include files (with a proper prototype.)
> > Should be moved to a platform-specific block if anyone finds out which
> > broken platforms need it :-)
> 
> [The following is inside #if 0]
> > + /* From Modules/nismodule.c */
> > + CLIENT *clnt_create();
> > + 
> 
> Thomas, I'm not sure if this particular declaration belongs in
> pyport.h, even inside #if 0.
> 
> CLIENT is declared in a NIS-specific header file that's not included by
> pyport.h, but which *is* included by nismodule.c.
> 
> I think you did the right thing to nismodule.c; the pyport.h patch is
> redundant in my eyes.

The same goes for most prototypes inside that '#if 0'. I see it more as an
easy list to see what prototypes were removed than as proper examples of the
prototype. You're right about CLIENT being defined in system-specific
include files, I just wasn't worried about it because it was inside an '#if 0'
that will never be turned into an '#if 1'. If a specific platform needs that
prototype, we'll figure out how to arrange the prototype then :)

But if you want me to remove it, that's fine.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Mon Jan 22 16:22:29 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 10:22:29 -0500
Subject: [Python-Dev] autoconf --enable vs. --with
In-Reply-To: Your message of "Mon, 22 Jan 2001 03:01:27 EST."
             <20010122030127.C20804@thyrsus.com> 
References: <200101212122.WAA16371@pandora.informatik.hu-berlin.de>  
            <20010122030127.C20804@thyrsus.com> 
Message-ID: <200101221522.KAA30287@cj20424-a.reston1.va.home.com>

> I've been working a bit on the build process lately.  I came
> across this in the autoconf documentation:
> 
> 
>     If a software package has optional compile-time features, the
>     user can give `configure' command line options to specify
>     whether to compile them. The options have one of these forms:
> 
>         --enable-FEATURE[=ARG]
>         --disable-FEATURE
> 
>     Some packages require, or can optionally use, other software
>     packages which are already installed.  The user can give
>     `configure' command line options to specify which such
>     external software to use.  The options have one of these
>     forms:
> 
>         --with-package[=ARG]
>         --without-package
> 
> 
> Is it worth fixing the Python configure script to comply with
> these definitions?  It looks like with-cycle-gc and mybe
> with-pydebug would have to be changed.

OK, but please add explicit checks for the old --with[out]-cycle-gc
and --with[out]-pydebug flags that cause errors (not just warnings)
when these forms are used.  It's bad enough that configure doesn't
flag typos in such options as errors; if we change the option names,
we really owe users who were using the old forms a clear error.

(Is this stupid autoconf behavior changable?  Does it also apply to
enable/disable?)

--Guido van Rossum (home page: http://www.python.org/~guido/)




From fdrake at acm.org  Mon Jan 22 16:19:49 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 22 Jan 2001 10:19:49 -0500 (EST)
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEEDIKAA.tim.one@home.com>
References: <200101212200.XAA16672@pandora.informatik.hu-berlin.de>
	<LNBBLJKPBEHFEDALKOLCKEEDIKAA.tim.one@home.com>
Message-ID: <14956.20373.104748.573294@cj42289-a.reston1.va.home.com>

[Martin, on ftp://ftp.jclark.com/pub/xml/expat.zip]
 > Anyway, I was using 1.1 in my own tests, and 1.2 in PyXML - either
 > works for me. I never tested 1.95.x (which is also not available from
 > jclark.com).

Tim Peters writes:
 > If you do and love it, let me know where to get it and I'll ship that
 > instead.

  I'll recommend not updating to 1.95.1; let's awit at least until
1.95.2 is out.  These are really just pre-2.0 releases to shake things
out.  I have been using the current Expat CVS lightly, but need to do
more testing before I can be confident in it and our bindings (not
yet checked in anywhere; should be in PyXML soon).


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From jeremy at alum.mit.edu  Mon Jan 22 16:44:41 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 10:44:41 -0500 (EST)
Subject: [Python-Dev] Worse news
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>
Message-ID: <14956.21865.943601.735426@localhost.localdomain>

On Linux, I am also seeing test_cpickle failures.  I have not been
able to reproduce failures in test_extcall or test_sax.

I ran 'regrtest.py -r -x test_thread test_unicodedata test_signal
test_select test_poll' 10 times and test_cpickle failed five times.
(I did the peculiar run because exclyding those five tests shaves two
minutes off the running time of the test suite.)

No more time to look into this...

Jeremy



From jeremy at alum.mit.edu  Mon Jan 22 16:26:27 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 10:26:27 -0500 (EST)
Subject: [Python-Dev] getcode() function in pyexpat.c
Message-ID: <14956.20771.447958.389724@localhost.localdomain>

The pyexpat module uses functions named getcode() and
call_with_frame() for handlers of some sort.  I can make this much out
from the code, but the rest is a bit of a mystery.  I was trying to
read this code because of the errors Tim is seeing with test_sax on
Windows.  A few comments to explain this highly stylized and
macro-laden code would be appreciated.

The module appears to be creating empty code objects and calling
them.  I say they appear to be empty, because when they are created 
they don't appear to have anything initialized except name, filename,
and firstlineno.

    getcode(EndNamespaceDecl, 419)
    <code at 0x81b73c0
        co_name = 'EndNamespaceDecl'
        co_filename = 'pyexpat.c'
        co_firstlineno = 419
        co_argcount = 0
        co_nlocals = 0
        co_stacksize = 0
        co_flags = 0
        co_consts = ()
        co_names = ()
        co_varnames = ()
        co_freevars = ()
        co_cellvars = ()
        co_code = ''
    >

(The freevars and cellvars entries are part of the support for nested
scopes.  They can be safely ignored for the moment.) 

I simply don't understand what's going on -- and I'm deeply suspicious
that it is the source of whatever problems Tim is seeing with
test_sax.

Jeremy



From thomas at xs4all.net  Mon Jan 22 16:55:35 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 16:55:35 +0100
Subject: [Python-Dev] 'make distclean' broken.
Message-ID: <20010122165535.P17392@xs4all.nl>

'make distclean' seems broken, at least on non-GNU make's:

[snip]
clobbering subdirectory Modules
rm -f *.o python core *~ [@,#]* *.old *.orig *.rej
rm -f add2lib hassignal
rm -f *.a tags TAGS config.c Makefile.pre
rm -f *.so *.sl so_locations
make -f ./Makefile.in  SUBDIRS="Include Lib Misc Demo" clobber
"./Makefile.in", line 134: Need an operator
make: fatal errors encountered -- cannot continue
*** Error code 1 (ignored)
rm -f config.status config.log config.cache config.h Makefile
rm -f buildno platform
rm -f Modules/Makefile
[snip]

(This is using FreeBSD's 'make'.)

Looking at line 134, I'm not sure why it works with GNU make other than that
it avoids complaining about syntax errors it doesn't run into (which could
be both bad and good :) or that it avoids complaining about obvious GNU
autoconf tricks. But I don't know enough about make to say for sure, nor to
fix the above problem.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Mon Jan 22 16:55:42 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 10:55:42 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: Your message of "Sun, 21 Jan 2001 17:28:45 EST."
             <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com> 
Message-ID: <200101221555.KAA30935@cj20424-a.reston1.va.home.com>

> Your faith in gcc is as charming as it is naive <wink>:  the most
> interesting cases of undefined behavior can't be checked no-way, no-how at
> compile-time.  That's why Barry keeps talking employers into dumping
> thousands of dollars into a single Insure++ license.  Insure++ actually tags
> every pointer at runtime with its source, and gripes if non-equality
> comparisons are done on a pair not derived from the same array or malloc
> etc.  Since Python type objects are individually allocated (not taken from a
> preallocated contiguous vector), Insure++ should complain about that
> compare.

IMHO, *this* *particular* gripe of Insure++ is just a pain in the
butt, and I wish there was a way to turn it off in Insure++ without
having to fix the code.

IMHO, this was included in the standard to allow segmented-memory
implementations of C.  Think certain DOS or Windows 3.1 memory models
where a pointer is a segment plus an offset.  This is not current
practice even on Palmpilots!

The standard may say that such comparisons are undefined, but I don't
care about this particular undefinedness, and I'm annoyed by the
required patches.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Mon Jan 22 17:02:15 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 11:02:15 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: Your message of "Sun, 21 Jan 2001 14:44:38 EST."
             <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com> 
Message-ID: <200101221602.LAA31103@cj20424-a.reston1.va.home.com>

> > My only concern is that under the old schele, two different numeric
> > extension types that somehow can't be compared will end up being
> > *equal*.  To fix this, I propose that if the names compare equal, as a
> > last resort we compare the type pointers -- this should be consistent
> > too.
> 
> Agreed, and sounds fine!

Checked in now.

While fixing the test_b1 code again, which depends on this behavior, I
thought of a refinement: it wouldn't be hard to make None compare
smaller than *anything* (including numbers).

Is this worth it?

diff -c -r2.113 object.c
*** object.c	2001/01/22 15:59:32	2.113
--- object.c	2001/01/22 16:03:38
***************
*** 550,555 ****
--- 550,561 ----
  		PyErr_Clear();
  	}
  
+ 	/* None is smaller than anything */
+ 	if (v == Py_None)
+ 		return -1;
+ 	if (w == Py_None)
+ 		return 1;
+ 
  	/* different type: compare type names */
  	if (v->ob_type->tp_as_number)
  		vname = "";


--Guido van Rossum (home page: http://www.python.org/~guido/)



From mwh21 at cam.ac.uk  Mon Jan 22 17:12:47 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: Mon, 22 Jan 2001 16:12:47 +0000 (GMT)
Subject: [Python-Dev] Worse news
In-Reply-To: <14956.21865.943601.735426@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.10101221609430.24819-100000@localhost.localdomain>

On Mon, 22 Jan 2001, Jeremy Hylton wrote:

> On Linux, I am also seeing test_cpickle failures.  I have not been
> able to reproduce failures in test_extcall or test_sax.

Hmm - my machine's done 28 exemplary "make clean; make test" runs this
morning.  I last updated yesterday afternoon my time (~1700 GMT).

Of course, I don't build pyexpat...

> No more time to look into this...

Don't you just love memory corruption bugs?

Cheers,
M.




From akuchlin at mems-exchange.org  Mon Jan 22 17:28:59 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 22 Jan 2001 11:28:59 -0500
Subject: [Python-Dev] Python 2.1 article
Message-ID: <E14Kjpz-0000cu-00@ute.cnri.reston.va.us>

I've put together an almost-complete first draft of a "What's New in
2.1" article.  The only missing piece is a section on the Nested
Scopes PEP, which obviously has to wait for the changes to get checked
in.  http://www.amk.ca/python/2.1/ ; as usual, nitpicking comments are
welcomed.

--amk




From nas at arctrix.com  Mon Jan 22 11:00:43 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 02:00:43 -0800
Subject: [Python-Dev] Worse news
In-Reply-To: <Pine.LNX.4.10.10101221609430.24819-100000@localhost.localdomain>; from mwh21@cam.ac.uk on Mon, Jan 22, 2001 at 04:12:47PM +0000
References: <14956.21865.943601.735426@localhost.localdomain> <Pine.LNX.4.10.10101221609430.24819-100000@localhost.localdomain>
Message-ID: <20010122020043.A25687@glacier.fnational.com>

On Mon, Jan 22, 2001 at 04:12:47PM +0000, Michael Hudson wrote:
> Don't you just love memory corruption bugs?

Great fun.

I've played around with efence and debauch on the weekend.  I
even when as far as merging an updated fmalloc from the XFree
source tree into debauch and writing a reporting script in
Python.

I probably would have caught the pyexpat overrun if I would have
used efence with EF_ALIGNMENT=0 and complied with -fpack-struct.
I'll have to try it tonight.  Maybe something else will turn up.

  Neil



From guido at digicool.com  Mon Jan 22 18:12:29 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 12:12:29 -0500
Subject: [Python-Dev] 'make distclean' broken.
In-Reply-To: Your message of "Mon, 22 Jan 2001 16:55:35 +0100."
             <20010122165535.P17392@xs4all.nl> 
References: <20010122165535.P17392@xs4all.nl> 
Message-ID: <200101221712.MAA00694@cj20424-a.reston1.va.home.com>

> 'make distclean' seems broken, at least on non-GNU make's:
> 
> [snip]
> clobbering subdirectory Modules
> rm -f *.o python core *~ [@,#]* *.old *.orig *.rej
> rm -f add2lib hassignal
> rm -f *.a tags TAGS config.c Makefile.pre
> rm -f *.so *.sl so_locations
> make -f ./Makefile.in  SUBDIRS="Include Lib Misc Demo" clobber
> "./Makefile.in", line 134: Need an operator
> make: fatal errors encountered -- cannot continue
> *** Error code 1 (ignored)
> rm -f config.status config.log config.cache config.h Makefile
> rm -f buildno platform
> rm -f Modules/Makefile
> [snip]
> 
> (This is using FreeBSD's 'make'.)
> 
> Looking at line 134, I'm not sure why it works with GNU make other than that
> it avoids complaining about syntax errors it doesn't run into (which could
> be both bad and good :) or that it avoids complaining about obvious GNU
> autoconf tricks. But I don't know enough about make to say for sure, nor to
> fix the above problem.

There's one line in Makefile.in that trips over Make (mine also
complains about it):

    @SET_DLLLIBRARY@

Looking at the code in configure.in that generates this macro:

    AC_SUBST(SET_DLLLIBRARY)
    LDLIBRARY=''
    SET_DLLLIBRARY=''
       .
       . (and later)
       .
    cygwin*)
	  LDLIBRARY='libpython$(VERSION).dll.a'
	  SET_DLLLIBRARY='DLLLIBRARY=	$(basename $(LDLIBRARY))'
	  ;;

I don't see why we couldn't change this so that Makefile.in just
contains

    DLLLIBRARY=		@DLLLIBRARY@

and then configure.in could be changed to

    AC_SUBST(DLLLIBRARY)
    LDLIBRARY=''
    DLLLIBRARY=''
       .
       . (and later)
       .
    cygwin*)
	  LDLIBRARY='libpython$(VERSION).dll.a'
	  DLLLIBRARY='DLLLIBRARY=	$(basename $(LDLIBRARY))'
	  ;;

Or am I missing something?

Does this fix the problem?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Mon Jan 22 18:21:09 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 12:21:09 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101221602.LAA31103@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 11:02:15AM -0500
References: <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com> <200101221602.LAA31103@cj20424-a.reston1.va.home.com>
Message-ID: <20010122122109.A14952@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> While fixing the test_b1 code again, which depends on this behavior, I
> thought of a refinement: it wouldn't be hard to make None compare
> smaller than *anything* (including numbers).
> 
> Is this worth it?

I think so, if only for the sake of well-definedness.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"They that can give up essential liberty to obtain a little temporary 
safety deserve neither liberty nor safety."
	-- Benjamin Franklin, Historical Review of Pennsylvania, 1759.



From thomas at xs4all.net  Mon Jan 22 18:25:30 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 18:25:30 +0100
Subject: [Python-Dev] 'make distclean' broken.
In-Reply-To: <200101221712.MAA00694@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 12:12:29PM -0500
References: <20010122165535.P17392@xs4all.nl> <200101221712.MAA00694@cj20424-a.reston1.va.home.com>
Message-ID: <20010122182530.E17295@xs4all.nl>

On Mon, Jan 22, 2001 at 12:12:29PM -0500, Guido van Rossum wrote:

> and then configure.in could be changed to

>     AC_SUBST(DLLLIBRARY)
>     LDLIBRARY=''
>     DLLLIBRARY=''
>        .
>        . (and later)
>        .
>     cygwin*)
> 	  LDLIBRARY='libpython$(VERSION).dll.a'
> 	  DLLLIBRARY='DLLLIBRARY=	$(basename $(LDLIBRARY))'
> 	  ;;

You mean 
 	  DLLLIBRARY='$(basename $(LDLIBRARY))'

But yes, that fixes it.

> Or am I missing something?

Well, on *that* I'm not sure, that's why I asked :P If things in the Python
source boggle me, they are always there for a good reason. Well, maybe just
'almost always', but practically always :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From nas at arctrix.com  Mon Jan 22 11:39:59 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 02:39:59 -0800
Subject: [Python-Dev] 'make distclean' broken.
In-Reply-To: <200101221712.MAA00694@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 12:12:29PM -0500
References: <20010122165535.P17392@xs4all.nl> <200101221712.MAA00694@cj20424-a.reston1.va.home.com>
Message-ID: <20010122023959.A25798@glacier.fnational.com>

[Guido on change SET_DLLLIBRARY]
> Or am I missing something?

I don't think so.  My new Makefile uses "FOO = @FOO@" everywhere.
SET_CXX is the same way in the current Makefile.

  Neil



From esr at thyrsus.com  Mon Jan 22 18:41:59 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 12:41:59 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
Message-ID: <20010122124159.A14999@thyrsus.com>

\section{\module{set} ---
         Basic set algebra for Python}

\declaremodule{standard}{set}
\modulesynopsis{Basic set algebra operations on sequences.}
\moduleauthor{Eric S. Raymond}{esr at thyrsus.com}
\sectionauthor{Eric S. Raymond}{esr at thyrsus.com}

The \module{set} module defines functions for treating lists and other
sequences as mathematical sets, and defines a set class that uses
these operations natively and overloads Python's standard operator set.

The \module{set} functions work on any sequence type and return lists.
The set methods can take a set or any sequence type as an argument.
Set or sequence elements may be of any type and may be mutable.
Comparisons and membership tests of elements against sequence objects
are done using \keyword{in}, and so can be customized by supplying a 
suitable \method{__getattr__} method for the sequence type.

The running time of these functions is O(n**2) in the worst case
unless otherwise noted.  For cases that can be short-circuited by 
cardinality comparisons, this has been done.

\begin{funcdesc}{setify}{list1}
Returns a list of the argument sequence's elements with duplicates removed.
\end{funcdesc}

\begin{funcdesc}{union}{list1, list2}
Set union.  All elements of both sets or sequences are returned.
\end{funcdesc}

\begin{funcdesc}{intersection}{list1, list2}
Set intersection.  All elements common to both sets or sequences are returned.
\end{funcdesc}

\begin{funcdesc}{difference}{list1, list2}
Set difference.  All elements of the first set or sequence not present
in the second are returned.
\end{funcdesc}

\begin{funcdesc}{symmetric_difference}{list1, list2}
Set symmetric difference.  All elements present in one sequence or the other
but not in both are returned.
\end{funcdesc}

\begin{funcdesc}{cartesian}{list1, list2}
Returns a list of tuples consisting of all possible pairs of elements
from the first and second sequences or sets.
\end{funcdesc}

\begin{funcdesc}{equality}{list1, list2}
Set comparison.  Return 1 if the two sets or sequences contain exactly
the same elements, 0 or otherwise.
\end{funcdesc}

\begin{funcdesc}{subset}{list1, list2}
Set subset test.  Return 1 if all elements of the fiorst set or
sequence are members of the second, 0 otherwise.
\end{funcdesc}

\begin{funcdesc}{proper_subset}{list1, list2}
Set subset test, excluding equality.  Return 1 if the arguments fail a
set equality test, and all elements of the fiorst set or sequence are
members of the second, 0 otherwise.
\end{funcdesc}

\begin{funcdesc}{powerset}{list1}
Return the set of all subsets of the argument set or
sequence. Warning: this produces huge results from small arguments and
is O(2**n) in both running time and space requirements; you can
readily run yourself out of memory using it.
\end{funcdesc}

\subsection{set Objects \label{set-objects}}

A \class{set} instance uses the \module{set} module functions to
implement set semantics on the list it contains, and to support 
a full set of Python list methods and operaors.  Thus, the set
methods can take a set or any sequence type as an argument.  

A set object contains a single data member:

\begin{memberdesc}{elements}
List containing the elements of the set.  
\end{memberdesc}

Set objects can be treated as mutable sequences; they support the
special methods 
\method{__len__}, 
\method{__getattr__},
\method{__setattr__}, 
and \method{__delattr__}.  
Through
\method{__getattr__}, they support the memebership test via
\keyword{in}. All the standard mutable-sequence methods
\method{list}, 
\method{append}, 
\method{extend}, 
\method{count}, 
\method{index}, 
\method{insert} (the index argument is ignored), 
\method{pop}, 
\method{remove}, 
\method{reverse}, 
and \method{sort}
are also supported.  After method calls that add elements
(\method{setattr},
\method{append}, \method{extend}, \method{insert}), the
elements of the data member are re-setified, so it is not possible to
introduce duplicates.

Calling \function{repr()} on a set returns the result of calling
\function{repr} on its element list.  Calling \function{str()} returns
a representation resembling mathematical notation for the set; an
open set bracket, followed by a comma-separated list of \function{str()}
representations of the elements, followed by a close set brackets.

Set objects support the following Python operators:

\begin {tableiii}{l|l|l}{code}{Operator}{Function}{Description}
\lineiii{|,+}{union}{Union}
\lineiii{&}{intersection}{Intersection}
\lineiii{-}{difference}{Difference}
\lineiii{^}{symmetric_difference}{Symmetric differe}
\lineiii{*}{cartesian}{Cartesian product}
\lineiii{==}{equality}{Equality test}
\lineiii{!=,<>}{}{Inequality test}
\lineiii{<}{proper_subset}{Proper-subset test}
\lineiii{<=}{subset}{Subset test}
\lineiii{>}{}{Proper superset test}
\lineiii{>=}{}{Superset test}
\end {tableiii}

-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Government is actually the worst failure of civilized man. There has
never been a really good one, and even those that are most tolerable
are arbitrary, cruel, grasping and unintelligent.
	-- H. L. Mencken 



From esr at snark.thyrsus.com  Mon Jan 22 19:28:57 2001
From: esr at snark.thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 13:28:57 -0500
Subject: [Python-Dev] I still can't build HTML in a current CVS tree.
Message-ID: <200101221828.f0MISvH15121@snark.thyrsus.com>

Fred, I still can't build HTML documentation in a current CVS tree -- same
complaint about lib/modindex.html being absent.  Can we get this fixed
before 2.1 ships?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

...Virtually never are murderers the ordinary, law-abiding people
against whom gun bans are aimed.  Almost without exception, murderers
are extreme aberrants with lifelong histories of crime, substance
abuse, psychopathology, mental retardation and/or irrational violence
against those around them, as well as other hazardous behavior, e.g.,
automobile and gun accidents."
        -- Don B. Kates, writing on statistical patterns in gun crime



From fredrik at effbot.org  Mon Jan 22 19:33:56 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Mon, 22 Jan 2001 19:33:56 +0100
Subject: [Python-Dev] Python 2.1 article
References: <E14Kjpz-0000cu-00@ute.cnri.reston.va.us>
Message-ID: <059b01c084a1$e431e490$e46940d5@hagrid>

> I've put together an almost-complete first draft of a "What's New in
> 2.1" article.  The only missing piece is a section on the Nested
> Scopes PEP, which obviously has to wait for the changes to get checked
> in.

what's the current 2.1a1 eta?  (pep 226 still
says last friday)

today?  wednesday?  this week?  this month?

Curious /F




From mal at lemburg.com  Mon Jan 22 19:33:24 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 19:33:24 +0100
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
References: <20010122124159.A14999@thyrsus.com>
Message-ID: <3A6C7CF4.F10AA77B@lemburg.com>

[LaTeX file]

Eric, we are all hackers, but plain LaTeX is not really the right
format for a posting to a mailing list... at least not if
you really expect feedback ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From martin at mira.cs.tu-berlin.de  Mon Jan 22 19:36:16 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 22 Jan 2001 19:36:16 +0100
Subject: [Python-Dev] getcode() function in pyexpat.c
Message-ID: <200101221836.f0MIaGL00923@mira.informatik.hu-berlin.de>

> A few comments to explain this highly stylized and macro-laden code
> would be appreciated.

I probably can't do that before 2.1a1, but I promise to suggest
something right afterwards.

In general, the macro magic is designed to make the many expat
callbacks available to Python. RC_HANDLER (for return code) is the
most general template; VOID_HANDLER and INT_HANDLER are common
specializations. In the core of RC_HANDLER, there a tuple is built and
a Python function is called.

The code used to do PyEval_CallObject right inside the macro; the
call_with_frame feature is new compared to 2.0. It solves the specific
problem of incomprehensible tracebacks.

In a typical SAX application, the user code calls
expatreader.ExpatParser.parse, which in turn calls 

            self._parser.Parse(data, isFinal)

Now, in 2.0, a common problem was a traceback

            self._parser.Parse(data, isFinal)
TypeError: not enough arguments; expected 4, got 2

Everybody assumes a problem in the call to Parse; the real problem is
in the call to the callback inside RC_HANDLER, which tried to call a
user's function with two arguments that expected four.

2.1 would improve this slightly on its own, writing

            self._parser.Parse(data, isFinal)
TypeError: characters() takes exactly 4 arguments (2 given)

With that code, you get

  File "/usr/local/lib/python2.1/xml/sax/expatreader.py", line 81, in feed
    self._parser.Parse(data, isFinal)
  File "pyexpat.c", line 379, in CharacterData
TypeError: characters() takes exactly 4 arguments (2 given)

So that tells you that it is the CharacterData handler that invokes
characters(). You are right that the frame object is not used
otherwise; it is just there to make a nice traceback.

> I simply don't understand what's going on -- and I'm deeply
> suspicious that it is the source of whatever problems Tim is seeing
> with test_sax.

I thought so, too, at first; it turned out that the problem was
elsewhere.

Regards,
Martin



From guido at digicool.com  Mon Jan 22 20:04:02 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 14:04:02 -0500
Subject: [Python-Dev] Python 2.1 article
In-Reply-To: Your message of "Mon, 22 Jan 2001 19:33:56 +0100."
             <059b01c084a1$e431e490$e46940d5@hagrid> 
References: <E14Kjpz-0000cu-00@ute.cnri.reston.va.us>  
            <059b01c084a1$e431e490$e46940d5@hagrid> 
Message-ID: <200101221904.OAA01170@cj20424-a.reston1.va.home.com>

> what's the current 2.1a1 eta?  (pep 226 still
> says last friday)

You missed my email that I sent out Friday.  Tentatively it's going
out tonight.  No point in updating the PEP each time there's slippage.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Mon Jan 22 20:10:54 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 14:10:54 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Mon, 22 Jan 2001 12:41:59 EST."
             <20010122124159.A14999@thyrsus.com> 
References: <20010122124159.A14999@thyrsus.com> 
Message-ID: <200101221910.OAA01218@cj20424-a.reston1.va.home.com>

Eric,

There's already a PEP on a set object type, and everybody and their
aunt has already implemented a set datatype.

If *your* set module is ready for prime time, why not publish it in
the Vaults of Parnassus?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Mon Jan 22 20:29:18 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 14:29:18 -0500 (EST)
Subject: [Python-Dev] Re: getcode() function in pyexpat.c
In-Reply-To: <200101221836.f0MIaGL00923@mira.informatik.hu-berlin.de>
References: <200101221836.f0MIaGL00923@mira.informatik.hu-berlin.de>
Message-ID: <14956.35342.724657.865367@localhost.localdomain>

>>>>> "MvL" == Martin v Loewis <martin at mira.cs.tu-berlin.de> writes:

  >> I simply don't understand what's going on -- and I'm deeply
  >> suspicious that it is the source of whatever problems Tim is
  >> seeing with test_sax.

  MvL> I thought so, too, at first; it turned out that the problem was
  MvL> elsewhere.

What was the cause of that problem?  I didn't see any mail after Tim's
middle-of-the-night message "Worse news."

Jeremy




From tim.one at home.com  Mon Jan 22 21:01:59 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 15:01:59 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101221602.LAA31103@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEGIIKAA.tim.one@home.com>

[Guido]
> ...
> While fixing the test_b1 code again, which depends on this behavior, I
> thought of a refinement: it wouldn't be hard to make None compare
> smaller than *anything* (including numbers).
>
> Is this worth it?

First, an attempt to see what Python did in this morning's CVS turned up an
internal error for Jeremy:

>>> [None < x for x in (1, 1L, 1j, 1.0, [1], {}, (1,))]
name: None, in ?, file '<stdin>', line 1
locals: {'[1]': 0, 'x': 1}
globals: {}
Fatal Python error: compiler did not label name as local or global

abnormal program termination

A simpler way to provoke that:

>>> [None < 2 for x in "x"]
name: None, in ?, file '<stdin>', line 1
locals: {'[1]': 0, 'x': 1}
globals: {}
Fatal Python error: compiler did not label name as local or global


Anyway, I think forcing None to be "the smallest" is cute!  Inexpensive to
do, and while I don't see a compelling *use* for it, I bet it would be least
surprising to newbies.  +1.




From fdrake at acm.org  Mon Jan 22 21:08:54 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 22 Jan 2001 15:08:54 -0500 (EST)
Subject: [Python-Dev] Re: I still can't build HTML in a current CVS tree.
In-Reply-To: <200101221828.f0MISvH15121@snark.thyrsus.com>
References: <200101221828.f0MISvH15121@snark.thyrsus.com>
Message-ID: <14956.37718.968912.189834@cj42289-a.reston1.va.home.com>

Eric S. Raymond writes:
 > Fred, I still can't build HTML documentation in a current CVS tree -- same
 > complaint about lib/modindex.html being absent.  Can we get this fixed
 > before 2.1 ships?

  I'm guessing I've lost a previous email on the topic, or it's buried
in my inbox.  If this is still a problem after today's checkins, could
you please file a bug report and assign it to me?
  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tim.one at home.com  Mon Jan 22 21:26:15 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 15:26:15 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101221555.KAA30935@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEGJIKAA.tim.one@home.com>

[Guido]
> IMHO, *this* *particular* gripe of Insure++ is just a pain in the
> butt, and I wish there was a way to turn it off in Insure++ without
> having to fix the code.

Maybe there is.  Barry?

> IMHO, this was included in the standard to allow segmented-memory
> implementations of C.  Think certain DOS or Windows 3.1 memory models
> where a pointer is a segment plus an offset.  This is not current
> practice even on Palmpilots!

I could ask Tom MacDonald (former X3J11 chair), but don't want to bother
him.  The way these things usually turn out:  the committee debated it 100
times over 10 years, but some committee member steadfastly claimed it was
important.  Since ANSI/ISO committees work via consensus, one implacable
objector is enough.

WRT pointers, I know that while the C committee did worry about segmented
architectures a lot in the past, tagged architectures gave them much
thornier problems (the HW tags each "word" with some manner of metadata
(such as a busy/free or empty/full bit, or read+write permission bits, or a
data type identifier, or a "capability" tag tying into a HW-enforced
security architecture, ...), and checks those on each access, and some of
the metadata can propagate into a pointer, and the HW can raise faults on
pointer comparisons if the metadata doesn't match).  While such machines
aren't in common use, the US Govt does all sorts of things they don't talk
about -- if it's not IBM's representative protecting a 40-year old
architecture, it's someone emphatically not from the NSA <wink> protecting
something they're not at liberty to discuss.  Of course Python wants to run
there too, even if we never hear about it ...

> The standard may say that such comparisons are undefined, but I don't
> care about this particular undefinedness, and I'm annoyed by the
> required patches.

Ya, and I'm annoyed that MS stdio corrupts itself -- but they're just
clinging to the letter of the std too, and I've learned to live with it
gracefully <wink>.

pointer-ordering-comparisons-should-be-very-rare-anyway-ly y'rs  - tim




From tim.one at home.com  Mon Jan 22 21:55:30 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 15:55:30 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: <Pine.LNX.4.10.10101221609430.24819-100000@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEGNIKAA.tim.one@home.com>

[Michael Hudson]
> Hmm - my machine's done 28 exemplary "make clean; make test" runs this
> morning.  I last updated yesterday afternoon my time (~1700 GMT).

So does mine now.  The remaining failures require *unusual* ways of running
the test suite (with -r to get test_cpickle to fail, confirmed now by Jeremy
under Linux; and in an extremely specialized and seemingly Windows-specific
way to get test_extcall to blow up w/ a bad pointer).




From tim.one at home.com  Mon Jan 22 22:07:27 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 16:07:27 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: <14956.16758.68050.257212@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>

[Jeremy Hylton]
> Funny (strange or haha?) that test_extcall is failing since the two
> pieces of code I've modified most recently are compile.c and the
> section of ceval.c that handles extended call syntax.

Ya, I knew that, but I avoided wagging a Finger of Shame in your direction
because coincidence isn't proof <wink>.

> ...
> As for the test_sax failure,

There is no test_sax failure anywhere anymore that I know of (Martin found a
dead-wrong array decl in contributed pyexpat.c code and repaired it).

And I believe my "rt -x test_sax" failure in test_extcall almost certainly
has nothing to do with test_sax -- far more likely the connection to
test_sax is an accident, and that if I spend umpteen hours trying other
things at random I'll provoke the same memory accident leading to a bad
pointer via excluding some other test.  I just picked test_sax because that
*was* broken and I wanted to get thru the rest of the tests.

BTW, delighted(?) to hear that test_cpickle fails for you too!  I'm sure
test_extcall is going to blow up for other people eventually too -- but it
is sooooo hard to provoke even for me.  I've dropped the effort pending news
from someone running Insure++ or efence or whatever.




From guido at digicool.com  Mon Jan 22 22:18:26 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 16:18:26 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: Your message of "Mon, 22 Jan 2001 16:07:27 EST."
             <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com> 
Message-ID: <200101222118.QAA28305@cj20424-a.reston1.va.home.com>

[Tim]
> So does mine now.  The remaining failures require *unusual* ways of running
> the test suite (with -r to get test_cpickle to fail, confirmed now by Jeremy
> under Linux;
[and later]
> BTW, delighted(?) to hear that test_cpickle fails for you too!

This (test_cpickle) is a red herring -- it's a shallow failure in the
test suite.  test_cpickle imports test_pickle, but test_pickle first
outputs the test output from testing pickle -- unless test_pickle has
been run before!  This succeeds:

  ./python Lib/test/regrtest.py test_cpickle test_pickle

and this fails:

  ./python Lib/test/regrtest.py test_pickle test_cpickle

Use regrtest.py -v to fidn out why. :-)

I'm not sure how to restucture this, but it's not of the same quality
as test_extcall or test_sax failing.  Neither of those has failed for
me on Linux during hours of testing.  However on Windows I get an
occasional appfail dialog box when using rt.bat.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas at arctrix.com  Mon Jan 22 15:44:00 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 06:44:00 -0800
Subject: [Python-Dev] Worse news
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 22, 2001 at 04:07:27PM -0500
References: <14956.16758.68050.257212@localhost.localdomain> <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>
Message-ID: <20010122064400.A26543@glacier.fnational.com>

On Mon, Jan 22, 2001 at 04:07:27PM -0500, Tim Peters wrote:
> I've dropped the effort pending news from someone running
> Insure++ or efence or whatever.

efence to the rescue!  I compiled with -fstruct-pack and used
EF_ALIGNMENT=0 and now I can trigger a core dump by running
test_extcall.  More news comming...

  Neil



From tim.one at home.com  Mon Jan 22 22:41:08 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 16:41:08 -0500
Subject: [Python-Dev] test_sax and site-python
In-Reply-To: <20010122145733.85E51373C95@snelboot.oratrix.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHHIKAA.tim.one@home.com>

[Jack Jansen]
> I'm not sure whether this is really a bug, but I had the problem
> that there  was something wrong with the xml package I had
> installed into my Lib/site-python, and this caused test_sax to
> complain.
>
> If the test stuff is expected to test only the core functionality
> maybe sys.path should be edited so that it only contains directories
> that are part of the core distribution?

AFAIK, xml *is* considered part of the core now, and has been since 2.0 was
released.  The wisdom of that decision is debatable with hindsight, but
AFAICT xml is in the same boat as, say, zlib now:  not builtin, and requires
3rd-party code to work, but part of the core all the same.  The Windows
installer comes w/ the necessary xml (and zlib) pieces, and I suppose the
Mac Python package also should.




From nas at arctrix.com  Mon Jan 22 16:00:57 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 07:00:57 -0800
Subject: [Python-Dev] Worse news
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 22, 2001 at 04:07:27PM -0500
References: <14956.16758.68050.257212@localhost.localdomain> <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>
Message-ID: <20010122070057.A26575@glacier.fnational.com>

Perhaps this will help somone track down the bug:

[running test_extcall...]
unbound method method() must be called with instance as first argument
unbound method method() must be called with instance as first argument

Program received signal SIGSEGV, Segmentation fault.
symtable_params (st=0x429bafd0, n=0x42a3ffd7) at Python/compile.c:4330
4330                    if (TYPE(c) == DOUBLESTAR) {
(gdb) l
4325                            symtable_add_def(st, STR(CHILD(n, i)), 
4326                                             DEF_PARAM | DEF_STAR);
4327                            i += 2;
4328                            c = CHILD(n, i);
4329                    }
4330                    if (TYPE(c) == DOUBLESTAR) {
4331                            i++;
4332                            symtable_add_def(st, STR(CHILD(n, i)), 
4333                                             DEF_PARAM | DEF_DOUBLESTAR);
4334                    }
(gdb) p c
$3 = (node *) 0x42a43fff
(gdb) p *c
$4 = {n_type = 0, n_str = 0x0, n_lineno = 0, n_nchildren = 0, n_child = 0x0}
(gdb) p n
$5 = (node *) 0x42a3ffd7
(gdb) p *n
$6 = {n_type = 261, n_str = 0x0, n_lineno = 1, n_nchildren = 2, 
  n_child = 0x42a43fc3}
(gdb) bt 10
#0  symtable_params (st=0x429bafd0, n=0x42a3ffd7) at Python/compile.c:4330
#1  0x8060126 in symtable_funcdef (st=0x429bafd0, n=0x42a23feb)
    at Python/compile.c:4245
#2  0x805fd29 in symtable_node (st=0x429bafd0, n=0x429b0fc3)
    at Python/compile.c:4128
#3  0x80600da in symtable_node (st=0x429bafd0, n=0x4290cfeb)
    at Python/compile.c:4232
#4  0x805f443 in symtable_build (c=0xbffff5c8, n=0x4290cfeb)
    at Python/compile.c:3816
#5  0x805f130 in jcompile (n=0x4290cfeb, filename=0x80a040f "<string>", 
    base=0x0) at Python/compile.c:3720
#6  0x805f0c2 in PyNode_Compile (n=0x4290cfeb, filename=0x80a040f "<string>")
    at Python/compile.c:3699
#7  0x8069adf in run_node (n=0x4290cfeb, filename=0x80a040f "<string>", 
    globals=0x40644fe0, locals=0x40644fe0) at Python/pythonrun.c:915
#8  0x8069ac0 in run_err_node (n=0x4290cfeb, filename=0x80a040f "<string>", 
    globals=0x40644fe0, locals=0x40644fe0) at Python/pythonrun.c:907
#9  0x8069a30 in PyRun_String (
    str=0x429f9fd1 "def zv(*v): print \"ok zv\", a, b, d, e, v, k", start=257, 
    globals=0x40644fe0, locals=0x40644fe0) at Python/pythonrun.c:881
(More stack frames follow...)




From thomas at xs4all.net  Mon Jan 22 23:13:29 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 23:13:29 +0100
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122070057.A26575@glacier.fnational.com>; from nas@arctrix.com on Mon, Jan 22, 2001 at 07:00:57AM -0800
References: <14956.16758.68050.257212@localhost.localdomain> <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com> <20010122070057.A26575@glacier.fnational.com>
Message-ID: <20010122231329.A27785@xs4all.nl>

On Mon, Jan 22, 2001 at 07:00:57AM -0800, Neil Schemenauer wrote:
> Perhaps this will help somone track down the bug:

> [running test_extcall...]
> unbound method method() must be called with instance as first argument
> unbound method method() must be called with instance as first argument
> 
> Program received signal SIGSEGV, Segmentation fault.
> symtable_params (st=0x429bafd0, n=0x42a3ffd7) at Python/compile.c:4330
> 4330                    if (TYPE(c) == DOUBLESTAR) {
> (gdb) l
> 4325                            symtable_add_def(st, STR(CHILD(n, i)), 
> 4326                                             DEF_PARAM | DEF_STAR);
> 4327                            i += 2;
> 4328                            c = CHILD(n, i);
> 4329                    }
> 4330                    if (TYPE(c) == DOUBLESTAR) {
> 4331                            i++;
> 4332                            symtable_add_def(st, STR(CHILD(n, i)), 
> 4333                                             DEF_PARAM | DEF_DOUBLESTAR);
> 4334                    }

> (gdb) p c
> $3 = (node *) 0x42a43fff
> (gdb) p *c
> $4 = {n_type = 0, n_str = 0x0, n_lineno = 0, n_nchildren = 0, n_child = 0x0}
> (gdb) p n
> $5 = (node *) 0x42a3ffd7
> (gdb) p *n
> $6 = {n_type = 261, n_str = 0x0, n_lineno = 1, n_nchildren = 2, 
>   n_child = 0x42a43fc3}

n_child is 0x42a43fc3. That's n_child[0]. 0x42a43fff is the child being
handled now. That would be n_child[3] (0x42a43fff - 0x42a3ffd7 == 60, a
struct node is 20 bytes.) But n_children is 2, so it's an off-by-two error
somewhere -- and look, there's a "i += 2' right above it ! It *looks* like
this code will blow up whenever you use '*eggs' without '**spam' in a
funtion definition. That's a fairly wild guess, but it's worth a try. Try
this patch:

Index: Python/compile.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Python/compile.c,v
retrieving revision 2.148
diff -c -c -r2.148 compile.c
*** Python/compile.c    2001/01/22 04:35:57     2.148
--- Python/compile.c    2001/01/22 22:12:31
***************
*** 4324,4329 ****
--- 4324,4331 ----
                        i++;
                        symtable_add_def(st, STR(CHILD(n, i)), 
                                         DEF_PARAM | DEF_STAR);
+                       if (NCH(n) <= i+2)
+                               return;
                        i += 2;
                        c = CHILD(n, i);
                }


-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From esr at thyrsus.com  Mon Jan 22 21:13:09 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 15:13:09 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101221910.OAA01218@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 02:10:54PM -0500
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com>
Message-ID: <20010122151309.C15236@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> There's already a PEP on a set object type, and everybody and their
> aunt has already implemented a set datatype.

I've just read the PEP.  Greg's proposal has a couple of problems.
The biggest one is that the interface design isn't very Pythonic --
it's formally adequate, but doesn't exploit the extent to which sets
naturally have common semantics with existing Python sequence types.
This is bad; it means that a lot of code that could otherwise ignore
the difference between lists and sets would have to be specialized 
one way or the other for no good reason.

The only other set module I can find in the Vaults or anywhere else is
kjBuckets (which I knew about before).  Looks like a good design, but
complicated -- and requires installation of an extension.

> If *your* set module is ready for prime time, why not publish it in
> the Vaults of Parnassus?

I suppose that's what I'll do if you don't bless it for the standard
library.  But here are the reasons I suggest you should do so:

1. It supports a set of operations that are both often useful and
fiddly to get right, thus enhancing the "batteries are included"
effect.  (I used its ancestor for representing seen-message numbers in
a specialized mailreader, for example.)

2. It's simple for application programmers to use.  No extension module
to integrate.

3. It's unsurprising.  My set objects behave almost exactly like other
mutable sequences, with all the same built-in methods working, except for 
the fact that you can't introduce duplicates with the mutators.

4. It's already completely documented in a form suitable for the library.

5. It's simple enough not to cause you maintainance hassles down the
road, and even if it did the maintainer is unlikely to disappear :-).
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The United States is in no way founded upon the Christian religion
	-- George Washington & John Adams, in a diplomatic message to Malta.



From guido at digicool.com  Mon Jan 22 23:29:26 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 17:29:26 -0500
Subject: [Python-Dev] test_sax and site-python
In-Reply-To: Your message of "Mon, 22 Jan 2001 16:41:08 EST."
             <LNBBLJKPBEHFEDALKOLCGEHHIKAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCGEHHIKAA.tim.one@home.com> 
Message-ID: <200101222229.RAA28667@cj20424-a.reston1.va.home.com>

> [Jack Jansen]
> > I'm not sure whether this is really a bug, but I had the problem
> > that there  was something wrong with the xml package I had
> > installed into my Lib/site-python, and this caused test_sax to
> > complain.
> >
> > If the test stuff is expected to test only the core functionality
> > maybe sys.path should be edited so that it only contains directories
> > that are part of the core distribution?
> 
[Tim]
> AFAIK, xml *is* considered part of the core now, and has been since 2.0 was
> released.  The wisdom of that decision is debatable with hindsight, but
> AFAICT xml is in the same boat as, say, zlib now:  not builtin, and requires
> 3rd-party code to work, but part of the core all the same.  The Windows
> installer comes w/ the necessary xml (and zlib) pieces, and I suppose the
> Mac Python package also should.

Yes, but Jack was talking about a non-std xml package in
site-python...  I agree that this shouldn't be picked up.  But is it
worth taking draconian measures to avoid this?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Mon Jan 22 23:35:08 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 17:35:08 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: <200101222118.QAA28305@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHLIKAA.tim.one@home.com>

[Guido]
> This (test_cpickle) is a red herring -- it's a shallow failure in the
> test suite.

Fixed now -- thanks!

Please note that Neil got text_extcall to fail in exactly the same place
(see his recent Python-Dev) mail.  That's the only remaining failure I know
of.

> ...
> However on Windows I get an occasional appfail dialog box when
> using rt.bat.

I don't believe I've ever seen one of those ("appfail" rings no bells), and
rt has never acted strangely for me.   Your DOS-box properties may be
screwed up:  use Start -> Find -> Files or Folders ...; set "Look in" to C:;
enter *.pif in the "Named:" box; click Find.  You'll probably get a dozen
hits.  One of them will correspond to the method you use to open a DOS box
(which I don't know).  Right-click on that one and select Properties.  On
the Memory tab of the dialog that pops up, the four dropdown lists should
have "Auto" selected.  "Uses HMA" should be checked.  Hmm ... looks like
"Protected" *should* be checked but mine isn't ... oh, this goes on and on.
I don't even know which version of Windows you're using here!  How about I
look at it next time I'm at your house ...




From greg at cosc.canterbury.ac.nz  Mon Jan 22 23:50:07 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 23 Jan 2001 11:50:07 +1300 (NZDT)
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122231329.A27785@xs4all.nl>
Message-ID: <200101222250.LAA01929@s454.cosc.canterbury.ac.nz>

> 4330                    if (TYPE(c) == DOUBLESTAR) {
> 4325                            symtable_add_def(st, STR(CHILD(n, i)), 
> 4326                                             DEF_PARAM | DEF_STAR);

Shouldn't line 4330 say if (TYPE(c) == STAR) ?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From thomas at xs4all.net  Mon Jan 22 23:56:02 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 23:56:02 +0100
Subject: [Python-Dev] Worse news
In-Reply-To: <200101222250.LAA01929@s454.cosc.canterbury.ac.nz>; from greg@cosc.canterbury.ac.nz on Tue, Jan 23, 2001 at 11:50:07AM +1300
References: <20010122231329.A27785@xs4all.nl> <200101222250.LAA01929@s454.cosc.canterbury.ac.nz>
Message-ID: <20010122235602.B27785@xs4all.nl>

On Tue, Jan 23, 2001 at 11:50:07AM +1300, Greg Ewing wrote:
> > 4330                    if (TYPE(c) == DOUBLESTAR) {
> > 4325                            symtable_add_def(st, STR(CHILD(n, i)), 
> > 4326                                             DEF_PARAM | DEF_STAR);

> Shouldn't line 4330 say if (TYPE(c) == STAR) ?

No, that's line 4323. You can't have doublestar without having star, and
star should precede doublestar. (Grammar should enforce that.) 

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From paulp at ActiveState.com  Tue Jan 23 00:02:07 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Mon, 22 Jan 2001 15:02:07 -0800
Subject: [Python-Dev] pydoc - put it in the core
References: <14945.59192.400783.403810@beluga.mojam.com> <200101142055.PAA13041@cj20424-a.reston1.va.home.com>
Message-ID: <3A6CBBEF.4732BFF2@ActiveState.com>

Guido van Rossum wrote:
> 
> ....
>
> Yes, wow!
> 
> ....

I apologize but I'm not clear on my responsibilities here, if any. I
wrote a PEP for online help. I submitted a partial implementation. Ping
wrote a full implementation that basically supercedes mine. There are
various ideas for improving it, but I think that we agree that the core
is solid. Several people have said that it should be moved into the core
library. Nobody has said that it shouldn't. Whose move is it? What's
next?

 Paul Prescod



From fredrik at effbot.org  Tue Jan 23 00:08:40 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 23 Jan 2001 00:08:40 +0100
Subject: [Python-Dev] test___all__ fails if bsddb not available
Message-ID: <079a01c084c8$43023e40$e46940d5@hagrid>

test___all__
test test___all__ failed -- dbhash has no __all__ attribute

maybe this test shouldn't depend on optional modules?

</F>




From nas at arctrix.com  Mon Jan 22 17:24:34 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 08:24:34 -0800
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122231329.A27785@xs4all.nl>; from thomas@xs4all.net on Mon, Jan 22, 2001 at 11:13:29PM +0100
References: <14956.16758.68050.257212@localhost.localdomain> <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com> <20010122070057.A26575@glacier.fnational.com> <20010122231329.A27785@xs4all.nl>
Message-ID: <20010122082433.B26765@glacier.fnational.com>

On Mon, Jan 22, 2001 at 11:13:29PM +0100, Thomas Wouters wrote:
> That's a fairly wild guess, but it's worth a try. Try this
> patch:
[...]

Works for me.

  Neil



From greg at cosc.canterbury.ac.nz  Tue Jan 23 00:21:14 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 23 Jan 2001 12:21:14 +1300 (NZDT)
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122235602.B27785@xs4all.nl>
Message-ID: <200101222321.MAA01957@s454.cosc.canterbury.ac.nz>

Thomas Wouters <thomas at xs4all.net>:

> You can't have doublestar without having star

What?!? You could in 1.5.2. Has that changed?

Anyway, it just looked a bit odd that it seemed to be testing
for DOUBLESTAR and then adding a DEF_STAR thing to the symtab.
But I guess I should shut up until I've seen all of the code.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From thomas at xs4all.net  Tue Jan 23 00:26:02 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 23 Jan 2001 00:26:02 +0100
Subject: [Python-Dev] Worse news
In-Reply-To: <200101222321.MAA01957@s454.cosc.canterbury.ac.nz>; from greg@cosc.canterbury.ac.nz on Tue, Jan 23, 2001 at 12:21:14PM +1300
References: <20010122235602.B27785@xs4all.nl> <200101222321.MAA01957@s454.cosc.canterbury.ac.nz>
Message-ID: <20010123002602.C27785@xs4all.nl>

On Tue, Jan 23, 2001 at 12:21:14PM +1300, Greg Ewing wrote:
> Thomas Wouters <thomas at xs4all.net>:

> > You can't have doublestar without having star

> What?!? You could in 1.5.2. Has that changed?

Sorry, my bad, I'm wrong. (I just tested this.) I could swear it was that
way, but it's 0:25 right now, after a night with about 2 hours decent sleep,
so ignore my delusions :)

> Anyway, it just looked a bit odd that it seemed to be testing
> for DOUBLESTAR and then adding a DEF_STAR thing to the symtab.
> But I guess I should shut up until I've seen all of the code.

No, it's not doing that. It's adding the symbol name to the symtab, with
DEF_DOUBLESTAR as one of its flags. Not sure what the flag does, but I could
guess. (But see the above mentioned delusions as to why I'm not doing that
out loud anymore :-) The 'if' in front of it adds the symbol to the symtab
with DEF_STAR as a flag, in the case of 'STAR' (rather than DOUBLESTAR).
Really. go check :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Tue Jan 23 00:31:03 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 23 Jan 2001 00:31:03 +0100
Subject: [Python-Dev] Worse news
In-Reply-To: <20010123002602.C27785@xs4all.nl>; from thomas@xs4all.net on Tue, Jan 23, 2001 at 12:26:02AM +0100
References: <20010122235602.B27785@xs4all.nl> <200101222321.MAA01957@s454.cosc.canterbury.ac.nz> <20010123002602.C27785@xs4all.nl>
Message-ID: <20010123003103.D27785@xs4all.nl>

On Tue, Jan 23, 2001 at 12:26:02AM +0100, Thomas Wouters wrote:
> On Tue, Jan 23, 2001 at 12:21:14PM +1300, Greg Ewing wrote:
> > Thomas Wouters <thomas at xs4all.net>:
> 
> > > You can't have doublestar without having star
> 
> > What?!? You could in 1.5.2. Has that changed?

> Sorry, my bad, I'm wrong. (I just tested this.) I could swear it was that
> way, but it's 0:25 right now, after a night with about 2 hours decent sleep,
> so ignore my delusions :)

Ah, yeah, what I meant to *think* was: you can't have *spam *after* **eggs:

>>> def foo(x, **kwarg, *arg)
  File "<stdin>", line 1
    def foo(x, **kwarg, *arg)
                      ^
SyntaxError: invalid syntax

So the logic of the latter part of the function seems okay (after the little
patch I posted before.) Jeremy should give his expert opinion before it goes
in, though, since it's his code :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Tue Jan 23 00:36:17 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 18:36:17 -0500
Subject: [Python-Dev] test___all__ fails if bsddb not available
In-Reply-To: Your message of "Tue, 23 Jan 2001 00:08:40 +0100."
             <079a01c084c8$43023e40$e46940d5@hagrid> 
References: <079a01c084c8$43023e40$e46940d5@hagrid> 
Message-ID: <200101222336.SAA30480@cj20424-a.reston1.va.home.com>

> test test___all__ failed -- dbhash has no __all__ attribute
> 
> maybe this test shouldn't depend on optional modules?

Fixed -- I just skip dbhash if bsddb can't be imported.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Tue Jan 23 01:38:28 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 19:38:28 -0500 (EST)
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122231329.A27785@xs4all.nl>
References: <14956.16758.68050.257212@localhost.localdomain>
	<LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>
	<20010122070057.A26575@glacier.fnational.com>
	<20010122231329.A27785@xs4all.nl>
Message-ID: <14956.53892.651549.493268@localhost.localdomain>

Thomas,

Your patch has the right diagnosis, although I would write it a tad
differently.  NCH(n) <= i + 2 should be NCH(n) < i + 2, because
CHILD(n, NCH(i)) is not valid.

I'll check it in.

Jeremy



From jeremy at alum.mit.edu  Tue Jan 23 02:23:56 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 20:23:56 -0500 (EST)
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments 
In-Reply-To: <20010119232323.70B03116392@oratrix.oratrix.nl>
References: <guido@digicool.com>
	<200101191634.LAA29239@cj20424-a.reston1.va.home.com>
	<20010119232323.70B03116392@oratrix.oratrix.nl>
Message-ID: <14956.56620.706531.647341@localhost.localdomain>

>>>>> "JJ" == Jack Jansen <jack at oratrix.nl> writes:

  JJ> Recently, Guido van Rossum <guido at digicool.com> said:
  >> > I get the impression that I'm currently seeing a non-NULL third
  >> > argument in my (C) methods even though the method is called
  >> > without keyword arguments.
  >>
  >> > Is this new semantics that I missed the discussion about, or is
  >> > this a bug?
  >>
  >> [...]  Do you really need the NULL?

  JJ> The places that I know I was counting on the NULL now have "if (
  JJ> kw && PyObject_IsTrue(kw))", so I'll just have to hope there
  JJ> aren't any more lingering in there.

Guido,

Does your query ("Do you really need the NULL?") mean that you don't
care whether the argument is NULL or an empty dictionary?  I could
change the code to do either for 2.1a2, if you have a preference.

Jeremy



From guido at digicool.com  Tue Jan 23 02:33:20 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 20:33:20 -0500
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: Your message of "Mon, 22 Jan 2001 20:23:56 EST."
             <14956.56620.706531.647341@localhost.localdomain> 
References: <guido@digicool.com> <200101191634.LAA29239@cj20424-a.reston1.va.home.com> <20010119232323.70B03116392@oratrix.oratrix.nl>  
            <14956.56620.706531.647341@localhost.localdomain> 
Message-ID: <200101230133.UAA04378@cj20424-a.reston1.va.home.com>

> Guido,
> 
> Does your query ("Do you really need the NULL?") mean that you don't
> care whether the argument is NULL or an empty dictionary?  I could
> change the code to do either for 2.1a2, if you have a preference.
> 
> Jeremy

Robust code IMO should treat NULL and {} the same.  But since
traditionally we passed NULL, it's better to pass NULL rather than {}.
I believe that's the status quo now, right?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Tue Jan 23 02:54:53 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 20:54:53 -0500 (EST)
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: <200101230133.UAA04378@cj20424-a.reston1.va.home.com>
References: <guido@digicool.com>
	<200101191634.LAA29239@cj20424-a.reston1.va.home.com>
	<20010119232323.70B03116392@oratrix.oratrix.nl>
	<14956.56620.706531.647341@localhost.localdomain>
	<200101230133.UAA04378@cj20424-a.reston1.va.home.com>
Message-ID: <14956.58477.874472.190937@localhost.localdomain>

>>>>> "GvR" == Guido van Rossum <guido at digicool.com> writes:

  [Jeremy wrote:]
  >> Does your query ("Do you really need the NULL?") mean that you
  >> don't care whether the argument is NULL or an empty dictionary?
  >> I could change the code to do either for 2.1a2, if you have a
  >> preference.

  GvR> Robust code IMO should treat NULL and {} the same.  But since
  GvR> traditionally we passed NULL, it's better to pass NULL rather
  GvR> than {}.  I believe that's the status quo now, right?

The current status in CVS is to pass {}, because there appeared to be
some case where a PyCFunction was not expecting NULL.  I assumed,
without checking, that {} was required and change the implementation
to always pass a dictionary to METH_KEYWORDS functions.  I could
change it back to NULL and see if I can reproduce the error I was
seeing.

Jeremy



From guido at digicool.com  Tue Jan 23 03:01:12 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 21:01:12 -0500
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: Your message of "Mon, 22 Jan 2001 20:54:53 EST."
             <14956.58477.874472.190937@localhost.localdomain> 
References: <guido@digicool.com> <200101191634.LAA29239@cj20424-a.reston1.va.home.com> <20010119232323.70B03116392@oratrix.oratrix.nl> <14956.56620.706531.647341@localhost.localdomain> <200101230133.UAA04378@cj20424-a.reston1.va.home.com>  
            <14956.58477.874472.190937@localhost.localdomain> 
Message-ID: <200101230201.VAA15993@cj20424-a.reston1.va.home.com>

>   [Jeremy wrote:]
>   >> Does your query ("Do you really need the NULL?") mean that you
>   >> don't care whether the argument is NULL or an empty dictionary?
>   >> I could change the code to do either for 2.1a2, if you have a
>   >> preference.
> 
>   GvR> Robust code IMO should treat NULL and {} the same.  But since
>   GvR> traditionally we passed NULL, it's better to pass NULL rather
>   GvR> than {}.  I believe that's the status quo now, right?
> 
> The current status in CVS is to pass {}, because there appeared to be
> some case where a PyCFunction was not expecting NULL.  I assumed,
> without checking, that {} was required and change the implementation
> to always pass a dictionary to METH_KEYWORDS functions.  I could
> change it back to NULL and see if I can reproduce the error I was
> seeing.

Yes, that's a good idea.  I hope that the {} in alpha 1 won't make
folks think that they will never see a NULL in the future and code
accordingly...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan 23 03:15:11 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 21:15:11 -0500
Subject: [Python-Dev] 2.1a1 release tonight -- but no nested scopes or weak refs
Message-ID: <200101230215.VAA16577@cj20424-a.reston1.va.home.com>

We've decided to release 2.1a1 without further ado, but without two
big hopeful patches: Jeremy's nested scopes aren't finished and will
take considerably more time, and Fred's weak references need more
review (I haven't had the time to look at the code).  Rather than wait
longer, I've decided to try and release 2.1a1 tonight -- there's
nothing I'm waiting for now before I can cut a tarball.  There will be
an alpha2 release around February 1.

Please don't make any check-ins until I announce the 2.1a1 release
here.  (PythonLabs: please mail or phone me if you need to check in a
last-minute thing -- I'm tagging the tree now.)

More news as it happens,

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Tue Jan 23 03:36:24 2001
From: skip at mojam.com (Skip Montanaro)
Date: Mon, 22 Jan 2001 20:36:24 -0600 (CST)
Subject: [Python-Dev] test_grammar failing
Message-ID: <14956.60968.363878.643640@beluga.mojam.com>

At the end of this:

    make distclean ; ./configure ; make OPT='-g -pipe' ; make test

I get this:

    rm -f ./Lib/test/*.py[co]
    PYTHONPATH=./build/lib.`cat platform` ./python -tt ./Lib/test/regrtest.py -l
    test_grammar
    name: None, in test_in_func, file './Lib/test/test_grammar.py', line 617
    locals: {'x': 2, '[1]': 1, 'l': 0}
    globals: {}
    Fatal Python error: compiler did not label name as local or global
    make: *** [test] Aborted
    PYTHONPATH=./build/lib.`cat platform` ./python -tt ./Lib/test/regrtest.py -l
    test_grammar
    name: None, in test_in_func, file './Lib/test/test_grammar.py', line 617
    locals: {'x': 2, '[1]': 1, 'l': 0}
    globals: {}
    Fatal Python error: compiler did not label name as local or global
    make: *** [test] Aborted

Any ideas?  I notice that Jeremy checked in some changes to test_grammar.py
this evening.

Skip



From gvwilson at nevex.com  Tue Jan 23 03:47:33 2001
From: gvwilson at nevex.com (Greg Wilson)
Date: Mon, 22 Jan 2001 21:47:33 -0500 (EST)
Subject: [Python-Dev] re: I think my set module is ready for prime time
Message-ID: <Pine.LNX.4.10.10101222146150.20319-100000@akbar.nevex.com>

> > Guido van Rossum:
> > There's already a PEP on a set object type, and everybody and their
> > aunt has already implemented a set datatype.

> Eric Raymond:
> Greg's proposal has a couple of problems.
> The biggest one is that the interface design isn't very Pythonic --
> ...doesn't exploit the extent to which sets
> naturally have common semantics with existing Python sequence types.
> This is bad; it means that a lot of code that could otherwise ignore
> the difference between lists and sets would have to be specialized 
> one way or the other for no good reason.

I agree with Eric's point; I put the interface design on hold while I
went off to try to find an efficient implementation capable of
handling mutable values (i.e. one that would allow things like sets of
sets).  I'm still looking :-(, but would appreciate comments from this
list on Eric's interface.

Thanks,
Greg




From guido at digicool.com  Tue Jan 23 04:02:50 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 22:02:50 -0500
Subject: [Python-Dev] test_grammar failing
In-Reply-To: Your message of "Mon, 22 Jan 2001 20:36:24 CST."
             <14956.60968.363878.643640@beluga.mojam.com> 
References: <14956.60968.363878.643640@beluga.mojam.com> 
Message-ID: <200101230302.WAA27104@cj20424-a.reston1.va.home.com>

> At the end of this:
> 
>     make distclean ; ./configure ; make OPT='-g -pipe' ; make test
> 
> I get this:
> 
>     rm -f ./Lib/test/*.py[co]
>     PYTHONPATH=./build/lib.`cat platform` ./python -tt ./Lib/test/regrtest.py -l
>     test_grammar
>     name: None, in test_in_func, file './Lib/test/test_grammar.py', line 617
>     locals: {'x': 2, '[1]': 1, 'l': 0}
>     globals: {}
>     Fatal Python error: compiler did not label name as local or global
>     make: *** [test] Aborted
>     PYTHONPATH=./build/lib.`cat platform` ./python -tt ./Lib/test/regrtest.py -l
>     test_grammar
>     name: None, in test_in_func, file './Lib/test/test_grammar.py', line 617
>     locals: {'x': 2, '[1]': 1, 'l': 0}
>     globals: {}
>     Fatal Python error: compiler did not label name as local or global
>     make: *** [test] Aborted
> 
> Any ideas?  I notice that Jeremy checked in some changes to test_grammar.py
> this evening.

Try another cvs update and rebuild.  The test that Jeremy checked in
is supposed to catch a bug in the compiler code that he checked in.
The latest compile.c is 103277 bytes long (in Unix).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan 23 04:33:02 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 22:33:02 -0500
Subject: [Python-Dev] Python 2.1 alpha 1 released!
Message-ID: <200101230333.WAA28376@cj20424-a.reston1.va.home.com>

Thanks to the PythonLabs developers and the many hard-working
volunteers, I'm proud to release Python 2.1a1 -- the first alpha
release of Python version 2.1.

The release mechanics are different than for previous releases: we're
only releasing through SourceForge for now.  The official source
tarball is already available from the download page:

  http://sourceforge.net/project/showfiles.php?group_id=5470

Additional files will be released soon: a Windows installer,
Linux RPMs, and documentation.

Please give it a good try!  The only way Python 2.1 can become a
rock-solid product is if people test the alpha releases.  Especially
if you are using Python for demanding applications or on extreme
platforms we are interested in hearing your feedback.  Are you
embedding Python or using threads?  Please test your application using
Python 2.1a1!  Please submit all bug reports through SourceForge:

  http://sourceforge.net/bugs/?group_id=5470

Here's the NEWS file:

What's New in Python 2.1 alpha 1?
=================================

Core language, builtins, and interpreter

- There is a new Unicode companion to the PyObject_Str() API
  called PyObject_Unicode(). It behaves in the same way as the
  former, but assures that the returned value is an Unicode object
  (applying the usual coercion if necessary).

- The comparison operators support "rich comparison overloading" (PEP
  207).  C extension types can provide a rich comparison function in
  the new tp_richcompare slot in the type object.  The cmp() function
  and the C function PyObject_Compare() first try the new rich
  comparison operators before trying the old 3-way comparison.  There
  is also a new C API PyObject_RichCompare() (which also falls back on
  the old 3-way comparison, but does not constrain the outcome of the
  rich comparison to a Boolean result).

  The rich comparison function takes two objects (at least one of
  which is guaranteed to have the type that provided the function) and
  an integer indicating the opcode, which can be Py_LT, Py_LE, Py_EQ,
  Py_NE, Py_GT, Py_GE (for <, <=, ==, !=, >, >=), and returns a Python
  object, which may be NotImplemented (in which case the tp_compare
  slot function is used as a fallback, if defined).

  Classes can overload individual comparison operators by defining one
  or more of the methods__lt__, __le__, __eq__, __ne__, __gt__,
  __ge__.  There are no explicit "reflected argument" versions of
  these; instead, __lt__ and __gt__ are each other's reflection,
  likewise for__le__ and __ge__; __eq__ and __ne__ are their own
  reflection (similar at the C level).  No other implications are
  made; in particular, Python does not assume that == is the Boolean
  inverse of !=, or that < is the Boolean inverse of >=.  This makes
  it possible to define types with partial orderings.

  Classes or types that want to implement (in)equality tests but not
  the ordering operators (i.e. unordered types) should implement ==
  and !=, and raise an error for the ordering operators.

  It is possible to define types whose rich comparison results are not
  Boolean; e.g. a matrix type might want to return a matrix of bits
  for A < B, giving elementwise comparisons.  Such types should ensure
  that any interpretation of their value in a Boolean context raises
  an exception, e.g. by defining __nonzero__ (or the tp_nonzero slot
  at the C level) to always raise an exception.

- Complex numbers use rich comparisons to define == and != but raise
  an exception for <, <=, > and >=.  Unfortunately, this also means
  that cmp() of two complex numbers raises an exception when the two
  numbers differ.  Since it is not mathematically meaningful to compare
  complex numbers except for equality, I hope that this doesn't break
  too much code.

- Functions and methods now support getting and setting arbitrarily
  named attributes (PEP 232).  Functions have a new __dict__
  (a.k.a. func_dict) which hold the function attributes.  Methods get
  and set attributes on their underlying im_func.  It is a TypeError
  to set an attribute on a bound method.

- The xrange() object implementation has been improved so that
  xrange(sys.maxint) can be used on 64-bit platforms.  There's still a
  limitation that in this case len(xrange(sys.maxint)) can't be
  calculated, but the common idiom "for i in xrange(sys.maxint)" will
  work fine as long as the index i doesn't actually reach 2**31.
  (Python uses regular ints for sequence and string indices; fixing
  that is much more work.)

- Two changes to from...import:

  1) "from M import X" now works even if M is not a real module; it's
     basically a getattr() operation with AttributeError exceptions
     changed into ImportError.

  2) "from M import *" now looks for M.__all__ to decide which names to
     import; if M.__all__ doesn't exist, it uses M.__dict__.keys() but
     filters out names starting with '_' as before.  Whether or not
     __all__ exists, there's no restriction on the type of M.

- File objects have a new method, xreadlines().  This is the fastest
  way to iterate over all lines in a file:

  for line in file.xreadlines():
      ...do something to line...

  See the xreadlines module (mentioned below) for how to do this for
  other file-like objects.

- Even if you don't use file.xreadlines(), you may expect a speedup on
  line-by-line input.  The file.readline() method has been optimized
  quite a bit in platform-specific ways:  on systems (like Linux) that
  support flockfile(), getc_unlocked(), and funlockfile(), those are
  used by default.  On systems (like Windows) without getc_unlocked(),
  a complicated (but still thread-safe) method using fgets() is used by
  default.

  You can force use of the fgets() method by #define'ing 
  USE_FGETS_IN_GETLINE at build time (it may be faster than 
  getc_unlocked()).

  You can force fgets() not to be used by #define'ing 
  DONT_USE_FGETS_IN_GETLINE (this is the first thing to try if std test 
  test_bufio.py fails -- and let us know if it does!).

- In addition, the fileinput module, while still slower than the other
  methods on most platforms, has been sped up too, by using
  file.readlines(sizehint).

- Support for run-time warnings has been added, including a new
  command line option (-W) to specify the disposition of warnings.
  See the description of the warnings module below.

- Extensive changes have been made to the coercion code.  This mostly
  affects extension modules (which can now implement mixed-type
  numerical operators without having to use coercion), but
  occasionally, in boundary cases the coercion semantics have changed
  subtly.  Since this was a terrible gray area of the language, this
  is considered an improvement.  Also note that __rcmp__ is no longer
  supported -- instead of calling __rcmp__, __cmp__ is called with
  reflected arguments.

- In connection with the coercion changes, a new built-in singleton
  object, NotImplemented is defined.  This can be returned for
  operations that wish to indicate they are not implemented for a
  particular combination of arguments.  From C, this is
  Py_NotImplemented.

- The interpreter accepts now bytecode files on the command line even
  if they do not have a .pyc or .pyo extension. On Linux, after executing

  echo ':pyc:M::\x87\xc6\x0d\x0a::/usr/local/bin/python:' > /proc/sys/fs/binfmt_misc/register

  any byte code file can be used as an executable (i.e. as an argument
  to execve(2)).

- %[xXo] formats of negative Python longs now produce a sign
  character.  In 1.6 and earlier, they never produced a sign,
  and raised an error if the value of the long was too large
  to fit in a Python int.  In 2.0, they produced a sign if and
  only if too large to fit in an int.  This was inconsistent
  across platforms (because the size of an int varies across
  platforms), and inconsistent with hex() and oct().  Example:

  >>> "%x" % -0x42L
  '-42'      # in 2.1
  'ffffffbe' # in 2.0 and before, on 32-bit machines
  >>> hex(-0x42L)
  '-0x42L'   # in all versions of Python

  The behavior of %d formats for negative Python longs remains
  the same as in 2.0 (although in 1.6 and before, they raised
  an error if the long didn't fit in a Python int).

  %u formats don't make sense for Python longs, but are allowed
  and treated the same as %d in 2.1.  In 2.0, a negative long
  formatted via %u produced a sign if and only if too large to
  fit in an int.  In 1.6 and earlier, a negative long formatted
  via %u raised an error if it was too big to fit in an int.

- Dictionary objects have an odd new method, popitem().  This removes
  an arbitrary item from the dictionary and returns it (in the form of
  a (key, value) pair).  This can be useful for algorithms that use a
  dictionary as a bag of "to do" items and repeatedly need to pick one
  item.  Such algorithms normally end up running in quadratic time;
  using popitem() they can usually be made to run in linear time.

Standard library

- In the time module, the time argument to the functions strftime,
  localtime, gmtime, asctime and ctime is now optional, defaulting to
  the current time (in the local timezone).

- The ftplib module now defaults to passive mode, which is deemed a
  more useful default given that clients are often inside firewalls
  these days.  Note that this could break if ftplib is used to connect
  to a *server* that is inside a firewall, from outside; this is
  expected to be a very rare situation.  To fix that, you can call
  ftp.set_pasv(0).

- The module site now treats .pth files not only for path configuration,
  but also supports extensions to the initialization code: Lines starting
  with import are executed.

- There's a new module, warnings, which implements a mechanism for
  issuing and filtering warnings.  There are some new built-in
  exceptions that serve as warning categories, and a new command line
  option, -W, to control warnings (e.g. -Wi ignores all warnings, -We
  turns warnings into errors).  warnings.warn(message[, category])
  issues a warning message; this can also be called from C as
  PyErr_Warn(category, message).

- A new module xreadlines was added.  This exports a single factory
  function, xreadlines().  The intention is that this code is the
  absolutely fastest way to iterate over all lines in an open
  file(-like) object:

  import xreadlines
  for line in xreadlines.xreadlines(file):
      ...do something to line...

  This is equivalent to the previous the speed record holder using
  file.readlines(sizehint).  Note that if file is a real file object
  (as opposed to a file-like object), this is equivalent:

  for line in file.xreadlines():
      ...do something to line...

- The bisect module has new functions bisect_left, insort_left,
  bisect_right and insort_right.  The old names bisect and insort
  are now aliases for bisect_right and insort_right.  XXX_right
  and XXX_left methods differ in what happens when the new element
  compares equal to one or more elements already in the list:  the
  XXX_left methods insert to the left, the XXX_right methods to the
  right.  Code that doesn't care where equal elements end up should
  continue to use the old, short names ("bisect" and "insort").

- The new curses.panel module wraps the panel library that forms part
  of SYSV curses and ncurses.  Contributed by Thomas Gellekum.

- The SocketServer module now sets the allow_reuse_address flag by
  default in the TCPServer class.

- A new function, sys._getframe(), returns the stack frame pointer of
  the caller.  This is intended only as a building block for
  higher-level mechanisms such as string interpolation.

Build issues

- For Unix (and Unix-compatible) builds, configuration and building of
  extension modules is now greatly automated.  Rather than having to
  edit the Modules/Setup file to indicate which modules should be
  built and where their include files and libraries are, a
  distutils-based setup.py script now takes care of building most
  extension modules.  All extension modules built this way are built
  as shared libraries.  Only a few modules that must be linked
  statically are still listed in the Setup file; you won't need to
  edit their configuration.

- Python should now build out of the box on Cygwin.  If it doesn't,
  mail to Jason Tishler (jlt63 at users.sourceforge.net).

- Python now always uses its own (renamed) implementation of getopt()
  -- there's too much variation among C library getopt()
  implementations.

- C++ compilers are better supported; the CXX macro is always set to a
  C++ compiler if one is found.

Windows changes

- select module:  By default under Windows, a select() call
  can specify no more than 64 sockets.  Python now boosts
  this Microsoft default to 512.  If you need even more than
  that, see the MS docs (you'll need to #define FD_SETSIZE
  and recompile Python from source).

- Support for Windows 3.1, DOS and OS/2 is gone.  The Lib/dos-8x3
  subdirectory is no more!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ping at lfw.org  Tue Jan 23 05:11:09 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 22 Jan 2001 20:11:09 -0800 (PST)
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: <3A6CBBEF.4732BFF2@ActiveState.com>
Message-ID: <Pine.LNX.4.10.10101221953190.1568-100000@skuld.kingmanhall.org>

Guido van Rossum wrote:
> Yes, wow!

Paul Prescod wrote:
> I apologize but I'm not clear on my responsibilities here, if any. I
> wrote a PEP for online help. I submitted a partial implementation.

Hi, guys.  Sorry i haven't been sending updates on what i'm doing.
Here's the current picture as i see it.

> Ping wrote a full implementation that basically supercedes mine.

My implementation is "full" in that it deploys and seems to work on
arbitrary modules as it stands, but it doesn't really supercede Paul's
because it leaves out the big piece of Paul's work that did conversion
from packaged HTML docs to plain text.

It also has the deficiency that it imports modules live; for untrusted
modules, this is a security risk.  I know Paul has been working on
stuff to compile a module into a kind of skeleton object that has all
the same name bindings but no live contents, and if that works reliably,
we should definitely try plugging that in.

> There are various ideas for improving it, but I think that we agree
> that the core is solid.

Yes.  I believe that as it stands, pydoc is useful enough to be a net
positive addition to the core.  inspect.py alone has been stable and
alpha-ready for some time, i believe.

Here is a summary of its status and work that remains.  pydoc has:

    inspecting live objects
    generating text docs from live objects
    generating HTML docs from live objects
    serving HTML docs from a little web server
    showing docs from the command line
    showing docs from within the interactive interpreter
    apropos-style module listing

It's missing the following, and Paul had stuff for this:

    inspecting unsafe modules
    generating text docs from packaged HTML (e.g. language reference)

It also needs these:

    generating docs from a file given on the command line (easy)
    more Windows and Mac testing and decisions
    various small bugfixes

This past week i've been messing around with Windows and Mac stuff,
trying to see whether it's possible to reliably spawn a webserver
and launch a web browser at the same time (this would seem to be a
good default action to do on GUI platforms).

In trying to do the latter i've found the webbrowser module pretty
unreliable, by the way.  For example, it relies on a constant delay
of 4 seconds to launch a new browser that can't be expected on all
platforms, and fails to launch Netscape 3 because it supplies an
illegal command-line option.  When i've found good cross-platform
ways to make this work i'll suggest some patches.

I've so far considered this project blocked only on cross-platform
testing -- do you agree?  While i know that inspecting unsafe modules
and processing packaged HTML are important features, i don't consider
them essential.


-- ?!ng




From ping at lfw.org  Tue Jan 23 05:14:50 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 22 Jan 2001 20:14:50 -0800 (PST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <Pine.LNX.4.10.10101221953190.1568-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101222011470.1568-100000@skuld.kingmanhall.org>

On Mon, 22 Jan 2001, Ka-Ping Yee wrote:
> In trying to do the latter i've found the webbrowser module pretty
> unreliable, by the way.  For example, it relies on a constant delay
> of 4 seconds to launch a new browser that can't be expected on all
> platforms, and fails to launch Netscape 3 because it supplies an
> illegal command-line option.  When i've found good cross-platform
> ways to make this work i'll suggest some patches.

Oh, and i forgot to mention... i was pretty disappointed that:

    setenv BROWSER my_browser_program
    python -c 'import webbrowser; webbrowser.open("http://python.org/")'

doesn't execute "my_browser_program http://python.org/" as i would
have hoped.  Even for a known browser type:

    setenv BROWSER lynx
    python -c 'import webbrowser; webbrowser.open("http://python.org/")'

does not work as expected, either.  (Red Hat Linux here.)


-- ?!ng




From ping at lfw.org  Tue Jan 23 05:22:56 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 22 Jan 2001 20:22:56 -0800 (PST)
Subject: [Python-Dev] Is X a (sequence|mapping)?
Message-ID: <Pine.LNX.4.10.10101222016150.1568-100000@skuld.kingmanhall.org>

We can implement abstract interfaces (sequence, mapping, number) in
Python with the appropriate __special__ methods, but i don't see an
easy way to test if something supports one of these abstract interfaces
in Python.

At the moment, to see if something is a sequence i believe i have to
say something like

    try:
        x[0]
    except:
        # not a sequence
    else:
        # okay, it's a sequence

or

    if hasattr(x, '__getitem__') or type(x) in [type(()), type([])]:
        ...

Is there, or should there be, a better way to do this?



-- ?!ng




From greg at cosc.canterbury.ac.nz  Tue Jan 23 05:46:26 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 23 Jan 2001 17:46:26 +1300 (NZDT)
Subject: [Python-Dev] re: I think my set module is ready for prime time
In-Reply-To: <Pine.LNX.4.10.10101222146150.20319-100000@akbar.nevex.com>
Message-ID: <200101230446.RAA01992@s454.cosc.canterbury.ac.nz>

Greg Wilson <gvwilson at nevex.com>:

> an efficient implementation capable of
> handling mutable values (i.e. one that would allow things like sets of
> sets)

I suspect that such a thing is impossible. To avoid a
linear search you have to take advantage of some kind
of hashing or ordering, which you can't do if your
objects can change their values out from under you.

Also, there's nothing to stop someone from mutating
two previously unequal elements so that they're equal.
Then you have a "set" with two identical elements,
which isn't a set any more, it's just a collection.

So, I submit that the very concept of a set only
makes sense for immutable values.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From tim.one at home.com  Tue Jan 23 06:03:18 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 23 Jan 2001 00:03:18 -0500
Subject: [Python-Dev] Is X a (sequence|mapping)?
In-Reply-To: <Pine.LNX.4.10.10101222016150.1568-100000@skuld.kingmanhall.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEJBIKAA.tim.one@home.com>

[?!ng]
> ...
> At the moment, to see if something is a sequence i believe i have to
> say something like
>
>     try:
>         x[0]
>     except:
>         # not a sequence
>     else:
>         # okay, it's a sequence
>
> or
>
>     if hasattr(x, '__getitem__') or type(x) in [type(()), type([])]:
>         ...
>
> Is there, or should there be, a better way to do this?

Dunno.  What's a sequence?  If you want to know whether x[0] will blow up,
trying x[0] is the most obvious way.  BTW, I expect trying x[:0] is a better
idea:  doesn't succeed for dicts, and doesn't blow up for an irrelevant
reason if x is an empty sequence.  BTW2, your second method suggests an
uncomfortable truth:  many contexts that want "a sequence" don't want
strings to pass the test, despite that strings are as much sequences as
lists in Python, no matter how "a sequence" is defined.

afraid that-what-you-want-to-do-with-it-is-more-important-than-what-
    python-calls-it-ly y'rs  - tim




From ping at lfw.org  Tue Jan 23 06:27:30 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 22 Jan 2001 21:27:30 -0800 (PST)
Subject: [Python-Dev] I think my set module is ready for prime time;
 comments?
In-Reply-To: <20010122124159.A14999@thyrsus.com>
Message-ID: <Pine.LNX.4.10.10101222125340.1568-100000@skuld.kingmanhall.org>

On Mon, 22 Jan 2001, Eric S. Raymond wrote:
> \section{\module{set} ---
>          Basic set algebra for Python}

I'd like to look at the module.  Did you actually show us the code
for this, or am i a blind doofus?

(Please, no answers to the unasked question of whether i am a doofus.)


-- ?!ng




From tim.one at home.com  Tue Jan 23 07:05:26 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 23 Jan 2001 01:05:26 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122064400.A26543@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEJEIKAA.tim.one@home.com>

In finding and repairing the test_extcall bug, Neil and Thomas have once
again contributed beyond the call of duty.  Thank you!  It took some doing
to convince Guido to release his Dutch Death Grip on the PythonLabs coffers,
but in the end he was overcome by the moral necessity of rewarding you
sterling fellows for your golden deeds:  you're both entitled to free(*)--
yes, FREE(*)! --copies of all Python 2.1 alpha, *and* beta, releases(*)!

you-wouldn't-believe-how-much-he-charges-us-ly y'rs  - tim


(*) Does not apply to Jython releases.  All applicable taxes are the
responsibility of the recipient.  No warranty is expressed or implied.  This
offer has not been reviewed or approved by CWI, CNRI, BeOpen.com, or Digital
Creations 2.  Export restrictions may apply.  By acceptance of this offer,
recipient grants perpetual license to use their name, image and likeness in
Python promotional materials without compensation.  Packaging, handling,
shipping and insurance costs to be borne by recipient, but in no case to
exceed 1 (one) US$/byte.  This offer may be withdrawn at any time, including
but not limited to retroactively, at the sole discretion of Guido van
Rossum, or such of his heirs and successors as he may designate from time to
time.




From martin at mira.cs.tu-berlin.de  Tue Jan 23 09:14:32 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 23 Jan 2001 09:14:32 +0100
Subject: [Python-Dev] Is X a (sequence|mapping)?
Message-ID: <200101230814.f0N8EWQ00849@mira.informatik.hu-berlin.de>

> i don't see an easy way to test if something supports one of these
> abstract interfaces in Python.

Why do you want to test for that? If you have an algorithm that only
operates on integer-indexed things, what can you do if the test fails?

So it is always better to just use the object in the algorithm, and
let it break with an exception if somebody passes a bad object.

Regards,
Martin



From mal at lemburg.com  Tue Jan 23 10:08:24 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 10:08:24 +0100
Subject: [Python-Dev] webbrowser.py
References: <Pine.LNX.4.10.10101222011470.1568-100000@skuld.kingmanhall.org>
Message-ID: <3A6D4A08.B3806984@lemburg.com>

Ka-Ping Yee wrote:
> 
> On Mon, 22 Jan 2001, Ka-Ping Yee wrote:
> > In trying to do the latter i've found the webbrowser module pretty
> > unreliable, by the way.  For example, it relies on a constant delay
> > of 4 seconds to launch a new browser that can't be expected on all
> > platforms, and fails to launch Netscape 3 because it supplies an
> > illegal command-line option.  When i've found good cross-platform
> > ways to make this work i'll suggest some patches.
> 
> Oh, and i forgot to mention... i was pretty disappointed that:
> 
>     setenv BROWSER my_browser_program
>     python -c 'import webbrowser; webbrowser.open("http://python.org/")'
> 
> doesn't execute "my_browser_program http://python.org/" as i would
> have hoped.  Even for a known browser type:
> 
>     setenv BROWSER lynx
>     python -c 'import webbrowser; webbrowser.open("http://python.org/")'
> 
> does not work as expected, either.  (Red Hat Linux here.)

Hmm, lynx should work (the module has explicit support for it)
and yes, I agree, webbrowser should trust BROWSER and use a
generic calling mechanism (program <url>) for opening the
URL.

Too late for 2.1a1, but maybe for a2 ?!

BTW, I think that the second line here is causing the problem:

class CommandLineBrowser:
    _browsers = [] # <- this overrides the global of the same name
    if os.environ.get("DISPLAY"):
        _browsers.extend([
            ("netscape", "netscape %s >/dev/null &"),
            ("mosaic", "mosaic %s >/dev/null &"),
            ])
    _browsers.extend([
        ("lynx", "lynx %s"),
        ("w3m", "w3m %s"),
        ])


-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Tue Jan 23 10:15:11 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 10:15:11 +0100
Subject: [Python-Dev] Is X a (sequence|mapping)?
References: <200101230814.f0N8EWQ00849@mira.informatik.hu-berlin.de>
Message-ID: <3A6D4B9F.38B17046@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > i don't see an easy way to test if something supports one of these
> > abstract interfaces in Python.
> 
> Why do you want to test for that? If you have an algorithm that only
> operates on integer-indexed things, what can you do if the test fails?
> 
> So it is always better to just use the object in the algorithm, and
> let it break with an exception if somebody passes a bad object.

Right. 

Polymorphic code will usually get you more out of an 
algorithm, than type-safe or interface-safe code.

BTW, there are Python interfaces to PySequence_Check() and
PyMapping_Check() burried in the builtin operator module in case
you really do care ;) ...

	operator.isSequenceType()
	operator.isMappingType()
	+ some other C style _Check() APIs

These only look at the type slots though, so Python instances
will appear to support everything but when used fail with
an exception if they don't provide the proper __xxx__ hooks.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Tue Jan 23 10:17:30 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 04:17:30 -0500
Subject: [Python-Dev] webbrowser.py
Message-ID: <20010123041730.A25165@thyrsus.com>

Ping's complaints are justified -- I've been looking at and testing
webbrowser.py and it's a mess.  Among other things:

1. The BROWSER variable is not interpreted properly.

2. The code is stupid about loading platform support it doesn't need.

3. It's not possible to specify lynx as a browser under Unix, because the
   computation of available browsers is split in two and partly done inside
   the CommandLineBrowser class.

3. The module code is excessively hard to read, obscuring these bugs.

Our mistake was hurriedly merging the launcher code from IDLE with the
browser-finder hack I wrote (the guts of CommandLineBrowser).  The resulting
code is a bad, overcomplicated architecture with a nasty seam in it.

As co-designer/implementor I should have caught this sooner, but I was
in a hurry to get a CML2 prototype out the door and didn't test
anything but the case I needed.  My apologies to all.

I'm rewriting to fix these problems now.  Documented semantics of entry
points will be preserved.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The politician attempts to remedy the evil by increasing the very thing
that caused the evil in the first place: legal plunder.
	-- Frederick Bastiat



From mal at lemburg.com  Tue Jan 23 11:26:16 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 11:26:16 +0100
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com>
Message-ID: <3A6D5C48.A076DA0@lemburg.com>

"Eric S. Raymond" wrote:
> 
> Guido van Rossum <guido at digicool.com>:
> > There's already a PEP on a set object type, and everybody and their
> > aunt has already implemented a set datatype.
> 
> I've just read the PEP.  Greg's proposal has a couple of problems.
> The biggest one is that the interface design isn't very Pythonic --
> it's formally adequate, but doesn't exploit the extent to which sets
> naturally have common semantics with existing Python sequence types.
> This is bad; it means that a lot of code that could otherwise ignore
> the difference between lists and sets would have to be specialized
> one way or the other for no good reason.
> 
> The only other set module I can find in the Vaults or anywhere else is
> kjBuckets (which I knew about before).  Looks like a good design, but
> complicated -- and requires installation of an extension.

There's also a kjSet.py available at Aaron's site:

	http://www.chordate.com/kwParsing/index.html

which is a pure Python version of the C extenion's kjSet type.
 
> > If *your* set module is ready for prime time, why not publish it in
> > the Vaults of Parnassus?
> 
> I suppose that's what I'll do if you don't bless it for the standard
> library.  But here are the reasons I suggest you should do so:
> 
> 1. It supports a set of operations that are both often useful and
> fiddly to get right, thus enhancing the "batteries are included"
> effect.  (I used its ancestor for representing seen-message numbers in
> a specialized mailreader, for example.)
> 
> 2. It's simple for application programmers to use.  No extension module
> to integrate.
> 
> 3. It's unsurprising.  My set objects behave almost exactly like other
> mutable sequences, with all the same built-in methods working, except for
> the fact that you can't introduce duplicates with the mutators.
> 
> 4. It's already completely documented in a form suitable for the library.
> 
> 5. It's simple enough not to cause you maintainance hassles down the
> road, and even if it did the maintainer is unlikely to disappear :-).

All very well, but are sets really that essential to every
day Python programming ? If we include sets then we ought to
also include graphs, tries, btrees and all those other goodies
we have in computer science. All of these types are available
out there, but I believe the audience who really cares for these
types is also capable of downloading the extensions and installing
them.

It would be nice if all of these extension could go into a SUMO
edition of Python though... together with your set module.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Tue Jan 23 12:08:06 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 06:08:06 -0500
Subject: [Python-Dev] What does "batteries are included" mean?
In-Reply-To: <3A6D5C48.A076DA0@lemburg.com>; from mal@lemburg.com on Tue, Jan 23, 2001 at 11:26:16AM +0100
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com>
Message-ID: <20010123060806.A25436@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> All very well, but are sets really that essential to every
> day Python programming ? If we include sets then we ought to
> also include graphs, tries, btrees and all those other goodies
> we have in computer science.

I use sets a lot.  And there was enough demand to generate a PEP.

But the wider question here is how seriously we take "batteries are
included" as a design principle.  Does a facility have to be useful
*every day* to be worth being in the standard library?  And if so,
what are things like the POP3 and IMAP libraries (or, for that matter,
my own shlex and netrc modules) doing there?

I don't think so.  I think there are at least four different
possible reasons for something to be in the standard library:

1. It's useful every day.

2. It's useful less frequently than every day, but is a stable
cross-platform implementation of a wheel that would otherwise have to
be reinvented frequently.  That is, you can solve it *once* and have a
zero-maintainance increment to the power of the language.

3. It's a technique that's not often used, and not necessarily stable 
in the face of platform variations, but nothing else will do
when you need it and it's notably difficult to get right.  (popen2 and
BaseHTTPServer would be good examples of this.)

4. It's a developer checklist feature that improves Python's competitive
position against Perl, Tcl, and other contenders for the same ecological
niche.

IMO a lightweight set facility, like POP3 and IMAP, qualifies under 2 and 4
even if not under 1 and 3.  

This question keeps coming up in different guises.  I'm often the one to
raise it, because I favor an aggressive interpretation of "batteries
are included" that would pull in a lot of stuff.  Yes, this makes more
work for us -- but I think it's work we should be doing.  

While minimalism is an excellent design heuristic for the core language,
I think it's a bad one for the libraries.  Python is a high-level language
and programmers using it both expect and deserve high-level libraries --
yes, including graphs/tries/btrees and all that computer science stuff.

Just as much to the point, Python competing against languages like
Perl that frequently get design wins against it because of the
richness of the environment *they* are willing to carry around.

Guido and Tim and others are more conservative than I, which would be
OK -- but it seems to me that the conservatives do not have consistent
or well-thought-out criteria for what to include, which is *not* OK.
We need to solve this problem.

Some time back I initiated a library guidelines PEP, then dropped it
due to press of overwork.  But the general question is going to keep
coming up and we ought to have policy guidelines that potential 
library developers can understand.  

Should I pick this up again?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

I do not find in orthodox Christianity one redeeming feature.
	-- Thomas Jefferson



From mal at lemburg.com  Tue Jan 23 12:50:39 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 12:50:39 +0100
Subject: [Python-Dev] What does "batteries are included" mean?
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com>
Message-ID: <3A6D700F.7A9E2509@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal at lemburg.com>:
> > All very well, but are sets really that essential to every
> > day Python programming ? If we include sets then we ought to
> > also include graphs, tries, btrees and all those other goodies
> > we have in computer science.
> 
> I use sets a lot.  And there was enough demand to generate a PEP.

Sure, but sets are fairly easy to implement using Python dictionaries
-- at least at the level normally needed by Python programs. Sets, queues
and graphs are examples of data types which can have many
different faces; it is hard to design APIs for these which meet 
everyones needs.
 
> But the wider question here is how seriously we take "batteries are
> included" as a design principle.  Does a facility have to be useful
> *every day* to be worth being in the standard library?  And if so,
> what are things like the POP3 and IMAP libraries (or, for that matter,
> my own shlex and netrc modules) doing there?

You can argue the same way for all kinds of extensions and
packages you find in the Vaults. That's why there's demand for
a different packaging of Python and this is what Moshe's
PEP 206 addresses:

	http://python.sourceforge.net/peps/pep-0206.html

> I don't think so. I think there are at least four different
> possible reasons for something to be in the standard library:
> 
> 1. It's useful every day.
> 
> 2. It's useful less frequently than every day, but is a stable
> cross-platform implementation of a wheel that would otherwise have to
> be reinvented frequently.  That is, you can solve it *once* and have a
> zero-maintainance increment to the power of the language.
> 
> 3. It's a technique that's not often used, and not necessarily stable
> in the face of platform variations, but nothing else will do
> when you need it and it's notably difficult to get right.  (popen2 and
> BaseHTTPServer would be good examples of this.)
> 
> 4. It's a developer checklist feature that improves Python's competitive
> position against Perl, Tcl, and other contenders for the same ecological
> niche.
> 
> IMO a lightweight set facility, like POP3 and IMAP, qualifies under 2 and 4
> even if not under 1 and 3.
> 
> This question keeps coming up in different guises.  I'm often the one to
> raise it, because I favor an aggressive interpretation of "batteries
> are included" that would pull in a lot of stuff.  Yes, this makes more
> work for us -- but I think it's work we should be doing.
> 
> While minimalism is an excellent design heuristic for the core language,
> I think it's a bad one for the libraries.  Python is a high-level language
> and programmers using it both expect and deserve high-level libraries --
> yes, including graphs/tries/btrees and all that computer science stuff.
> 
> Just as much to the point, Python competing against languages like
> Perl that frequently get design wins against it because of the
> richness of the environment *they* are willing to carry around.
> 
> Guido and Tim and others are more conservative than I, which would be
> OK -- but it seems to me that the conservatives do not have consistent
> or well-thought-out criteria for what to include, which is *not* OK.
> We need to solve this problem.
> 
> Some time back I initiated a library guidelines PEP, then dropped it
> due to press of overwork.  But the general question is going to keep
> coming up and we ought to have policy guidelines that potential
> library developers can understand.
> 
> Should I pick this up again?

Hmm, we already have the PEP 206 which focusses on the topic.
Perhaps you could work with Moshe to sort out the "which
batteries do we need" sub-topic ?!

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Tue Jan 23 13:20:46 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 07:20:46 -0500
Subject: [Python-Dev] What does "batteries are included" mean?
In-Reply-To: <3A6D700F.7A9E2509@lemburg.com>; from mal@lemburg.com on Tue, Jan 23, 2001 at 12:50:39PM +0100
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com> <3A6D700F.7A9E2509@lemburg.com>
Message-ID: <20010123072046.A25593@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> > But the wider question here is how seriously we take "batteries are
> > included" as a design principle.  Does a facility have to be useful
> > *every day* to be worth being in the standard library?  And if so,
> > what are things like the POP3 and IMAP libraries (or, for that matter,
> > my own shlex and netrc modules) doing there?
> 
> You can argue the same way for all kinds of extensions and
> packages you find in the Vaults. That's why there's demand for
> a different packaging of Python and this is what Moshe's
> PEP 206 addresses:
> 
> 	http://python.sourceforge.net/peps/pep-0206.html

Muttering "PEP 206" evades the fundamental problem rather than solving it.

Not that I'm saying Moshe hasn't made a valiant effort, within the political
constraint that the BDFL and others seem unwilling to confront the deeper 
issue.  But PEP 206 is not enough.  Here is why:

1. If the "Sumo" packaging ever happens, the vanilla non-Sumo version that
Guido issues will quickly become of mostly theoretical interest -- because
Red Hat and everybody else will move to Sumo instantly, figuring they have
nothing to lose by including more features.

2. If by some change I'm wrong about 1, the outcome will be worse;
we'll in effect have fragmented the language, because there won't be
consistency in what library stuff is available between Sumo and
non-Sumo builds on the same platform.

3. There are documentation issues as well.  It's already a blot on
Python that the standard documentation set doesn't cover Tkinter.  In
the Sumo distribution, the gap between what's installed and what's
documented is likely to widen further.  Developers will see this as
pointlessly irritating -- and they'll be right.

The stock distribution should *be* the Sumo distribution.  If we're really
so terrified of the extra maintainence load, then the right fix is to
mark some modules and documentation as "externally maintained" with 
prominent pointers back to the responsible people.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The day will come when the mystical generation of Jesus by the Supreme
Being as his father, in the womb of a virgin, will be classed with the
fable of the generation of Minerva in the brain of Jupiter.
	-- Thomas Jefferson, 1823



From mal at lemburg.com  Tue Jan 23 13:48:09 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 13:48:09 +0100
Subject: [Python-Dev] What does "batteries are included" mean?
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com> <3A6D700F.7A9E2509@lemburg.com> <20010123072046.A25593@thyrsus.com>
Message-ID: <3A6D7D89.A6BE1B74@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal at lemburg.com>:
> > > But the wider question here is how seriously we take "batteries are
> > > included" as a design principle.  Does a facility have to be useful
> > > *every day* to be worth being in the standard library?  And if so,
> > > what are things like the POP3 and IMAP libraries (or, for that matter,
> > > my own shlex and netrc modules) doing there?
> >
> > You can argue the same way for all kinds of extensions and
> > packages you find in the Vaults. That's why there's demand for
> > a different packaging of Python and this is what Moshe's
> > PEP 206 addresses:
> >
> >       http://python.sourceforge.net/peps/pep-0206.html
> 
> Muttering "PEP 206" evades the fundamental problem rather than solving it.
> 
> Not that I'm saying Moshe hasn't made a valiant effort, within the political
> constraint that the BDFL and others seem unwilling to confront the deeper
> issue.  But PEP 206 is not enough.  Here is why:
> 
> 1. If the "Sumo" packaging ever happens, the vanilla non-Sumo version that
> Guido issues will quickly become of mostly theoretical interest -- because
> Red Hat and everybody else will move to Sumo instantly, figuring they have
> nothing to lose by including more features.
> 
> 2. If by some change I'm wrong about 1, the outcome will be worse;
> we'll in effect have fragmented the language, because there won't be
> consistency in what library stuff is available between Sumo and
> non-Sumo builds on the same platform.
> 
> 3. There are documentation issues as well.  It's already a blot on
> Python that the standard documentation set doesn't cover Tkinter.  In
> the Sumo distribution, the gap between what's installed and what's
> documented is likely to widen further.  Developers will see this as
> pointlessly irritating -- and they'll be right.
> 
> The stock distribution should *be* the Sumo distribution.  If we're really
> so terrified of the extra maintainence load, then the right fix is to
> mark some modules and documentation as "externally maintained" with
> prominent pointers back to the responsible people.

That's your POV, others think different and since this is not
a democracy, the Sumo distribution is a feasable way of satisfying
both needs.

There are a few other issues to consider as well:

* licensing is a problem (and this is also mentioned in the PEP 206)
  since some of the nicer additions are GPLed and thus not
  in the spirit of Python's closed-source friendliness which
  has provided it with a large user base in the commercial field

* packages authors are not all the same and some may not want
  to split their distribution due to the integration of their
  package in a Sumo-distribution

* the packages mentioned in PEP 206 are very complex and usually
  largish; maintaining them will cause much more effort compared
  to the standard lib modules and extensions

* the build process varies widely between packages; even though
  we have distutils, some of the packages extend it to fit
  their specific needs (which is OK, but causes extra efforts
  in getting the build process combined)

I'm not objecting to the Sumo-distribution project; to the 
contrary -- I tried a similar project a few years ago:
the Python PowerTools distribution which you can download
from:

	http://www.lemburg.com/python/PowerTools-0.2.zip

The project died quickly though, as I wasn't able to keep
up with the maintenance effort.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From akuchlin at cnri.reston.va.us  Tue Jan 23 14:40:06 2001
From: akuchlin at cnri.reston.va.us (Andrew Kuchling)
Date: Tue, 23 Jan 2001 08:40:06 -0500
Subject: [Python-Dev] What does "batteries are included" mean?
In-Reply-To: <3A6D7D89.A6BE1B74@lemburg.com>; from mal@lemburg.com on Tue, Jan 23, 2001 at 01:48:09PM +0100
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com> <3A6D700F.7A9E2509@lemburg.com> <20010123072046.A25593@thyrsus.com> <3A6D7D89.A6BE1B74@lemburg.com>
Message-ID: <20010123084006.A23485@newcnri.cnri.reston.va.us>

On Tue, Jan 23, 2001 at 01:48:09PM +0100, M.-A. Lemburg wrote:
>There are a few other issues to consider as well:
>   <good list deleted>

To add a few:

* The larger the amount of code in the distribution, the more effort it is
  maintain it all.

* Minor fixes aren't available until the next Python release.  For example,
  to drag out the XML code again: there have been two PyXML releases since
  Python 2.0 fixing various bugs, but someone who sticks to installing just 
  Python will not be able to get at those bugfixes until April (when 2.1
  is supposed to get finalized). 

If there were a core Python distribution and a sumo distribution, and the
sumo distribution was the one that most people downloaded and used, that
would be perfectly OK.  Practically no one assembles their own Linux
distribution, and that's not considered a problem.  To some degree, if
you're using a well-packaged Linux distribution such as Debian, you also
have Python distribution mechanism with intermodule dependencies; we just
have to reinvent the wheel for people on other platforms.

>The project died quickly though, as I wasn't able to keep
>up with the maintenance effort.

Interesting.  Did you get much feedback indicating that people used it much?
Perhaps when you were doing that effort the Python community was composed
more of self-reliant early adopter types; there are probably more newbies
around now.

--amk



From mal at lemburg.com  Tue Jan 23 15:05:13 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 15:05:13 +0100
Subject: [Python-Dev] What does "batteries are included" mean?
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com> <3A6D700F.7A9E2509@lemburg.com> <20010123072046.A25593@thyrsus.com> <3A6D7D89.A6BE1B74@lemburg.com> <20010123084006.A23485@newcnri.cnri.reston.va.us>
Message-ID: <3A6D8F99.53A0F411@lemburg.com>

Andrew Kuchling wrote:
> 
> On Tue, Jan 23, 2001 at 01:48:09PM +0100, M.-A. Lemburg wrote:
> >There are a few other issues to consider as well:
> >   <good list deleted>
> 
> To add a few:
> 
> * The larger the amount of code in the distribution, the more effort it is
>   maintain it all.
> 
> * Minor fixes aren't available until the next Python release.  For example,
>   to drag out the XML code again: there have been two PyXML releases since
>   Python 2.0 fixing various bugs, but someone who sticks to installing just
>   Python will not be able to get at those bugfixes until April (when 2.1
>   is supposed to get finalized).
> 
> If there were a core Python distribution and a sumo distribution, and the
> sumo distribution was the one that most people downloaded and used, that
> would be perfectly OK.  Practically no one assembles their own Linux
> distribution, and that's not considered a problem.  To some degree, if
> you're using a well-packaged Linux distribution such as Debian, you also
> have Python distribution mechanism with intermodule dependencies; we just
> have to reinvent the wheel for people on other platforms.
> 
> >The project died quickly though, as I wasn't able to keep
> >up with the maintenance effort.
> 
> Interesting.  Did you get much feedback indicating that people used it much?

Not much -- the interested parties were mostly Python experts (the
lib started out as a project called expert-lib).

> Perhaps when you were doing that effort the Python community was composed
> more of self-reliant early adopter types; there are probably more newbies
> around now.

True. The included packages are dated 1997-1998 -- at that time
Starship was just starting to get off the ground (this are moving
at a much faster pace now).

The PowerTools package still uses the Makefile.pre.in mechanism
(with much success though) as distutils wasn't even considered
at the time. Perhaps Moshe could pick this up to have a head
start for Sumo-Python ?!

Some of the included packages are not available elsewhere, AFAIK,
so it may well be worthwhile having a look (e.g. the LGPLed trie and
btree implementations donated by John W. M. Stevens).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Tue Jan 23 15:06:47 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 09:06:47 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: Your message of "Tue, 23 Jan 2001 04:17:30 EST."
             <20010123041730.A25165@thyrsus.com> 
References: <20010123041730.A25165@thyrsus.com> 
Message-ID: <200101231406.JAA04765@cj20424-a.reston1.va.home.com>

> Ping's complaints are justified -- I've been looking at and testing
> webbrowser.py and it's a mess.  Among other things:
> 
> 1. The BROWSER variable is not interpreted properly.
> 
> 2. The code is stupid about loading platform support it doesn't need.
> 
> 3. It's not possible to specify lynx as a browser under Unix, because the
>    computation of available browsers is split in two and partly done inside
>    the CommandLineBrowser class.
> 
> 3. The module code is excessively hard to read, obscuring these bugs.
> 
> Our mistake was hurriedly merging the launcher code from IDLE with the
> browser-finder hack I wrote (the guts of CommandLineBrowser).  The resulting
> code is a bad, overcomplicated architecture with a nasty seam in it.
> 
> As co-designer/implementor I should have caught this sooner, but I was
> in a hurry to get a CML2 prototype out the door and didn't test
> anything but the case I needed.  My apologies to all.
> 
> I'm rewriting to fix these problems now.  Documented semantics of entry
> points will be preserved.

Excellent, Eric!  That's the spirit.

Can you point me to docs explaining the meaning of the BROWSER
environment variable?  I've never heard of it...  The last new
environment variables I learned were PAGER and EDITOR, probably 15
years ago when 4.1BSD was released... :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Tue Jan 23 15:22:26 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 09:22:26 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101231406.JAA04765@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 09:06:47AM -0500
References: <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com>
Message-ID: <20010123092226.A25968@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> Can you point me to docs explaining the meaning of the BROWSER
> environment variable?  I've never heard of it...  The last new
> environment variables I learned were PAGER and EDITOR, probably 15
> years ago when 4.1BSD was released... :-)

You've never heard of BROWSER because I invented it and have not
widely popularized it yet :-).  Ping knew about it either because he
read the module code and saw that it was supposed to work, or because
he remembered the design discussion when webbrowser.py was first
implemented.

I've had conversations with some key Perl and Tcl people (Larry Wall,
Tom Christiansen, Clif Flynt) about the BROWSER convention, and they
agree it's a good idea.  I'll probably hack support for it into Perl's
browser launcher next.

It's documented in the version of libwebbrowser.tex now in the CVS tree.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Power concedes nothing without a demand. It never did, and it never will.
Find out just what people will submit to, and you have found out the exact
amount of injustice and wrong which will be imposed upon them; and these will
continue until they are resisted with either words or blows, or with both.
The limits of tyrants are prescribed by the endurance of those whom they
oppress.
	-- Frederick Douglass, August 4, 1857



From nas at arctrix.com  Tue Jan 23 09:30:56 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 23 Jan 2001 00:30:56 -0800
Subject: [Python-Dev] Does autoconfig detect INSTALL incorrectly?
Message-ID: <20010123003056.A28309@glacier.fnational.com>

Why is the configure.in file set to always use "install-sh"?
There is a comment that says:

    # Install just never works :-(

I don't think that statement is accurate.  /usr/bin/install works
quite well on my machine.  The only commments I can find in the
changelog are:

    revision 1.16
    date: 1995/01/20 14:12:16;  author: guido;  state: Exp;  lines: +27 -2
    add INSTALL_PROGRAM and INSTALL_DATA; check for getopt

and:

    revision 1.5
    date: 1994/08/19 15:33:51;  author: guido;  state: Exp;  lines: +14 -6
    Simplify value of INSTALL (always 'cp').

Is there any reason why the autoconf macro AC_PROG_INSTALL is not used?  The
documentation seems to indicate that is does what we want.

 Neil



From guido at digicool.com  Tue Jan 23 16:31:39 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 10:31:39 -0500
Subject: [Python-Dev] Is X a (sequence|mapping)?
In-Reply-To: Your message of "Tue, 23 Jan 2001 10:15:11 +0100."
             <3A6D4B9F.38B17046@lemburg.com> 
References: <200101230814.f0N8EWQ00849@mira.informatik.hu-berlin.de>  
            <3A6D4B9F.38B17046@lemburg.com> 
Message-ID: <200101231531.KAA05122@cj20424-a.reston1.va.home.com>

> Polymorphic code will usually get you more out of an 
> algorithm, than type-safe or interface-safe code.

Right.

But there are times when people want to write methods that take
e.g. either a sequence or a mapping, and need to distinguish between
the two.  That's not easy in Python!  Java and C++ support it very
well though, and thus we'll always keep seeing this kind of
complaint.  Not sure what to do, except to recommend "find out which
methods you expect in one case but not in the other (e.g. keys()) and
do a hasattr() test for that."

> BTW, there are Python interfaces to PySequence_Check() and
> PyMapping_Check() burried in the builtin operator module in case
> you really do care ;) ...
> 
> 	operator.isSequenceType()
> 	operator.isMappingType()
> 	+ some other C style _Check() APIs
> 
> These only look at the type slots though, so Python instances
> will appear to support everything but when used fail with
> an exception if they don't provide the proper __xxx__ hooks.

Yes, these should probably be deprecated.  I certainly have never used
them!  (The operator module doesn't seem to get much use in
general...  Was it a bad idea?)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan 23 16:49:23 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 10:49:23 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Mon, 22 Jan 2001 15:13:09 EST."
             <20010122151309.C15236@thyrsus.com> 
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com>  
            <20010122151309.C15236@thyrsus.com> 
Message-ID: <200101231549.KAA05172@cj20424-a.reston1.va.home.com>

> I've just read the PEP.  Greg's proposal has a couple of problems.
> The biggest one is that the interface design isn't very Pythonic --
> it's formally adequate, but doesn't exploit the extent to which sets
> naturally have common semantics with existing Python sequence types.
> This is bad; it means that a lot of code that could otherwise ignore
> the difference between lists and sets would have to be specialized 
> one way or the other for no good reason.

Actually, I thought that Greg's proposal has some charm: it seems to
be using a natural extension of the existing dictionary syntax, where
a set is a dictionary without the values.  I haven't thought about
this deeply enough, but I see a lot of potential here.

I understand that you have probably given this more thought than I
have recently, so I'd like to see your more detailed analysis of what
you do and don't like about Greg's proposal!

> The only other set module I can find in the Vaults or anywhere else is
> kjBuckets (which I knew about before).  Looks like a good design, but
> complicated -- and requires installation of an extension.
> 
> > If *your* set module is ready for prime time, why not publish it in
> > the Vaults of Parnassus?
> 
> I suppose that's what I'll do if you don't bless it for the standard
> library.  But here are the reasons I suggest you should do so:
> 
> 1. It supports a set of operations that are both often useful and
> fiddly to get right, thus enhancing the "batteries are included"
> effect.  (I used its ancestor for representing seen-message numbers in
> a specialized mailreader, for example.)

I haven't read your docs yet (and no time because Digital Creations is
requiring my attention all of today), but I expect that designing a
universal set type, one that is good enough to be used in all sorts of
applications, is very difficult.  

> 2. It's simple for application programmers to use.  No extension module
> to integrate.

This is a silly argument for wanting something to be added to the
core.  If it's part of the core, the need for an extension is
immaterial because that extension will always be available.  So
I conclude that your module is set up perfectly for a popular module
in the Vaults. :-)

> 3. It's unsurprising.  My set objects behave almost exactly like other
> mutable sequences, with all the same built-in methods working, except for 
> the fact that you can't introduce duplicates with the mutators.

Ah, so you see a set as an extension of a sequence.  That may be the
big rift between your version and Greg's PEP: are sets more like
sequences or more like dictionaries?

> 4. It's already completely documented in a form suitable for the library.

Much appreciated.

> 5. It's simple enough not to cause you maintainance hassles down the
> road, and even if it did the maintainer is unlikely to disappear :-).

I'll be the judge of that, and since you prefer not to show your
source code (why is that?), I can't tell yet.

[...time flows...]

Having just skimmed your docs, I'm disappointed that you choose lists
as your fundamental representation type -- this makes it slow to test
for membership and hence makes intersection and union slow.  I suppose
that you have evidence from using this that those operations aren't
used much, or not for large sets?  This is one of the problems with
coming up with a set type for the core: it has to work for (nearly)
everybody.  It's no big deal if the Vaults contain three or more set
modules -- perfect even, people can choose the best one for their
purpose.  But in the core, there's only room for one set type or
module.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Tue Jan 23 17:30:50 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 11:30:50 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101231549.KAA05172@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 10:49:23AM -0500
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <200101231549.KAA05172@cj20424-a.reston1.va.home.com>
Message-ID: <20010123113050.A26162@thyrsus.com>

Guido van Rossum <guido at digicool.com>: 
> I understand that you have probably given this more thought than I
> have recently, so I'd like to see your more detailed analysis of what
> you do and don't like about Greg's proposal!

I've already covered my big objection, the fact that it doesn't
support the degree of polymorphic crossover one might expect with
sequence types (and Greg has agreed that I have a point there).
Another problem is the lack of support for mutable elements (and yes,
I'm quite aware of the problems with this.)

One thing I do like is the proposal for an actual set input syntax.  Of course
this would require that the set type become one of the builtins, with 
compiler support.

> I haven't read your docs yet (and no time because Digital Creations is
> requiring my attention all of today), but I expect that designing a
> universal set type, one that is good enough to be used in all sorts of
> applications, is very difficult.  

For "difficult" read "can't be done".  This is one of those cases where
no matter what implementation you choose, some of the operations you want
to be cheap will be worst-case quadratic.  Life is like that.  So I chose
a dead-simple representation and accepted quadratic times for 
union/intersection.

> > 2. It's simple for application programmers to use.  No extension module
> > to integrate.
> 
> This is a silly argument for wanting something to be added to the
> core.  If it's part of the core, the need for an extension is
> immaterial because that extension will always be available.  So
> I conclude that your module is set up perfectly for a popular module
> in the Vaults. :-)

Reasonable point.
 
> > 3. It's unsurprising.  My set objects behave almost exactly like other
> > mutable sequences, with all the same built-in methods working, except for 
> > the fact that you can't introduce duplicates with the mutators.
> 
> Ah, so you see a set as an extension of a sequence.  That may be the
> big rift between your version and Greg's PEP: are sets more like
> sequences or more like dictionaries?

Indeed it is.  

> > 5. It's simple enough not to cause you maintainance hassles down the
> > road, and even if it did the maintainer is unlikely to disappear :-).
> 
> I'll be the judge of that, and since you prefer not to show your
> source code (why is that?), I can't tell yet.

No nefarious concealment going on here here :-), I've sent versions of
the code to Greg and Ping already.  I'll shoot you a copy too.
 
> Having just skimmed your docs, I'm disappointed that you choose lists
> as your fundamental representation type -- this makes it slow to test
> for membership and hence makes intersection and union slow.

Not quite.  Membership test is still linear-time; so is adding and deleting
elements.  It's true that union and intersection are quadratic, but see below.

>                                                      I suppose
> that you have evidence from using this that those operations aren't
> used much, or not for large sets?

Exactly!  In my experience the usage pattern of a class like this runs
heavily to small sets (usually < 64 elements); membership tests
dominate usage, with addition and deletion of elements running second
and the "classical" boolean operations like union and intersection
being uncommon.

What you get by going with a dictionary representation is that
membership test becomes close to constant-time, while insertion and
deletion become sometimes cheap and sometimes quite expensive
(depending of course on whether you have to allocate a new 
hash bucket).  Given the usage pattern I described, the overall
difference in performance is marginal.

>                              This is one of the problems with
> coming up with a set type for the core: it has to work for (nearly)
> everybody.

As I pointed out above (and someone else on the list had made the same point
earlier), "works for everbody" isn't really possible here.  So my solution
does the next best thing -- pick a choice of tradeoffs that isn't obviously
worse than the alternatives and keeps things bog-simple.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Alcohol still kills more people every year than all `illegal' drugs put
together, and Prohibition only made it worse.  Oppose the War On Some Drugs!
-------------- next part --------------
"""
A set-algebra module for Python.

The functions work on any sequence type and return lists.
The set methods can take a set or any sequence type as an argument.
They are insensitive to the types of the elements.

Lists are used rather than dictionaries so the elements can be mutable.

"""
# Design and implementation by ESR, January 2001.

def setify(list1):		# Used by set constructor
    "Remove duplicates in sequence."
    res = []
    for i in range(len(list1)):
	duplicate = 0
        for j in range(i):
	    if list1[i] == list1[j]:
		duplicate = 1
		break
	if not duplicate:
	    res.append(list1[i])
    return res

def union(list1, list2):		# Used for |
    "Compute set intersection of sequences."
    res = list1[:]
    for x in list2:
	if not x in list1:
	    res.append(x)
    return res

def intersection(list1, list2):		# Used for &
    "Compute set intersection of sequences."
    res = []
    for x in list1:
	if x in list2:
	    res.append(x)
    return res

def difference(list1, list2):		# Used for -
    "Compute set difference of sequences."
    res = []
    for x in list1:
	if not x in list2:
	    res.append(x)
    return res

def symmetric_difference(list1, list2):	# Used for ^
    "Compute set symmetric-difference of sequences."
    res = []
    for x in list1:
	if not x in list2:
	    res.append(x)
    for x in list2:
	if not x in list1:
	    res.append(x)
    return res

def cartesian(list1, list2):		# Used for *
    "Cartesian product of sequences considered as sets."
    res = []
    for x in list1:
	for y in list2:
	    res.append((x,y))
    return res

def equality(list1, list2):
    "Test sequences considered as sets for equality."
    if len(list1) != len(list2):
        return 0
    for x in list1:
        if not x in list2:
            return 0
    for x in list2:
        if not x in list1:
            return 0
    return 1

def proper_subset(list1, list2):
    "Return 1 if first argument is a proper subset of second, 0 otherwise."
    if not len(list1) < len(list2):
        return 0
    for x in list1:
        if not x in list2:
            return 0
    return 1

def subset(list1, list2):
    "Return 1 if first argument is a subset of second, 0 otherwise."
    if not len(list1) <= len(list2):
        return 0
    for x in list1:
        if not x in list2:
            return 0
    return 1

def powerset(base):
    "Compute the set of all subsets of a set."
    powerset = []
    for n in xrange(2 ** len(base)):
	subset = []
	for e in xrange(len(base)):
	     if n & 2 ** e:
		subset.append(base[e])
	powerset.append(subset)
    return powerset

class set:
    "Lists with set-theoretic operations."

    def __init__(self, value):
        self.elements = setify(value)

    def __len__(self):
	return len(self.elements)

    def __getitem__(self, ind):
	return self.elements[ind]

    def __setitem__(self, ind, val):
        if val not in self.elements:
            self.elements[ind] = val

    def __delitem__(self, ind):
	del self.elements[ind]

    def list(self):
        return self.elements

    def append(self, new):
        if new not in self.elements:
            self.elements.append(new)

    def extend(self, new):
	self.elements.extend(new)
        self.elements = setify(self.elements)

    def count(self, x):
	self.elements.count(x)

    def index(self, x):
	self.elements.index(x)

    def insert(self, i, x):
        if x not in self.elements:
            self.elements.index(i, x)

    def pop(self, i=None):
	self.elements.pop(i)

    def remove(self, x):
	self.elements.remove(x)

    def reverse(self):
	self.elements.reverse()

    def sort(self, cmp=None):
	self.elements.sort(cmp)

    def __or__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(union(self.elements, other))

    __add__ = __or__

    def __and__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(intersection(self.elements, other))

    def __sub__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(difference(self.elements, other))

    def __xor__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(symmetric_difference(self.elements, other))

    def __mul__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(cartesian(self.elements, other))

    def __eq__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return self.elements == other

    def __ne__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return self.elements != other

    def __lt__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return proper_subset(self.elements, other)

    def __le__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return subset(self.elements, other)

    def __gt__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return proper_subset(other, self.elements)

    def __ge__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return subset(other, self.elements)

    def __str__(self):
        res = "{"
        for x in self.elements:
            res = res + str(x) + ", "
        res = res[0:-2] + "}"
        return res

    def __repr__(self):
        return repr(self.elements)

if __name__ == '__main__':
    a = set([1, 2, 3, 4])
    b = set([1, 4])
    c = set([5, 6])
    d = [1, 1, 2, 1]
    print `d`, "setifies to", set(d)
    print `a`, "|", `b`, "is", `a | b`
    print `a`, "^", `b`, "is", `a ^ b`
    print `a`, "&", `b`, "is", `a & b`
    print `b`, "*", `c`, "is", `b * c`
    print `a`, '<', `b`, "is", `a < b`
    print `a`, '>', `b`, "is", `a > b`
    print `b`, '<', `c`, "is", `b < c`
    print `b`, '>', `c`, "is", `b > c`
    print "Power set of", `c`, "is", powerset(c)

# end

From sdm7g at virginia.edu  Tue Jan 23 18:12:22 2001
From: sdm7g at virginia.edu (Steven D. Majewski)
Date: Tue, 23 Jan 2001 12:12:22 -0500 (EST)
Subject: [Python-Dev] libraries=['m'] in config.py [Re: Python 2.1 alpha 1 released!]
In-Reply-To: <200101230333.WAA28376@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.NXT.4.21.0101231204010.227-100000@localhost.virginia.edu>


Is there a simple way (other than editing config.py) to remove the
effect of all of the "libraries=['m']" options from config.py ? 

This breaks the MacOSX build as there's no libm -- that functionality
is build into the System.framework .

Shouldn't these type of flags be acquired from configure or the
make environment somehow ? 

-- Steve Majewski 


( BTW: OSX build also needs a "-traditional-cpp" flag to get thru 
  compiling classobject.c without error. ) 







From uche.ogbuji at fourthought.com  Tue Jan 23 18:28:18 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Tue, 23 Jan 2001 10:28:18 -0700
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
In-Reply-To: Message from Martin von Loewis <loewis@informatik.hu-berlin.de> 
   of "Mon, 22 Jan 2001 15:46:39 +0100." <200101221446.PAA05164@pandora.informatik.hu-berlin.de> 
Message-ID: <200101231728.KAA03408@localhost.localdomain>

> > This has nothing to do with Python. UTF-8 marks the codes 
> > from 128-191 as illegal prefix. 
> [...]
> > Perhaps the parser should catch the UnicodeError and
> > instead return a not-wellformed exception ?!
> 
> Right on both accounts. If no encoding is specified, and if the
> document appears not to be UTF-16 in any endianness, an XML processor
> shall assume it is UTF-8. As Marc-Andre explains, your document is not
> proper UTF-8, hence the error.
> 
> The confusing thing is that expat itself does not care about it not
> being UTF-8; that is only detected when the callback is invoked in
> pyexpat, and therefore conversion to a Unicode object is attempted.

Pyexpat violates the XML spec here.  XML parsers are not allowed to "recover" 
from well-formedness errors.  And I would classify blithley reporting the 
character data as "recovery".

However, I'm amazed that this wouldn't have come up before, considering the 
pedigree of expat.

I'll poke around, and raise a bug on the expat site if need be.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From tismer at tismer.com  Tue Jan 23 18:35:08 2001
From: tismer at tismer.com (Christian Tismer)
Date: Tue, 23 Jan 2001 18:35:08 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
References: <200101231728.KAA03408@localhost.localdomain>
Message-ID: <3A6DC0CC.C4FF83DF@tismer.com>


uche.ogbuji at fourthought.com wrote:
> 
> > > This has nothing to do with Python. UTF-8 marks the codes
> > > from 128-191 as illegal prefix.
> > [...]
> > > Perhaps the parser should catch the UnicodeError and
> > > instead return a not-wellformed exception ?!
> >
> > Right on both accounts. If no encoding is specified, and if the
> > document appears not to be UTF-16 in any endianness, an XML processor
> > shall assume it is UTF-8. As Marc-Andre explains, your document is not
> > proper UTF-8, hence the error.
> >
> > The confusing thing is that expat itself does not care about it not
> > being UTF-8; that is only detected when the callback is invoked in
> > pyexpat, and therefore conversion to a Unicode object is attempted.
> 
> Pyexpat violates the XML spec here.  XML parsers are not allowed to "recover"
> from well-formedness errors.  And I would classify blithley reporting the
> character data as "recovery".
> 
> However, I'm amazed that this wouldn't have come up before, considering the
> pedigree of expat.

Well, I had to write a preprocessor which turns some "xml-like"
but not well-formed stuff into something useable. This was a
bulk of 100 MB of data, partially hand-written, partially
machine-generated, but not really well-formed. Some
special characters appeared very late in the data set, raising
an error in Python 2.0, but not in 1.5.2, so I perceived
it as an error in the parser first, not the data. :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From uche.ogbuji at fourthought.com  Tue Jan 23 18:55:12 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Tue, 23 Jan 2001 10:55:12 -0700
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
In-Reply-To: Message from Christian Tismer <tismer@tismer.com> 
   of "Mon, 22 Jan 2001 16:05:24 +0100." <3A6C4C34.4D1252C9@tismer.com> 
Message-ID: <200101231755.KAA03471@localhost.localdomain>

> "M.-A. Lemburg" wrote:
> ...
> > > The codes from 192 to 236, 238-243 produce
> > > "UTF-8 decoding error: invalid data",
> > > the rest gives "not well-formed".
> > >
> > > I would like to know if this happens with your (Tim) modified
> > > version as well. I'm using plain vanilla BeOpen Python 2.0 .
> > 
> > This has nothing to do with Python. UTF-8 marks the codes
> > from 128-191 as illegal prefix. See Object/unicodeobject.c:
> ...
> 
> Schade.
> 
> > Perhaps the parser should catch the UnicodeError and
> > instead return a not-wellformed exception ?!
> 
> I belive it would be better.

Yes, and given there is not much time before thr 2.1 release, doing so is an 
acceptable stop-gap.  However, I think the real fix has to lie in expat.

I just had a *very* quick and dirty perusal of expat 1.2 and 1.95.1, and not 
only do the UTF-8 validity checks (at the top of xmltok.c) seem wrong, but it 
doesn't look as if they're ever invoked.

I'll try to some time to look into this more closely, or perhaps someone will 
straighten me out if I'm on the wrong trail.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From fredrik at effbot.org  Tue Jan 23 19:03:42 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 23 Jan 2001 19:03:42 +0100
Subject: [Python-Dev] getting rid of ucnhash
Message-ID: <013901c08566$d2a8f360$e46940d5@hagrid>

It's probably just me, but the names of the two unicode
modules tend to irritate me:

> ls u*.pyd
ucnhash.pyd      unicodedata.pyd

(the former contains names, the latter data)

I've been meaning to rename the former, but I just realized
that it might be better to get rid of it completely, and move
its functionality into the unicodedata module.

The result is a single 200k unicodedata module, which con-
tains the name database as well as two new functions:

    name(character [, default]) => map unicode
    character to name.  if the name doesn't exist,
    return the default object, or raise ValueError.

    lookup(name) => unicode character
    (or raise KeyError if it doesn't exist)

Should I check it in now, change the names/semantics and check
it in, or post it to sourceforge?

Cheers /F





From uche.ogbuji at fourthought.com  Tue Jan 23 19:00:19 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Tue, 23 Jan 2001 11:00:19 -0700
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
In-Reply-To: Message from "Eric S. Raymond" <esr@thyrsus.com> 
   of "Mon, 22 Jan 2001 12:41:59 EST." <20010122124159.A14999@thyrsus.com> 
Message-ID: <200101231800.LAA03515@localhost.localdomain>

> \section{\module{set} ---
>          Basic set algebra for Python}

Looks good.  Are you making this available for download?  I could put this to 
experimental use right away (experimental since, IIRC, you are using the new 
rich comparisons).


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From uche.ogbuji at fourthought.com  Tue Jan 23 19:16:27 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Tue, 23 Jan 2001 11:16:27 -0700
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
In-Reply-To: Message from "Eric S. Raymond" <esr@thyrsus.com> 
   of "Mon, 22 Jan 2001 15:13:09 EST." <20010122151309.C15236@thyrsus.com> 
Message-ID: <200101231816.LAA03551@localhost.localdomain>

> Guido van Rossum <guido at digicool.com>:
> > There's already a PEP on a set object type, and everybody and their
> > aunt has already implemented a set datatype.

Tim mentioned that he had one, and he also claimed that every other dodder had 
a set class, but the only one listed in the vaults is kjBuckets, which I'm not 
sure is maintained any more.  (Is Aaron Watters hereabouts?)

> I've just read the PEP.  Greg's proposal has a couple of problems.
> The biggest one is that the interface design isn't very Pythonic --
> it's formally adequate, but doesn't exploit the extent to which sets
> naturally have common semantics with existing Python sequence types.
> This is bad; it means that a lot of code that could otherwise ignore
> the difference between lists and sets would have to be specialized 
> one way or the other for no good reason.

IMO, Eric's Set interface is close to perfect.

PEP 218 is interesting, but I'm not sure it's worth slogging through the 
inevitable uproar over an entirely new syntactic construct (the "{}" notation) 
before getting something as useful as a set class into the standard library.


> > If *your* set module is ready for prime time, why not publish it in
> > the Vaults of Parnassus?
> 
> I suppose that's what I'll do if you don't bless it for the standard
> library.  But here are the reasons I suggest you should do so:

For what it's worth, I'm +1 on adding this to the standard library.  I've seen 
so many set hacks with dictionaries (memory ouch) and list hacks (speed ouch) 
in Python code out there, that I'm convinced it would meet much more common 
usage than, say zlib, xdr, or even expat.

On this hacker list everyone's aunt might whip up set extensions on boring 
weekends, but I doubt this describes the overall Python populace.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From uche.ogbuji at fourthought.com  Tue Jan 23 19:29:36 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Tue, 23 Jan 2001 11:29:36 -0700
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
In-Reply-To: Message from "M.-A. Lemburg" <mal@lemburg.com> 
   of "Tue, 23 Jan 2001 11:26:16 +0100." <3A6D5C48.A076DA0@lemburg.com> 
Message-ID: <200101231829.LAA03575@localhost.localdomain>

> All very well, but are sets really that essential to every
> day Python programming ?

Not everyday, but as I said, the standard library has zlib, expat, tkinter, 
colorsys, and a whole lot of other stuff that is undoubtedly less useful than 
a set class.

> If we include sets then we ought to
> also include graphs, tries, btrees

I see all of these as far less commonly useful than sets (at least in 
situations where implementations using existing data structures won't suffice).

I run into needs for sets all the time.  I don't have as much trouble with 
your other examples, though I've always considered tries as a possible 
performance boost in XPath.  Oddly enough another data structure I often wish 
I had is a splay tree, and I hope to wrap my old C++ splay tree implementation 
for Python one of these days.

> and all those other goodies
> we have in computer science. All of these types are available
> out there, but I believe the audience who really cares for these
> types is also capable of downloading the extensions and installing
> them.
> 
> It would be nice if all of these extension could go into a SUMO
> edition of Python though... together with your set module.

Considering "batteries included", it's worth considering these very important 
"batteries".


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From skip at mojam.com  Tue Jan 23 19:35:04 2001
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 23 Jan 2001 12:35:04 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,1.3,1.4
In-Reply-To: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net>
References: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <14957.52952.48739.53360@beluga.mojam.com>

    Guido> - Use "exec ... in dict" to avoid having to walk on eggshells;
    Guido>   locals no don't have to start with underscore.

Thanks.  I have just been incredibly short on time lately.

    Guido> - Only test dbhash if bsddb can be imported.  (Wonder if there
    Guido>   are more like this?)

Alpha testing should pick those up, yes? ;-)

    Guido> ! try:
    Guido> !     import bsddb
    Guido> ! except ImportError:
    Guido> !     if verbose:
    Guido> !         print "can't import bsddb, so skipping dbhash"
    Guido> ! else:
    Guido> !     check_all("dbhash")

Instead of having to know that dbhash includes bsddb, shouldn't dbhash be
the module that's imported here?

Skip



From uche.ogbuji at fourthought.com  Tue Jan 23 19:36:59 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Tue, 23 Jan 2001 11:36:59 -0700
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
In-Reply-To: Message from "Eric S. Raymond" <esr@thyrsus.com> 
   of "Tue, 23 Jan 2001 11:30:50 EST." <20010123113050.A26162@thyrsus.com> 
Message-ID: <200101231836.LAA03655@localhost.localdomain>

> """
> A set-algebra module for Python.
> 
> The functions work on any sequence type and return lists.
> The set methods can take a set or any sequence type as an argument.
> They are insensitive to the types of the elements.
> 
> Lists are used rather than dictionaries so the elements can be mutable.
> 
> """

Hmm.  I was hoping this was actually a C extension for the performance boost, 
esp. given the number of __foo__ methods in the set class.

Implementation in Python makes my interest in adding it to the standard lib 
more tepid (not to cast the least bit of aspersion on your work).


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From skip at mojam.com  Tue Jan 23 19:37:44 2001
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 23 Jan 2001 12:37:44 -0600 (CST)
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: <3A6CBBEF.4732BFF2@ActiveState.com>
References: <14945.59192.400783.403810@beluga.mojam.com>
	<200101142055.PAA13041@cj20424-a.reston1.va.home.com>
	<3A6CBBEF.4732BFF2@ActiveState.com>
Message-ID: <14957.53112.119272.797494@beluga.mojam.com>

    Paul> I apologize but I'm not clear on my responsibilities here, if
    Paul> any. I wrote a PEP for online help. I submitted a partial
    Paul> implementation. 

Perhaps I am the one who should apologize.  I started the thread.  I tried
Ping's code and was simply amazed at how useful it was.  I didn't bother
checking the list of PEPs to see if it overlapped with something there, and
I suspect any discussion of this stuff has taken place in the doc sig, where
I don't hang out.

Skip



From esr at thyrsus.com  Tue Jan 23 19:39:04 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 13:39:04 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101231816.LAA03551@localhost.localdomain>; from uche.ogbuji@fourthought.com on Tue, Jan 23, 2001 at 11:16:27AM -0700
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain>
Message-ID: <20010123133904.B26487@thyrsus.com>

uche.ogbuji at fourthought.com <uche.ogbuji at fourthought.com>:
> I've seen so many set hacks with dictionaries (memory ouch) and list
> hacks (speed ouch) in Python code out there, that I'm convinced it
> would meet much more common usage than, say zlib, xdr, or even
> expat.

Uche brings up a point I meant to make in my reply to Guido.  The dict-
vs.-list choice in set representation is indeed a choice between 
memory ouch and speed ouch.  

I believe most uses of sets are small sets.  That reduces the speed ouch
of using a list representation and increases the proportional memory
ouch of a dictionary implementation.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Question with boldness even the existence of a God; because, if there
be one, he must more approve the homage of reason, than that of
blindfolded fear.... Do not be frightened from this inquiry from any
fear of its consequences. If it ends in the belief that there is no
God, you will find incitements to virtue in the comfort and
pleasantness you feel in its exercise...
	-- Thomas Jefferson, in a 1787 letter to his nephew



From jeremy at alum.mit.edu  Tue Jan 23 19:41:23 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue, 23 Jan 2001 13:41:23 -0500 (EST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123113050.A26162@thyrsus.com>
References: <20010122124159.A14999@thyrsus.com>
	<200101221910.OAA01218@cj20424-a.reston1.va.home.com>
	<20010122151309.C15236@thyrsus.com>
	<200101231549.KAA05172@cj20424-a.reston1.va.home.com>
	<20010123113050.A26162@thyrsus.com>
Message-ID: <14957.53331.342827.462297@localhost.localdomain>

>>>>> "ESR" == Eric S Raymond <esr at thyrsus.com> writes:

  ESR> Guido van Rossum <guido at digicool.com>:
  >> Having just skimmed your docs, I'm disappointed that you choose
  >> lists as your fundamental representation type -- this makes it
  >> slow to test for membership and hence makes intersection and
  >> union slow.

  ESR> Not quite.  Membership test is still linear-time; so is adding
  ESR> and deleting elements.  It's true that union and intersection
  ESR> are quadratic, but see below.

  >> I suppose that you have evidence from using this that those
  >> operations aren't used much, or not for large sets?

  ESR> Exactly!  In my experience the usage pattern of a class like
  ESR> this runs heavily to small sets (usually < 64 elements);
  ESR> membership tests dominate usage, with addition and deletion of
  ESR> elements running second and the "classical" boolean operations
  ESR> like union and intersection being uncommon.

I use a Set type in the compiler package (Tools/compiler/compiler) to
collect the names for a code block.  I implemented a trivial Set type
using a dictionary, because it supported the operations I was most
interested in: addition, membership tests, intersection, and get
elements as sequence (in arbitrary order).  Those are the only
operations the compiler uses.

I think I use sets for this purpose frequently, although I can't think
of any other good examples at the moment.  I usually just use a
dictionary explicitly.  In the compiler, I chose an explicit Set class
with unique method names (add, has_elt, elements) to make it obvious
for readers that I was using a set.

  ESR> What you get by going with a dictionary representation is that
  ESR> membership test becomes close to constant-time, while insertion
  ESR> and deletion become sometimes cheap and sometimes quite
  ESR> expensive (depending of course on whether you have to allocate
  ESR> a new hash bucket).  Given the usage pattern I described, the
  ESR> overall difference in performance is marginal.

The cost of insertion would presumably be dominated by the frequency
of dictionary resizes.  I don't know how often they occur, but I
assume the dictionary type is designed to accommodate efficient
insert.

I did a quick and dirty performance comparison of dictionary-based and
list-based sets.  (I'll include the code below.)  It uses sample data
collected from running the compiler; so it is measuring actual usage.

The tests showed that dictionary-based sets were always faster.  For
small tests (3 operations), the difference was about 10 percent.  For
larger tests (88 operations), the difference ranged from 180 to almost
700 percent.

  >> This is one of the problems with coming up with a set type for
  >> the core: it has to work for (nearly) everybody.

  ESR> As I pointed out above (and someone else on the list had made
  ESR> the same point earlier), "works for everbody" isn't really
  ESR> possible here.  So my solution does the next best thing -- pick
  ESR> a choice of tradeoffs that isn't obviously worse than the
  ESR> alternatives and keeps things bog-simple.

For my applications, the dictionary-based approach is faster and
offers a natural interface.  If a set implementation were included in
the standard library, I would like to see either (1) the
implementation that favors my needs <wink> or (2) multiple
implementations tuned for different uses.  I think it would be just as
easy to make set implementations available separately, though.

Jeremy

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: sets.tar
URL: <http://mail.python.org/pipermail/python-dev/attachments/20010123/99167672/attachment.txt>

From loewis at informatik.hu-berlin.de  Tue Jan 23 19:51:37 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Tue, 23 Jan 2001 19:51:37 +0100 (MET)
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
In-Reply-To: <200101231755.KAA03471@localhost.localdomain>
	(uche.ogbuji@fourthought.com)
References: <200101231755.KAA03471@localhost.localdomain>
Message-ID: <200101231851.TAA19488@pandora.informatik.hu-berlin.de>

> I'll try to some time to look into this more closely, or perhaps
> someone will straighten me out if I'm on the wrong trail.

Spending only a little time myself, either, I'd agree with your
conclusions.

Regards,
Martin



From esr at thyrsus.com  Tue Jan 23 19:55:30 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 13:55:30 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <14957.53331.342827.462297@localhost.localdomain>; from jeremy@alum.mit.edu on Tue, Jan 23, 2001 at 01:41:23PM -0500
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <200101231549.KAA05172@cj20424-a.reston1.va.home.com> <20010123113050.A26162@thyrsus.com> <14957.53331.342827.462297@localhost.localdomain>
Message-ID: <20010123135530.A26565@thyrsus.com>

Jeremy Hylton <jeremy at alum.mit.edu>:
Content-Description: message body text
> The tests showed that dictionary-based sets were always faster.  For
> small tests (3 operations), the difference was about 10 percent.  For
> larger tests (88 operations), the difference ranged from 180 to almost
> 700 percent.

Not surprising.  88 elements is getting pretty large.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Hoplophobia (n.): The irrational fear of weapons, correctly described by 
Freud as "a sign of emotional and sexual immaturity".  Hoplophobia, like
homophobia, is a displacement symptom; hoplophobes fear their own
"forbidden" feelings and urges to commit violence.  This would be
harmless, except that they project these feelings onto others.  The
sequelae of this neurosis include irrational and dangerous behaviors
such as passing "gun-control" laws and trashing the Constitution.



From petrilli at amber.org  Tue Jan 23 20:06:05 2001
From: petrilli at amber.org (Christopher Petrilli)
Date: Tue, 23 Jan 2001 14:06:05 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123133904.B26487@thyrsus.com>; from esr@thyrsus.com on Tue, Jan 23, 2001 at 01:39:04PM -0500
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com>
Message-ID: <20010123140604.E18796@trump.amber.org>

Eric S. Raymond [esr at thyrsus.com] wrote:
> I believe most uses of sets are small sets.  That reduces the speed ouch
> of using a list representation and increases the proportional memory
> ouch of a dictionary implementation.

The problem is that there are a lot of uses for large sets, especially 
when you begin to introduce intersections and unions.  If an
implementation is only useful for a few dozen (or a hundered) items in 
the set, that eliminates a lot of places where the real use of set
types is useful---optimizing large scale manipulations.

Zope for example, manipulates sets with 10,000 items in it on a
regular basis when doing text index manipulation.  The data structures 
are heavily optimized for this kind of behaviour, without a major
sacrifice in space.  I think Jim perhaps can talk to this. 

Unfortunately, for me, a Python implementation of Sets is only
interesting academicaly.  Any time I've needed to work with them at a
large scale, I've needed them *much* faster than Python could achieve
without a C extension.

Perhaps the difference is in problem domain.  In the "scripting"
problem domain, I would agree that Setswould rarely reach large sizes, 
and so a algorithm which performed in quadratic time might be fine,
because the actual resultant time is small.  However, in more
full-blown applications, this would be counter productive, and the
user would be forced implement their own (or use Aaron's excellent
kjBuckets).

Just my opinion, of course.
Chris
-- 
| Christopher Petrilli
| petrilli at amber.org



From ping at lfw.org  Tue Jan 23 20:27:38 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Tue, 23 Jan 2001 11:27:38 -0800 (PST)
Subject: [Python-Dev] Sets: elt in dict, lst.include
In-Reply-To: <14957.53331.342827.462297@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org>

On Tue, 23 Jan 2001, Jeremy Hylton wrote:
> For my applications, the dictionary-based approach is faster and
> offers a natural interface.

The only change that needs to be made to support sets of immutable
elements is to provide "in" on dictionaries.  The rest is then all
quite natural:

    dict[key] = 1
    if key in dict: ...
    for key in dict: ...

(Then we can also get rid of the ugly has_key method.)

For those that need mutable set elements badly enough to sacrifice
a little speed, we can add two methods to lists:

    lst.include(elt)   # same as - if elt not in lst: lst.append(elt)
    lst.exclude(elt)   # same as - while elt in lst: lst.remove(elt)

(These are generally useful methods to have anyway.)


This proposal has the following advantages:

    1. You still get to choose which implementation best suits your needs.

    2. No new types are introduced; lists and dicts are well understood.

    3. Both features are extremely simple to understand and explain.

    4. Both features are useful in their own right, and could stand as
       independent proposals to improve lists and dicts respectively.
       (For instance, i spotted about 10 places in the std library where
       the 'include' method could be used, and i know i would use it
       myself -- certainly more often than pop or reverse!)

    5. In all cases this is faster than a new Python class.  (For instance,
       Jeremy's implementation even contained a commented-out optimization
       that stored self.elts.has_key as self.has_elt to speed things up a
       bit.  Using straight dicts would see this optimization and raise it
       one, with no effort at all.)

    6. Either feature can be independently approved or rejected without
       affecting the other.


-- ?!ng




From loewis at informatik.hu-berlin.de  Tue Jan 23 20:33:00 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Tue, 23 Jan 2001 20:33:00 +0100 (MET)
Subject: [Python-Dev] getting rid of ucnhash
Message-ID: <200101231933.UAA02223@pandora.informatik.hu-berlin.de>

> Should I check it in now, change the names/semantics and check it
> in, or post it to sourceforge?

Is that two or three options? If three, what change in semantics did
you propose?

Anyway, I feel it could go in right now; the only breakage would be to
applications that use ucnhash.ucnhashAPI, right?

Regards,
Martin



From fredrik at effbot.org  Tue Jan 23 20:49:09 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 23 Jan 2001 20:49:09 +0100
Subject: [Python-Dev] Re:  getting rid of ucnhash
References: <200101231933.UAA02223@pandora.informatik.hu-berlin.de>
Message-ID: <01e801c08575$8f71c680$e46940d5@hagrid>

martin wrote:

> > Should I check it in now, change the names/semantics and check it
> > in, or post it to sourceforge?
> 
> Is that two or three options?

three, I think.

> If three, what change in semantics did you propose?

none -- but maybe someone else has a better name for "lookup"?

(the "name" function behaves like the existing property methods
in 2.0's unicodedata)

> Anyway, I feel it could go in right now; the only breakage would be to
> applications that use ucnhash.ucnhashAPI, right?

yup -- and those applications are already broken, since the CObject
was renamed in 2.1a1.

(well, any code using 2.1a1's new ucnhash.getcode/getname functions
will of course also break.  but I think we can live with that ;-)

Cheers /F




From ping at lfw.org  Tue Jan 23 20:43:50 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Tue, 23 Jan 2001 11:43:50 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101231135570.1568-100000@skuld.kingmanhall.org>

Christopher Petrilli wrote:
> The problem is that there are a lot of uses for large sets, especially 
> when you begin to introduce intersections and unions.
[...]
> Unfortunately, for me, a Python implementation of Sets is only
> interesting academicaly.  Any time I've needed to work with them at a
> large scale, I've needed them *much* faster than Python could achieve
> without a C extension.

On Tue, 23 Jan 2001, Ka-Ping Yee wrote:
> This proposal has the following advantages:
[six nice things about 'in dict' and 'lst.include']

I forgot to mention an important seventh advantage:

    7. The list and dictionary data structures are implemented
       in the C core, so we leave open the possibility of a
       wizard going and optimizing the snot out of them later.

Just as there's e.g. a boundary on recursion levels before Python
invokes the cycle detection algorithm during comparison, if we
decide we need more speed for big sets, Python could notice when
a list or dictionary gets very big and invoke more powerful
optimizations.  We don't have to do this now, but the important
thing is that we will always have the option to make Christopher's
dream come true.  (A wizard can do this once, and every Python
script on the planet benefits.)

In general i support Python deciding on the Right Thing to do
under the hood, performance-wise, so that the programmer doesn't
have to think too hard about what data structure to choose.


-- ?!ng




From nas at arctrix.com  Tue Jan 23 14:08:07 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 23 Jan 2001 05:08:07 -0800
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123140604.E18796@trump.amber.org>; from petrilli@amber.org on Tue, Jan 23, 2001 at 02:06:05PM -0500
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com> <20010123140604.E18796@trump.amber.org>
Message-ID: <20010123050807.A29115@glacier.fnational.com>

On Tue, Jan 23, 2001 at 02:06:05PM -0500, Christopher Petrilli wrote:
> Unfortunately, for me, a Python implementation of Sets is only
> interesting academicaly.  Any time I've needed to work with them at a
> large scale, I've needed them *much* faster than Python could achieve
> without a C extension.

I think this argues that if sets are added to the core they
should be implemented as an extension type with the speed of
dictionaries and the memory usage of lists.  Basicly, we would
use the implementation of PyDict but drop the values.

  Neil



From jeremy at alum.mit.edu  Tue Jan 23 20:48:18 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue, 23 Jan 2001 14:48:18 -0500 (EST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <14957.53331.342827.462297@localhost.localdomain>
References: <20010122124159.A14999@thyrsus.com>
	<200101221910.OAA01218@cj20424-a.reston1.va.home.com>
	<20010122151309.C15236@thyrsus.com>
	<200101231549.KAA05172@cj20424-a.reston1.va.home.com>
	<20010123113050.A26162@thyrsus.com>
	<14957.53331.342827.462297@localhost.localdomain>
Message-ID: <14957.57346.248852.656387@localhost.localdomain>

Sorry about the garbled attachment on the previous message; I think I
got the content-type wrong.  Here's a second try.

Jeremy

-------------- next part --------------
A non-text attachment was scrubbed...
Name: sets.tar
Type: application/octet-stream
Size: 20480 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20010123/9dda92b6/attachment.obj>

From petrilli at amber.org  Tue Jan 23 21:06:16 2001
From: petrilli at amber.org (Christopher Petrilli)
Date: Tue, 23 Jan 2001 15:06:16 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123050807.A29115@glacier.fnational.com>; from nas@arctrix.com on Tue, Jan 23, 2001 at 05:08:07AM -0800
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com> <20010123140604.E18796@trump.amber.org> <20010123050807.A29115@glacier.fnational.com>
Message-ID: <20010123150616.F18796@trump.amber.org>

Neil Schemenauer [nas at arctrix.com] wrote:
> On Tue, Jan 23, 2001 at 02:06:05PM -0500, Christopher Petrilli wrote:
> > Unfortunately, for me, a Python implementation of Sets is only
> > interesting academicaly.  Any time I've needed to work with them at a
> > large scale, I've needed them *much* faster than Python could achieve
> > without a C extension.
> 
> I think this argues that if sets are added to the core they
> should be implemented as an extension type with the speed of
> dictionaries and the memory usage of lists.  Basicly, we would
> use the implementation of PyDict but drop the values.

This is effectively the implementation that Zope has for Sets.  In
addition we have "buckets" that have scores on them (which are
implemented as a modified BTree).  

Unfortunately Jim Fulton (who wrote all the code for that level) is in 
a meeting, but I hope he'll comment on the implementation that was
chosen for our software.

Chris
-- 
| Christopher Petrilli
| petrilli at amber.org



From jeremy at alum.mit.edu  Tue Jan 23 20:56:05 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue, 23 Jan 2001 14:56:05 -0500 (EST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123135530.A26565@thyrsus.com>
References: <20010122124159.A14999@thyrsus.com>
	<200101221910.OAA01218@cj20424-a.reston1.va.home.com>
	<20010122151309.C15236@thyrsus.com>
	<200101231549.KAA05172@cj20424-a.reston1.va.home.com>
	<20010123113050.A26162@thyrsus.com>
	<14957.53331.342827.462297@localhost.localdomain>
	<20010123135530.A26565@thyrsus.com>
Message-ID: <14957.57813.23072.723418@localhost.localdomain>

>>>>> "ESR" == Eric S Raymond <esr at thyrsus.com> writes:

  ESR> Jeremy Hylton <jeremy at alum.mit.edu>: Content-Description:
  ESR> message body text
  >> The tests showed that dictionary-based sets were always faster.
  >> For small tests (3 operations), the difference was about 10
  >> percent.  For larger tests (88 operations), the difference ranged
  >> from 180 to almost 700 percent.

  ESR> Not surprising.  88 elements is getting pretty large.

Large for what?  I've got directories with that many files and modules
with the many names defined at the top-level :-).  I'm just reporting
the range of set sizes I've encountered for a real application.  In
general, I expect a few hundred elements should be handled without
trouble by most Python containers.

Jeremy



From gvwilson at nevex.com  Tue Jan 23 21:26:22 2001
From: gvwilson at nevex.com (Greg Wilson)
Date: Tue, 23 Jan 2001 15:26:22 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123200601.87817EF68@mail.python.org>
Message-ID: <001101c0857a$c0dce420$770a0a0a@nevex.com>

Greg Wilson:
Meta-question: do people want to continue to discuss sets on the
general python-dev list, or take it out-of-line (e.g. to an egroups
list)?  I'm finding all of the discussion very useful, but I realize
that many readers might prefer to concentrate on the 2.1 release...

> Jeremy Hylton <jeremy at alum.mit.edu>:
> > The tests showed that dictionary-based sets were always faster.
> > small tests (3 operations), the difference was about 10 percent.
> > larger tests (88 operations), the difference ranged from 
> > 180 to almost 700 percent.

> Eric Raymond <esr at thyrsus.com>:
> Not surprising.  88 elements is getting pretty large.

Greg Wilson:
Really?  I was testing my implementation with sets of email addresses
grep'd out of old mail folders --- typical sizes were several thousand
elements.

> From: Christopher Petrilli <petrilli at amber.org>
> Unfortunately, for me, a Python implementation of Sets is only
> interesting academicaly.  Any time I've needed to work with them at a
> large scale, I've needed them *much* faster than Python could achieve
> without a C extension.

Greg Wilson:
I had been expecting to implement this in C, not in pure Python, for
performance.

> From: Christopher Petrilli <petrilli at amber.org>
> In the "scripting" problem domain, I would agree that Sets would
> rarely reach large sizes,
> and so a algorithm which performed in quadratic time might be fine,

Greg Wilson:
I strongly disagree (see the email address example above --- it was
the first thing that occurred to me to try).  I am still hoping to
find a sub-quadratic (preferably sub-linear) implementation.  I can
do it in C++ with observer/observable (contained items notify containers
of changes in value, sets store all equivalent items in the same bucket),
but that doesn't really help...

> From: Ka-Ping Yee <ping at lfw.org>
> The only change that needs to be made to support sets of immutable
> elements is to provide "in" on dictionaries...

and:

> From: Neil Schemenauer <nas at arctrix.com>
> ...if sets are added to the core...we would
> use the implementation of PyDict but drop the values.

Unfortunately, if values are required to be immutable, then sets of
sets aren't possible... :-(

Thanks, everyone,
Greg




From esr at thyrsus.com  Tue Jan 23 21:38:39 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 15:38:39 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org>; from ping@lfw.org on Tue, Jan 23, 2001 at 11:27:38AM -0800
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org>
Message-ID: <20010123153839.B26676@thyrsus.com>

Ka-Ping Yee <ping at lfw.org>:
> The only change that needs to be made to support sets of immutable
> elements is to provide "in" on dictionaries.  The rest is then all
> quite natural:
> 
>     dict[key] = 1
>     if key in dict: ...
>     for key in dict: ...

Independently of implementation issues about sets, I think this is a
damn fine idea. +1.

> (Then we can also get rid of the ugly has_key method.)
> 
> For those that need mutable set elements badly enough to sacrifice
> a little speed, we can add two methods to lists:
> 
>     lst.include(elt)   # same as - if elt not in lst: lst.append(elt)
>     lst.exclude(elt)   # same as - while elt in lst: lst.remove(elt)

+1 on the concept, -0 on the names.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

[The disarming of citizens] has a double effect, it palsies the hand
and brutalizes the mind: a habitual disuse of physical forces totally
destroys the moral [force]; and men lose at once the power of
protecting themselves, and of discerning the cause of their
oppression.
        -- Joel Barlow, "Advice to the Privileged Orders", 1792-93



From tim.one at home.com  Tue Jan 23 23:02:41 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 23 Jan 2001 17:02:41 -0500
Subject: [Python-Dev] Is X a (sequence|mapping)?
In-Reply-To: <200101231531.KAA05122@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEELFIKAA.tim.one@home.com>

>> 	operator.isMappingType()
>> 	+ some other C style _Check() APIs

[Guido]
> Yes, these should probably be deprecated.  I certainly have never
> used them!  (The operator module doesn't seem to get much use in
> general...

It's used heavily by test_operator.py <wink>.  Outside of that, it's used
maybe three times in the std distribution, nowhere essential; the

    return map(operator.__div__, rgbtuple, _maxtuple)

in Pynche's ColorDB.py is typical.  2.0's

    return [x / 256. for x in rgbtuple]

does the same thing more clearly (_maxtuple is a module constant).

It appeals to functional-language fans and extreme micro-optimizers, so they
don't have to type "lambda" in the simplest cases.  At least
operator.truth(x) is *clearer* than "not not x".

> Was it a bad idea?)

Mixed, but I'd say more bad than good overall.




From thomas at xs4all.net  Wed Jan 24 00:38:14 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 24 Jan 2001 00:38:14 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010123153839.B26676@thyrsus.com>; from esr@thyrsus.com on Tue, Jan 23, 2001 at 03:38:39PM -0500
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com>
Message-ID: <20010124003814.F27785@xs4all.nl>

On Tue, Jan 23, 2001 at 03:38:39PM -0500, Eric S. Raymond wrote:

> > The only change that needs to be made to support sets of immutable
> > elements is to provide "in" on dictionaries.  The rest is then all
> > quite natural:

> >     dict[key] = 1
> >     if key in dict: ...
> >     for key in dict: ...

> Independently of implementation issues about sets, I think this is a
> damn fine idea. +1.

It's come up before. The problem with it is that it's not quite obvious
whether it is 'if key in dict' or 'if value in dict'. Sure, from the above
example it's obvious what you *expect*, but I suspect that 'for x in dict'
will result in a 40/60 split in expectations, and like American voters, the
20% middle section will change their vote each recount :-)

Now, if only there was a terribly obvious way to spell it... so that it's
immediately obvious which of the two you wanted.... something like, oh, I
donno, this, maybe:

  if key in dict.keys: ...
  if value in dict.values: ...

Ponder-ponder--Guido-should-use-the-time-machine-for-this-one!-ly y'rs,
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From fredrik at effbot.org  Wed Jan 24 01:13:20 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Wed, 24 Jan 2001 01:13:20 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl>
Message-ID: <02f401c0859a$765d07c0$e46940d5@hagrid>

> It's come up before. The problem with it is that it's not quite obvious
> whether it is 'if key in dict' or 'if value in dict'.

you forgot "if (key, value) in dict"

on the other hand, it's not quite obvious that "list.sort"
doesn't return the sorted list, "print >>None" prints to
standard output, "except KeyError, ValueError" doesn't
catch a ValueError exception, etc, etc, etc.

(nor that it's "has_key" and "hasattr", and not "has_key"
and "has_attr" or "haskey" and "hasattr" ;-)

let's just say that "in" is the same thing as "has_key",
and be done with it.

Cheers /F




From tim.one at home.com  Wed Jan 24 02:51:22 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 23 Jan 2001 20:51:22 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123140604.E18796@trump.amber.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCMELKIKAA.tim.one@home.com>

[Christopher Petrilli]
> ....
> Unfortunately, for me, a Python implementation of Sets is only
> interesting academicaly.  Any time I've needed to work with them at a
> large scale, I've needed them *much* faster than Python could achieve
> without a C extension.

How do you know that?  I've used large sets in Python happily without
resorting to C or kjbuckets (which is really aiming at fast operations on
*graphs*, in which area it has no equal).

Everyone (except Eric <wink>) uses dicts to implement sets in Python, and
"most" set operations can work at full C speed then; e.g., assuming both
sets have N elements:

    membership testing
        O(1) -- it's just dict.has_key()
    element insertion
        O(1) -- dict[element] = 1
    element removal
        O(1) -- del dict[element]
    union
        O(N), but at full C speed -- dict1.update(dict2)
    intersection
        O(N), but at Python speed (the only 2.1 dog in the bunch!)
    choose some element and remove it
        took O(N) time and additional space in 2.0, but
        is O(1) in both since dict.pop() was introduced
    iteration
        O(N), with O(N) additional space using dict.keys(),
        or O(1) additional space using dict.pop() repeatedly

What are you going to do in C that's faster than using a Python dict for
this purpose?  Most key set operations are straightforward Python dict
1-liners then, and Python dicts are very fast.  kjbuckets sets were slower
last time I timed them (several years ago, but Python dicts have gotten
faster since then while kjbuckets has been stagnant).

There's a long tradition in the Lisp world of using unordered lists to
represent sets (when the only tool you have is a hammer ... <0.5 wink>), but
it's been easy to do much better than that in Python almost since the start.
Even in the Python list world, enormous improvements for large sets can be
gotten by maintaining lists in sorted order (then most O(N) operations drop
to O(log2(N)), and O(N**2) to O(N)).  Curiously, though, in 2.1 we can still
use a dict-set for complex numbers, but no longer a sorted-list-set!
Requiring a total ordering can get in the way more than requiring
hashability (and vice versa -- that's a tough one).

measurement-is-the-measure-of-all-measurable-things-ly y'rs  - tim




From greg at cosc.canterbury.ac.nz  Wed Jan 24 03:45:01 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 24 Jan 2001 15:45:01 +1300 (NZDT)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010124003814.F27785@xs4all.nl>
Message-ID: <200101240245.PAA02098@s454.cosc.canterbury.ac.nz>

Thomas Wouters <thomas at xs4all.net>:

> Now, if only there was a terribly obvious way to spell it... so that it's
> immediately obvious which of the two you wanted...

Well, in the case of

  for key in d:

or

  for value in d:

it's immediately obvious to a *human* reader what is meant,
so all we need to do is make the compiler a bit smarter. This
can easily be done by the use of a small table, containing
the equivalents of the words 'key' and 'value' in all known
natural languages, against which the target variable name is
matched using some suitable fuzzy matching algorithm.
Soundex could be used for this, if we can decide on which
version to use...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From guido at digicool.com  Wed Jan 24 03:46:37 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 21:46:37 -0500
Subject: [Python-Dev] getting rid of ucnhash
In-Reply-To: Your message of "Tue, 23 Jan 2001 19:03:42 +0100."
             <013901c08566$d2a8f360$e46940d5@hagrid> 
References: <013901c08566$d2a8f360$e46940d5@hagrid> 
Message-ID: <200101240246.VAA06336@cj20424-a.reston1.va.home.com>

> It's probably just me, but the names of the two unicode
> modules tend to irritate me:
> 
> > ls u*.pyd
> ucnhash.pyd      unicodedata.pyd
> 
> (the former contains names, the latter data)
> 
> I've been meaning to rename the former, but I just realized
> that it might be better to get rid of it completely, and move
> its functionality into the unicodedata module.
> 
> The result is a single 200k unicodedata module, which con-
> tains the name database as well as two new functions:
> 
>     name(character [, default]) => map unicode
>     character to name.  if the name doesn't exist,
>     return the default object, or raise ValueError.
> 
>     lookup(name) => unicode character
>     (or raise KeyError if it doesn't exist)
> 
> Should I check it in now, change the names/semantics and check
> it in, or post it to sourceforge?

To me, both of these are irrelevant details of the Unicode
implementation. :-)   IOW, feel free to check it in.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From greg at cosc.canterbury.ac.nz  Wed Jan 24 03:49:21 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 24 Jan 2001 15:49:21 +1300 (NZDT)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMELKIKAA.tim.one@home.com>
Message-ID: <200101240249.PAA02101@s454.cosc.canterbury.ac.nz>

Tim Peters <tim.one at home.com>:

> Requiring a total ordering can get in the way more than requiring
> hashability

Often it's useful to have *some* total ordering, and
you don't really care what it is as long as its consistent.

Maybe all types should be required to support cmp(x,y) even 
if doing x < y via the rich comparison route raises a
NotOrderable exception.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From greg at cosc.canterbury.ac.nz  Wed Jan 24 03:52:43 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 24 Jan 2001 15:52:43 +1300 (NZDT)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123050807.A29115@glacier.fnational.com>
Message-ID: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>

Neil Schemenauer <nas at arctrix.com>:

> Basicly, we would
> use the implementation of PyDict but drop the values.

This could be incorporated into PyDict. Instead of storing keys and
values in the same array, keep them in separate arrays and only
allocate the values array the first time someone stores a value other
than 1.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From guido at digicool.com  Wed Jan 24 03:58:59 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 21:58:59 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Wed, 24 Jan 2001 01:13:20 +0100."
             <02f401c0859a$765d07c0$e46940d5@hagrid> 
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl>  
            <02f401c0859a$765d07c0$e46940d5@hagrid> 
Message-ID: <200101240258.VAA06479@cj20424-a.reston1.va.home.com>

> let's just say that "in" is the same thing as "has_key",
> and be done with it.

You know, I've long resisted this, but I agree now -- this is the
right thing.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Wed Jan 24 04:11:30 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 22:11:30 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,1.3,1.4
In-Reply-To: Your message of "Tue, 23 Jan 2001 12:35:04 CST."
             <14957.52952.48739.53360@beluga.mojam.com> 
References: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net>  
            <14957.52952.48739.53360@beluga.mojam.com> 
Message-ID: <200101240311.WAA06582@cj20424-a.reston1.va.home.com>

>     Guido> - Use "exec ... in dict" to avoid having to walk on eggshells;
>     Guido>   locals no don't have to start with underscore.
> 
> Thanks.  I have just been incredibly short on time lately.

You're welcome.

>     Guido> - Only test dbhash if bsddb can be imported.  (Wonder if there
>     Guido>   are more like this?)
> 
> Alpha testing should pick those up, yes? ;-)

Yes. :-)

>     Guido> ! try:
>     Guido> !     import bsddb
>     Guido> ! except ImportError:
>     Guido> !     if verbose:
>     Guido> !         print "can't import bsddb, so skipping dbhash"
>     Guido> ! else:
>     Guido> !     check_all("dbhash")
> 
> Instead of having to know that dbhash includes bsddb, shouldn't dbhash be
> the module that's imported here?

I think I saw a complaint about this that specifically said that when
dbhash is imported when bsddb can't be imported, an incomplete dbhash
is left behind in sys.modules, and then a second import of dbhash will
succeed -- but of course it will define no objects.  Since dbhash may
be imported elsewhere, testing for bsddb is safer.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Wed Jan 24 04:22:14 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 22:22:14 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.114,2.115
In-Reply-To: Your message of "Tue, 23 Jan 2001 08:24:38 PST."
             <E14L6FK-0001ZY-00@usw-pr-cvs1.sourceforge.net> 
References: <E14L6FK-0001ZY-00@usw-pr-cvs1.sourceforge.net> 
Message-ID: <200101240322.WAA06671@cj20424-a.reston1.va.home.com>

> A few miscellaneous helpers.
> 
> PyObject_Dump(): New function that is useful when debugging Python's C
> runtime.  In something like gdb it can be a pain to get some useful
> information out of PyObject*'s.  This function prints the str() of the
> object to stderr, along with the object's refcount and hex address.
> 
> PyGC_Dump(): Similar to PyObject_Dump() but knows how to cast from the
> garbage collector prefix back to the PyObject* structure.
> 
> [See Misc/gdbinit for some useful gdb hooks]
> 
> none_dealloc(): Rather than SEGV if we accidentally decref None out of
> existance, we assign None's and NotImplemented's destructor slot to
> this function, which just calls abort().

Barry, since these are only gdb helpers, would it perhaps be better if
their names started with "_Py" to indicate that they aren't part of
the regular API?  They violate an important rule: you shouldn't write
to stderr directly, but always to sys.stderr.  (There's a helper
routines to write to stderr: PySys_WriteStderr().)  I understand that
for the gdb helper it's important to use the real stderr, and I don't
object to having these functions present at all times (they're so
small), but I do think that we should make it clear (by a _Py name,
and also by a comment) that they should not be called!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ping at lfw.org  Wed Jan 24 04:29:24 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Tue, 23 Jan 2001 19:29:24 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010124003814.F27785@xs4all.nl>
Message-ID: <Pine.LNX.4.10.10101231921030.1568-100000@skuld.kingmanhall.org>

I wrote:
> The only change that needs to be made to support sets of immutable
> elements is to provide "in" on dictionaries.

Thomas Wouters wrote:
> It's come up before. The problem with it is that it's not quite obvious
> whether it is 'if key in dict' or 'if value in dict'.

Yes, and i've seen this objection before, and i think it's silly.

> Sure, from the above
> example it's obvious what you *expect*, but I suspect that 'for x in dict'
> will result in a 40/60 split in expectations,

No way... it's at least 90/10.

How often do you write 'dict.has_key(x)'?          (std lib says: 206)
How often do you write 'for x in dict.keys()'?     (std lib says: 49)

How often do you write 'x in dict.values()'?       (std lib says: 0)
How often do you write 'for x in dict.values()'?   (std lib says: 3)

I rest my case.


-- ?!ng




From barry at digicool.com  Wed Jan 24 04:44:31 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 23 Jan 2001 22:44:31 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.114,2.115
References: <E14L6FK-0001ZY-00@usw-pr-cvs1.sourceforge.net>
	<200101240322.WAA06671@cj20424-a.reston1.va.home.com>
Message-ID: <14958.20383.795064.832967@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido at digicool.com> writes:

    GvR> Barry, since these are only gdb helpers, would it perhaps be
    GvR> better if their names started with "_Py" to indicate that
    GvR> they aren't part of the regular API?  They violate an
    GvR> important rule: you shouldn't write to stderr directly, but
    GvR> always to sys.stderr.  (There's a helper routines to write to
    GvR> stderr: PySys_WriteStderr().)  I understand that for the gdb
    GvR> helper it's important to use the real stderr, and I don't
    GvR> object to having these functions present at all times
    GvR> (they're so small), but I do think that we should make it
    GvR> clear (by a _Py name, and also by a comment) that they should
    GvR> not be called!

I thought about it, couldn't decide and figured I'd check it in
anyway, knowing that you'd let me know.  See how wise I was?  :)

I will rename them as _Py* and fix the gdbinit file accordingly.  One
note: these functions /ought/ to be useful for dbx or any other
command line debugger.  I just haven't used anything but gdb for
years.  If anybody's got a dbxinit equivalent I could add that to Misc
too.

nothing-an-adjacent-office-wouldn't-have-solved-much-more-quick-ly y'rs,
-Barry



From guido at digicool.com  Wed Jan 24 04:46:47 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 22:46:47 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: Your message of "Tue, 23 Jan 2001 09:22:26 EST."
             <20010123092226.A25968@thyrsus.com> 
References: <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com>  
            <20010123092226.A25968@thyrsus.com> 
Message-ID: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>

> Guido van Rossum <guido at digicool.com>:
> > Can you point me to docs explaining the meaning of the BROWSER
> > environment variable?  I've never heard of it...  The last new
> > environment variables I learned were PAGER and EDITOR, probably 15
> > years ago when 4.1BSD was released... :-)

ESR replies:
> You've never heard of BROWSER because I invented it and have not
> widely popularized it yet :-).  Ping knew about it either because he
> read the module code and saw that it was supposed to work, or because
> he remembered the design discussion when webbrowser.py was first
> implemented.
> 
> I've had conversations with some key Perl and Tcl people (Larry Wall,
> Tom Christiansen, Clif Flynt) about the BROWSER convention, and they
> agree it's a good idea.  I'll probably hack support for it into Perl's
> browser launcher next.
> 
> It's documented in the version of libwebbrowser.tex now in the CVS
> tree.

Grumble.  That wasn't the kind of answer I expected.  I don't like it
if Python is used as a wedge to get a particular thing introduced to
the rest of the world, no matter how useful it may seem at the time.
If something is already a popular convention, I'll happily adopt it,
but I'm not comfortable being put in front of somebody else's cart.
There just are too many carts that would like to be pulled by a horse
as strong as Python, and I don't want to take sides if I can avoid it.
BROWSER seems unlikely to take the world by storm and I don't feel I
need to be involved in the effort to get it accepted.

(And yes, I know there are enough cases where I *did* take sides.
There were some cases where I *do* want to take a side, and there were
some mistakes -- which is one of the reasons why I'm shy about taking
sides now.)

Anyway, shouldn't you also talk to the developers of packages like KDE
and Gnome?  Surely their users would like to be able to configure the
default webbrowser.  Talking just to the scripting language people
seems like you're thinking too small.  There must be lots of C apps
with the desire to invoke a browser.  Also Emacs, which has an
extensive list of browser-url-* functions (you might even learn a few
tricks from it about how to invoke various external browsers) but
AFAIK no default browser selection.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Wed Jan 24 04:54:25 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 22:54:25 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Wed, 24 Jan 2001 15:52:43 +1300."
             <200101240252.PAA02105@s454.cosc.canterbury.ac.nz> 
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz> 
Message-ID: <200101240354.WAA06903@cj20424-a.reston1.va.home.com>

> Neil Schemenauer <nas at arctrix.com>:
> 
> > Basicly, we would
> > use the implementation of PyDict but drop the values.
> 
> This could be incorporated into PyDict. Instead of storing keys and
> values in the same array, keep them in separate arrays and only
> allocate the values array the first time someone stores a value other
> than 1.

Not a bad idea!  (But shouldn't the default value be something else,
like none?)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Wed Jan 24 05:20:56 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 23:20:56 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Wed, 24 Jan 2001 00:38:14 +0100."
             <20010124003814.F27785@xs4all.nl> 
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com>  
            <20010124003814.F27785@xs4all.nl> 
Message-ID: <200101240420.XAA07153@cj20424-a.reston1.va.home.com>

> > >     dict[key] = 1
> > >     if key in dict: ...
> > >     for key in dict: ...
> 
> > Independently of implementation issues about sets, I think this is a
> > damn fine idea. +1.
> 
> It's come up before. The problem with it is that it's not quite obvious
> whether it is 'if key in dict' or 'if value in dict'. Sure, from the above
> example it's obvious what you *expect*, but I suspect that 'for x in dict'
> will result in a 40/60 split in expectations, and like American voters, the
> 20% middle section will change their vote each recount :-)
> 
> Now, if only there was a terribly obvious way to spell it... so that it's
> immediately obvious which of the two you wanted.... something like, oh, I
> donno, this, maybe:
> 
>   if key in dict.keys: ...
>   if value in dict.values: ...
> 
> Ponder-ponder--Guido-should-use-the-time-machine-for-this-one!-ly y'rs,

No chance of a time-machine escape, but I *can* say that I agree that
Ping's proposal makes a lot of sense.  This is a reversal of my
previous opinion on this matter.  (Take note -- those don't happen
very often! :-)

First to submit a working patch gets a free copy of 2.1a2 and
subsequent releases,

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Wed Jan 24 05:50:49 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 23 Jan 2001 23:50:49 -0500
Subject: [Python-Dev] getting rid of ucnhash
In-Reply-To: <013901c08566$d2a8f360$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEMCIKAA.tim.one@home.com>

[/F]
> It's probably just me, but the names of the two unicode
> modules tend to irritate me:

I don't care much about the names, but having two Unicode subprojects in the
MS build seems overkill <wink>.

> ls u*.pyd
> ucnhash.pyd      unicodedata.pyd
>
> (the former contains names, the latter data)

Maybe that's the reason:  the names don't get loaded at all unless you *use*
one of the name APIs?  Hard to say whether that's worth the bother; now that
everything has been nicely compressed, it's sure not as compelling as it may
have been earlier.

> I've been meaning to rename the former, but I just realized
> that it might be better to get rid of it completely, and move
> its functionality into the unicodedata module.
>
> The result is a single 200k unicodedata module, which con-
> tains the name database as well as two new functions:
>
>     name(character [, default]) => map unicode
>     character to name.  if the name doesn't exist,
>     return the default object, or raise ValueError.
>
>     lookup(name) => unicode character
>     (or raise KeyError if it doesn't exist)
>
> Should I check it in now, change the names/semantics and check
> it in, or post it to sourceforge?

I have no opinion on what's best:  you're working with it, you're the best
judge of that.  I only vote for checking in whatever you decide sooner
rather than later; I'll fiddle the MS project files and readmes accordingly
ASAP after that.




From moshez at zadka.site.co.il  Wed Jan 24 15:07:08 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Wed, 24 Jan 2001 16:07:08 +0200 (IST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>
Message-ID: <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il>

On Wed, 24 Jan 2001, Greg Ewing <greg at cosc.canterbury.ac.nz> wrote:

> This could be incorporated into PyDict. Instead of storing keys and
> values in the same array, keep them in separate arrays and only
> allocate the values array the first time someone stores a value other
> than 1.

Cool idea, but even cooler (would catch more idioms, that is) is
"the first time someone stores something not 'is'  something in the
dict, allocate the values array". This would catch small numbers,
None and identifier-looking strings, for the measly cost of one
pointer/dict object.

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From moshez at zadka.site.co.il  Wed Jan 24 15:15:39 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Wed, 24 Jan 2001 16:15:39 +0200 (IST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>, <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com>  
            <20010123092226.A25968@thyrsus.com>
Message-ID: <20010124141539.76C3FA83E@darjeeling.zadka.site.co.il>

On Tue, 23 Jan 2001 22:46:47 -0500, Guido van Rossum <guido at digicool.com> wrote:

[ESR]
> You've never heard of BROWSER because I invented it and have not
> widely popularized it yet :-).

[Guido v. Rossum]
> Grumble.  That wasn't the kind of answer I expected.  I don't like it
> if Python is used as a wedge to get a particular thing introduced to
> the rest of the world, no matter how useful it may seem at the time.

Guido, I think you're being over-dramatic. BROWSER is right in the
tradition of PAGER and EDITOR, and a lot of other programs need it.
I know Eric uses RH and mutt, so probably RH's urlview program (which
mutt uses to jump to URLs) uses BROWSER. I was just about to submit
a bug report to Debian that their urlview doesn't respect it.

And if you really don't want to be a horse in front of a cart...

> Anyway, shouldn't you also talk to the developers of packages like KDE
> and Gnome?  Surely their users would like to be able to configure the
> default webbrowser.

Yes -- via GNOME/KDE specific mechanisms. I have 0 experience with KDE,
but I'm guessing the GNOME guys would do it via the GNOME "registry".
KDE probably has something similar. I'm sure you wouldn't want Python
to depend on GNOME, though it would be nice to make the browser-choosing
part pluggable so when "import gnome" is done, it automatically tries
to choose the user's browser.

On UNIX (as opposed to GNOME/KDE, which are pretty much operating systems
themselves), these things are done via environment variable. And $BROWSER
doesn't seem like that much of an innovation.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From skip at mojam.com  Wed Jan 24 07:28:21 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 24 Jan 2001 00:28:21 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,1.3,1.4
In-Reply-To: <200101240311.WAA06582@cj20424-a.reston1.va.home.com>
References: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net>
	<14957.52952.48739.53360@beluga.mojam.com>
	<200101240311.WAA06582@cj20424-a.reston1.va.home.com>
Message-ID: <14958.30213.325584.373062@beluga.mojam.com>

    Guido> I think I saw a complaint about this that specifically said that
    Guido> when dbhash is imported when bsddb can't be imported, an
    Guido> incomplete dbhash is left behind in sys.modules, and then a
    Guido> second import of dbhash will succeed -- but of course it will
    Guido> define no objects.

So it does:

    % ./python
    Python 2.1a1 (#2, Jan 23 2001, 23:30:41) 
    [GCC 2.95.3 19991030 (prerelease)] on linux2
    Type "copyright", "credits" or "license" for more information.
    >>> import dbhash
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "/home/beluga/skip/src/python/dist/src/Lib/dbhash.py", line 3, in ?
	import bsddb
    ImportError: No module named bsddb
    >>> import dbhash
    >>>

Can that be construed as a bug?  If import fails, shouldn't the stub module
that was inserted in sys.modules be removed?

Skip



From skip at mojam.com  Wed Jan 24 07:31:08 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 24 Jan 2001 00:31:08 -0600 (CST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>
References: <20010123041730.A25165@thyrsus.com>
	<200101231406.JAA04765@cj20424-a.reston1.va.home.com>
	<20010123092226.A25968@thyrsus.com>
	<200101240346.WAA06790@cj20424-a.reston1.va.home.com>
Message-ID: <14958.30380.851599.764535@beluga.mojam.com>

    Guido> BROWSER seems unlikely to take the world by storm and I don't
    Guido> feel I need to be involved in the effort to get it accepted.

Editors and web browsers are classes of tools which (one would hope) will
always come in several varieties.  Users have to have some way to specify
what to launch.  BROWSER seems analogous to the EDITOR environment variable
which is commonly used in Unix environments for just that purpose.

Skip



From thomas at xs4all.net  Wed Jan 24 08:03:09 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 24 Jan 2001 08:03:09 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101240420.XAA07153@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 11:20:56PM -0500
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl> <200101240420.XAA07153@cj20424-a.reston1.va.home.com>
Message-ID: <20010124080308.G27785@xs4all.nl>

On Tue, Jan 23, 2001 at 11:20:56PM -0500, Guido van Rossum wrote:

> First to submit a working patch gets a free copy of 2.1a2 and
> subsequent releases,

Patch submitted. It only implements 'if key in dict', not 'for key in dict'.
The latter is kind of hard until we have a separate iteration protocol.
(PEP, anyone ?) Once we have it, we could consider 'for key, value in dict',
which is now easily explained with 'dict.popitem()'.

Does this mean I get a legally sound and thus empty legal statement with
every Python release for the rest of your, its or my life, Guido, or will
you just make me 'Free Python Release Receiver For Life' ? :-)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From pf at artcom-gmbh.de  Wed Jan 24 08:31:30 2001
From: pf at artcom-gmbh.de (Peter Funk)
Date: Wed, 24 Jan 2001 08:31:30 +0100 (MET)
Subject: OT: contribution rewards (was Re: [Python-Dev] Re: Sets: elt in dict, lst.include)
In-Reply-To: <200101240420.XAA07153@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Jan 23, 2001 11:20:56 pm"
Message-ID: <m14LKOw-000CxUC@artcom0.artcom-gmbh.de>

Hi,

Guido van Rossum:
[...]
> Ping's proposal makes a lot of sense.  This is a reversal of my
> previous opinion on this matter.  (Take note -- those don't happen
> very often! :-)

It gives a warm und fuzzy feeling to see that happen sometimes at all. ;-)

> First to submit a working patch gets a free copy of 2.1a2 and
> subsequent releases,

This repeated offer of free copies of Python becomes increasingly
boring.  For quite a while I myself have not contributed anything useful 
and I am nevertheless hoarding free copies of Python here. ;-)

What about offering another immaterial reward to potential contributors
instead?  What about "fame points"?  Anybody contributing something
useful to Python receives a certain number of "fame points":  These
fame points will be added and placed in front of the name of
the contributor into the ACKS file and the file will be sorted
accordingly turning the ACKS file effectively into some kind of
"Python contribution high score" ...   ;-)

Just kidding, Peter




From tim.one at home.com  Wed Jan 24 09:08:50 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 03:08:50 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123050807.A29115@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEMPIKAA.tim.one@home.com>

[Neil Schemenauer]
> I think this argues that if sets are added to the core they
> should be implemented as an extension type with the speed of
> dictionaries and the memory usage of lists.  Basicly, we would
> use the implementation of PyDict but drop the values.

They'll be slower than dicts and take more memory than lists then.  WRT
memory, dicts cache the hash code with each entry for speed (so double the
memory of a list even without the value field), and are never more than 2/3
full anyway.  The dict implementation also gets low-level speed benefits out
of using both the key and value fields to characterize the nature of a slot
(the key field is NULL iff the slot is virgin; the value field is NULL iff
the slot is available (virgin or dummy)).

Dummy slots can be avoided (and so also the need for runtime code to
distinguish them from active slots) by using a hash table of pointers to
linked lists-- or flex vectors, or linked lists of small vectors --instead,
and in most ways that leads to much simpler code (no more fiddling with
dummies, no more probe-sequence hassles, no more boosting the size before
the table is full).  But without fine control over the internals of malloc,
that takes even more memory in the end.

Interesting twist:  "a dict" *is* "a set", but a set of (key, value) pairs
further constrained so that no two elements have the same key.  So any set
implementation can be used as-is to implement a dict as a set of 2-tuples,
customizing the hash and "is equal" functions to look at just the tuples'
first elements.  The was the view taken by SETL in 1969, although their
"map" (dict) type was eventually optimized to get away from actually
constructing 2-tuples.  Indeed, SETL eventually grew an elaborate optional
type declaration sublanguage, allowing the user to influence many details of
its many internal set-storage schemes; e.g., from pg 399 of "Programming
With Sets:  An Introduction to SETL":

    For example, we can declare [I'm putting their keywords in UPPERCASE
    for, umm, clarity]

        successors: LOCAL MMAP(ELMT b) REMOTE SET(ELMT b);

    This declaration specifies that for each x in b the image set
    successors{x} is stored in the element block of x, and that this
    image set is always to be represented as a bit vector.  Similarly,
    the declaration

        successors: LOCAL MMAP(ELMT b) SPARSE SET(ELMT b);

    specifies that for each x in b the image set successors{x} is to
    be stored as a hash table containing pointers to elements of b.
    Note that the attribute LOCAL cannot be used for image sets of
    multivalued maps,  This follows from the remarks in section 10.4.3
    on the awkwardness of making local objects into subparts of
    composite objects.

Clear?  Snort.  Here are some citations lifted from the web for their
experience in trying to make these kinds of decisions by magic:

@article{dewar:79,
title="Programming by Refinement, as Exemplified by the {SETL}
Representation Sublanguage",
author="Robert B. K. Dewar and Arthur Grand and Ssu-Cheng Liu and
Jacob T. Schwartz and Edmond Schonberg",
journal=toplas,
year=1979,
month=jul,
volume=1,
number=1,
pages="27--49"
}

@article{schonberg:81,
title="An Automatic Technique for Selection of Data Structures in
{SETL} Programs",
author="Edmond Schonberg and Jacob T. Schwartz and Micha Sharir",
journal=toplas,
year=1981,
month=apr,
volume=3,
number=2,
pages="126--143"
}

@article{freudenberger:83,
title="Experience with the {SETL} Optimizer",
author="Stefan M. Freudenberger and Jacob T. Schwartz and Micha Sharir",
pages="26--45",
journal=toplas,
year=1983,
month=jan,
volume=5,
number=1
}

If someone wanted to take sets seriously today, a better approach would be
to define a minimal "set interface" ("abstract base class" in C++ terms),
then supply multiple implementations of that interface, letting the user
choose directly which implementation strategy they want for each of their
sets.  And people are doing just that in the C++ and Java worlds; e.g.,

http://developer.java.sun.com/developer/onlineTraining/
    collections/Collection.html#SetInterface

Curiously, the newer Java Collections Framework (covering multiple
implementations of list, set, and dict interfaces) gave up on thread-safety
by default, because it cost too much at runtime.  Just another thing to
argue about <wink>.

we're-not-exactly-pioneers-here-ly y'rs  - tim




From fredrik at effbot.org  Wed Jan 24 09:29:30 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Wed, 24 Jan 2001 09:29:30 +0100
Subject: [Python-Dev] getting rid of ucnhash
References: <013901c08566$d2a8f360$e46940d5@hagrid>  <200101240246.VAA06336@cj20424-a.reston1.va.home.com>
Message-ID: <019801c085df$c7ee0540$e46940d5@hagrid>

guido wrote:
> > It's probably just me, but the names of the two unicode
> > modules tend to irritate me:
> > 
> > > ls u*.pyd
> > ucnhash.pyd      unicodedata.pyd
> 
> To me, both of these are irrelevant details of the Unicode
> implementation. :-)   IOW, feel free to check it in.

Done.

Note that Include/ucnhash.h is still there; it declares the
"ucnhash_CAPI" structure used to access names from the
unicodeobject module.

(and all name-related tests are still kept in test_ucn)

I'll leave it to Tim to update the MSVC build files.

Cheers /F




From tim.one at home.com  Wed Jan 24 09:28:34 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 03:28:34 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>

[Guido]
> Can you point me to docs explaining the meaning of the BROWSER
> environment variable?  I've never heard of it...  The last new
> environment variables I learned were PAGER and EDITOR, probably 15
> years ago when 4.1BSD was released... :-)

I gotta say, politics aside, BROWSER is a screamingly natural answer to the
question "what comes next in this sequence?":

    PAGER, EDITOR, ...

Dear Lord, even *I* use a browser almost every week <wink>.

explicit-is-better-than-implicit-ly y'rs  - tim




From esr at thyrsus.com  Wed Jan 24 10:02:59 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:02:59 -0500
Subject: OT: contribution rewards (was Re: [Python-Dev] Re: Sets: elt in dict, lst.include)
In-Reply-To: <m14LKOw-000CxUC@artcom0.artcom-gmbh.de>; from pf@artcom-gmbh.de on Wed, Jan 24, 2001 at 08:31:30AM +0100
References: <200101240420.XAA07153@cj20424-a.reston1.va.home.com> <m14LKOw-000CxUC@artcom0.artcom-gmbh.de>
Message-ID: <20010124040259.A28086@thyrsus.com>

Peter Funk <pf at artcom-gmbh.de>:
> What about offering another immaterial reward to potential contributors
> instead?  What about "fame points"?  Anybody contributing something
> useful to Python receives a certain number of "fame points":  These
> fame points will be added and placed in front of the name of
> the contributor into the ACKS file and the file will be sorted
> accordingly turning the ACKS file effectively into some kind of
> "Python contribution high score" ...   ;-)
> 
> Just kidding, Peter

You may be joking, but as an observer of how gift cultures work I say this
isn't a bad idea.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"One of the ordinary modes, by which tyrants accomplish their purposes
without resistance, is, by disarming the people, and making it an
offense to keep arms."
        -- Constitutional scholar and Supreme Court Justice Joseph Story, 1840



From esr at thyrsus.com  Wed Jan 24 10:09:18 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:09:18 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101240258.VAA06479@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 09:58:59PM -0500
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl> <02f401c0859a$765d07c0$e46940d5@hagrid> <200101240258.VAA06479@cj20424-a.reston1.va.home.com>
Message-ID: <20010124040918.B28086@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> > let's just say that "in" is the same thing as "has_key",
> > and be done with it.
> 
> You know, I've long resisted this, but I agree now -- this is the
> right thing.

I think we've just justified the time and energy that went into this 
discussion.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

What is a magician but a practicing theorist?
	-- Obi-Wan Kenobi, 'Return of the Jedi'



From esr at thyrsus.com  Wed Jan 24 10:14:27 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:14:27 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 10:46:47PM -0500
References: <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com> <20010123092226.A25968@thyrsus.com> <200101240346.WAA06790@cj20424-a.reston1.va.home.com>
Message-ID: <20010124041427.D28086@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> Grumble.  That wasn't the kind of answer I expected.  I don't like it
> if Python is used as a wedge to get a particular thing introduced to
> the rest of the world, no matter how useful it may seem at the time.

Oh, stop!  I'm not using Python as an argument for other people to adopt
the BROWSER convention.  The idea sells itself quite nicely by analogy to
EDITOR and PAGER the second people hear it.

> Anyway, shouldn't you also talk to the developers of packages like KDE
> and Gnome?  Surely their users would like to be able to configure the
> default webbrowser.  Talking just to the scripting language people
> seems like you're thinking too small.  There must be lots of C apps
> with the desire to invoke a browser.  Also Emacs, which has an
> extensive list of browser-url-* functions (you might even learn a few
> tricks from it about how to invoke various external browsers) but
> AFAIK no default browser selection.

All on my TO-DO list.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

It is proper to take alarm at the first experiment on our
liberties. We hold this prudent jealousy to be the first duty of
citizens and one of the noblest characteristics of the late
Revolution. The freemen of America did not wait till usurped power had
strengthened itself by exercise and entangled the question in
precedents. They saw all the consequences in the principle, and they
avoided the consequences by denying the principle. We revere this
lesson too much ... to forget it
	-- James Madison.



From esr at thyrsus.com  Wed Jan 24 10:16:12 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:16:12 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>; from tim.one@home.com on Wed, Jan 24, 2001 at 03:28:34AM -0500
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>
Message-ID: <20010124041612.E28086@thyrsus.com>

Tim Peters <tim.one at home.com>:
> I gotta say, politics aside, BROWSER is a screamingly natural answer to the
> question "what comes next in this sequence?":
> 
>     PAGER, EDITOR, ...

That's exactly what I thought when I was struck by the obvious.  Everybody
I spread this meme to seems to agree.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Government is actually the worst failure of civilized man. There has
never been a really good one, and even those that are most tolerable
are arbitrary, cruel, grasping and unintelligent.
	-- H. L. Mencken 



From esr at thyrsus.com  Wed Jan 24 10:21:56 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:21:56 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124141539.76C3FA83E@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Wed, Jan 24, 2001 at 04:15:39PM +0200
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>, <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com> <20010123092226.A25968@thyrsus.com> <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <20010124141539.76C3FA83E@darjeeling.zadka.site.co.il>
Message-ID: <20010124042156.F28086@thyrsus.com>

Moshe Zadka <moshez at zadka.site.co.il>:
> I know Eric uses RH and mutt, so probably RH's urlview program (which
> mutt uses to jump to URLs) uses BROWSER. I was just about to submit
> a bug report to Debian that their urlview doesn't respect it.

Oh, *do* that!  Note: BROWSER may consist of a colon-separated series
of parts, browser commands to be tried in order (this is useful so you
can put an X browser first, then a console browser, and have the right
thing happen).  If a part contains %s, the URL is substituted there;
otherwise, the URL is concatenated to the command after a space.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Gun Control: The theory that a woman found dead in an alley, raped and
strangled with her panty hose, is somehow morally superior to a
woman explaining to police how her attacker got that fatal bullet wound.
	-- L. Neil Smith



From tim.one at home.com  Wed Jan 24 10:24:26 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 04:24:26 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101240354.WAA06903@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAENIIKAA.tim.one@home.com>

[Greg Ewing]
> This could be incorporated into PyDict. Instead of storing keys and
> values in the same array, keep them in separate arrays and only
> allocate the values array the first time someone stores a value other
> than 1.

[Guido]
> Not a bad idea!

In theory, but if Vladimir were here he'd bust a gut over the possibly bad
cache effects on "real dicts" (by keeping everything together, simply
accessing the cached hash code brings both the key and value pointers into
L1 cache too).  We would need to quantify the effect of breaking that
connection.

> (But shouldn't the default value be something else,
> like none?)

Bleech.  I hate the idiom of using a false value to mean "present".

    d = {}
    for x in seq:
        d[x] = 1

runs faster too (None needs a LOAD_GLOBAL now).




From tim.one at home.com  Wed Jan 24 11:01:36 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 05:01:36 -0500
Subject: [Python-Dev] test___all__ failing; Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>

> python  ../lib/test/regrtest.py test___all__

test___all__
test test___all__ crashed -- exceptions.AttributeError:
     'locale' module has no attribute 'LC_MESSAGES'

And indeed it does not:

> python
Python 2.1a1 (#9, Jan 24 2001, 04:40:55) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import locale
>>> dir(locale)
['CHAR_MAX', 'Error', 'LC_ALL', 'LC_COLLATE', 'LC_CTYPE',
 'LC_MONETARY', 'LC_NUMERIC', 'LC_TIME', '__all__', '__builtins__',
 '__doc__', '__file__', '__name__', '_build_localename', '_group',
 '_parse_localename', '_print_locale', '_setlocale', '_test', 'atof',
 'atoi', 'encoding_alias', 'format', 'getdefaultlocale', 'getlocale',
 'locale_alias', 'localeconv', 'normalize', 'resetlocale', 'setlocale',
 'str', 'strcoll', 'string', 'strxfrm', 'sys', 'windows_locale']
>>>

Nor is LC_MESSAGES std C (the other LC_XXX guys are).

I pin the blame on

    from _locale import *

in locale.py -- who knows what that's supposed to export?  Certainly not
Skip <wink>.




From tim.one at home.com  Wed Jan 24 11:17:47 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 05:17:47 -0500
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>

Nevermind; checked in a hack to stop the error on Windows.




From mal at lemburg.com  Wed Jan 24 14:00:28 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 24 Jan 2001 14:00:28 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl> <02f401c0859a$765d07c0$e46940d5@hagrid>
Message-ID: <3A6ED1EC.237B5B1D@lemburg.com>

Fredrik Lundh wrote:
> 
> > It's come up before. The problem with it is that it's not quite obvious
> > whether it is 'if key in dict' or 'if value in dict'.
> 
> you forgot "if (key, value) in dict"
> 
> on the other hand, it's not quite obvious that "list.sort"
> doesn't return the sorted list, "print >>None" prints to
> standard output, "except KeyError, ValueError" doesn't
> catch a ValueError exception, etc, etc, etc.
> 
> (nor that it's "has_key" and "hasattr", and not "has_key"
> and "has_attr" or "haskey" and "hasattr" ;-)
> 
> let's just say that "in" is the same thing as "has_key",
> and be done with it.

+1 all the way :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Wed Jan 24 15:01:33 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 24 Jan 2001 15:01:33 +0100
Subject: [Python-Dev] Interfaces (Is X a (sequence|mapping)?)
References: <200101230814.f0N8EWQ00849@mira.informatik.hu-berlin.de>  
	            <3A6D4B9F.38B17046@lemburg.com> <200101231531.KAA05122@cj20424-a.reston1.va.home.com>
Message-ID: <3A6EE03D.4D5DFD17@lemburg.com>

Guido van Rossum wrote:
> 
> > Polymorphic code will usually get you more out of an
> > algorithm, than type-safe or interface-safe code.
> 
> Right.
> 
> But there are times when people want to write methods that take
> e.g. either a sequence or a mapping, and need to distinguish between
> the two.  That's not easy in Python!  Java and C++ support it very
> well though, and thus we'll always keep seeing this kind of
> complaint.  Not sure what to do, except to recommend "find out which
> methods you expect in one case but not in the other (e.g. keys()) and
> do a hasattr() test for that."

Perhaps we should provide simple means for testing a set of
available methods and slots ?!

E.g. hasinterface(obj, ('keys', 'items', '__len__'))

Objects could provide an __interface__ special attribute for this
purpose (since not all slots can be auto-detected and -verified
without side-effects).

> > BTW, there are Python interfaces to PySequence_Check() and
> > PyMapping_Check() burried in the builtin operator module in case
> > you really do care ;) ...
> >
> >       operator.isSequenceType()
> >       operator.isMappingType()
> >       + some other C style _Check() APIs
> >
> > These only look at the type slots though, so Python instances
> > will appear to support everything but when used fail with
> > an exception if they don't provide the proper __xxx__ hooks.
> 
> Yes, these should probably be deprecated.  I certainly have never used
> them!  (The operator module doesn't seem to get much use in
> general...  Was it a bad idea?)

Some of these are nice to have and provide some good performance
boost (e.g. the numeric slot access APIs). The type slot checking 
APIs are not too useful though.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From jim at digicool.com  Wed Jan 24 10:05:44 2001
From: jim at digicool.com (Jim Fulton)
Date: Wed, 24 Jan 2001 04:05:44 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com> <20010123140604.E18796@trump.amber.org> <20010123050807.A29115@glacier.fnational.com> <20010123150616.F18796@trump.amber.org>
Message-ID: <3A6E9AE8.6C2D3CF0@digicool.com>

Christopher Petrilli wrote:
> 
> Neil Schemenauer [nas at arctrix.com] wrote:
> > On Tue, Jan 23, 2001 at 02:06:05PM -0500, Christopher Petrilli wrote:
> > > Unfortunately, for me, a Python implementation of Sets is only
> > > interesting academicaly.  Any time I've needed to work with them at a
> > > large scale, I've needed them *much* faster than Python could achieve
> > > without a C extension.
> >
> > I think this argues that if sets are added to the core they
> > should be implemented as an extension type with the speed of
> > dictionaries and the memory usage of lists.  Basicly, we would
> > use the implementation of PyDict but drop the values.
> 
> This is effectively the implementation that Zope has for Sets. 

Except we use sorted collections with binary search for sets.

I think that a simple hash-based set would make alot of sense.

> In
> addition we have "buckets" that have scores on them (which are
> implemented as a modified BTree).
> 
> Unfortunately Jim Fulton (who wrote all the code for that level) is in
> a meeting, but I hope he'll comment on the implementation that was
> chosen for our software.

We have a number of special needs:

  - Scalability is critical. We make some special opimizations, 
    like sets of integers and mapping objects with integer keys
    and values. In these cases, data are stored using C int arrays, 
    allowing very efficient data storage and manipulation, especially
    when using integer keys.

  - We need to spread data over multiple database records. Our data
    structures may be hundreds of megabytes in size. We have ZODB-aware
    structures that use multiple independently stored database objects.

  - Range searches are very common, and under some circomstances, 
    sorted collections and BTrees can have very little overhead
    compared to dictionaries. For this reason, out mapping objects
    and sets have been based on BTrees and sorted collections.

Unfortunately, our current BTree implementation has a flaw that
causes excessive number of objects to be updated when items are 
added and removed. (Each BTree internal node keeps track of the number
of objects contained in it.)  Also, out current sets are limited
to integers and cannot be spread over multiple database records.

We are completing a new BTree implementation that overcomes these 
limitations.  IN this implementation, we will provide sets as
value-less BTrees.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org



From gvwilson at nevex.com  Wed Jan 24 15:10:41 2001
From: gvwilson at nevex.com (Greg Wilson)
Date: Wed, 24 Jan 2001 09:10:41 -0500
Subject: [Python-Dev] re: sets
In-Reply-To: <20010124032401.EB329F199@mail.python.org>
Message-ID: <000301c0860f$6fa29010$770a0a0a@nevex.com>

1. I did a poll overnight by email of 22 friends and colleagues,
none of whom are regular Python users (yet).  My question was,

   "Would you expect the interface of a set class to be like
    the interface of a vector or list, or like the interface
    of a map or hash?"

15 people have replied; all 15 have said, "map or hash".
Several respondents are Perl hackers, so I'm sure the answer
is influenced by previous exposure to the set-as-valueless-hash
idiom.  Still, I think 15-0 is a pretty convincing score...

Four, unprompted, said that they thought the STL's hierarchy of
containers was as good as it gets, and that other languages
should mirror it.  (One of those added that this makes teaching
much simpler --- students can transfer instincts from one language
to another.)

2. Is there enough interest in sets for a BOF at IPC9?  Please
reply to me point-to-point if you're interested; I'll summarize
and post the result.  I volunteer to bring the donuts...

> > Ka-Ping Yee:
> > The only change that needs to be made to support sets of immutable
> > elements is to provide "in" on dictionaries.  The rest is then all
> > quite natural:
> >     dict[key] = 1
> >     if key in dict: ...
> >     for key in dict: ...

> > various:
> > [but what about 'value in dict' or '(key, value) in dict'?]

> Fredrik Lundh:
> let's just say that "in" is the same thing as "has_key",
> and be done with it.

> Guido van Rossum:
> You know, I've long resisted this, but I agree now -- this is the
> right thing.

Greg Wilson:
Woo hoo!  Now, on a related note, what is the status of the 'indices()'
proposal, as in:

    for i in indices(someList):

instead of:

    for i in range(len(someList)):

Would 'indices(dict)' be the same as 'dict.keys()', to allow
uniform iteration?  Or would it be more economical to introduce
a 'keys()' method on lists and tuples, so that:

    for i in collection.keys():

would work on dicts, lists, and tuples?  I know that 'keys()'
is the wrong name for lists and tuples, but dicts are already
using it, and it's completely unambiguous...

Thanks,
Greg



From mal at lemburg.com  Wed Jan 24 15:46:10 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 24 Jan 2001 15:46:10 +0100
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com> <20010123140604.E18796@trump.amber.org> <20010123050807.A29115@glacier.fnational.com> <20010123150616.F18796@trump.amber.org> <3A6E9AE8.6C2D3CF0@digicool.com>
Message-ID: <3A6EEAB2.5E6A4E83@lemburg.com>

Jim Fulton wrote:
> 
> Christopher Petrilli wrote:
> >
> > Neil Schemenauer [nas at arctrix.com] wrote:
> > > On Tue, Jan 23, 2001 at 02:06:05PM -0500, Christopher Petrilli wrote:
> > > > Unfortunately, for me, a Python implementation of Sets is only
> > > > interesting academicaly.  Any time I've needed to work with them at a
> > > > large scale, I've needed them *much* faster than Python could achieve
> > > > without a C extension.
> > >
> > > I think this argues that if sets are added to the core they
> > > should be implemented as an extension type with the speed of
> > > dictionaries and the memory usage of lists.  Basicly, we would
> > > use the implementation of PyDict but drop the values.
> >
> > This is effectively the implementation that Zope has for Sets.
> 
> Except we use sorted collections with binary search for sets.
> 
> I think that a simple hash-based set would make alot of sense.
> 
> > In
> > addition we have "buckets" that have scores on them (which are
> > implemented as a modified BTree).
> >
> > Unfortunately Jim Fulton (who wrote all the code for that level) is in
> > a meeting, but I hope he'll comment on the implementation that was
> > chosen for our software.
> 
> We have a number of special needs:
> 
>   - Scalability is critical. We make some special opimizations,
>     like sets of integers and mapping objects with integer keys
>     and values. In these cases, data are stored using C int arrays,
>     allowing very efficient data storage and manipulation, especially
>     when using integer keys.
> 
>   - We need to spread data over multiple database records. Our data
>     structures may be hundreds of megabytes in size. We have ZODB-aware
>     structures that use multiple independently stored database objects.
> 
>   - Range searches are very common, and under some circomstances,
>     sorted collections and BTrees can have very little overhead
>     compared to dictionaries. For this reason, out mapping objects
>     and sets have been based on BTrees and sorted collections.
> 
> Unfortunately, our current BTree implementation has a flaw that
> causes excessive number of objects to be updated when items are
> added and removed. (Each BTree internal node keeps track of the number
> of objects contained in it.)  Also, out current sets are limited
> to integers and cannot be spread over multiple database records.
> 
> We are completing a new BTree implementation that overcomes these
> limitations.  IN this implementation, we will provide sets as
> value-less BTrees.

You may want to check out a soon to be released new mx
package: mxBeeBase. This is an on-disk b+tree implementation
which supports data files up to 2GB on 32-bit platforms.

Here's a preview:

	http://www.lemburg.com/python/mxBeeBase.html

(The links on that page are not functional.)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From skip at mojam.com  Wed Jan 24 15:42:23 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 24 Jan 2001 08:42:23 -0600 (CST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
Message-ID: <14958.59855.4855.52638@beluga.mojam.com>

    Tim> Nor is LC_MESSAGES std C (the other LC_XXX guys are).

    Tim> I pin the blame on

    Tim>     from _locale import *

    Tim> in locale.py -- who knows what that's supposed to export?
    Tim> Certainly not Skip <wink>.

Was that a roundabout way of complimenting me for having found a bug? ;-)

Skip






From skip at mojam.com  Wed Jan 24 15:50:02 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 24 Jan 2001 08:50:02 -0600 (CST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
	<LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>
Message-ID: <14958.60314.482226.825611@beluga.mojam.com>

    Tim> Nevermind; checked in a hack to stop the error on Windows.

Probably should file a bug report (if you haven't already) so the root
problem isn't forgotten because the hack obscures it.  I see this code in
localemodule.c:

    #ifdef LC_MESSAGES
	x = PyInt_FromLong(LC_MESSAGES);
	PyDict_SetItemString(d, "LC_MESSAGES", x);
	Py_XDECREF(x);
    #endif /* LC_MESSAGES */

Martin, looks like this module is your baby.  Care to hazard a guess about
whether LC_MESSAGES should always or never be there?

Skip




From fredrik at effbot.org  Wed Jan 24 16:11:33 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Wed, 24 Jan 2001 16:11:33 +0100
Subject: [Python-Dev] test___all__ failing; Windows
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com><LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com> <14958.60314.482226.825611@beluga.mojam.com>
Message-ID: <04de01c08617$f56216f0$e46940d5@hagrid>

Skip wrote:

> Probably should file a bug report (if you haven't already) so the root
> problem isn't forgotten because the hack obscures it.  I see this code in
> localemodule.c:
> 
>     #ifdef LC_MESSAGES
> x = PyInt_FromLong(LC_MESSAGES);
> PyDict_SetItemString(d, "LC_MESSAGES", x);
> Py_XDECREF(x);
>     #endif /* LC_MESSAGES */
> 
> Martin, looks like this module is your baby.  Care to hazard a guess about
> whether LC_MESSAGES should always or never be there?

I think the correct answer is "sometimes":

    ANSI C mandates LC_ALL, LC_COLLATE, LC_CTYPE,
    LC_MONETARY, LC_NUMERIC, and LC_TIME

    Unix mandates LC_ALL, LC_COLLATE,LC_CTYPE,
    LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and
    LC_TIME

in other words, if it's supported, it should be exposed by
the Python bindings.

Cheers /F




From tismer at tismer.com  Wed Jan 24 15:40:04 2001
From: tismer at tismer.com (Christian Tismer)
Date: Wed, 24 Jan 2001 16:40:04 +0200
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>
Message-ID: <3A6EE944.C8CC6EF7@tismer.com>


Greg Ewing wrote:
> 
> Neil Schemenauer <nas at arctrix.com>:
> 
> > Basicly, we would
> > use the implementation of PyDict but drop the values.
> 
> This could be incorporated into PyDict. Instead of storing keys and
> values in the same array, keep them in separate arrays and only
> allocate the values array the first time someone stores a value other
> than 1.

Very good idea. It fits also in my view of how dicts should be
implemented: Keep keys and values apart, since this information
has different access patterns.
I think (or at least hope) that dictionaries become faster,
when hashes, keys and values are in seperate areas, giving more
cache hits. Not sure if hashes and keys should be apart, but
sure for values.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From guido at digicool.com  Wed Jan 24 16:37:03 2001
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 24 Jan 2001 10:37:03 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,1.3,1.4
In-Reply-To: Your message of "Wed, 24 Jan 2001 00:28:21 CST."
             <14958.30213.325584.373062@beluga.mojam.com> 
References: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net> <14957.52952.48739.53360@beluga.mojam.com> <200101240311.WAA06582@cj20424-a.reston1.va.home.com>  
            <14958.30213.325584.373062@beluga.mojam.com> 
Message-ID: <200101241537.KAA27039@cj20424-a.reston1.va.home.com>

>     Guido> I think I saw a complaint about this that specifically said that
>     Guido> when dbhash is imported when bsddb can't be imported, an
>     Guido> incomplete dbhash is left behind in sys.modules, and then a
>     Guido> second import of dbhash will succeed -- but of course it will
>     Guido> define no objects.
> 
> So it does:
> 
>     % ./python
>     Python 2.1a1 (#2, Jan 23 2001, 23:30:41) 
>     [GCC 2.95.3 19991030 (prerelease)] on linux2
>     Type "copyright", "credits" or "license" for more information.
>     >>> import dbhash
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>       File "/home/beluga/skip/src/python/dist/src/Lib/dbhash.py", line 3, in ?
> 	import bsddb
>     ImportError: No module named bsddb
>     >>> import dbhash
>     >>>
> 
> Can that be construed as a bug?  If import fails, shouldn't the stub module
> that was inserted in sys.modules be removed?

Yep, but not a very important bug -- typically this isn't caught.
Feel free to check in a change; I think you should be able to insert
something like

    import sys
    try:
	import bsddb
    except ImportError:
	del sys.modules[__name__]
	raise

into dbhash.

If this works for you in testing, forget the patch manager, just check
it in.  (I'm too busy to do much myself, the company needs me. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pf at artcom-gmbh.de  Wed Jan 24 16:32:55 2001
From: pf at artcom-gmbh.de (Peter Funk)
Date: Wed, 24 Jan 2001 16:32:55 +0100 (MET)
Subject: LC_MESSAGES (was Re: [Python-Dev] test___all__ failing; Windows)
In-Reply-To: <14958.60314.482226.825611@beluga.mojam.com> from Skip Montanaro at "Jan 24, 2001  8:50: 2 am"
Message-ID: <m14LRup-000CxUC@artcom0.artcom-gmbh.de>

Hi,

Skip Montanaro:
> 
>     Tim> Nevermind; checked in a hack to stop the error on Windows.
> 
> Probably should file a bug report (if you haven't already) so the root
> problem isn't forgotten because the hack obscures it.  I see this code in
> localemodule.c:
> 
>     #ifdef LC_MESSAGES
> 	x = PyInt_FromLong(LC_MESSAGES);
> 	PyDict_SetItemString(d, "LC_MESSAGES", x);
> 	Py_XDECREF(x);
>     #endif /* LC_MESSAGES */
> 
> Martin, looks like this module is your baby.  Care to hazard a guess about
> whether LC_MESSAGES should always or never be there?

AFAI found out, LC_MESSAGES was added to the POSIX "standard" in Posix.2.
Non-posix2 compatible systems probably miss the proper functionality 
behind 'setlocale()'.  So the best solution would be to add a clever
emulation/approximation of this feature, if the underlying platform
(here windows) doesn't provide it.   This would require to wrap 
'setlocale()'.  But I'm not sure how to emulate for example
'setlocale(LC_MESSAGES, 'DE_de') on a Windows box.  May be it is
impossible to achieve.  

What I would love to see is that the typical query
'setlocale(LC_MESSAGES)' would return 'DE_de' on a Box running for example
the german version of Windows or MacOS.  This would eliminate the need for
ugly language selection menus on these platforms in a portable fashion.

Regards, Peter




From guido at digicool.com  Wed Jan 24 16:41:07 2001
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 24 Jan 2001 10:41:07 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Wed, 24 Jan 2001 16:07:08 +0200."
             <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il> 
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>  
            <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il> 
Message-ID: <200101241541.KAA27082@cj20424-a.reston1.va.home.com>

> > This could be incorporated into PyDict. Instead of storing keys and
> > values in the same array, keep them in separate arrays and only
> > allocate the values array the first time someone stores a value other
> > than 1.
> 
> Cool idea, but even cooler (would catch more idioms, that is) is
> "the first time someone stores something not 'is'  something in the
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> dict, allocate the values array". This would catch small numbers,
> None and identifier-looking strings, for the measly cost of one
> pointer/dict object.

Sorry, but I don't understand what you mean by the ^^^ marked phrase.
Can you please elaborate?

Regarding storing one for "present", that's all well and fine, but it
suggests to me that storing a false value could mean "not present".
Do we really want that?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From moshez at zadka.site.co.il  Thu Jan 25 01:50:13 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Thu, 25 Jan 2001 02:50:13 +0200 (IST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101241541.KAA27082@cj20424-a.reston1.va.home.com>
References: <200101241541.KAA27082@cj20424-a.reston1.va.home.com>, <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>  
            <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il>
Message-ID: <20010125005013.58C12A840@darjeeling.zadka.site.co.il>

On Wed, 24 Jan 2001 10:41:07 -0500, Guido van Rossum <guido at digicool.com> wrote:

> > Cool idea, but even cooler (would catch more idioms, that is) is
> > "the first time someone stores something not 'is'  something in the
>                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> > dict, allocate the values array". This would catch small numbers,
> > None and identifier-looking strings, for the measly cost of one
> > pointer/dict object.
> 
> Sorry, but I don't understand what you mean by the ^^^ marked phrase.
> Can you please elaborate?

I should really stop writing incomprehensible bits like that. Heck,
I can't even understand it on second reading.

I meant that the dictionary would keep a slot for "the one and only
value". First time someone puts a value in the dict, it puts it
in the "one and only value" slot, and doesn't initalize the value
array. The second time someone puts a value, it checks for pointer
equality with that "one and only value". If it is the same, it
it still doesn't initalize the value array. The only time when
the dictionary initalizes the value array is when two pointer-different
values are put in.

This would let me code

a[key] = None

For my sets (but consistent in the same set!)

a[key] = 1

When the timbot codes (again, consistent in the same set)

and

a[key] = 'present'

If you're really weird.

(identifier-like strings get interned)

That's not *semantics*, that's *optimization* for a commonly
used (I think) idiom with dictionaries -- you can't predict
the value, but it will probably remain the same.

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From skip at mojam.com  Wed Jan 24 17:44:17 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 24 Jan 2001 10:44:17 -0600 (CST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <04de01c08617$f56216f0$e46940d5@hagrid>
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
	<LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>
	<14958.60314.482226.825611@beluga.mojam.com>
	<04de01c08617$f56216f0$e46940d5@hagrid>
Message-ID: <14959.1633.163407.779930@beluga.mojam.com>

    Fredrik> I think the correct answer is "sometimes":

    Fredrik>     ANSI C mandates LC_ALL, LC_COLLATE, LC_CTYPE,
    Fredrik>     LC_MONETARY, LC_NUMERIC, and LC_TIME

    Fredrik>     Unix mandates LC_ALL, LC_COLLATE,LC_CTYPE,
    Fredrik>     LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and
    Fredrik>     LC_TIME

    Fredrik> in other words, if it's supported, it should be exposed by
    Fredrik> the Python bindings.

Then this suggests that either Tim's hack is the correct fix (leave it out
because we can't rely on it always being there) or I should add it to
__all__ at the bottom of the file if and only if it's present in the
module's namespace.

Skip






From moshez at zadka.site.co.il  Thu Jan 25 01:57:22 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Thu, 25 Jan 2001 02:57:22 +0200 (IST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <04de01c08617$f56216f0$e46940d5@hagrid>
References: <04de01c08617$f56216f0$e46940d5@hagrid>, <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com><LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com> <14958.60314.482226.825611@beluga.mojam.com>
Message-ID: <20010125005722.D2229A840@darjeeling.zadka.site.co.il>

On Wed, 24 Jan 2001 16:11:33 +0100, "Fredrik Lundh" <fredrik at effbot.org> wrote:

> I think the correct answer is "sometimes":
> 
>     ANSI C mandates LC_ALL, LC_COLLATE, LC_CTYPE,
>     LC_MONETARY, LC_NUMERIC, and LC_TIME
> 
>     Unix mandates LC_ALL, LC_COLLATE,LC_CTYPE,
>     LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and
>     LC_TIME
> 
> in other words, if it's supported, it should be exposed by
> the Python bindings.

In that case, the __all__ attribute in the module has to be calculated
dynamically. Say, adding code like

try:
    LC_MESSAGES
except NameError:
    pass
else:
    __all__.append('LC_MESSAGES')

Ditto for anything else.

Should I check in a patch?
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From trentm at ActiveState.com  Wed Jan 24 17:49:17 2001
From: trentm at ActiveState.com (Trent Mick)
Date: Wed, 24 Jan 2001 08:49:17 -0800
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>; from tim.one@home.com on Wed, Jan 24, 2001 at 03:28:34AM -0500
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>
Message-ID: <20010124084917.C29977@ActiveState.com>

How will the expected adherence of apps to BROWSER jive with the current (and
poorly understood by me) Windows convention of specifying the "default"
browser somewhere in the registry? 

Trent


-- 
Trent Mick
TrentM at ActiveState.com



From skip at mojam.com  Wed Jan 24 17:49:23 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 24 Jan 2001 10:49:23 -0600 (CST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <20010125005722.D2229A840@darjeeling.zadka.site.co.il>
References: <04de01c08617$f56216f0$e46940d5@hagrid>
	<LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
	<LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>
	<14958.60314.482226.825611@beluga.mojam.com>
	<20010125005722.D2229A840@darjeeling.zadka.site.co.il>
Message-ID: <14959.1939.398029.896891@beluga.mojam.com>

    Moshe> In that case, the __all__ attribute in the module has to be
    Moshe> calculated dynamically. Say, adding code like

No need.  I've already got this exact change in my local copy and I'll be
adding a few more __all__ lists later today.

Skip



From paulp at ActiveState.com  Wed Jan 24 17:56:26 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Wed, 24 Jan 2001 08:56:26 -0800
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>  
	            <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il> <200101241541.KAA27082@cj20424-a.reston1.va.home.com>
Message-ID: <3A6F093A.A311C71E@ActiveState.com>

Guido van Rossum wrote:
> 
>...
> 
> > Cool idea, but even cooler (would catch more idioms, that is) is
> > "the first time someone stores something not 'is'  something in the
>
> Sorry, but I don't understand what you mean by the ^^^ marked phrase.
> Can you please elaborate?

I wasn't clear about that either. The idea is:

def add(new_value):
    if not values_array:
        if self.magic_value is NULL:
            self.magic_value = new_value
        elif new_value is not self.magic_value:
            self.values_array=[self.magic_value, new_value, ... ]
        else:
            # new_value is self.magic_value: do nothing

I am neutral on this proposal myself. I think that even if we optimize
any code where you pass the same thing over and over again, we should
document a convention for consistency. So I'm not sure there is much
advantage.

 Paul Prescod



From esr at thyrsus.com  Wed Jan 24 17:53:31 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 11:53:31 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124084917.C29977@ActiveState.com>; from trentm@ActiveState.com on Wed, Jan 24, 2001 at 08:49:17AM -0800
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com> <20010124084917.C29977@ActiveState.com>
Message-ID: <20010124115331.A15059@thyrsus.com>

Trent Mick <trentm at ActiveState.com>:
> How will the expected adherence of apps to BROWSER jive with the current (and
> poorly understood by me) Windows convention of specifying the "default"
> browser somewhere in the registry? 

BROWSER overrides the registry setting.  Which is OK; under Windows, only
wizards are going to muck with it.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Ideology, politics and journalism, which luxuriate in failure, are
impotent in the face of hope and joy.
	-- P. J. O'Rourke



From guido at digicool.com  Wed Jan 24 17:59:00 2001
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 24 Jan 2001 11:59:00 -0500
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: Your message of "Wed, 24 Jan 2001 10:44:17 CST."
             <14959.1633.163407.779930@beluga.mojam.com> 
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com> <LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com> <14958.60314.482226.825611@beluga.mojam.com> <04de01c08617$f56216f0$e46940d5@hagrid>  
            <14959.1633.163407.779930@beluga.mojam.com> 
Message-ID: <200101241659.LAA27650@cj20424-a.reston1.va.home.com>

>     Fredrik> I think the correct answer is "sometimes":
> 
>     Fredrik>     ANSI C mandates LC_ALL, LC_COLLATE, LC_CTYPE,
>     Fredrik>     LC_MONETARY, LC_NUMERIC, and LC_TIME
> 
>     Fredrik>     Unix mandates LC_ALL, LC_COLLATE,LC_CTYPE,
>     Fredrik>     LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and
>     Fredrik>     LC_TIME
> 
>     Fredrik> in other words, if it's supported, it should be exposed by
>     Fredrik> the Python bindings.
> 
> Then this suggests that either Tim's hack is the correct fix (leave it out
> because we can't rely on it always being there) or I should add it to
> __all__ at the bottom of the file if and only if it's present in the
> module's namespace.

The latter.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From moshez at zadka.site.co.il  Thu Jan 25 18:05:44 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Thu, 25 Jan 2001 19:05:44 +0200 (IST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124084917.C29977@ActiveState.com>
References: <20010124084917.C29977@ActiveState.com>, <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>
Message-ID: <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>

On Wed, 24 Jan 2001 08:49:17 -0800, Trent Mick <trentm at ActiveState.com> wrote:
 
> How will the expected adherence of apps to BROWSER jive with the current (and
> poorly understood by me) Windows convention of specifying the "default"
> browser somewhere in the registry? 

The "webbrowser" module should prefer to take the setting from the
registry on windows.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From guido at digicool.com  Wed Jan 24 18:17:09 2001
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 24 Jan 2001 12:17:09 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Thu, 25 Jan 2001 02:50:13 +0200."
             <20010125005013.58C12A840@darjeeling.zadka.site.co.il> 
References: <200101241541.KAA27082@cj20424-a.reston1.va.home.com>, <200101240252.PAA02105@s454.cosc.canterbury.ac.nz> <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il>  
            <20010125005013.58C12A840@darjeeling.zadka.site.co.il> 
Message-ID: <200101241717.MAA27852@cj20424-a.reston1.va.home.com>

> I meant that the dictionary would keep a slot for "the one and only
> value". First time someone puts a value in the dict, it puts it
> in the "one and only value" slot, and doesn't initalize the value
> array. The second time someone puts a value, it checks for pointer
> equality with that "one and only value". If it is the same, it
> it still doesn't initalize the value array. The only time when
> the dictionary initalizes the value array is when two pointer-different
> values are put in.
> 
> This would let me code
> 
> a[key] = None
> 
> For my sets (but consistent in the same set!)
> 
> a[key] = 1
> 
> When the timbot codes (again, consistent in the same set)
> 
> and
> 
> a[key] = 'present'
> 
> If you're really weird.
> 
> (identifier-like strings get interned)
> 
> That's not *semantics*, that's *optimization* for a commonly
> used (I think) idiom with dictionaries -- you can't predict
> the value, but it will probably remain the same.

This I like!

But note that a dict currently uses 12 bytes per slot in the hash
table (on a 32-bit platform: long me_hash; PyObject *me_key,
*me_value).  The hash table's fill factor is typically between 50 and
67%.

I think removing the hashes would slow down lookups too much, so
optimizing identical values out would only save 6-8 bytes per existing
key on average.  Not clear if it's worth enough.  I think I have to
agree with Tim's expectation that two (or three) separate parallel
arrays will reduce the cache locality and thus slow things down.  Once
you start probing, you jump through the hashtable at large random
strides, causing bad cache performance (for largeish hash tables); but
since often enough the first slot tried is right, you have the hash,
key and value right next together, typically on the same cache line.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Wed Jan 24 18:31:55 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 12:31:55 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Thu, Jan 25, 2001 at 07:05:44PM +0200
References: <20010124084917.C29977@ActiveState.com>, <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com> <20010124084917.C29977@ActiveState.com> <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>
Message-ID: <20010124123155.A15203@thyrsus.com>

Moshe Zadka <moshez at zadka.site.co.il>:
> > How will the expected adherence of apps to BROWSER jive with the
> > current (and poorly understood by me) Windows convention of
> > specifying the "default" browser somewhere in the registry?
> 
> The "webbrowser" module should prefer to take the setting from the
> registry on windows.

Um, that's not the way it works right now. The windows-default browser choice 
launches the registered default browser, but BROWSER may have something else
in its search list first.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The real point of audits is to instill fear, not to extract revenue;
the IRS aims at winning through intimidation and (thereby) getting
maximum voluntary compliance
	-- Paul Strassel, former IRS Headquarters Agent Wall St. Journal 1980



From esr at thyrsus.com  Wed Jan 24 18:52:11 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 12:52:11 -0500
Subject: [Python-Dev] BROWSER status
Message-ID: <20010124125211.A15276@thyrsus.com>

I spent the morning writing and testing patches to make urlview and GNU Emacs
BROWSER-aware, and have sent them off to the relevant maintainers.  I've
also sent a patch to Andries Brouwer for the environ(5) man page.  

Those of you interested in my latest bit of social engineering can
take a look at

	http://www.tuxedo.org/~esr/BROWSER/

A bow in Guido's direction -- if he hadn't been grouchy about this I
probably wouldn't have gotten to shipping those patches for a while.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

A right is not what someone gives you; it's what no one can take from you. 
	-- Ramsey Clark



From thomas at xs4all.net  Wed Jan 24 19:33:27 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 24 Jan 2001 19:33:27 +0100
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Thu, Jan 25, 2001 at 07:05:44PM +0200
References: <20010124084917.C29977@ActiveState.com>, <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com> <20010124084917.C29977@ActiveState.com> <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>
Message-ID: <20010124193326.B962@xs4all.nl>

On Thu, Jan 25, 2001 at 07:05:44PM +0200, Moshe Zadka wrote:
> On Wed, 24 Jan 2001 08:49:17 -0800, Trent Mick <trentm at ActiveState.com> wrote:
>  
> > How will the expected adherence of apps to BROWSER jive with the current (and
> > poorly understood by me) Windows convention of specifying the "default"
> > browser somewhere in the registry? 

> The "webbrowser" module should prefer to take the setting from the
> registry on windows.

Why ? That's a lot harder to change, and not settable per
'shell'/'thread'/'process'.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tim.one at home.com  Wed Jan 24 20:54:47 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 14:54:47 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124115331.A15059@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEPDIKAA.tim.one@home.com>

Guys, while I like BROWSER, don't think it has anything to do with Windows!
Windows is not Unix; doesn't have PAGER or EDITOR either; and, in general,
use of envars is an abomination under Windows.  The old webbrowser.py uses
the Windows-specific os.startfile(url) because that's the *right* way to do
it on Windows, wizard or not.  And you would have to be a Windows wizard to
succeed in launching a browser under Windows in any other way anyway.  You
may as well try to sell the notion that, on Unix, Python should maintain a
dict mapping file extensions to the user's preferred ways of opening such
files <0.9 wink>.




From tim.one at home.com  Wed Jan 24 20:56:32 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 14:56:32 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124193326.B962@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPEIKAA.tim.one@home.com>

>> The "webbrowser" module should prefer to take the setting from the
>> registry on windows.

> Why ? That's a lot harder to change, and not settable per
> 'shell'/'thread'/'process'.

A Windows user has a legitimate expectation that *every* time an .html file
is opened, it will come up in their browser of choice.  That choice is made
via the registry, and this is how *all* apps work under Windows.  Ditto for
.htm files (and that may be a different browser than is used for .html
files, but again the user has set up their registry to do what *they* want
done with it).  It's not supposed to be easy to change; it is supposed to be
consistent.  Using a different browser per shell/thread/process is a foreign
concept; it's also a useless concept on Windows <0.5 wink>.




From tim.one at home.com  Wed Jan 24 21:32:35 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 15:32:35 -0500
Subject: LC_MESSAGES (was Re: [Python-Dev] test___all__ failing; Windows)
In-Reply-To: <m14LRup-000CxUC@artcom0.artcom-gmbh.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEPGIKAA.tim.one@home.com>

[Peter Funk]
> ...
> AFAI found out, LC_MESSAGES was added to the POSIX "standard" in Posix.2.

FYI, it appears that C99 declined to adopt this extension to C89, but don't
know why (the C99 Rationale doesn't mention it).  That means the vendors who
don't already support it can (well, *will*) use the new C99 std as "a
reason" to continue leaving it out.




From tim.one at home.com  Wed Jan 24 21:15:28 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 15:15:28 -0500
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <14959.1633.163407.779930@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPFIKAA.tim.one@home.com>

[Skip]
> Then this suggests that either Tim's hack is the correct fix (leave it out
> because we can't rely on it always being there) or I should add it to
> __all__ at the bottom of the file if and only if it's present in the
> module's namespace.

What you suggest at the end *is* the hack I checked in.  That is, it's
already done.  The existence of LC_MESSAGES is clearly platform-specific; if
anyone can say for sure a priori *which* platforms it's available on, tell
Fred Drake so he can update the docs accordingly.




From skip at mojam.com  Wed Jan 24 22:25:45 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 24 Jan 2001 15:25:45 -0600 (CST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124123155.A15203@thyrsus.com>
References: <20010124084917.C29977@ActiveState.com>
	<200101240346.WAA06790@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>
	<20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>
	<20010124123155.A15203@thyrsus.com>
Message-ID: <14959.18521.648454.488731@beluga.mojam.com>

>>>>> "Eric" == Eric S Raymond <esr at thyrsus.com> writes:

    Moshe Zadka <moshez at zadka.site.co.il>:

    >> The "webbrowser" module should prefer to take the setting from the
    >> registry on windows.

    Eric> Um, that's not the way it works right now. The windows-default
    Eric> browser choice launches the registered default browser, but
    Eric> BROWSER may have something else in its search list first.

Why not have a special REGISTRY token you can place in the BROWSER path to
tell it when to consult the registry?  On non-Windows platforms it can
simply be ignored:

    BROWSER=netscape:REGISTRY:explorer

Skip




From esr at thyrsus.com  Wed Jan 24 22:30:44 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 16:30:44 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <14959.18521.648454.488731@beluga.mojam.com>; from skip@mojam.com on Wed, Jan 24, 2001 at 03:25:45PM -0600
References: <20010124084917.C29977@ActiveState.com> <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com> <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il> <20010124123155.A15203@thyrsus.com> <14959.18521.648454.488731@beluga.mojam.com>
Message-ID: <20010124163044.A15877@thyrsus.com>

Skip Montanaro <skip at mojam.com>:
> Why not have a special REGISTRY token you can place in the BROWSER path to
> tell it when to consult the registry?  On non-Windows platforms it can
> simply be ignored:

In effect, windows-default is that special token.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The Bible is not my book, and Christianity is not my religion.  I could never
give assent to the long, complicated statements of Christian dogma.
	-- Abraham Lincoln



From martin at mira.cs.tu-berlin.de  Wed Jan 24 22:41:11 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 24 Jan 2001 22:41:11 +0100
Subject: [Python-Dev] Tkinter documentation (Was:  What does "batteries are included" mean?)
Message-ID: <200101242141.f0OLfBT01812@mira.informatik.hu-berlin.de>

> It's already a blot on Python that the standard documentation set
> doesn't cover Tkinter.

Just point your friendly web browser to Ping's HTML generator and ask
for Tkinter, or invoke "pydoc.py Tkinter".

[I wouldn't have brought this up if it hadn't been the contribution of
my friend Nils Fischbeck:-]

Regards,
Martin



From nas at arctrix.com  Wed Jan 24 16:31:55 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 24 Jan 2001 07:31:55 -0800
Subject: [Python-Dev] Makefile changes
Message-ID: <20010124073155.B32266@glacier.fnational.com>

I've checked in my new makefile.  Hopefully everything goes well.
The following files are no longer used so please don't patch
them:

    Grammar/Makefile.in
    Include/Makefile
    Lib/Makefile
    Modules/Makefile.pre.in
    Objects/Makefile.in
    Parser/Makefile.in
    Python/Makefile.in
    Makefile.in

They will be removed in a few days assuming all goes well.  You
should re-run configure to use the new makefile.

I would appreciate it if people using platforms other than Linux
and GNU make could give me some feedback on the build process.
Does configure and make work okay?  Does "make test" and "make
install" work?  Thanks.

  Neil



From greg at cosc.canterbury.ac.nz  Wed Jan 24 23:55:00 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 25 Jan 2001 11:55:00 +1300 (NZDT)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101240354.WAA06903@cj20424-a.reston1.va.home.com>
Message-ID: <200101242255.LAA02208@s454.cosc.canterbury.ac.nz>

Guido:

> But shouldn't the default value be something else,
> like none?

It should really be whatever is the first value that gets
stored after the dict is created. That way people can
use whatever they want for their dummy value and it will
Just Work. And it will probably catch most existing uses
of a dict as a set as well.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From ping at lfw.org  Wed Jan 24 21:33:43 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Wed, 24 Jan 2001 12:33:43 -0800 (PST)
Subject: [Python-Dev] Anonymous + varargs: possible serious breakage -- please confirm!
Message-ID: <Pine.LNX.4.10.10101241222270.483-100000@skuld.kingmanhall.org>

Hi -- after updating my CVS tree today with Python 2.1a1, i ran
the tests and test_inspect failed.  This revealed that the format
of code.co_varnames has changed.  At first i tried to update the
inspect.py module to check the Python version number and track the
change, but now i believe this is actually symptomatic of a real
interpreter problem.

Consider the function:

    def f(a, (b, c), *d):
        x = 1
        print a, b, c, d, x

Whereas in Python 1.5.2:

    f.func_code.co_argcount = 2
    f.func_code.co_nlocals = 6
    f.func_code.co_names = ('x', 'a', 'b', 'c', 'd')
    f.func_code.co_varnames = ('a', '.2', 'd', 'b', 'c', 'x')

In Python 2.1a1:

    f.func_code.co_argcount = 2
    f.func_code.co_nlocals = 6
    f.func_code.co_names = ('b', 'c', 'x', 'a', 'd')
    f.func_code.co_varnames = ('a', '.2', 'b', 'c', 'd', 'x')

Notice how the ordering of the variable names has changed.
I went and looked at the CO_VARARGS clause in eval_code2 to
see if it put the varargs and kwdict arguments in different
slots, but it appears unchanged!  It still puts varargs at
locals[co_argcount] and kwdict at locals[co_argcount + 1].

Please try:

    >>> def f(a, (b, c), *d):
    ...     x = 1
    ...     print a, b, c, d, x
    ...
    >>> f(1, (2, 3), 4)
    1 2 3
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "<stdin>", line 3, in f
    UnboundLocalError: local variable 'd' referenced before assignment
    >>> 

In Python 1.5.2, this prints "1 2 3 (4,)" as expected.

I only have 1.5.2 and 2.1a1 to test.  I hope this problem
isn't present in 2.0...


Note that test_inspect was the only test to fail!  It might be the
only test that checks anonymous and *varargs at the same time.
(Yet another reason to put inspect in the core...)

I did recently check in additions to test_extcall that made the
test much beefier -- but that only tested combinations of regular,
keyword, varargs, and kwdict arguments; it neglected to test
anonymous (tuple) arguments as well.


-- ?!ng




From tim.one at home.com  Thu Jan 25 00:56:25 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 18:56:25 -0500
Subject: [Python-Dev] Re: test___all__ failing; Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCMEPMIKAA.tim.one@home.com>

> In that case, the __all__ attribute in the module has to be calculated
> dynamically. Say, adding code like
>
> try:
>    LC_MESSAGES
> except NameError:
>    pass
> else:
>    __all__.append('LC_MESSAGES')
>
> Ditto for anything else.
>
> Should I check in a patch?

SourceForge CVS doesn't appear to be broken, so I can only conclude everyone
decided this was a bad to stop taking drugs <0.9 wink>.




From tim.one at home.com  Thu Jan 25 01:04:50 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 19:04:50 -0500
Subject: [Python-Dev] (no subject)
Message-ID: <LNBBLJKPBEHFEDALKOLCEEPNIKAA.tim.one@home.com>

[Skip]
> Why not have a special REGISTRY token you can place in the BROWSER
> path to tell it when to consult the registry?  On non-Windows
> platforms it can simply be ignored:
>
>    BROWSER=netscape:REGISTRY:explorer

Because non-Windows platforms shouldn't be bothered with Windows silliness
any more than Windows users should be bothered with Unix silliness.  BROWSER
isn't of any use on Windows, and REGISTRY isn't of any use on Unix.  Eric
may still *think* BROWSER is of use on Windows, but if so that's not really
a technical problem <wink>.




From thomas at xs4all.net  Thu Jan 25 01:25:54 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 25 Jan 2001 01:25:54 +0100
Subject: [Python-Dev] Makefile changes
In-Reply-To: <20010124073155.B32266@glacier.fnational.com>; from nas@arctrix.com on Wed, Jan 24, 2001 at 07:31:55AM -0800
References: <20010124073155.B32266@glacier.fnational.com>
Message-ID: <20010125012554.F962@xs4all.nl>

On Wed, Jan 24, 2001 at 07:31:55AM -0800, Neil Schemenauer wrote:

> I would appreciate it if people using platforms other than Linux
> and GNU make could give me some feedback on the build process.
> Does configure and make work okay?  Does "make test" and "make
> install" work?  Thanks.

Only have time for a quick check now, and no time what so ever tomorrow, but
at first glance, it looks okay (read: it compiles Python) on BSDI 4.0.1,
BSDI 4.1 and FreeBSD 4.2.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From esr at thyrsus.com  Thu Jan 25 01:15:10 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 19:15:10 -0500
Subject: [Python-Dev] (no subject)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEPNIKAA.tim.one@home.com>; from tim.one@home.com on Wed, Jan 24, 2001 at 07:04:50PM -0500
References: <LNBBLJKPBEHFEDALKOLCEEPNIKAA.tim.one@home.com>
Message-ID: <20010124191510.A17782@thyrsus.com>

Tim Peters <tim.one at home.com>:
> Because non-Windows platforms shouldn't be bothered with Windows silliness
> any more than Windows users should be bothered with Unix silliness.  BROWSER
> isn't of any use on Windows, and REGISTRY isn't of any use on Unix.  Eric
> may still *think* BROWSER is of use on Windows, but if so that's not really
> a technical problem <wink>.

Actually that's not something I have an opinion on.  I addressed the
original question because I know it would be technically possible to set
a BROWSER variable under Windows.  Yes, an unlikely move, but possible.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

A man who has nothing which he is willing to fight for, nothing 
which he cares about more than he does about his personal safety, 
is a miserable creature who has no chance of being free, unless made 
and kept so by the exertions of better men than himself. 
	-- John Stuart Mill, writing on the U.S. Civil War in 1862



From tim.one at home.com  Thu Jan 25 05:38:54 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 23:38:54 -0500
Subject: [Python-Dev] I think my set module is ready for prime time;  comments?
In-Reply-To: <3A6EE944.C8CC6EF7@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEADILAA.tim.one@home.com>

[Christian Tismer]
> ...
> Not sure if hashes and keys should be apart, but
> sure for values.

How so?  That is, under what assumptions?  Any savings from separation would
appear to require that I look up keys a lot more than I access the
associated values; while trivially true for dicts used as sets, it seems
dubious to me for use of dicts as mappings (count[word] += 1, etc).




From Jason.Tishler at dothill.com  Thu Jan 25 07:09:47 2001
From: Jason.Tishler at dothill.com (Jason Tishler)
Date: Thu, 25 Jan 2001 01:09:47 -0500
Subject: [Python-Dev] Re: Python 2.1 alpha 1 released!
In-Reply-To: <200101230333.WAA28376@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 10:33:02PM -0500
References: <200101230333.WAA28376@cj20424-a.reston1.va.home.com>
Message-ID: <20010125010947.M1256@dothill.com>

On Mon, Jan 22, 2001 at 10:33:02PM -0500, Guido van Rossum wrote:
> - Python should now build out of the box on Cygwin.  If it doesn't,
>   mail to Jason Tishler (jlt63 at users.sourceforge.net).

Although Python CVS built OOTB under Cygwin until 2001/01/17 18:54:54,
Python 2.1a1 needs a small patch in order to build cleanly under Cygwin.
If interested, please see the following for details:

    http://www.cygwin.com/ml/cygwin-apps/2001-01/msg00019.html

Thanks,
Jason

-- 
Jason Tishler
Director, Software Engineering       Phone: +1 (732) 264-8770 x235
Dot Hill Systems Corp.               Fax:   +1 (732) 264-8798
82 Bethany Road, Suite 7             Email: Jason.Tishler at dothill.com
Hazlet, NJ 07730 USA                 WWW:   http://www.dothill.com



From tim.one at home.com  Thu Jan 25 08:29:19 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 25 Jan 2001 02:29:19 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101231549.KAA05172@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEBFILAA.tim.one@home.com>

[Guido]
> ...
> It's no big deal if the Vaults contain three or more set modules --
> perfect even, people can choose the best one for their purpose.

They really can't, not realistically, unless all the modules in question
conform to the same interface (which users can't control), and users
restrict themselves to methods defined only in the interface (which users
can control).  The problem is that "their purpose" changes over time, and in
some cases the effects of representation on performance simply can't be
out-guessed in advance of actual measurement.  If people need to change any
more than just the import statement, *then* a single implementation has to
be all things to all people.

I hate to say this (bet <wink>?), but I suspect the fact that Python's basic
types are all builtin and not classes has kept us from fully appreciating
the class-based "1 interface, N implementations" approach that C++ and Java
hackers are having so much fun with.  They're not all that easy to find, but
people who have climbed the steep STL learning curve often end up in the
same ecstatic trance I used to see only among fellow Pythoneers.

> But in the core, there's only room for one set type or module.

I don't like the conclusion:  it implies there's no room in the core for
more than one implementation of anything, yet one-size-fits-all doesn't.  I
have no problem with the idea that there's only room for one Set *interface*
in the core.  Then you only need Pronounce on a reasonable set of abstract
operations, and leave the implementation tradeoffs to be made by different
people in different ways (I've really got no use for Eric's list-based sets;
he's really got no use for my sets-of-sets).

That said, if there can be at most one, and must be at least one, a
hashtable based set is the best compromise there is, and mutable objects as
elements should not be supported (they add great implementation complexity
for the benefit of relatively few applications).

jeremy's-set-class-couldn't-be-accused-of-overkill<wink>-ly y'rs  - tim




From tim.one at home.com  Thu Jan 25 08:57:18 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 25 Jan 2001 02:57:18 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123113050.A26162@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEBLILAA.tim.one@home.com>

[Eric S. Raymond]
> ...
> What you get by going with a dictionary representation is that
> membership test becomes close to constant-time, while insertion and
> deletion become sometimes cheap and sometimes quite expensive
> (depending of course on whether you have to allocate a new
> hash bucket).

Note that Python's dicts aren't vulnerable to that:  they use open
addressing in a contiguous, preallocated vector.  There are no mallocs() or
free()s going on for lookups, deletes, or inserts, unless an insert happens
to hit a "time to double the size of the vector" boundary.  Deletes never
cost more than a lookup; inserts never more unless the table-size boundary
is hit (one in 2**N unique inserts, at which point N goes up too).

> ...
> "works for everbody" isn't really possible here.  So my solution
> does the next best thing -- pick a choice of tradeoffs that isn't
> obviously worse than the alternatives and keeps things bog-simple.

I agree that this shouldn't be an either/or choice, but if it's going to be
forced into that mold I have to protest that the performance of unordered
lists would kill most of the set applications I've ever had.  I typically
have a small number of very large sets (and I'm talking not 100s, but often
100s of 1000s of elements).  The relatively large memory burden of a dict
representation wouldn't bother me unless I instead had 100s of 1000s of very
small sets.

which-we-may-happen-in-my-next-life-but-not-in-this-one-ly y'rs  - tim




From tim.one at home.com  Thu Jan 25 09:08:30 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 25 Jan 2001 03:08:30 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <001101c0857a$c0dce420$770a0a0a@nevex.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEBMILAA.tim.one@home.com>

[Greg Wilson]
> ...
> Unfortunately, if values are required to be immutable, then sets of
> sets aren't possible... :-(

Sure they are.  I wrote about how before, and Moshe put up a simple
implementation as a SourceForge patch.  Not bulletproof, though:  "consentng
adults".  No matter *what* you implement, I'll find *some* way to trick it
into believing my sets are immutable <wink>, so don't worry about that.

Bulletproof is very hard, and is a minority distraction at best.  IIRC, SETL
had "by value" semantics when inserting a set into another set as an
element, and had some exceedingly hairy copy-on-write scheme under the
covers to make that bearably quick.  That may be wrong, though.  Herman
Venter's Slim (Sets, Lists and Maps) language does work that way (Guido,
Herman was a friend of the departed Stoffel Erasmus, who you may recall
fondly from Python's very early days -- if *that* doesn't make sets
attractive to you, nothing will <wink>).

Ah!  Meant to post this before:

    http://birch.eecs.lehigh.edu/~bacon/setlprog.ps.gz

That's a readable and very good intro to SETL Classic.  People pondering
computerized sets should at least catch up with what was common knowledge 30
years ago <wink>.




From thomas at xs4all.net  Thu Jan 25 10:24:24 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 25 Jan 2001 10:24:24 +0100
Subject: [Python-Dev] Anonymous + varargs: possible serious breakage -- please confirm!
In-Reply-To: <Pine.LNX.4.10.10101241222270.483-100000@skuld.kingmanhall.org>; from ping@lfw.org on Wed, Jan 24, 2001 at 12:33:43PM -0800
References: <Pine.LNX.4.10.10101241222270.483-100000@skuld.kingmanhall.org>
Message-ID: <20010125102424.G962@xs4all.nl>

On Wed, Jan 24, 2001 at 12:33:43PM -0800, Ka-Ping Yee wrote:

> Please try:

>     >>> def f(a, (b, c), *d):
>     ...     x = 1
>     ...     print a, b, c, d, x
>     ...
>     >>> f(1, (2, 3), 4)
>     1 2 3
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>       File "<stdin>", line 3, in f
>     UnboundLocalError: local variable 'd' referenced before assignment
>     >>> 

> In Python 1.5.2, this prints "1 2 3 (4,)" as expected.

> I only have 1.5.2 and 2.1a1 to test.  I hope this problem
> isn't present in 2.0...

It isn't present in 2.0. This is probably related to Jeremy's changes
in the call mechanism or the compiler track, though Jeremy himself is the
best person to claim that for sure :)

> Note that test_inspect was the only test to fail!  It might be the
> only test that checks anonymous and *varargs at the same time.
> (Yet another reason to put inspect in the core...)

Well, this is not an inspect-specific test, so it shouldn't *be* in
test_inspect, it should be in test_extcall :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From fredrik at effbot.org  Thu Jan 25 10:45:31 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Thu, 25 Jan 2001 10:45:31 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Doc/lib libwinsound.tex,1.5,1.6
References: <E14Limt-0002Rf-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <003801c086b3$8ff41560$e46940d5@hagrid>

tim accidentally wrote:

>     \versionadded{1.5.3} % XXX fix this version number when release is scheduled!

1.5.3?  time for a 1.5.3 => 1.6 query replace?

> fgrep 1.5.3 doc/*/*.tex
doc/lib/libcmp.tex:\deprecated{1.5.3}{Use the \module{filecmp} module inste
doc/lib/libcmpcache.tex:\deprecated{1.5.3}{Use the \module{filecmp} module
ad.}
doc/lib/libwinsound.tex:  \versionadded{1.5.3} % XXX fix this version number

or am I missing something?

Cheers /F




From tim.one at home.com  Thu Jan 25 12:20:18 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 25 Jan 2001 06:20:18 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Doc/lib libwinsound.tex,1.5,1.6
In-Reply-To: <003801c086b3$8ff41560$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCAECFILAA.tim.one@home.com>

Gotta ask Fred about this one!

> or am I missing something?

Yes, the Python 1.5.3 release.  I use it all the time <wink>.




From tismer at tismer.com  Thu Jan 25 13:22:32 2001
From: tismer at tismer.com (Christian Tismer)
Date: Thu, 25 Jan 2001 14:22:32 +0200
Subject: [Python-Dev] Intended to work? (lambda x,y:map(eval, ["x", "y"]))(2,3)
Message-ID: <3A701A88.F2C68635@tismer.com>

In a function like this:

def f(x):
  return eval("x")

, eval uses the local function namespace, and the above works.
This is according to chapter 2.3 of the Python library ref.

Now on my problem: When eval() is used with map, the same
mechanism takes place:

def f(x):
  return map(eval,["x"])

It works the same as the above, because map is a builtin function
that does not modify the frame chain, so eval finds the local
namespace.
Not so with Stackless Python (at the moment), since Stackless map
assigns an own frame to map without passing the correct namespaces
to it. (Reported by Bernd Rinn)

Question: Is this by chance, or is eval() *meant* to function with
the local namespace, even if it is executed in the context of
a function like map() ?

The description of map() does not state whether it has to pass
its surrounding namespace to the mapped function, and if one
simulates map() by writing one's own python implementation,
it will fail exactly like Stackless does today. The same
applies to apply().

I think I should fix Stackless here, anyway?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From guido at digicool.com  Thu Jan 25 14:35:12 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 25 Jan 2001 08:35:12 -0500
Subject: [Python-Dev] Re: Intended to work? (lambda x,y:map(eval, ["x", "y"]))(2,3)
In-Reply-To: Your message of "Thu, 25 Jan 2001 14:22:32 +0200."
             <3A701A88.F2C68635@tismer.com> 
References: <3A701A88.F2C68635@tismer.com> 
Message-ID: <200101251335.IAA16713@cj20424-a.reston1.va.home.com>

> In a function like this:
> 
> def f(x):
>   return eval("x")
> 
> , eval uses the local function namespace, and the above works.
> This is according to chapter 2.3 of the Python library ref.
> 
> Now on my problem: When eval() is used with map, the same
> mechanism takes place:
> 
> def f(x):
>   return map(eval,["x"])
> 
> It works the same as the above, because map is a builtin function
> that does not modify the frame chain, so eval finds the local
> namespace.
> Not so with Stackless Python (at the moment), since Stackless map
> assigns an own frame to map without passing the correct namespaces
> to it. (Reported by Bernd Rinn)
> 
> Question: Is this by chance, or is eval() *meant* to function with
> the local namespace, even if it is executed in the context of
> a function like map() ?

Map, being a built-in, is transparent to namespaces.

> The description of map() does not state whether it has to pass
> its surrounding namespace to the mapped function, and if one
> simulates map() by writing one's own python implementation,
> it will fail exactly like Stackless does today. The same
> applies to apply().

So you can't simulate a built-in.

> I think I should fix Stackless here, anyway?

Yes.

Note: beware of Jeremy's nested scopes.  That adds a whole slew of
namespaces!  (But eval() is more crippled there.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Thu Jan 25 16:20:45 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 10:20:45 -0500 (EST)
Subject: [Python-Dev] Anonymous + varargs: possible serious breakage -- please confirm!
In-Reply-To: <20010125102424.G962@xs4all.nl>
References: <Pine.LNX.4.10.10101241222270.483-100000@skuld.kingmanhall.org>
	<20010125102424.G962@xs4all.nl>
Message-ID: <14960.17485.549337.5476@localhost.localdomain>

>>>>> "TW" == Thomas Wouters <thomas at xs4all.net> writes:

  TW> On Wed, Jan 24, 2001 at 12:33:43PM -0800, Ka-Ping Yee wrote:
  >> Please try:

  >> >>> def f(a, (b, c), *d):
  >> ...  x = 1 ...  print a, b, c, d, x ...
  >> >>> f(1, (2, 3), 4)
  >> 1 2 3 Traceback (most recent call last): File "<stdin>", line 1,
  >> in ?  File "<stdin>", line 3, in f UnboundLocalError: local
  >> variable 'd' referenced before assignment
  >> >>>

  >> In Python 1.5.2, this prints "1 2 3 (4,)" as expected.

  >> I only have 1.5.2 and 2.1a1 to test.  I hope this problem isn't
  >> present in 2.0...

  TW> It isn't present in 2.0. This is probably related to Jeremy's
  TW> changes in the call mechanism or the compiler track, though
  TW> Jeremy himself is the best person to claim that for sure :)

The bug is in the compiler.  It creates varnames while it is parsing
the argument list.  While I got the handling of the anonymous tuples
right, I forgot to insert *varargs or **kwargs in varnames *before*
the names defined in the tuple.

I will fix it real soon now.

  >> Note that test_inspect was the only test to fail!  It might be
  >> the only test that checks anonymous and *varargs at the same
  >> time.  (Yet another reason to put inspect in the core...)

  TW> Well, this is not an inspect-specific test, so it shouldn't *be*
  TW> in test_inspect, it should be in test_extcall :)

It should probably be in test_grammar.  The ext call mechanism is only
invoked when the caller uses a form like 'f(*arg)'.  Perhaps the name
"ext call" isn't very clear.

Jeremy



From esr at thyrsus.com  Thu Jan 25 17:19:36 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 25 Jan 2001 11:19:36 -0500
Subject: [Python-Dev] Waiting method for file objects
Message-ID: <20010125111936.A23512@thyrsus.com>

I have been researching the question of how to ask a file descriptor how much
data it has waiting for the next sequential read, with a view to discovering
what cross-platform behavior we could count on for a hypothetical `waiting'
method in Python's built-in file class.

1:  Why bother?

I have these main applications in mind:

1. Detecting EOF on a static plain file.
2. Non-blocking poll of a socket opened in non-blocking mode.
3. Non-blocking poll of a FIFO opened in non-blocking mode.
4. Non-blocking poll of a terminal device opened in non-blocking mode.

These are all frequently requested capabilities on C newsgroups -- how
often have *you* seen the "how do I detect an individual keypress"
question from beginning programmers?  I believe having these
capabilities would substantially enhance Python's appeal.

2: What would be under the hood?

Summary: We can do this portably, and we can do it with only one (1)
new #ifdef.  Our tools for this purpose will be the fstat(2) st_size
field and the FIONREAD ioctl(2) call.  They are complementary.

In all supposedly POSIX-conformant environments I know of, the st_size
field has a documented meaning for plain files (S_IFREG) and may or
may not give a meaningful number for FIFOs, sockets, and tty devices.
The Single Unix Specification is silent on the meaning of st_size for
file types other than regular files (S_IFREG).  I have filed a defect
report about this with OpenGroup and am discussing appropriate language
with them.

(The last sentence of the Inferno operating system's language on
stat(2) is interesting: "If the file resides on permanent storage and
is not a directory, the length returned by stat is the number of bytes
in the file. For directories, the length returned is zero. Some
devices report a length that is the number of bytes that may be read
from the device without blocking.")

The FIONREAD ioctl(2) call, on the other hand, returns bytes waiting
on character devices such as FIFOs, sockets, or ttys -- but does not
return a useful value for files or directories or block devices. The
FIONREAD ioctl was supported in both SVr4 and 4.2BSD.  It's present in
all the open-source Unixes, SunOS, Solaris, and AIX.  Via Google
search I have discovered that it's also supported in the Windows
Sockets API and the GUSI POSIX libraries for the Macintosh.  Thus, it
can be considered portable for Python's purposes even though it's
rather sparsely documented.

I was able to obtain confirming information on Linux from Linus
Torvalds himself. My information on Windows and the Mac is from
Gavriel State, formerly a lead developer on Corel's WINE team and a
programmer with extensive cross-platform experience.  Gavriel reported
on the MSCRT POSIX environment, on the Metrowerks Standard Library
POSIX implementation for the Mac, and on the GUSI POSIX implementation
for the Mac.

2.1: Plain files

Torvalds and State confirm that for plain files (S_IFREG) the st_size
field is reliable on all three platforms.  On the Mac it gives the
file's data fork size.

One apparent difficulty with the plain-file case is that POSIX does
not guarantee anything about seek_t quantities such as lseek(2)
returns and the st_size field except that they can be compared for
equality.  Thus, under the strict letter of POSIX law, `waiting' can
be used to detect EOF but not to get a reliable read-size return in
any other file position.

Fortunately, this is less an issue than it appears.  The weakness of
the POSIX language was a 1980s-era concession to a generation of
mainframe operating systems with record-oriented file structures --
all of which are now either thoroughly obsolete or (in the case of IBM
VM/CMS) have become Linux emulators :-).  On modern operating systems
under which files have character granularity, stat(2) emulations can
be and are written to give the right result.

2.2: Block devices

The directory case (S_IFDIR) is a complete loss.  Under Unixes,
including Linux, the fstat(2) size field gives the allocated size of
the directory as if it were a plain file.  Under MSCRT POSIX the
meaning is undocumented and unclear.  Metroworks returns garbage.
GUSI POSIX returns the number of files in the directory!  FIONREAD
cannot be used on directories.

Block devices (S_IFBLK) are a mess again.  Linus points out that a
system with removable or unmountable volumes *cannot* return a useful
st_size field -- what happens when the device is dismounted?

2.3: Character devices

Pipes and FIFOs (S_IFIFO) look better.  On MSCRT the fstat(2) size
field returns the number of bytes waiting to be read.  This is also
true under current Linuxes, though Torvalds says it is "an
implementation detail" and recommends polling with the FIONREAD ioctl
instead.  Fortunately, FIONREAD is available under Unix, Windows, and
the Mac.

Sockets (S_IFSOCK) look better too.  Under Linux, the fstat(2) size
field gives number of bytes waiting.  Torvalds again says this is "an
implementation detail" and recommends polling with the FIONREAD ioctl.
Neither MSCRT POSIX nor Metroworks has direct support for sockets.
GUSI POSIX returns 1 (!) in the st_size field. But FIONREAD is
available under Unix, Windows, and the GUSI POSIX libraries on the
Mac.

Character devices (S_IFCHR) can be polled with FIONREAD.  This technique
has a long history of use with tty devices under Unix.  I don't know whether
it will work with the equivalents of terminal devices for Windows and the Mac.
Fortunately this is not a very important question, as those are GUI 
environments with the terminal devices are rarely if ever used.

3. How does this turn into Python?

The upshot of our portability analysis is that by using FIONREAD and
fstat(2), we can get useful results for plain files, pipes, and
sockets on all three platforms.  Directories and block devices are a
complete loss.  Character devices (in particular, ttys) we can poll
reliably under Unix.  What we'll get polling the equivalents of tty or
character devices under Windows and the Mac is presently unknown, but
also unimportant.

My proposed semantics for a Python `waiting' method is that it reports
the amount of data that would be returned by a read() call at the time
of the waiting-method invocation.  The interpreter throws OSError if
such a report is impossible or forbidden.

I have enclosed a patch against the current CVS sources, including
documentation.  This patch is tested and working against plain files,
sockets, and FIFOs under Linux.  I have also attached the
Python test program I used under Linux.

I would appreciate it if those of you on Windows and Macintosh
machines would test the waiting method. The test program will take
some porting, because it needs to write to a FIFO in background.
Under Linux I do it this way:

	(echo -n '%s' >testfifo; echo 'Data written to FIFO.') &

I don't know how to do the equivalent under Windows or Mac.

When you run this program, it will try to mail me your test results.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Sometimes it is said that man cannot be trusted with the government
of himself.  Can he, then, be trusted with the government of others?
	-- Thomas Jefferson, in his 1801 inaugural address
-------------- next part --------------
Index: fileobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/fileobject.c,v
retrieving revision 2.108
diff -c -r2.108 fileobject.c
*** fileobject.c	2001/01/18 03:03:16	2.108
--- fileobject.c	2001/01/25 16:16:10
***************
*** 35,40 ****
--- 35,44 ----
  #include <errno.h>
  #endif
  
+ #ifndef DONT_HAVE_IOCTL_H
+ #include <sys/ioctl.h>
+ #endif
+ 
  
  typedef struct {
  	PyObject_HEAD
***************
*** 423,428 ****
--- 427,513 ----
  }
  
  static PyObject *
+ file_waiting(PyFileObject *f, PyObject *args)
+ {
+ 	struct stat stbuf;
+ #ifdef HAVE_FSTAT
+ 	int ret;
+ #endif
+ 
+ 	if (f->f_fp == NULL)
+ 		return err_closed();
+ 	if (!PyArg_NoArgs(args))
+ 		return NULL;
+ #ifndef HAVE_FSTAT
+ 	PyErr_SetString(PyExc_OSError, "fstat(2) is not available.");
+ 	clearerr(f->f_fp);
+ 	return NULL;
+ #else
+ 	Py_BEGIN_ALLOW_THREADS
+ 	errno = 0;
+ 	ret = fstat(fileno(f->f_fp), &stbuf);
+ 	Py_END_ALLOW_THREADS
+ 	    if (ret == -1) {			/* the fstat failed */
+ 		PyErr_SetFromErrno(PyExc_IOError);
+ 		clearerr(f->f_fp);
+ 		return NULL;
+        	} else if (S_ISDIR(stbuf.st_mode) || S_ISBLK(stbuf.st_mode)) {
+ 		PyErr_SetString(PyExc_IOError, 
+ 				"Can't poll a block device or directory.");
+ 		clearerr(f->f_fp);
+ 		return NULL;
+ 	} else if (S_ISREG(stbuf.st_mode)) {	/* plain file */
+ #if defined(HAVE_LARGEFILE_SUPPORT) && SIZEOF_OFF_T < 8 && SIZEOF_FPOS_T >= 8
+ 		fpos_t pos;
+ #else
+ 		off_t pos;
+ #endif
+ 		Py_BEGIN_ALLOW_THREADS
+ 		errno = 0;
+ 		pos = _portable_ftell(f->f_fp);
+ 		Py_END_ALLOW_THREADS
+ 		if (pos == -1) {
+ 			PyErr_SetFromErrno(PyExc_IOError);
+ 			clearerr(f->f_fp);
+ 			return NULL;
+ 		}
+ #if !defined(HAVE_LARGEFILE_SUPPORT)
+ 		return PyInt_FromLong(stbuf.st_size - pos);
+ #else
+ 		return PyLong_FromLongLong(stbuf.st_size - pos);
+ #endif
+ 	} else if (S_ISFIFO(stbuf.st_mode) 
+ 		    || S_ISSOCK(stbuf.st_mode) 
+ 		    || S_ISCHR(stbuf.st_mode)) {	/* stream device */
+ #ifndef FIONREAD
+ 		PyErr_SetString(PyExc_OSError, 
+ 				"FIONREAD is not available.");
+ 		clearerr(f->f_fp);
+ 		return NULL;
+ #else
+ 		int waiting;
+ 
+ 		Py_BEGIN_ALLOW_THREADS
+ 		errno = 0;
+ 		ret = ioctl(fileno(f->f_fp), FIONREAD, &waiting);
+ 		Py_END_ALLOW_THREADS
+ 		if (ret == -1) {
+ 			PyErr_SetFromErrno(PyExc_IOError);
+ 			clearerr(f->f_fp);
+ 			return NULL;
+ 		}
+ 
+ 		return Py_BuildValue("i", waiting);
+ #endif /* FIONREAD */
+ 	} else {				/* should never happen! */
+ 		PyErr_SetString(PyExc_OSError, "Unknown file type.");
+ 		clearerr(f->f_fp);
+ 		return NULL;
+ 	}
+ #endif /* HAVE_FSTAT */
+ }
+ 
+ static PyObject *
  file_fileno(PyFileObject *f, PyObject *args)
  {
  	if (f->f_fp == NULL)
***************
*** 1263,1268 ****
--- 1348,1354 ----
  	{"truncate",	(PyCFunction)file_truncate, 1},
  #endif
  	{"tell",	(PyCFunction)file_tell, 0},
+ 	{"waiting",	(PyCFunction)file_waiting, 0},
  	{"readinto",	(PyCFunction)file_readinto, 0},
  	{"readlines",	(PyCFunction)file_readlines, 1},
  	{"xreadlines",	(PyCFunction)file_xreadlines, 1},
-------------- next part --------------
#!/usr/bin/env python
import sys, os, random, string, time, socket, smtplib, readline

print "This program tests the `waiting' method of file objects."

fp = open("waiting_test.py")
if hasattr(fp, "waiting"):
    print "Good, you're running a patched Python with `waiting' available."
else:
    print "You haven't installed the `waiting' patch yet.  This won't work."
    sys.exit(1)

successes = ""
failures = ""
nogo = ""

print ""
print "First, plain files:"

filesize = fp.waiting()
print "There are %d bytes waiting to be read in this file." % filesize
if os.name == 'posix':
    os.system("ls -l waiting_test.py")
    print "That should match the number in the ls listing above."
else:
    print "Please check this with your OS's directory tools."

get = random.randrange(fp.waiting())
print "I'll now read a random number (%d) of bytes." % get
fp.read(get)
print "The waiting method sees %d bytes left." % fp.waiting()
if get + fp.waiting() == filesize:
    print  "%d + %d = %d.  That's consistent.  Test passed." % \
          (get, fp.waiting(), filesize)
    successes += "Plain file random-read test passed.\n"
else:
    print "That's not consistent. Test failed."
    failures += "Plain file random-read test failed\n"

print "Now let's see if we can detect EOF reliably."
fp.read()
left = fp.waiting()
print "I'll do a read()...the waiting method now returns %d" % left
if left == 0:
    print "That looks like EOF."
    successes += "Plain file EOF test passed.\n"
else:
    print "%d bytes left. Test failed." % left
    failures += "Plain file EOF test failed\n"
fp.close()

print ""
print "Now sockets:"
print "Connecting to imap.netaxs.com's IMAP server now..."
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
file = sock.makefile('rb')
sock.connect(("imap.netaxs.com", 143))
print "Waiting a few seconds to avoid a race condition..."
time.sleep(3)
greetsize = file.waiting()
print "There appear to be %d bytes waiting..." % greetsize
greeting = file.readline()
print "I just read the greeting line..."
sys.stdout.write(greeting)
if len(greeting) == greetsize:
    print "...and the size matches.  Test passed."
    successes += "Socket test passed.\n"
else:
    print "That's not right.  Test failed."
    failures += "Socket test failed.\n"
sock.close()

print ""
if not hasattr(os, "mkfifo"):
    print "Your platform doesn't have FIFOs (mkfifo() is absent), so I can't test them."
    nogo = "FIFO test could not be performed."
else:
    print "Now FIFOs:"
    print "I'm making a FIFO named testfifo."; os.mkfifo("testfifo")
    str = string.letters[:random.randrange(len(string.letters))]
    print "I'm going to send it the following string '%s' of random length %d:" \
          % (str, len(str),)
    # Note: Unix dependency here!
    os.system("(echo -n '%s' >testfifo; echo 'Data written to FIFO.') &" % str)
    fp = open("testfifo", "r")
    print "Waiting a few seconds to avoid a race condition..."
    time.sleep(3)
    ready = fp.waiting()
    print "I see %d bytes waiting in the FIFO." % ready
    if ready == len(str):
        print "That's consistent.  Test passed."
        successes += "FIFO test passed.\n"
    else:
        print "That's not consistent. Test failed."
        failures += "FIFO test failed\n"
    os.remove("testfifo")

print "\nSummary:"
report = "Platform is: %s, version is %s\n" % (sys.platform, sys.version)
if successes:
    report += "The following tests succeeded:\n" + successes
if failures:
    report += "The following tests failed:\n" + failures
if nogo:
    report += "The following tests could not be performed:\n" + nogo
if not nogo:
    report += "No tests were skipped.\n"
if not failures:
    report += "All tests succeeded.\n"
print report

if os.name == 'posix':
    me = os.environ["USER"] + "@" + socket.getfqdn()
else:
    me = raw_input("Enter your emasil address, please?")

try:
    server = smtplib.SMTP('localhost')
    report = ("From: %s\nTo: esr at thyrsus.com\nSubject: waiting_test\n\n" % me) + report
    server.sendmail(me, ["esr at thyrsus.com"], report)
    server.quit()
except:
    print "The attempt to mail your test result failed.\n"

From esr at snark.thyrsus.com  Thu Jan 25 17:46:20 2001
From: esr at snark.thyrsus.com (Eric S. Raymond)
Date: Thu, 25 Jan 2001 11:46:20 -0500
Subject: [Python-Dev] Documentation patch for waiting method.
Message-ID: <200101251646.f0PGkKM23567@snark.thyrsus.com>

Index: libstdtypes.tex
===================================================================
RCS file: /cvsroot/python/python/dist/src/Doc/lib/libstdtypes.tex,v
retrieving revision 1.50
diff -u -r1.50 libstdtypes.tex
--- libstdtypes.tex	2001/01/17 01:18:00	1.50
+++ libstdtypes.tex	2001/01/25 16:46:40
@@ -1142,6 +1142,24 @@
   \UNIX{} versions support this operation).
 \end{methoddesc}
 
+\begin{methoddesc}[file]{waiting}{}
+  Return the number of bytes waiting to be read from this file object.
+  For regular files, this returns the size of the file in bytes minus
+  the current seek address, as would be returned by \method{tell()}; a
+  zero return can be used to detect EOF.  For streams such as FIFOs,
+  sockets, Unix ttys, and other Unix character devices, this method
+  returns the number of bytes currently buffered up and waiting to be
+  read.  Attempts to call this method on Unix block devices or
+  on directories will raise an error.
+	\footnote{The \method{waiting()} method uses
+  	\cfunction{fstat(2)} and \cfunction{lseek(2)} on plain files;
+  	these should be reliable on all of Unix, Windows, and MacOS.
+  	It uses the FIONREAD ioctl(2) call to query FIFOs, sockets,
+  	Unix ttys, and other POSIX character devices; FIFO and socket
+  	behavior should be consistent across all three platforms, but
+  	the results from querying other character devices may vary.}
+\end{methoddesc}
+
 \begin{methoddesc}[file]{write}{str}
   Write a string to the file.  There is no return value.  Note: Due to
   buffering, the string may not actually show up in the file until

-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"To disarm the people... was the best and most effectual way to enslave them."
        -- George Mason, speech of June 14, 1788



From fredrik at effbot.org  Thu Jan 25 20:23:50 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Thu, 25 Jan 2001 20:23:50 +0100
Subject: [Python-Dev] Fw: random.py gives wrong results (+ a solution)
Message-ID: <00f701c08704$59bde510$e46940d5@hagrid>

I'm pretty sure Tim's seen this already, but just
in case...

----- Original Message ----- 
From: "Ivan Frohne" <frohne at gci.net>
Newsgroups: comp.lang.python
Sent: Thursday, January 25, 2001 5:20 PM
Subject: Re: random.py gives wrong results (+ a solution)


> 
> "Janne Sinkkonen" <janne at oops.nnets.fi> wrote in message
> news:m3u26oy1rw.fsf at kinos.nnets.fi...
> >
> > At least in Python 2.0 and earlier, the samples returned by the
> > function betavariate() of random.py are not from a beta distribution
> > although the function name misleadingly suggests so.
> >
> > The following would give beta-distributed samples:
> >
> > def betavariate(alpha, beta):
> >      y = gammavariate(alpha,1)
> >      if y==0: return 0.0
> >      else: return  y/(y+gammavariate(beta,1))
> >
> > This is from matlab. A comment in the original matlab code refers to
> > Devroye, L. (1986) Non-Uniform Random Variate Generation, theorem 4.1A
> > (p. 430). Another reference would be Gelman, A. et al. (1995) Bayesian
> > data analysis, p. 481, which I have checked and found to agree with
> > the code above.
> 
> 
> I'm convinced that Janne Sinkkonen is right:  The beta distribution
> generator in module random.py does not return Beta-distributed
> random numbers.  Janne's suggested fix should work just fine.
> 
> Here's my guess on how and why this bug bit  -- it won't be of interest to
> most but
> this subject is so obscure sometimes that there needs to be a detailed
> analysis.
> 
> The probability density function of the gamma distribution with (positive)
> parameters
> A and B is usually written
> 
>     g(x; A, B) = (x**(A-1) * exp(x/B)) / (Gamma(A) * B**A), where x, A, and
> B > 0.
> 
> Here Gamma(A) is the gamma function -- for A a positive integer, Gamma(A) is
> the
> factorial of A - 1, Gamma(A) = (A-1)!.  In fact, this is the definition used
> by the authors of random.py in defining gammavariate(alpha, beta), the gamma
> distribution random number generator.
> 
> Now it happens that a gamma-distributed random variable with parameters A =
> 1 and
> B has the (much simpler) exponential distribution with density function
> 
>     g(x; 1, B) = exp(-x/B) / B.
> 
> Keep that in mind.
> 
> The reference "Discrete Event Simulation in ," by Kevin Watkins
> (McGraw-Hill, 1993)
> was consulted by the random.py authors.  But this reference defines the
> gamma probability distribution a little differently, as
> 
>     g1(x; A, B) =  (B**A * x**(A-1) * exp(B*x)) / Gamma(A), where x, A, B >
> 0.
> 
> (See p. 85).  On page 87, Watkins states (incorrectly) that if grv(A, B) is
> a function which
> returns a gamma random variable with parameters A and B (using his
> definition on p. 85),
> then the function
> 
>     brv(A, B) = grv(1, 1/B) / ( grv(1, 1/B) + grv(1, A) )              [ not
> true!]
> 
> will return a random variable which has the beta distribution with
> parameters A and B.
> 
> Believing Watkins to be correct, the random.py authors remembered that a
> gamma
> random variable with parameter A = 1 is just an exponential random variable
> and
> further simplified their beta generator to
> 
>    brv(A, B) = erv(1/B) / (erv(1/B) + erv(A)), where erv(K) is a random
> variable
> 
> having the exponential distribution with
> 
> parameter K.
> 
> The corrected equation for a beta random variable, using Watkins' definition
> of the
> gamma density, is
> 
>     brv(A, B) = grv(A, 1) / ( grv(A, 1) + grv(1/B, 1) ),
> 
> which translates to
> 
>     brv(A, B) = grv(A, 1) / (grv(A, 1) + grv(B, 1)
> 
> using the more common gamma density definition (the one used in random.py).
> Many standard statistical references give this equation -- two are
> "Non-Uniform random Variate Generation," by Luc Devroye, Springer-Verlag,
> 1986,
> p. 432, and  "Monte Carlo Concepts, Algorithms and Applications," by
> George S. Fishman, Springer, 1996, p. 200.
> 
> --Ivan Frohne
> 
> 
> 
> 
> >>>
> 
> 
> 
> 




From jeremy at alum.mit.edu  Thu Jan 25 18:13:03 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 12:13:03 -0500 (EST)
Subject: [Python-Dev] Makefile changes
In-Reply-To: <20010124073155.B32266@glacier.fnational.com>
References: <20010124073155.B32266@glacier.fnational.com>
Message-ID: <14960.24223.599357.388059@localhost.localdomain>

Neil,

What would it take to add useful dependency information to the
Makefile?  Or does it already exist?

When I was working the nested scopes, building was tedious at times
because a change to funcobject.h meant that, e.g., newmodule.c needed
to be recompiled.  The Makefiles didn't capture that information, so I
had been adding it to the individual Makefiles, e.g.

newmodule.o: newmodule.c ../Include/funcobject.h

(I think this worked.)

It would be great if the Makefile captured all the dependencies.
Could we just use makedepend?

Jeremy



From MarkH at ActiveState.com  Thu Jan 25 20:43:35 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Thu, 25 Jan 2001 11:43:35 -0800
Subject: [Python-Dev] Waiting method for file objects
In-Reply-To: <20010125111936.A23512@thyrsus.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPOEBFDAAA.MarkH@ActiveState.com>

> I would appreciate it if those of you on Windows and Macintosh
> machines would test the waiting method. The test program will take
> some porting, because it needs to write to a FIFO in background.

This didn't compile under Windows.  I have a patch (against CVS) that
compiles, but doesnt appear to work (and will be forwarded to Eric under
seperate cover) [news flash :-)  Changing the open call to add "rb" as the
mode makes it work - text v binary bites again]

I didn't try any sort of fifo test.

The sockets test failed with a socket error, but would certainly have failed
had the socket connected, as my patch includes:

#ifndef S_ISSOCK
#	define S_ISSOCK(mode) (0)
#endif

I have no idea if it managed to mail the results, but I guess not, so the
output is below.  The test file (after some small mods, including the "rb"
param) is indeed 4252 bytes long.

Hope this is useful!

Mark.

This program tests the `waiting' method of file objects.
Good, you're running a patched Python with `waiting' available.

First, plain files:
There are 4252 bytes waiting to be read in this file.
Please check this with your OS's directory tools.
I'll now read a random number (3091) of bytes.
The waiting method sees 1161 bytes left.
3091 + 1161 = 4252.  That's consistent.  Test passed.
Now let's see if we can detect EOF reliably.
I'll do a read()...the waiting method now returns 0
That looks like EOF.

Now sockets:
Connecting to imap.netaxs.com's IMAP server now...
Traceback (most recent call last):
  File "c:\temp\waiting_test.py", line 57, in ?
    sock.connect(("imap.netaxs.com", 143))
  File "<string>", line 1, in connect
socket.error: (10060, 'Operation timed out')




From nas at arctrix.com  Thu Jan 25 14:07:53 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 05:07:53 -0800
Subject: [Python-Dev] Makefile changes
In-Reply-To: <14960.24223.599357.388059@localhost.localdomain>; from jeremy@alum.mit.edu on Thu, Jan 25, 2001 at 12:13:03PM -0500
References: <20010124073155.B32266@glacier.fnational.com> <14960.24223.599357.388059@localhost.localdomain>
Message-ID: <20010125050753.A1573@glacier.fnational.com>

On Thu, Jan 25, 2001 at 12:13:03PM -0500, Jeremy Hylton wrote:
> What would it take to add useful dependency information to the
> Makefile?  Or does it already exist?

Some of it exists but I don't think its complete.

> When I was working the nested scopes, building was tedious at times
> because a change to funcobject.h meant that, e.g., newmodule.c needed
> to be recompiled.  The Makefiles didn't capture that information, so I
> had been adding it to the individual Makefiles, e.g.
> 
> newmodule.o: newmodule.c ../Include/funcobject.h
> 
> (I think this worked.)


Hmm, I don't think so.  Which makefile did you add this to?  Are
you using the new makefile?  The Makefile.pre.in file contains a
line like:

    $(LIBRARY_OBJS) $(MAINOBJ): $(PYTHON_HEADERS)

but newmodule.o not in LIBRARY_OBJS.  By default its not compiled
by make but with distutils.  If you add newmodule to Setup then a
line like:

    Modules/newmodule.o: $(PYTHON_HEADERS)

would do the trick.  I think I will add a line like:

    $(MODOBJS): $(PYTHON_HEADERS)

to fix the problem.

I could easily restore the mkdep target but my feeling right now
that explicitly including the header dependencies is better.
What do you think?  

  Neil



From jeremy at alum.mit.edu  Thu Jan 25 21:02:46 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 15:02:46 -0500 (EST)
Subject: [Python-Dev] PEP 227 checkins to follow
Message-ID: <14960.34406.342961.834827@localhost.localdomain>

I am about to check in the changes that implemention PEP 227.  There
are many changes, which I will make via separate commits.  You might
want to wait until the checkins are done to do an update.  I'll send a
note when I'm done.

I also wanted to mention that the PEP has fallen a little out of
date.  There are a few wrinkles that it doesn't deal with, e.g.
    def f(x):
        def g(y):
            return x + y
        del x
        return g

For now, this raises a SyntaxError.

I'll flesh out the PEP to reflect the current implemention and spec
out some of the less obvious cases.

I'd welcome any comments on the code itself.  I know there are a
number of rough edges and also, most likely, a bunch of memory leaks.
I'll be working to clean things up before 2.1a2, but wanted to get the
code into CVS ASAP.

Jeremy



From jeremy at alum.mit.edu  Thu Jan 25 21:15:01 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 15:15:01 -0500 (EST)
Subject: [Python-Dev] checkins done for PEP 227
Message-ID: <14960.35141.237252.468467@localhost.localdomain>

It looks like python-dev is very slow, so you'll see my original
warning well after the checkins occurred.  Oh, well.  They're done.

Jeremy




From tim.one at home.com  Thu Jan 25 21:58:03 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 25 Jan 2001 15:58:03 -0500
Subject: [Python-Dev] Fw: random.py gives wrong results (+ a solution) 
Message-ID: <LNBBLJKPBEHFEDALKOLCCEDDILAA.tim.one@home.com>

[/F, fwds a c.l.py claim that random.betavariate is dead wrong]

Not to worry; I had already entered that into the SF bug database and
assigned it to me (hmm:  why would you send it to Python-Dev instead of
putting it in the database?).  I suspect he's correct, and, more
importantly, so does Ivan Frohne.  We'll settle it before 2.1a2, but perhaps
not today.  Alas, I have no idea where the original code came from ("Guido"
isn't a useful answer -- he was just converting somebody else's C++ code to
Python).




From fredrik at effbot.org  Thu Jan 25 21:42:05 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Thu, 25 Jan 2001 21:42:05 +0100
Subject: [Python-Dev] Waiting method for file objects
References: <20010125111936.A23512@thyrsus.com>
Message-ID: <01fb01c0870f$48517110$e46940d5@hagrid>

eric wrote:

> Fortunately, this is less an issue than it appears.

only if you ignore Windows...

-1 on making this a file method

+0 on adding it as an optional support function to
the os module.

</F>




From martin at mira.cs.tu-berlin.de  Thu Jan 25 21:42:39 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 25 Jan 2001 21:42:39 +0100
Subject: [Python-Dev] jeremy@alum.mit.edu
Message-ID: <200101252042.f0PKgd101532@mira.informatik.hu-berlin.de>

> It would be great if the Makefile captured all the dependencies.

That would be great, yes. However, setup.py should probably also
consider dependencies.

> Could we just use makedepend?

Not sure. Certainly not in the build process. I dislike distributions
which, as the first thing, perform dependency generation. Dependencies
change less often than the actual source, so it is should be
sufficient to update them manually. Furthermore, generated files as
part of the CVS repository fail to work properly unless everybody uses
the exact same generator. For autoconf alone, that's a problem because
of multiple autoconf versions. I don't know how many different
makedepend versions are in use.

Regards,
Martin




From tim.one at home.com  Thu Jan 25 22:02:11 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 25 Jan 2001 16:02:11 -0500
Subject: [Python-Dev] Windows compile broken
Message-ID: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com>

Linking...
   Creating library ./python21.lib and object ./python21.exp
ceval.obj : error LNK2001: unresolved external symbol _PyCell_Set
ceval.obj : error LNK2001: unresolved external symbol _PyCell_Get
frameobject.obj : error LNK2001: unresolved external symbol _PyCell_New
./python21.dll : fatal error LNK1120: 3 unresolved externals
Error executing link.exe.


Sorry if this has already been discussed.  I don't see mention of it in the
Python-Dev archive, and my email is almost worse than useless (random delays
of minutes to days, due to what appears to be the simultaneous worldwide
wedging of every email server servicing every email account I have).




From esr at thyrsus.com  Thu Jan 25 22:12:25 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 25 Jan 2001 16:12:25 -0500
Subject: [Python-Dev] Waiting method for file objects
In-Reply-To: <01fb01c0870f$48517110$e46940d5@hagrid>; from fredrik@effbot.org on Thu, Jan 25, 2001 at 09:42:05PM +0100
References: <20010125111936.A23512@thyrsus.com> <01fb01c0870f$48517110$e46940d5@hagrid>
Message-ID: <20010125161225.A24305@thyrsus.com>

Fredrik Lundh <fredrik at effbot.org>:
> > Fortunately, this is less an issue than it appears.
> 
> only if you ignore Windows...

I don't understand this.  Explain?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Sometimes the law defends plunder and participates in it. Sometimes
the law places the whole apparatus of judges, police, prisons and
gendarmes at the service of the plunderers, and treats the victim --
when he defends himself -- as a criminal.
	-- Frederic Bastiat, "The Law"



From esr at thyrsus.com  Thu Jan 25 22:13:31 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 25 Jan 2001 16:13:31 -0500
Subject: [Python-Dev] jeremy@alum.mit.edu
In-Reply-To: <200101252042.f0PKgd101532@mira.informatik.hu-berlin.de>; from martin@mira.cs.tu-berlin.de on Thu, Jan 25, 2001 at 09:42:39PM +0100
References: <200101252042.f0PKgd101532@mira.informatik.hu-berlin.de>
Message-ID: <20010125161331.B24305@thyrsus.com>

Martin v. Loewis <martin at mira.cs.tu-berlin.de>:
> Not sure. Certainly not in the build process. I dislike distributions
> which, as the first thing, perform dependency generation. Dependencies
> change less often than the actual source, so it is should be
> sufficient to update them manually. Furthermore, generated files as
> part of the CVS repository fail to work properly unless everybody uses
> the exact same generator. For autoconf alone, that's a problem because
> of multiple autoconf versions. I don't know how many different
> makedepend versions are in use.

Easily solved -- there are script versions of makedepend we can just ship
with the distribution.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Morality is always the product of terror; its chains and
strait-waistcoats are fashioned by those who dare not trust others,
because they dare not trust themselves, to walk in liberty.
	-- Aldous Huxley 



From mal at lemburg.com  Thu Jan 25 22:26:04 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 25 Jan 2001 22:26:04 +0100
Subject: [Python-Dev] Windows compile broken
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com>
Message-ID: <3A7099EC.81689EA5@lemburg.com>

Tim Peters wrote:
> 
> Linking...
>    Creating library ./python21.lib and object ./python21.exp
> ceval.obj : error LNK2001: unresolved external symbol _PyCell_Set
> ceval.obj : error LNK2001: unresolved external symbol _PyCell_Get
> frameobject.obj : error LNK2001: unresolved external symbol _PyCell_New
> ./python21.dll : fatal error LNK1120: 3 unresolved externals
> Error executing link.exe.
> 
> Sorry if this has already been discussed.  I don't see mention of it in the
> Python-Dev archive, and my email is almost worse than useless (random delays
> of minutes to days, due to what appears to be the simultaneous worldwide
> wedging of every email server servicing every email account I have).

These must be related to checkins by Jeremy and his nested
scopes... (I knew these would get us into trouble ;-)

I think Jeremy forgot to check in the needed change for 
Objects/Makefile.in and probably the Windows project file is
missing the new object type too (Objects/cellobject.c).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From jeremy at alum.mit.edu  Thu Jan 25 22:14:52 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 16:14:52 -0500 (EST)
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <3A7099EC.81689EA5@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com>
	<3A7099EC.81689EA5@lemburg.com>
Message-ID: <14960.38732.773129.793360@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:

  MAL> Tim Peters wrote:
  >>
  >> Linking...  Creating library ./python21.lib and object
  >> ./python21.exp ceval.obj : error LNK2001: unresolved external
  >> symbol _PyCell_Set ceval.obj : error LNK2001: unresolved external
  >> symbol _PyCell_Get frameobject.obj : error LNK2001: unresolved
  >> external symbol _PyCell_New ./python21.dll : fatal error LNK1120:
  >> 3 unresolved externals Error executing link.exe.
  >>
  >> Sorry if this has already been discussed.  I don't see mention of
  >> it in the Python-Dev archive, and my email is almost worse than
  >> useless (random delays of minutes to days, due to what appears to
  >> be the simultaneous worldwide wedging of every email server
  >> servicing every email account I have).

  MAL> These must be related to checkins by Jeremy and his nested
  MAL> scopes... (I knew these would get us into trouble ;-)

Just you wait and see!

  MAL> I think Jeremy forgot to check in the needed change for
  MAL> Objects/Makefile.in and probably the Windows project file is
  MAL> missing the new object type too (Objects/cellobject.c).

That's right.  I didn't change the Makefile in Objects or do anything
with Windows.  Don't know how to do the latter, but perhaps Tim will
stop by my desk next week and show me.  As for the Makefile, I thought
I saw a message from Neil saying not to update those anymore.

Jeremy



From nas at arctrix.com  Thu Jan 25 16:10:56 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 07:10:56 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include cellobject.h,NONE,2.1 Python.h,2.30,2.31
In-Reply-To: <E14Lscy-00065x-00@usw-pr-cvs1.sourceforge.net>; from jhylton@users.sourceforge.net on Thu, Jan 25, 2001 at 12:04:16PM -0800
References: <E14Lscy-00065x-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010125071056.A2390@glacier.fnational.com>

On Thu, Jan 25, 2001 at 12:04:16PM -0800, Jeremy Hylton wrote:
> A cell contains a reference to a single PyObject.  It could be
> implemented as a mutable, one-element sequence, but the separate type
> has less overhead.

Can this object be involved in reference cycles?  If so, it
should probably have the GC methods added to it.

  Neil



From jeremy at alum.mit.edu  Thu Jan 25 22:42:04 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 16:42:04 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include cellobject.h,NONE,2.1 Python.h,2.30,2.31
In-Reply-To: <20010125071056.A2390@glacier.fnational.com>
References: <E14Lscy-00065x-00@usw-pr-cvs1.sourceforge.net>
	<20010125071056.A2390@glacier.fnational.com>
Message-ID: <14960.40364.594582.353511@localhost.localdomain>

>>>>> "NS" == Neil Schemenauer <nas at arctrix.com> writes:

  NS> On Thu, Jan 25, 2001 at 12:04:16PM -0800, Jeremy Hylton wrote:
  >> A cell contains a reference to a single PyObject.  It could be
  >> implemented as a mutable, one-element sequence, but the separate
  >> type has less overhead.

  NS> Can this object be involved in reference cycles?  If so, it
  NS> should probably have the GC methods added to it.

It's already there.  (Last five lines of cellobject.c quoted as
proof.) 

>	Py_TPFLAGS_DEFAULT | Py_TPFLAGS_GC,	/* tp_flags */
> 	0,					/* tp_doc */
> 	(traverseproc)cell_traverse,		/* tp_traverse */
> 	(inquiry)cell_clear,			/* tp_clear */
>};



From nas at arctrix.com  Thu Jan 25 16:19:22 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 07:19:22 -0800
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <3A7099EC.81689EA5@lemburg.com>; from mal@lemburg.com on Thu, Jan 25, 2001 at 10:26:04PM +0100
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com>
Message-ID: <20010125071922.B2390@glacier.fnational.com>

On Thu, Jan 25, 2001 at 10:26:04PM +0100, M.-A. Lemburg wrote:
> I think Jeremy forgot to check in the needed change for 
> Objects/Makefile.in

That file is dead.  Should I remove it now?  I haven't heard any
major complaints about Makefile.pre.in yet.  Maybe the messages
are all sitting in the python.org mail spool.  Barry, what the
hell is going on?  You need to drop that Postfix crap and get
qmail. :-)

  Neil



From thomas at xs4all.net  Thu Jan 25 23:19:37 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 25 Jan 2001 23:19:37 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include modsupport.h,2.35,2.36
In-Reply-To: <E14Lue8-0006SF-00@usw-pr-cvs1.sourceforge.net>; from fdrake@users.sourceforge.net on Thu, Jan 25, 2001 at 02:13:36PM -0800
References: <E14Lue8-0006SF-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010125231937.I962@xs4all.nl>

On Thu, Jan 25, 2001 at 02:13:36PM -0800, Fred L. Drake wrote:

> The addition of new parameters to functions in the Python/C API requires
> that PYTHON_API_VERSION be incremented.

When we update the API version, isn't it time to clean up the TP_HASFEATURE
stuff ? Since we updated the API, all the current slots should be there,
right ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Thu Jan 25 23:32:32 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 25 Jan 2001 17:32:32 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include modsupport.h,2.35,2.36
In-Reply-To: Your message of "Thu, 25 Jan 2001 23:19:37 +0100."
             <20010125231937.I962@xs4all.nl> 
References: <E14Lue8-0006SF-00@usw-pr-cvs1.sourceforge.net>  
            <20010125231937.I962@xs4all.nl> 
Message-ID: <200101252232.RAA20013@cj20424-a.reston1.va.home.com>

> > The addition of new parameters to functions in the Python/C API requires
> > that PYTHON_API_VERSION be incremented.
> 
> When we update the API version, isn't it time to clean up the TP_HASFEATURE
> stuff ? Since we updated the API, all the current slots should be there,
> right ?

No, we're issuing a warning about old API versions but still try to
work with them.  After all most extensions don't create frame or code
objects.

I added the flags for the tp_richcompare field when I tried 2.1a1 with
Zope's ExtensionClasses and Acquisition modules.  Turns out I cot a
core dump, while 2.1 ran flawlessly.  The reason: they have their own
type struct which has the same lay-out as the Python 1.5.2 (or even
older) type struct, followed by fields of their own.  They have the
tp_flags field set to 0, so up to 2.0, it was compatible.  I expect
that 2.1a2 will work with the unchanged Zope code because of the flag
I added.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Fri Jan 26 00:04:54 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 00:04:54 +0100
Subject: [Python-Dev] Windows compile broken
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com>
Message-ID: <3A70B116.12BF756B@lemburg.com>

Neil Schemenauer wrote:
> 
> On Thu, Jan 25, 2001 at 10:26:04PM +0100, M.-A. Lemburg wrote:
> > I think Jeremy forgot to check in the needed change for
> > Objects/Makefile.in
> 
> That file is dead.  Should I remove it now?  I haven't heard any
> major complaints about Makefile.pre.in yet.

What about that file ? Are you saying that Makefile.pre.in
will no longer work in 2.1 ??? 

Please don't remove that mechanism -- it has been in use for
quite a while and is much more stable than distutils. We should
at least wait a few more distutils releases for the dust to
settle before removing the old fallback solution.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Fri Jan 26 00:06:40 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 25 Jan 2001 18:06:40 -0500
Subject: [Python-Dev] Windows compile broken
In-Reply-To: Your message of "Fri, 26 Jan 2001 00:04:54 +0100."
             <3A70B116.12BF756B@lemburg.com> 
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com>  
            <3A70B116.12BF756B@lemburg.com> 
Message-ID: <200101252306.SAA20173@cj20424-a.reston1.va.home.com>

> > That file is dead.  Should I remove it now?  I haven't heard any
> > major complaints about Makefile.pre.in yet.
> 
> What about that file ? Are you saying that Makefile.pre.in
> will no longer work in 2.1 ??? 
> 
> Please don't remove that mechanism -- it has been in use for
> quite a while and is much more stable than distutils. We should
> at least wait a few more distutils releases for the dust to
> settle before removing the old fallback solution.

Let's at least mark it clearly as obsolete though -- it's a pain to
maintain.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas at arctrix.com  Thu Jan 25 17:31:28 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 08:31:28 -0800
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <3A70B116.12BF756B@lemburg.com>; from mal@lemburg.com on Fri, Jan 26, 2001 at 12:04:54AM +0100
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com> <3A70B116.12BF756B@lemburg.com>
Message-ID: <20010125083128.A2699@glacier.fnational.com>

On Fri, Jan 26, 2001 at 12:04:54AM +0100, M.-A. Lemburg wrote:
> What about that file ? Are you saying that Makefile.pre.in
> will no longer work in 2.1 ??? 

I'm talking about Objects/Makefile.in.  Which Makefile.pre.in are
you talking about?  Modules/Makefile.pre.in is dead too.  There
is a Makefile.pre.in in the toplevel directory which does the
same thing.  There is also Misc/Makefile.pre.in.  That file gets
installed into lib and still works as it aways did.  The toplevel
Makefile.pre.in can use Modules/Setup* just like the old
Modules/Makefile.pre.in could.  Does this address your concerns?

> Please don't remove that mechanism -- it has been in use for
> quite a while and is much more stable than distutils. We should
> at least wait a few more distutils releases for the dust to
> settle before removing the old fallback solution.

No doubt.

  Neil



From nas at arctrix.com  Thu Jan 25 17:33:48 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 08:33:48 -0800
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <200101252306.SAA20173@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Jan 25, 2001 at 06:06:40PM -0500
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com> <3A70B116.12BF756B@lemburg.com> <200101252306.SAA20173@cj20424-a.reston1.va.home.com>
Message-ID: <20010125083348.B2699@glacier.fnational.com>

On Thu, Jan 25, 2001 at 06:06:40PM -0500, Guido van Rossum wrote:
> Let's at least mark it clearly as obsolete though -- it's a pain to
> maintain.

Are you talking about Misc/Makefile.pre.in?  If so, how do you
suggest we mark it?

I don't think Modules/Setup should go away any time soon.  I
often like to build lots of modules staticly into the
interpreter.  setup.py has no support for building static
modules.

  Neil



From tim.one at home.com  Fri Jan 26 00:27:52 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 25 Jan 2001 18:27:52 -0500
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <14960.38732.773129.793360@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEBILAA.tim.one@home.com>

Thanks for the clues, everyone!  I'll fix it for Windows.  Note that I'm
getting email in wild bursts, and most often delayed.  So I'm generally not
seeing any checkin msgs, or SF bug email, or Python-Dev email, ..., anywhere
near the time (or, alas, sometimes even day) they're generated.  So I simply
didn't see the checkin msg introducing cellobject.c.

all's-well-that-looks-like-it-may-end-ly y'rs  - tim




From mal at lemburg.com  Fri Jan 26 10:32:14 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 10:32:14 +0100
Subject: [Python-Dev] Makefile.pre.in (Windows compile broken)
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com> <3A70B116.12BF756B@lemburg.com> <20010125083128.A2699@glacier.fnational.com>
Message-ID: <3A71441E.4584A5C8@lemburg.com>

Neil Schemenauer wrote:
> 
> On Fri, Jan 26, 2001 at 12:04:54AM +0100, M.-A. Lemburg wrote:
> > What about that file ? Are you saying that Makefile.pre.in
> > will no longer work in 2.1 ???
> 
> I'm talking about Objects/Makefile.in.  Which Makefile.pre.in are
> you talking about?  Modules/Makefile.pre.in is dead too.  There
> is a Makefile.pre.in in the toplevel directory which does the
> same thing.  There is also Misc/Makefile.pre.in.  That file gets
> installed into lib and still works as it aways did.  The toplevel
> Makefile.pre.in can use Modules/Setup* just like the old
> Modules/Makefile.pre.in could.  Does this address your concerns?

Yes. Thanks. I was talking about the Misc/Makefile.pre.in mechanism
which was used in the past by many Python C extensions to provide
a portable of compiling the extension into a shared module or
statically into the Python interpreter.
 
I have been using that mechanism for years now and with much
success. Even though I am currently moving to distutils I have
no idea how stable distutils is on exotic platforms or ones which
have special needs (like e.g. AIX).

> > Please don't remove that mechanism -- it has been in use for
> > quite a while and is much more stable than distutils. We should
> > at least wait a few more distutils releases for the dust to
> > settle before removing the old fallback solution.
> 
> No doubt.

Ok.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Fri Jan 26 10:37:12 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 10:37:12 +0100
Subject: [Python-Dev] setup.py
Message-ID: <3A714548.C487DCC9@lemburg.com>

I have posted two messages here regarding the new setup.py
mechanism for building Modules/ but have received no comments
on them so far. Here's another go:

1. I think that setup.py should output warnings about modules 
   which cannot be built for some reason rather than having
   ot the build process completely.

2. I suggest adding -L/usr/lib/termcap to the readline extension.
   This doesn't hurt anywhere and will get this extension to compile
   on SuSE Linux too.

Thoughts ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Fri Jan 26 13:27:56 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Fri, 26 Jan 2001 07:27:56 -0500
Subject: [Python-Dev] setup.py
In-Reply-To: <3A714548.C487DCC9@lemburg.com>; from mal@lemburg.com on Fri, Jan 26, 2001 at 10:37:12AM +0100
References: <3A714548.C487DCC9@lemburg.com>
Message-ID: <20010126072756.A5013@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> 1. I think that setup.py should output warnings about modules 
>    which cannot be built for some reason rather than having
>    ot the build process completely.
> 
> 2. I suggest adding -L/usr/lib/termcap to the readline extension.
>    This doesn't hurt anywhere and will get this extension to compile
>    on SuSE Linux too.

Both good ideas.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Such are a well regulated militia, composed of the freeholders,
citizen and husbandman, who take up arms to preserve their property,
as individuals, and their rights as freemen.
        -- "M.T. Cicero", in a newspaper letter of 1788 touching the "militia" 
            referred to in the Second Amendment to the Constitution.



From mal at lemburg.com  Fri Jan 26 15:13:45 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 15:13:45 +0100
Subject: [Python-Dev] setup.py
References: <3A714548.C487DCC9@lemburg.com> <20010126072756.A5013@thyrsus.com>
Message-ID: <3A718619.6278AF41@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal at lemburg.com>:
> > 1. I think that setup.py should output warnings about modules
> >    which cannot be built for some reason rather than having
> >    ot the build process completely.
> >
> > 2. I suggest adding -L/usr/lib/termcap to the readline extension.
> >    This doesn't hurt anywhere and will get this extension to compile
> >    on SuSE Linux too.
> 
> Both good ideas.

Should I implement the two and check these in ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Fri Jan 26 15:25:59 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Fri, 26 Jan 2001 09:25:59 -0500
Subject: [Python-Dev] setup.py
In-Reply-To: <3A718619.6278AF41@lemburg.com>; from mal@lemburg.com on Fri, Jan 26, 2001 at 03:13:45PM +0100
References: <3A714548.C487DCC9@lemburg.com> <20010126072756.A5013@thyrsus.com> <3A718619.6278AF41@lemburg.com>
Message-ID: <20010126092559.A5623@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> "Eric S. Raymond" wrote:
> > 
> > M.-A. Lemburg <mal at lemburg.com>:
> > > 1. I think that setup.py should output warnings about modules
> > >    which cannot be built for some reason rather than having
> > >    ot the build process completely.
> > >
> > > 2. I suggest adding -L/usr/lib/termcap to the readline extension.
> > >    This doesn't hurt anywhere and will get this extension to compile
> > >    on SuSE Linux too.
> > 
> > Both good ideas.
> 
> Should I implement the two and check these in ?

I may not channel Guido the way Tim does, but I suspect he gave you
developer privileges because he trusts you to do routine stuff like this.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The saddest life is that of a political aspirant under democracy. His
failure is ignominious and his success is disgraceful.
        -- H.L. Mencken



From mal at lemburg.com  Fri Jan 26 15:29:18 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 15:29:18 +0100
Subject: [Python-Dev] setup.py
References: <3A714548.C487DCC9@lemburg.com> <20010126072756.A5013@thyrsus.com> <3A718619.6278AF41@lemburg.com> <20010126092559.A5623@thyrsus.com>
Message-ID: <3A7189BE.C6C2806E@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal at lemburg.com>:
> > "Eric S. Raymond" wrote:
> > >
> > > M.-A. Lemburg <mal at lemburg.com>:
> > > > 1. I think that setup.py should output warnings about modules
> > > >    which cannot be built for some reason rather than having
> > > >    ot the build process completely.
> > > >
> > > > 2. I suggest adding -L/usr/lib/termcap to the readline extension.
> > > >    This doesn't hurt anywhere and will get this extension to compile
> > > >    on SuSE Linux too.
> > >
> > > Both good ideas.
> >
> > Should I implement the two and check these in ?
> 
> I may not channel Guido the way Tim does, but I suspect he gave you
> developer privileges because he trusts you to do routine stuff like this.

Just asking because setup.py is Andrew's baby. I'll add the above
two later today.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mwh21 at cam.ac.uk  Fri Jan 26 17:40:47 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 26 Jan 2001 16:40:47 +0000
Subject: [Python-Dev] [PEP 232] Syntactic support for function attributes strawman.
Message-ID: <m3ofwuw9kg.fsf@atrus.jesus.cam.ac.uk>

Following discussion on c.l.py I've just submitted:

http://sourceforge.net/patch/?func=detailpatch&patch_id=103441&group_id=5470

which implements a syntax for adding function attributes inline:

>>> def f(a) having (publish=1):
...  print 1
... 
>>> f.publish
1

It uses an "import-as" like strategy to avoid makeing "having" a
keyword (which interacts a bit badly with error reporting, as it
happens).  Obviously, it would be easy to change "having" to a
different word.

Another idea I had was:

>>> def f(a) having (.publish=1):
...  print 1
... 
>>> f.publish
1

to emphasize the attributeness of what's going on, but I didn't like
this as much in practice (I always forgot the period!).

Emile van Sebille also suggested

>>> d = {'a':1}
>>> def f(a) having (**d):
...  print 1
... 
>>> f.a
1

which I haven't implemented, because I didn't really like it, but I
thought I'd mention.

I'll do test suites and documentation in time, but I thought I'd call
in here to check the idea wasn't DOA.  What do you all think?

Cheers,
M.

-- 
  surely, somewhere, somehow, in the history of computing, at least
  one manual has been written that you could at least remotely
  attempt to consider possibly glancing at.              -- Adam Rixey





From nas at arctrix.com  Fri Jan 26 10:55:57 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 01:55:57 -0800
Subject: [Python-Dev] [PEP 232] Syntactic support for function attributes strawman.
In-Reply-To: <m3ofwuw9kg.fsf@atrus.jesus.cam.ac.uk>; from mwh21@cam.ac.uk on Fri, Jan 26, 2001 at 04:40:47PM +0000
References: <m3ofwuw9kg.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <20010126015556.A4215@glacier.fnational.com>

I don't see whats wrong with:

    def f(a):
        print 1
    f.publish = 1

Its perfectly clear to me.  As a bonus it works already.  I'm -1
on inventing more syntax.

  Neil



From evan at digicool.com  Fri Jan 26 18:12:43 2001
From: evan at digicool.com (Evan Simpson)
Date: Fri, 26 Jan 2001 12:12:43 -0500
Subject: [Python-Dev] [PEP 232] Syntactic support for function attributes strawman.
References: <m3ofwuw9kg.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <00c001c087bb$322a9720$3e48a4d8@digicool.com>

From: Michael Hudson <mwh21 at cam.ac.uk>
> >>> def f(a) having (publish=1):
> ...  print 1

This doesn't really need special syntax.  I would much rather have this (or
something like it) as a way of spelling initialized local variables.  That
is, when I want static local variables, instead of corrupting the function
signature by writing:

def f(x, marker=[], foo=foo)

...I could write:

def f(x) having (marker=[], foo)

Cheers,

Evan @ digicool




From jeremy at alum.mit.edu  Fri Jan 26 18:58:24 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 26 Jan 2001 12:58:24 -0500 (EST)
Subject: [Python-Dev] Makefile changes
In-Reply-To: <20010125050753.A1573@glacier.fnational.com>
References: <20010124073155.B32266@glacier.fnational.com>
	<14960.24223.599357.388059@localhost.localdomain>
	<20010125050753.A1573@glacier.fnational.com>
Message-ID: <14961.47808.315324.734238@localhost.localdomain>

>>>>> "NS" == Neil Schemenauer <nas at arctrix.com> writes:

  >> When I was working the nested scopes, building was tedious at
  >> times because a change to funcobject.h meant that, e.g.,
  >> newmodule.c needed to be recompiled.  The Makefiles didn't
  >> capture that information, so I had been adding it to the
  >> individual Makefiles, e.g.
  >>
  >> newmodule.o: newmodule.c ../Include/funcobject.h
  >>
  >> (I think this worked.)

  NS> Hmm, I don't think so.  Which makefile did you add this to?

Just to clarify: I added this line to the old Makefile before you
checked the new one in.

  NS> Hmm, I don't think so.  Which makefile did you add this to?  Are
  NS> you using the new makefile?  The Makefile.pre.in file contains a
  NS> line like:

  NS>     $(LIBRARY_OBJS) $(MAINOBJ): $(PYTHON_HEADERS)

  NS> but newmodule.o not in LIBRARY_OBJS.  By default its not
  NS> compiled by make but with distutils.  If you add newmodule to
  NS> Setup then a line like:

  NS>     Modules/newmodule.o: $(PYTHON_HEADERS)

  NS> would do the trick.  I think I will add a line like:

  NS>     $(MODOBJS): $(PYTHON_HEADERS)

  NS> to fix the problem.

  NS> I could easily restore the mkdep target but my feeling right now
  NS> that explicitly including the header dependencies is better.
  NS> What do you think?

Isn't it overkill to have every .o file depend on all the .h files?
If I change cobject.h, there are very few .o files that depend on this
change.  I suppose, however, it's not worth the effort to get it right
at a finer granularity, e.g. that the only files that depend on
cobject.h are cobject, cStringIO, unicodedata, _cursesmodule, object,
and unicodeobject.

Jeremy






From fdrake at acm.org  Fri Jan 26 21:36:18 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 26 Jan 2001 15:36:18 -0500 (EST)
Subject: [Python-Dev] Makefile changes
In-Reply-To: <14961.47808.315324.734238@localhost.localdomain>
References: <20010124073155.B32266@glacier.fnational.com>
	<14960.24223.599357.388059@localhost.localdomain>
	<20010125050753.A1573@glacier.fnational.com>
	<14961.47808.315324.734238@localhost.localdomain>
Message-ID: <14961.57282.880552.358709@cj42289-a.reston1.va.home.com>

Jeremy Hylton writes:
 > Isn't it overkill to have every .o file depend on all the .h files?
 > If I change cobject.h, there are very few .o files that depend on this
 > change.  I suppose, however, it's not worth the effort to get it right

  Perhaps.  It's definately easier to maintain than tracking it more
specifically and better than what we had, so I'll live with it.  ;)

 > at a finer granularity, e.g. that the only files that depend on
 > cobject.h are cobject, cStringIO, unicodedata, _cursesmodule, object,
 > and unicodeobject.

  And py_curses.h, which is also used in _curses_panel.c.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From nas at arctrix.com  Fri Jan 26 14:58:50 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 05:58:50 -0800
Subject: [Python-Dev] Makefile changes
In-Reply-To: <14961.47808.315324.734238@localhost.localdomain>; from jeremy@alum.mit.edu on Fri, Jan 26, 2001 at 12:58:24PM -0500
References: <20010124073155.B32266@glacier.fnational.com> <14960.24223.599357.388059@localhost.localdomain> <20010125050753.A1573@glacier.fnational.com> <14961.47808.315324.734238@localhost.localdomain>
Message-ID: <20010126055850.C4918@glacier.fnational.com>

On Fri, Jan 26, 2001 at 12:58:24PM -0500, Jeremy Hylton wrote:
> Isn't it overkill to have every .o file depend on all the .h files?

Maybe, but Python compiles pretty fast anyhow.  I'd rather error
on the safe side (ie. compiling too much).  Trying to figure out
which of the subheaders a .c file uses when it imports Python.h
would be a lot of work and error prone.  More power to you if you
want to do it.  ;-)

  Neil



From dgoodger at atsautomation.com  Fri Jan 26 22:46:13 2001
From: dgoodger at atsautomation.com (Goodger, David)
Date: Fri, 26 Jan 2001 16:46:13 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
Message-ID: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>

[CC'ing to Armin Steinhoff, who maintains pyqnx on SourceForge.]

I'm having trouble building Python 2.1a1 on QNX 4.25. Caveat: my C is very
rusty (long live Python!), I don't know my way around configure, and am not
familiar with Python's Makefile. Python 2.0 compiled fine (with a couple of
tweaks), but I'm getting caught by the new way of building things. Please
help if you can! Many thanks in advance.

Here's an excerpt of my efforts:

    # cd /tmp/py
    # gunzip -c < python-2.1a1.tgz | tar -rf -
    # cd Python-2.1a1
    # ./configure 2>&1 | tee ../configure.1
    # make 2>&1 | tee ../make.1
    ...
    ./python //5/tmp/py/Python-2.1a1/setup.py build
    'import site' failed; use -v for traceback
    Traceback (most recent call last):
      File "//5/tmp/py/Python-2.1a1/setup.py", line 4, in ?
        import sys, os, string, getopt
    ImportError: No module named string

Running ./python results in stack overflow. The old QNX instructions in
README recommend editing Modules/Makefile:
    LDFLAGS=    -N 64k

    # make 2>&1 | tee ../make.2

Same error as first make. But now the stack doesn't overflow.

    # python
    'import site' failed; use -v for traceback
    Python 2.1a1 (#2, Jan 26 2001, 11:38:55) [C] on qnxJ
    Type "copyright", "credits" or "license" for more information.
    >>> import sys
    >>> sys.path
    ['', '/usr/local/lib/python', '/home/dgoodger/lib/python', 
    '/5/tmp/py/Python-2.1a1/Lib', '/5/tmp/py/Python-2.1a1/Lib/plat-qnxJ', 
    '/tmp/py/Python-2.1a1/Modules']
    >>> ^D

    # fullpath .
    . is //5/tmp/py/Python-2.1a1

The QNX node number prefix '//5' (machine or host number, equivalent to a
'hostname:' prefix for network paths) is being reduced somehow (path
normalization?) to '/5', so paths don't resolve. 2 slashes ('//') are
required at the head of the path. Is this something that can be fixed?

I added a prefix (QNX virtual-to-real path mapping on the filesystem tree)
to correct this:

    # prefix -A /5=//5

Now /5 points to //5, similar to a link.

    # make 2>&1 | tee ../make.3
    ...
    ./python //5/tmp/py/Python-2.1a1/setup.py build
    unable to execute ld: No such file or directory
    running build
    running build_ext
    building 'struct' extension
    creating build
    creating build/temp.qnx-J-PCI-2.1
    cc -O -I. -I/5/tmp/py/Python-2.1a1/./Include -IInclude/
-I/usr/local/include -c /5/tmp/py/Python-2.1a1/Modules/structmodule.c -o
build/temp.qnx-J-PCI-2.1/structmodule.o
    creating build/lib.qnx-J-PCI-2.1
    ld build/temp.qnx-J-PCI-2.1/structmodule.o -L/usr/local/lib -o
build/lib.qnx-J-PCI-2.1/struct.so
    error: command 'ld' failed with exit status 1
    make: *** [sharedmods] Error 1

QNX doesn't have an 'ld' command. Is configure not getting its info to
setup.py? (Is it supposed to?)

What should I check? I have logs of each of the configure & make runs.
Should I submit this as a bug on SourceForge?

Hope to hear from somebody soon.


David Goodger
Systems Administrator & Programmer, Advanced Systems
Automation Tooling Systems Inc., Automation Systems Division
direct: (519) 653-4483 ext. 7121    fax: (519) 650-6695
e-mail: dgoodger at atsautomation.com



From guido at digicool.com  Fri Jan 26 22:52:47 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 26 Jan 2001 16:52:47 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: Your message of "Fri, 26 Jan 2001 16:46:13 EST."
             <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> 
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> 
Message-ID: <200101262152.QAA26624@cj20424-a.reston1.va.home.com>

> [CC'ing to Armin Steinhoff, who maintains pyqnx on SourceForge.]
> 
> I'm having trouble building Python 2.1a1 on QNX 4.25. Caveat: my C is very
> rusty (long live Python!), I don't know my way around configure, and am not
> familiar with Python's Makefile. Python 2.0 compiled fine (with a couple of
> tweaks), but I'm getting caught by the new way of building things. Please
> help if you can! Many thanks in advance.
> 
> Here's an excerpt of my efforts:
> 
>     # cd /tmp/py
>     # gunzip -c < python-2.1a1.tgz | tar -rf -
>     # cd Python-2.1a1
>     # ./configure 2>&1 | tee ../configure.1
>     # make 2>&1 | tee ../make.1
>     ...
>     ./python //5/tmp/py/Python-2.1a1/setup.py build
>     'import site' failed; use -v for traceback
>     Traceback (most recent call last):
>       File "//5/tmp/py/Python-2.1a1/setup.py", line 4, in ?
>         import sys, os, string, getopt
>     ImportError: No module named string
> 
> Running ./python results in stack overflow. The old QNX instructions in
> README recommend editing Modules/Makefile:
>     LDFLAGS=    -N 64k
> 
>     # make 2>&1 | tee ../make.2
> 
> Same error as first make. But now the stack doesn't overflow.
> 
>     # python
>     'import site' failed; use -v for traceback
>     Python 2.1a1 (#2, Jan 26 2001, 11:38:55) [C] on qnxJ
>     Type "copyright", "credits" or "license" for more information.
>     >>> import sys
>     >>> sys.path
>     ['', '/usr/local/lib/python', '/home/dgoodger/lib/python', 
>     '/5/tmp/py/Python-2.1a1/Lib', '/5/tmp/py/Python-2.1a1/Lib/plat-qnxJ', 
>     '/tmp/py/Python-2.1a1/Modules']
>     >>> ^D
> 
>     # fullpath .
>     . is //5/tmp/py/Python-2.1a1
> 
> The QNX node number prefix '//5' (machine or host number, equivalent to a
> 'hostname:' prefix for network paths) is being reduced somehow (path
> normalization?) to '/5', so paths don't resolve. 2 slashes ('//') are
> required at the head of the path. Is this something that can be fixed?

Aha -- you may need QNX-specific path manipulation functions.  What's
going on is that site.py normalizes the entries in sys.path, using
this function:

    def makepath(*paths):
	dir = os.path.join(*paths)
	return os.path.normcase(os.path.abspath(dir))

I've got a feeling that os.path.abspath(dir) here is the culprit in
posixpath.py:

def abspath(path):
    """Return an absolute path."""
    if not isabs(path):
        path = join(os.getcwd(), path)
    return normpath(path)

And here I think that normpath(path) is the routine that actually gets
rid of the double leading /.

Feel free to submit a patch that leaves double leading slashes in if
on QNX.

> I added a prefix (QNX virtual-to-real path mapping on the filesystem tree)
> to correct this:
> 
>     # prefix -A /5=//5
> 
> Now /5 points to //5, similar to a link.
> 
>     # make 2>&1 | tee ../make.3
>     ...
>     ./python //5/tmp/py/Python-2.1a1/setup.py build
>     unable to execute ld: No such file or directory
>     running build
>     running build_ext
>     building 'struct' extension
>     creating build
>     creating build/temp.qnx-J-PCI-2.1
>     cc -O -I. -I/5/tmp/py/Python-2.1a1/./Include -IInclude/
> -I/usr/local/include -c /5/tmp/py/Python-2.1a1/Modules/structmodule.c -o
> build/temp.qnx-J-PCI-2.1/structmodule.o
>     creating build/lib.qnx-J-PCI-2.1
>     ld build/temp.qnx-J-PCI-2.1/structmodule.o -L/usr/local/lib -o
> build/lib.qnx-J-PCI-2.1/struct.so
>     error: command 'ld' failed with exit status 1
>     make: *** [sharedmods] Error 1
> 
> QNX doesn't have an 'ld' command. Is configure not getting its info to
> setup.py? (Is it supposed to?)
> 
> What should I check? I have logs of each of the configure & make runs.
> Should I submit this as a bug on SourceForge?
> 
> Hope to hear from somebody soon.

This is probably in the realm of the distutils.  I have no idea how to
teach it to build on QNX, sorry!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at cnri.reston.va.us  Fri Jan 26 23:01:01 2001
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Fri, 26 Jan 2001 17:01:01 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>; from dgoodger@atsautomation.com on Fri, Jan 26, 2001 at 04:46:13PM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>
Message-ID: <20010126170101.B2762@amarok.cnri.reston.va.us>

On Fri, Jan 26, 2001 at 04:46:13PM -0500, Goodger, David wrote:
>    ImportError: No module named string

The 'import string' in setup.py actually seems to be redundant now,
since nothing seems to actually refer to the string module.  I've
removed it from CVS.

>The QNX node number prefix '//5' (machine or host number, equivalent to a
>'hostname:' prefix for network paths) is being reduced somehow (path
>normalization?) to '/5', so paths don't resolve. 2 slashes ('//') are
>required at the head of the path. Is this something that can be fixed?

Ooh, very likely:
>>> os.path.normpath('//5/foo/bar')
'/5/foo/bar'

Isn't // at the root a Unix convention of some sort for some 
network filesystems?  Probably normpath() should just leave it alone.

>QNX doesn't have an 'ld' command. Is configure not getting its info to
>setup.py? (Is it supposed to?)

setup.py should be parsing the Makefile.  The old QNX instructions say
Modules/Makefile should be edited, but with Neil's non-recursive
Makefile patch (committed after alpha1's release), editing
Modules/Makefile will have no effect.  Try editing just the top-level
Makefile, which should affect setup.py.

--amk
 



From mal at lemburg.com  Fri Jan 26 23:15:09 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 23:15:09 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us>
Message-ID: <3A71F6ED.D6D642A7@lemburg.com>

"Andrew M. Kuchling" wrote:
> >The QNX node number prefix '//5' (machine or host number, equivalent to a
> >'hostname:' prefix for network paths) is being reduced somehow (path
> >normalization?) to '/5', so paths don't resolve. 2 slashes ('//') are
> >required at the head of the path. Is this something that can be fixed?
> 
> Ooh, very likely:
> >>> os.path.normpath('//5/foo/bar')
> '/5/foo/bar'
> 
> Isn't // at the root a Unix convention of some sort for some
> network filesystems?  Probably normpath() should just leave it alone.

Samba uses //<hostname>/<mountname>/<path>. os.path.normpath()
should probably leave the leading '//' untouched (having too
many of those in the path doesn't do any harm, AFAIK).
 
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From nas at arctrix.com  Fri Jan 26 16:26:12 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 07:26:12 -0800
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>; from dgoodger@atsautomation.com on Fri, Jan 26, 2001 at 04:46:13PM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>
Message-ID: <20010126072611.A5345@glacier.fnational.com>

On Fri, Jan 26, 2001 at 04:46:13PM -0500, Goodger, David wrote:
> Running ./python results in stack overflow. The old QNX instructions in
> README recommend editing Modules/Makefile:
>     LDFLAGS=    -N 64k
> 
>     # make 2>&1 | tee ../make.2

The README should be changed to say edit the toplevel Makefile.
Should those flags be the default?  If you can give me the
MACHDEP from your Makefile I can add it to configure.in.

> QNX doesn't have an 'ld' command. Is configure not getting its info to
> setup.py? (Is it supposed to?)

I'm not sure how distutils figures out what to use for ld.  It
doesn't appear in the Makefile.  It think this is probably some
distutils thing.  Andrew?

  Neil



From fredrik at effbot.org  Fri Jan 26 23:25:34 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 26 Jan 2001 23:25:34 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com>
Message-ID: <001a01c087e6$ec3b9710$e46940d5@hagrid>

mal wrote:> > Ooh, very likely:
> > >>> os.path.normpath('//5/foo/bar')
> > '/5/foo/bar'
> > 
> > Isn't // at the root a Unix convention of some sort for some
> > network filesystems?  Probably normpath() should just leave it alone.
> 
> Samba uses //<hostname>/<mountname>/<path>. os.path.normpath()
> should probably leave the leading '//' untouched (having too
> many of those in the path doesn't do any harm, AFAIK).

from 1.5.2's posixpath:

def normpath(path):
    """Normalize path, eliminating double slashes, etc."""
    import string
    # Treat initial slashes specially
    slashes = ''
    while path[:1] == '/':
        slashes = slashes + '/'
        path = path[1:]
    ...
    return slashes + string.joinfields(comps, '/')

from 2.0's posixpath:

def normpath(path):
    """Normalize path, eliminating double slashes, etc."""
    if path == '':
        return '.'
    import string
    initial_slash = (path[0] == '/')
    ...
    if initial_slash:
        path = '/' + path
    return path or '.'

interesting...

Cheers /F




From akuchlin at cnri.reston.va.us  Fri Jan 26 23:28:03 2001
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Fri, 26 Jan 2001 17:28:03 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <20010126072611.A5345@glacier.fnational.com>; from nas@arctrix.com on Fri, Jan 26, 2001 at 07:26:12AM -0800
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126072611.A5345@glacier.fnational.com>
Message-ID: <20010126172803.A2817@amarok.cnri.reston.va.us>

On Fri, Jan 26, 2001 at 07:26:12AM -0800, Neil Schemenauer wrote:
>I'm not sure how distutils figures out what to use for ld.  It
>doesn't appear in the Makefile.  It think this is probably some
>distutils thing.  Andrew?

It looks at LDSHARED.  See customize_compiler in
Lib/distutils/sysconfig.py.  Looking in Modules/Makefile, LDFLAGS is
only used for the final link to produce a Python executable, so I
think this is up to the Makefile, not setup.py.

--amk



From nas at arctrix.com  Fri Jan 26 16:56:41 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 07:56:41 -0800
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <20010126172803.A2817@amarok.cnri.reston.va.us>; from akuchlin@cnri.reston.va.us on Fri, Jan 26, 2001 at 05:28:03PM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126072611.A5345@glacier.fnational.com> <20010126172803.A2817@amarok.cnri.reston.va.us>
Message-ID: <20010126075641.A5534@glacier.fnational.com>

On Fri, Jan 26, 2001 at 05:28:03PM -0500, Andrew M. Kuchling wrote:
> On Fri, Jan 26, 2001 at 07:26:12AM -0800, Neil Schemenauer wrote:
> >I'm not sure how distutils figures out what to use for ld.
> 
> It looks at LDSHARED.

Okay.  David, what should LDSHARED say for QNX?  I can add the
magic to configure.in.

  Neil



From mal at lemburg.com  Fri Jan 26 23:51:09 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 23:51:09 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid>
Message-ID: <3A71FF5D.DC609775@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:> > Ooh, very likely:
> > > >>> os.path.normpath('//5/foo/bar')
> > > '/5/foo/bar'
> > >
> > > Isn't // at the root a Unix convention of some sort for some
> > > network filesystems?  Probably normpath() should just leave it alone.
> >
> > Samba uses //<hostname>/<mountname>/<path>. os.path.normpath()
> > should probably leave the leading '//' untouched (having too
> > many of those in the path doesn't do any harm, AFAIK).
> 
> from 1.5.2's posixpath:
> 
> def normpath(path):
>     """Normalize path, eliminating double slashes, etc."""
>     import string
>     # Treat initial slashes specially
>     slashes = ''
>     while path[:1] == '/':
>         slashes = slashes + '/'
>         path = path[1:]
>     ...
>     return slashes + string.joinfields(comps, '/')
> 
> from 2.0's posixpath:
> 
> def normpath(path):
>     """Normalize path, eliminating double slashes, etc."""
>     if path == '':
>         return '.'
>     import string
>     initial_slash = (path[0] == '/')
>     ...
>     if initial_slash:
>         path = '/' + path
>     return path or '.'
> 
> interesting...

Here's the log message:

revision 1.34
date: 2000/07/19 17:09:51;  author: montanaro;  state: Exp;  lines: +18 -23
added rewritten normpath from Moshe Zadka that does the right thing with
paths containing ..

and the diff:

diff -r1.34 -r1.33
349,350d348
<     if path == '':
<         return '.'
352,367c350,372
<     initial_slash = (path[0] == '/')
<     comps = string.split(path, '/')
<     new_comps = []
<     for comp in comps:
<         if comp in ('', '.'):
<             continue
<         if (comp != '..' or (not initial_slash and not new_comps) or 
<              (new_comps and new_comps[-1] == '..')):
<             new_comps.append(comp)
<         elif new_comps:
<             new_comps.pop()
<     comps = new_comps
<     path = string.join(comps, '/')
<     if initial_slash:
<         path = '/' + path
<     return path or '.'
---
>     # Treat initial slashes specially
>     slashes = ''
>     while path[:1] == '/':
>         slashes = slashes + '/'
>         path = path[1:]
>     comps = string.splitfields(path, '/')
>     i = 0
>     while i < len(comps):
>         if comps[i] == '.':
>             del comps[i]
>             while i < len(comps) and comps[i] == '':
>                 del comps[i]
>         elif comps[i] == '..' and i > 0 and comps[i-1] not in ('', '..'):
>             del comps[i-1:i+1]
>             i = i-1
>         elif comps[i] == '' and i > 0 and comps[i-1] <> '':
>             del comps[i]
>         else:
>             i = i+1
>     # If the path is now empty, substitute '.'
>     if not comps and not slashes:
>         comps.append('.')
>     return slashes + string.joinfields(comps, '/')

Revision 1.33 clearly leaves initial slashes untouched.
I guess we should restore this...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From nas at arctrix.com  Fri Jan 26 17:12:15 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 08:12:15 -0800
Subject: [Python-Dev] LINKCC defaults to CXX
Message-ID: <20010126081215.B5534@glacier.fnational.com>

Dear lord why?  So people can develop extensions using C++?  Its
not worth the pain inflicted on everyone else.  Let them
recompile with LINKCC=CXX.

Linking with CXX opens a huge can of stinky worms.  First of all,
just because configure found a value for CXX doesn't mean it
works.  Even if it does that doesn't mean that using it is a good
idea.  Linking with CXX will bring in the C++ runtime.  There are
a large number of platforms where the C++ ABI has not been
standarized; for example, anything that used g++.

Can we please leave LINKCC default to CXX?  Its easy enough for
the crazies to override if they like.  I'll even create a
configure option for them.

  Neil



From barry at digicool.com  Sat Jan 27 00:09:57 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Fri, 26 Jan 2001 18:09:57 -0500
Subject: [Python-Dev] LINKCC defaults to CXX
References: <20010126081215.B5534@glacier.fnational.com>
Message-ID: <14962.965.464326.794431@anthem.wooz.org>

>>>>> "NS" == Neil Schemenauer <nas at arctrix.com> writes:

    NS> Can we please leave LINKCC default to CXX?

I think you mean default it to CC, eh?  +1



From mal at lemburg.com  Sat Jan 27 01:16:01 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 27 Jan 2001 01:16:01 +0100
Subject: [Python-Dev] Nightly CVS tarballs
Message-ID: <3A721341.3F348E51@lemburg.com>

I just got a request from someone who wants to test the latest
CVS version but unfortunately can't because he's behind a 
firewall.

Is there any chance of reactivating the nightly tarball generation
that was once in place ?

	http://www.python.org/download/cvs.html

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From dgoodger at atsautomation.com  Sat Jan 27 01:30:21 2001
From: dgoodger at atsautomation.com (Goodger, David)
Date: Fri, 26 Jan 2001 19:30:21 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
Message-ID: <B4A8F5EFA7E4D41184A00003470D35DE011587BC@INTERGATE>

Thank you all for your prompt replies. (Guido's was within seconds! Well,
minutes, certainly.)

I'll give it another go on Monday. I've got renovations to fill my weekend.

/David



From thomas at xs4all.net  Sat Jan 27 01:35:41 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sat, 27 Jan 2001 01:35:41 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <B4A8F5EFA7E4D41184A00003470D35DE011587BC@INTERGATE>; from dgoodger@atsautomation.com on Fri, Jan 26, 2001 at 07:30:21PM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE011587BC@INTERGATE>
Message-ID: <20010127013541.N962@xs4all.nl>

On Fri, Jan 26, 2001 at 07:30:21PM -0500, Goodger, David wrote:

> Thank you all for your prompt replies. (Guido's was within seconds! Well,
> minutes, certainly.)

Oh, the wonderful things one can do with a time machine....

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From jeremy at alum.mit.edu  Fri Jan 26 23:14:26 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 26 Jan 2001 17:14:26 -0500 (EST)
Subject: [Python-Dev] Nightly CVS tarballs
In-Reply-To: <3A721341.3F348E51@lemburg.com>
References: <3A721341.3F348E51@lemburg.com>
Message-ID: <14961.63170.394043.790610@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:

  MAL> I just got a request from someone who wants to test the latest
  MAL> CVS version but unfortunately can't because he's behind a
  MAL> firewall.

  MAL> Is there any chance of reactivating the nightly tarball
  MAL> generation that was once in place ?

  MAL> 	http://www.python.org/download/cvs.html

I plan to set up nightly cvs snapshots soon.  We should be moving into
our new office next week; I hope to have a machine that is on the net
24x7 shortly after that.

Jeremy



From bckfnn at worldonline.dk  Sat Jan 27 08:58:38 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Sat, 27 Jan 2001 07:58:38 GMT
Subject: [Python-Dev] Nightly CVS tarballs
In-Reply-To: <14961.63170.394043.790610@localhost.localdomain>
References: <3A721341.3F348E51@lemburg.com> <14961.63170.394043.790610@localhost.localdomain>
Message-ID: <3a727e79.835771@smtp.worldonline.dk>

>>>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:
>
>  MAL> I just got a request from someone who wants to test the latest
>  MAL> CVS version but unfortunately can't because he's behind a
>  MAL> firewall.
>
>  MAL> Is there any chance of reactivating the nightly tarball
>  MAL> generation that was once in place ?
>
>  MAL> 	http://www.python.org/download/cvs.html

[Jeremy]

>I plan to set up nightly cvs snapshots soon.  We should be moving into
>our new office next week; I hope to have a machine that is on the net
>24x7 shortly after that.

FWIW, I have been using this cron and shell script running on
shell.sourceforge.net. This way I don't need 24x7 in order to make a cvs
tarball (and .zip) available.


22 2 * * * $HOME/bin/jython-snap



SHOTLABEL=`date +%Y%m%d`
LOGLABEL=log.`date +%Y%m%d`
cd /home/groups/jython/htdocs/cvssnaps
(cvs -Qd :pserver:anonymous at cvs1:/cvsroot/jython checkout -d
jython-$SHOTLABEL jython && \
  tar zcf jython-nightly.tar.gz jython-$SHOTLABEL && \
  rm -fr jython-nightly.zip && \
  zip -qr9 jython-nightly.zip jython-$SHOTLABEL && \
  rm -fr jython-$SHOTLABEL) >$LOGLABEL 2>&1


regards,
finn



From tim.one at home.com  Sat Jan 27 10:35:14 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 27 Jan 2001 04:35:14 -0500
Subject: [Python-Dev] setup.py
In-Reply-To: <20010126092559.A5623@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEJPILAA.tim.one@home.com>

[Eric S. Raymond]
> I may not channel Guido the way Tim does, but I suspect he gave you
> developer privileges because he trusts you to do routine stuff like this.

Excellent, Eric!  You're batting 1%.  Here's how to boost it to 93%:
whenever a new idea comes up, just grumble "no".  You'll be right 92% of the
time <wink>.

Reminds me of a friend who got sucked into working at a neural-net startup
trying to build a black box to predict whether the daily close of the S&P
500 would be above or below the previous day's.  He was greatly impressed by
the research they had done, showing that the prototype got the right answer
more than half the time when fed historical data, and at a very high
significance level (i.e., it almost certainly did better than flipping a
coin).  What he didn't realize at the time is that if they had written the
prototype in Python:

    # S&P close daily direction predictor
    print "higher"

it would have been right about 2/3rds the time <0.33 wink>.

never-ascribe-to-insight-what-can-be-explained-by-idiocy-ly y'rs  - tim




From martin at mira.cs.tu-berlin.de  Sat Jan 27 10:38:41 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 27 Jan 2001 10:38:41 +0100
Subject: [Python-Dev] Nightly CVS tarballs
Message-ID: <200101270938.f0R9cfU08311@mira.informatik.hu-berlin.de>

> Is there any chance of reactivating the nightly tarball generation
> that was once in place ?

What's wrong with

http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.gz

?

Regards,
Martin



From fredrik at effbot.org  Sat Jan 27 11:43:50 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Sat, 27 Jan 2001 11:43:50 +0100
Subject: [Python-Dev] setup.py
References: <LNBBLJKPBEHFEDALKOLCOEJPILAA.tim.one@home.com>
Message-ID: <008c01c0884e$09bd2030$e46940d5@hagrid>

tim wrote:
> Reminds me of a friend who got sucked into working at a neural-net startup
> trying to build a black box to predict whether the daily close of the S&P
> 500 would be above or below the previous day's.  /.../
> 
>     # S&P close daily direction predictor
>     print "higher"

replace "higher" with "same", and you have a pretty
decent weather predictor.

Cheers /F




From mal at lemburg.com  Sat Jan 27 13:01:30 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 27 Jan 2001 13:01:30 +0100
Subject: [Python-Dev] Nightly CVS tarballs
References: <200101270938.f0R9cfU08311@mira.informatik.hu-berlin.de>
Message-ID: <3A72B89A.E03C1912@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > Is there any chance of reactivating the nightly tarball generation
> > that was once in place ?
> 
> What's wrong with
> 
> http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.gz
> 
> ?

I didn't realize that SF does this automagically. Could someone
please redirect the link on the python.org cvs page to the
above address (David Ascher's tarball generation stopped in
February 2000 !).

Thanks for the hint, Martin.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fdrake at acm.org  Sat Jan 27 14:16:01 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sat, 27 Jan 2001 08:16:01 -0500 (EST)
Subject: [Python-Dev] Nightly CVS tarballs
In-Reply-To: <3A72B89A.E03C1912@lemburg.com>
References: <200101270938.f0R9cfU08311@mira.informatik.hu-berlin.de>
	<3A72B89A.E03C1912@lemburg.com>
Message-ID: <14962.51729.905084.154359@cj42289-a.reston1.va.home.com>

"Martin v. Loewis" wrote:
 > What's wrong with
 > 
 > http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.gz

M.-A. Lemburg writes:
 > I didn't realize that SF does this automagically. Could someone
 > please redirect the link on the python.org cvs page to the
 > above address (David Ascher's tarball generation stopped in
 > February 2000 !).

  Did you want a "snapshot" or a copy of the repository?  What SF
produces is a tarball of the repository, not a snapshot.  We still
need to do something to create snapshots.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From mal at lemburg.com  Sat Jan 27 14:28:40 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 27 Jan 2001 14:28:40 +0100
Subject: [Python-Dev] Nightly CVS tarballs
References: <200101270938.f0R9cfU08311@mira.informatik.hu-berlin.de>
		<3A72B89A.E03C1912@lemburg.com> <14962.51729.905084.154359@cj42289-a.reston1.va.home.com>
Message-ID: <3A72CD08.F47DAA69@lemburg.com>

"Fred L. Drake, Jr." wrote:
> 
> "Martin v. Loewis" wrote:
>  > What's wrong with
>  >
>  > http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.gz
> 
> M.-A. Lemburg writes:
>  > I didn't realize that SF does this automagically. Could someone
>  > please redirect the link on the python.org cvs page to the
>  > above address (David Ascher's tarball generation stopped in
>  > February 2000 !).
> 
>   Did you want a "snapshot" or a copy of the repository?  What SF
> produces is a tarball of the repository, not a snapshot. 

I meant a copy of what you get when you check out the Python
CVS tree wrapped into a .tar.gz file. The size of the above
archive (16MB) suggests that a lot more is going into the
.tar.gz file. A .tar.gz of the CVS checkout is around 4MB in
size. Looks like we still need to do something after all ;)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From armin at steinhoff.de  Sat Jan 27 17:24:57 2001
From: armin at steinhoff.de (Armin Steinhoff)
Date: Sat, 27 Jan 2001 17:24:57 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
Message-ID: <4.3.2.7.2.20010127170125.00b2ee80@mail.secureweb.de>

Hello Guido,

nice to see the first 2.1 version :)

At 16:52 26.01.01 -0500, you wrote:
> > [CC'ing to Armin Steinhoff, who maintains pyqnx on SourceForge.]
> >
> > I'm having trouble building Python 2.1a1 on QNX 4.25. Caveat: my C is very
> > rusty (long live Python!), I don't know my way around configure, and am not
> > familiar with Python's Makefile. Python 2.0 compiled fine (with a couple of
> > tweaks), but I'm getting caught by the new way of building things. Please
> > help if you can! Many thanks in advance.
> >
> > Here's an excerpt of my efforts:
> >
> >     # cd /tmp/py
> >     # gunzip -c < python-2.1a1.tgz | tar -rf -
> >     # cd Python-2.1a1
> >     # ./configure 2>&1 | tee ../configure.1

I did a fast hack with the new 2.1 version:

CC=cc LINKCC=cc configure --without-gcc --shared=no --without-threads

(Hope '--shared=no' works ... QNX4 doesn't support dynamic loading)
Please replace all references to g++ by cc -> in the main Makefile and the 
Modules/Makefile.
In the Modules/Makefile set LDFLAGS=250K  ... the default stacksize of 32K 
seems to be too small.

> >     # make 2>&1 | tee ../make.1
> >     ...
> >     ./python //5/tmp/py/Python-2.1a1/setup.py build
> >     'import site' failed; use -v for traceback

'python -v' shows that the module 'distutils.util' isn't there ....  it 
seems to be not included in the source distribution.

'import site' failed; traceback:
Traceback (most recent call last):
File "//1/Python-2.1a1/Lib/site.py", line 85, in ?
from distutils.util import get_platform
ImportError: No module named distutils.util
                                              ^^^^^^^^^^^^^^
[ clip ..]

>This is probably in the realm of the distutils.  I have no idea how to
>teach it to build on QNX, sorry!

IMHO ... it is not a path problem.

In the moment there is no time left for me to go into these details. A 
clean port will happen in a few weeks. Please check out PyQNX for news 
regarding QNX4.25 and QNX6.0  (aka  QNX Neutrino).

Greetings

Armin Steinhoff

Life-Demo of PyDACHS
http://www.dachs.net/PyDACHS_python-tilcon.htm
in our booth at
Embedded Systems 2001, Nuremberg, GER
http://www.embedded-systems-messe.de
Febr. 14-16, 2000            Hall 11, Booth P 04






From guido at digicool.com  Sat Jan 27 17:50:50 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 27 Jan 2001 11:50:50 -0500
Subject: [Python-Dev] LINKCC defaults to CXX
In-Reply-To: Your message of "Fri, 26 Jan 2001 08:12:15 PST."
             <20010126081215.B5534@glacier.fnational.com> 
References: <20010126081215.B5534@glacier.fnational.com> 
Message-ID: <200101271650.LAA30720@cj20424-a.reston1.va.home.com>

> Dear lord why?  So people can develop extensions using C++?  Its
> not worth the pain inflicted on everyone else.  Let them
> recompile with LINKCC=CXX.
> 
> Linking with CXX opens a huge can of stinky worms.  First of all,
> just because configure found a value for CXX doesn't mean it
> works.  Even if it does that doesn't mean that using it is a good
> idea.  Linking with CXX will bring in the C++ runtime.  There are
> a large number of platforms where the C++ ABI has not been
> standarized; for example, anything that used g++.
> 
> Can we please leave LINKCC default to CXX?  Its easy enough for
> the crazies to override if they like.  I'll even create a
> configure option for them.

Arg.  My bad.  I did this as an experiment; it didn't break on my
machine, but I didn't intend this to become the standard!  Thanks for
changing it back.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Sat Jan 27 17:52:23 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 27 Jan 2001 11:52:23 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: Your message of "Fri, 26 Jan 2001 23:51:09 +0100."
             <3A71FF5D.DC609775@lemburg.com> 
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid>  
            <3A71FF5D.DC609775@lemburg.com> 
Message-ID: <200101271652.LAA30750@cj20424-a.reston1.va.home.com>

> revision 1.34
> date: 2000/07/19 17:09:51;  author: montanaro;  state: Exp;  lines: +18 -23
> added rewritten normpath from Moshe Zadka that does the right thing with
> paths containing ..
[...]
> Revision 1.33 clearly leaves initial slashes untouched.
> I guess we should restore this...

Yes, please!  (Just the "leading extra slashes stay" behavior.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Sat Jan 27 17:57:40 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 27 Jan 2001 11:57:40 -0500
Subject: [Python-Dev] New bug in function object hash() and comparisons
In-Reply-To: Your message of "Fri, 26 Jan 2001 17:02:09 EST."
             <list-760656@digicool.com> 
References: <list-760656@digicool.com> 
Message-ID: <200101271657.LAA30782@cj20424-a.reston1.va.home.com>

Barry noticed:

> Anyway, did you know that you can use functions as keys to a
> dictionary, but that you can mutate them to "lose" the element?
> 
> -------------------- snip snip --------------------
> Python 2.0 (#13, Jan 10 2001, 13:06:39) 
> [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2
> Type "copyright", "credits" or "license" for more information.
> >>> d = {}
> >>> def foo(): pass
> ... 
> >>> def bar(): pass
> ... 
> >>> d[foo] = 1
> >>> d[foo]
> 1
> >>> foocode = foo.func_code
> >>> foo.func_code = bar.func_code
> >>> d[foo]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> KeyError: <function foo at 0x81ef474>
> >>> d[bar] = 2
> >>> d[bar]
> 2
> >>> d[foo]
> 2
> >>> foo.func_code = foocode
> >>> d[foo]
> 1
> -------------------- snip snip --------------------
> 
> It's because a function's func_code attribute is used in its hash
> calculation, but func_code is writable!

Clearly, something changed.  I'm pretty sure it's the function
attributes.  Either the function attributes shouldn't be used in
comparing function objects, or hash() on functions should be
unimplemented, or comparison on functions should use simple pointer
compares.

What's the right solution?  Do people use functions as dict keys?  If
not, we can remove the hash() implementation.  But I suspect they
*are* used as dict keys.  Not using the __dict__ on comparisons
appears ugly, so probably the best solution is to change function
comparisons to use simple pointer compares.  That removes the
possibility to see whether two different functions implement the same
code -- but does anybody really use that?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From moshez at zadka.site.co.il  Sat Jan 27 18:17:50 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Sat, 27 Jan 2001 19:17:50 +0200 (IST)
Subject: [Python-Dev] New bug in function object hash() and comparisons
In-Reply-To: <200101271657.LAA30782@cj20424-a.reston1.va.home.com>
References: <200101271657.LAA30782@cj20424-a.reston1.va.home.com>, <list-760656@digicool.com>
Message-ID: <20010127171750.91412A840@darjeeling.zadka.site.co.il>

On Sat, 27 Jan 2001 11:57:40 -0500, Guido van Rossum <guido at digicool.com> wrote:

(about function hash doing the wrong thing)
> What's the right solution?

I have no idea...

>  Do people use functions as dict keys?  If
> not, we can remove the hash() implementation.

...but this ain't it.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From gvwilson at ca.baltimore.com  Sat Jan 27 18:23:42 2001
From: gvwilson at ca.baltimore.com (Greg Wilson)
Date: Sat, 27 Jan 2001 12:23:42 -0500
Subject: [Python-Dev] RE: Python-Dev digest, Vol 1 #1119 - 17 msgs
In-Reply-To: <20010127170103.DA6DEEA44@mail.python.org>
Message-ID: <000001c08885$e5418c40$770a0a0a@nevex.com>

> Guido wrote:
> What's the right solution?  Do people use functions as dict keys?

Yup --- even use this as an example in the course (part of drumming
home to students that functions are just a special kind of data).

Greg



From barry at digicool.com  Sat Jan 27 18:43:43 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 27 Jan 2001 12:43:43 -0500
Subject: [Python-Dev] Re: New bug in function object hash() and comparisons
References: <list-760656@digicool.com>
	<200101271657.LAA30782@cj20424-a.reston1.va.home.com>
Message-ID: <14963.2255.268933.615456@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido at digicool.com> writes:

    GvR> Clearly, something changed.  I'm pretty sure it's the
    GvR> function attributes.

Actually no.  func_code is used in func_hash() but somewhere in the
Python 1.6 cycle, func_code was made assignable.
    
    GvR> Either the function attributes shouldn't be used in comparing
    GvR> function objects, or hash() on functions should be
    GvR> unimplemented, or comparison on functions should use simple
    GvR> pointer compares.

    GvR> What's the right solution?

We should definitely continue to allow functions as keys to
dictionaries, but probably just remove func_code as an input to the
function's hash.
    
-Barry



From barry at digicool.com  Sat Jan 27 18:48:33 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 27 Jan 2001 12:48:33 -0500
Subject: [Python-Dev] Re: New bug in function object hash() and comparisons
References: <list-760656@digicool.com>
	<200101271657.LAA30782@cj20424-a.reston1.va.home.com>
	<14963.2255.268933.615456@anthem.wooz.org>
Message-ID: <14963.2545.14600.667505@anthem.wooz.org>

    Me> We should definitely continue to allow functions as keys to
    Me> dictionaries, but probably just remove func_code as an input
    Me> to the function's hash.
    
But of course, func_globals won't be sufficient as a hash for
functions.  Probably changing the hash to a pointer compare is the
best thing after all.

-Barry



From guido at digicool.com  Sat Jan 27 18:49:16 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 27 Jan 2001 12:49:16 -0500
Subject: [Python-Dev] Re: New bug in function object hash() and comparisons
In-Reply-To: Your message of "Sat, 27 Jan 2001 12:43:43 EST."
             <14963.2255.268933.615456@anthem.wooz.org> 
References: <list-760656@digicool.com> <200101271657.LAA30782@cj20424-a.reston1.va.home.com>  
            <14963.2255.268933.615456@anthem.wooz.org> 
Message-ID: <200101271749.MAA32025@cj20424-a.reston1.va.home.com>

> >>>>> "GvR" == Guido van Rossum <guido at digicool.com> writes:
> 
>     GvR> Clearly, something changed.  I'm pretty sure it's the
>     GvR> function attributes.
> 
> Actually no.  func_code is used in func_hash() but somewhere in the
> Python 1.6 cycle, func_code was made assignable.

Argh!  You're right.

>     GvR> Either the function attributes shouldn't be used in comparing
>     GvR> function objects, or hash() on functions should be
>     GvR> unimplemented, or comparison on functions should use simple
>     GvR> pointer compares.
> 
>     GvR> What's the right solution?
> 
> We should definitely continue to allow functions as keys to
> dictionaries, but probably just remove func_code as an input to the
> function's hash.

OK, that settles it.  There's not much point in having a function
compare do anything besides a pointer comparison when the code objects
aren't compared.  (Two completely different functions could compare
equal e.g. if they has the same attribute dict.)  So we should just
punt, and compare functions by object pointer.

The proper way to do this is to *delete* func_hash and func_compare
from funcobject.c -- the default comparison will take care of this.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Sat Jan 27 19:58:30 2001
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Sat, 27 Jan 2001 13:58:30 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org>
Message-ID: <200101271858.NAA04898@mira.erols.com>

On Sat, 27 Jan 2001 18:28:02 +0100, 
	Andreas Jung <andreas at andreas-jung.com> wrote:
>Is there a reason why 2.1 runs significantly slower ?
>Both Python versions were compiled with -g -O2 only.

[CC'ing to python-dev]  Confirmed:

[amk at mira Python-2.0]$ ./python Lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 3.14
This machine benchmarks at 3184.71 pystones/second
[amk at mira Python-2.0]$ python2.1 Lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 3.81
This machine benchmarks at 2624.67 pystones/second

The ceval.c changes seem a likely candidate to have caused this.
Anyone want to run Marc-Andre's microbenchmarks and see how the
numbers have changed?

--amk




From moshez at zadka.site.co.il  Sat Jan 27 20:14:28 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Sat, 27 Jan 2001 21:14:28 +0200 (IST)
Subject: [Python-Dev] Function Hash: Check it in?
Message-ID: <20010127191428.D71ADA840@darjeeling.zadka.site.co.il>

Attached is an example Python session after I patched the intepreter.
The test-suite passes all right.

I want an OK to check this in.

Here is the patch:
Index: Objects/funcobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/funcobject.c,v
retrieving revision 2.33
diff -c -r2.33 funcobject.c
*** Objects/funcobject.c        2001/01/25 20:06:59     2.33
--- Objects/funcobject.c        2001/01/27 19:13:08
***************
*** 347,358 ****
        0,              /*tp_print*/
        0, /*tp_getattr*/
        0, /*tp_setattr*/
!       (cmpfunc)func_compare, /*tp_compare*/
        (reprfunc)func_repr, /*tp_repr*/
        0,              /*tp_as_number*/
        0,              /*tp_as_sequence*/
        0,              /*tp_as_mapping*/
!       (hashfunc)func_hash, /*tp_hash*/
        0,              /*tp_call*/
        0,              /*tp_str*/
        (getattrofunc)func_getattro,         /*tp_getattro*/
--- 347,358 ----
        0,              /*tp_print*/
        0, /*tp_getattr*/
        0, /*tp_setattr*/
!       0, /*tp_compare*/
        (reprfunc)func_repr, /*tp_repr*/
        0,              /*tp_as_number*/
        0,              /*tp_as_sequence*/
        0,              /*tp_as_mapping*/
!       0, /*tp_hash*/
        0,              /*tp_call*/
        0,              /*tp_str*/
        (getattrofunc)func_getattro,         /*tp_getattro*/

Python 2.1a1 (#1, Jan 27 2001, 21:01:24)
[GCC 2.95.3 20010111 (prerelease)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> def foo():
...     pass
...
>>> def bar():
...     pass
...
>>> hash(foo)
135484636
>>> hash(bar)
135481676
>>> foo == bar
0
>>> d = {}
>>> d[foo] =1
>>> def temp():
...     print "baz"
...
>>> foo.func_code = temp.func_code
>>> d[foo]
1

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From tim.one at home.com  Sat Jan 27 21:06:20 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 27 Jan 2001 15:06:20 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <200101271858.NAA04898@mira.erols.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGELGILAA.tim.one@home.com>

[A.M. Kuchling]
> [CC'ing to python-dev]  Confirmed:
>
> [amk at mira Python-2.0]$ ./python Lib/test/pystone.py
> Pystone(1.1) time for 10000 passes = 3.14
> This machine benchmarks at 3184.71 pystones/second
> [amk at mira Python-2.0]$ python2.1 Lib/test/pystone.py
> Pystone(1.1) time for 10000 passes = 3.81
> This machine benchmarks at 2624.67 pystones/second
>
> The ceval.c changes seem a likely candidate to have caused this.
> Anyone want to run Marc-Andre's microbenchmarks and see how the
> numbers have changed?

Want to, yes, but it looks hopeless on my box:

**** 2.0

C:\Python20>python lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 0.851013
This machine benchmarks at 11750.7 pystones/second

C:\Python20>python lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 1.24279
This machine benchmarks at 8046.41 pystones/second

**** 2.1a1

C:\Python21a1>python lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 0.823313
This machine benchmarks at 12146 pystones/second

C:\Python21a1>python lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 1.27046
This machine benchmarks at 7871.15 pystones/second

**** CVS

C:\Code\python\dist\src\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.836391
This machine benchmarks at 11956.1 pystones/second

C:\Code\python\dist\src\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 1.3055
This machine benchmarks at 7659.9 pystones/second


That's after a reboot:  no matter which Python I use, it gets about 12000 on
the first run with a given python.exe, and about 8000 on the second.  Not
shown is that it *stays* at about 8000 until the next reboot.

So there's a Windows (W98SE) Mystery, but also no evidence that timings have
changed worth spit under the MS compiler.  The eval loop is very touchy, and
I suspect you won't track this down on your box until staring at the code
gcc (I presume you're using gcc) generates.  May be sensitive to which
release of gcc you're using too.

switch-to-windows-and-you'll-have-easier-things-to-worry-about<wink>-ly
    y'rs  - tim




From fredrik at pythonware.com  Sun Jan 28 10:37:45 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun, 28 Jan 2001 10:37:45 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid>              <3A71FF5D.DC609775@lemburg.com>  <200101271652.LAA30750@cj20424-a.reston1.va.home.com>
Message-ID: <00ed01c0890e$e3bf5ad0$e46940d5@hagrid>

guido wrote:

> > Revision 1.33 clearly leaves initial slashes untouched.
> > I guess we should restore this...
> 
> Yes, please!  (Just the "leading extra slashes stay" behavior.)

just looked this up in the specs, and POSIX seem to
require that leading slashes are preserved only if there
are exactly two of them:

    A pathname that begins with two successive slashes
    may be interpreted in an implementation-dependent
    manner, although more than two leading slashes are
    treated as a single slash.
    (from susv2)

maybe we should add a if len(slashes) > 2: slashes = "/"
test to the patch?

Cheers /F




From thomas at xs4all.net  Sun Jan 28 18:39:58 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sun, 28 Jan 2001 18:39:58 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <00ed01c0890e$e3bf5ad0$e46940d5@hagrid>; from fredrik@pythonware.com on Sun, Jan 28, 2001 at 10:37:45AM +0100
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid> <3A71FF5D.DC609775@lemburg.com> <200101271652.LAA30750@cj20424-a.reston1.va.home.com> <00ed01c0890e$e3bf5ad0$e46940d5@hagrid>
Message-ID: <20010128183958.Q962@xs4all.nl>

On Sun, Jan 28, 2001 at 10:37:45AM +0100, Fredrik Lundh wrote:
> guido wrote:

> > > Revision 1.33 clearly leaves initial slashes untouched.
> > > I guess we should restore this...
> > 
> > Yes, please!  (Just the "leading extra slashes stay" behavior.)

> just looked this up in the specs, and POSIX seem to
> require that leading slashes are preserved only if there
> are exactly two of them:

>     A pathname that begins with two successive slashes
>     may be interpreted in an implementation-dependent
>     manner, although more than two leading slashes are
>     treated as a single slash.
>     (from susv2)

> maybe we should add a if len(slashes) > 2: slashes = "/"
> test to the patch?

How strictly do we need (or want, for that matter) to follow POSIX here ?
I'm aware the module is called 'posixpath', but it's used in a bit more than
just POSIX environments (or POSIX behaviours) so it might make sense to
ignore this particular tidbit. What if there is a system that attaches a
special meaning to ///, should we create a new path module for it ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From martin at mira.cs.tu-berlin.de  Sun Jan 28 21:50:35 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 28 Jan 2001 21:50:35 +0100
Subject: [Python-Dev] XSLT parser interface
Message-ID: <200101282050.f0SKoZr08809@mira.informatik.hu-berlin.de>

Based on my previous IDL interface for XPath parsers, I've defined an
API for a parser that parsers XSLT pattern expressions. It is an
extension to the XPath API, so I attach only the additional functions.

Any comments are appreciated.

Martin

module XPath{
  // XSLT exprType values
  const unsigned short PATTERN = 17;
  const unsigned short LOCATION_PATTERN = 18;
  const unsigned short RELATIVE_PATH_PATTERN = 19;
  const unsigned short STEP_PATTERN = 20;

  interface Pattern;
  interface LocationPathPattern;
  interface RelativePathPattern;
  interface StepPattern;

  interface PatternFactory:ExprFactory{
    Pattern createPattern(in LocationPathPattern first);
    // idkey may be null, represents IdKeyPattern
    // if parent is true, it is '/', else '//'
    // rel may be null
    LocationPathPattern createLocationPathPattern(in FunctionCall idkey,
						  boolean parent,
						  in RelativePathPattern rel);
    // if parent is true, it is /, else //
    RelativePathPattern createRelativePathPattern(in RelativePathPattern rel,
						  boolean parent,
						  in StepPattern step);
    StepPattern createStepPattern(in AxisSpecifier axis,
				  in NodeTest test,
				  in PredicateList predicates);
  };

  typedef sequence<LocationPathPattern> LocationPathPatterns;
  interface Pattern:Expr{
    readonly attribute LocationPathPatterns patterns;
    void append(in LocationPathPattern pattern);
  };

  interface LocationPathPattern:Expr{
    readonly attribute FunctionCall idkey;
    readonly attribute boolean parent;
    readonly attribute RelativePathPattern relative_pattern;
  };

  interface RelativePathPattern:Expr{
    readonly attribute RelativePathPattern relative;
    readonly attribute boolean parent;
    readonly attribute StepPattern step;
  };

  interface StepPattern:Expr{
    readonly attribute AxisSpecifier axis;
    readonly attribute NodeTest test;
    readonly attribute PredicateList predicates;
  };

  interface XSLTParser:Parser{
    Pattern parsePattern(in DOMString pattern);
  };
};



From skip at mojam.com  Sun Jan 28 22:40:28 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sun, 28 Jan 2001 15:40:28 -0600 (CST)
Subject: [Python-Dev] What happened to Setup.local's functionality?
Message-ID: <14964.37324.642566.602319@beluga.mojam.com>

I just remembered Modules/Setup.local.  I peeked at mine and noticed it had
been zeroed out.  I then copied a version of it over from another machine
and reran make a couple times.  Makesetup ran but nothing mentioned in
Setup.local got built.

I don't think 2.1 can be released without providing a way for users to
recover from this change.  I didn't see anything obvious in setup.py.  Am I
missing something?

Skip




From thomas at xs4all.net  Mon Jan 29 01:39:04 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 29 Jan 2001 01:39:04 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include pyport.h,2.20,2.21
In-Reply-To: <20001104001415.A2093@53b.hoffleit.de>; from gregor@hoffleit.de on Sat, Nov 04, 2000 at 12:14:15AM +0100
References: <200010050142.SAA08326@slayer.i.sourceforge.net> <20001104001415.A2093@53b.hoffleit.de>
Message-ID: <20010129013904.R962@xs4all.nl>

On Sat, Nov 04, 2000 at 12:14:15AM +0100, Gregor Hoffleit wrote:
> FYI: This misdefinition with LONG_BIT was due to a bug in glibc's limits.h. It
> has been fixed in glibc 2.96.

Do you mean gcc 2.96, or glibc 2.(1|2).96 ? Or is 2.96 some internal
versioning for glibc that I was unaware of ? :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From barry at digicool.com  Mon Jan 29 06:03:45 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 00:03:45 -0500
Subject: [Python-Dev] Function Hash: Check it in?
References: <20010127191428.D71ADA840@darjeeling.zadka.site.co.il>
Message-ID: <14964.63921.966960.445548@anthem.wooz.org>

>>>>> "MZ" == Moshe Zadka <moshez at zadka.site.co.il> writes:

    MZ> Attached is an example Python session after I patched the
    MZ> intepreter.  The test-suite passes all right.

    MZ> I want an OK to check this in.

Moshe, please remove the func_hash() and func_compare() functions, and
if the patch passes the test suite, go ahead and check it all in.
Please also check in a test case.

Thanks,
-Barry



From barry at digicool.com  Mon Jan 29 06:04:12 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 00:04:12 -0500
Subject: [Python-Dev] Function Hash: Check it in?
References: <20010127191428.D71ADA840@darjeeling.zadka.site.co.il>
Message-ID: <14964.63948.492662.775413@anthem.wooz.org>

Oh yeah, please also add an entry to the NEWS file.

Thanks,
-Barry



From moshez at zadka.site.co.il  Mon Jan 29 07:26:25 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon, 29 Jan 2001 08:26:25 +0200 (IST)
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <14964.63948.492662.775413@anthem.wooz.org>
References: <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il>
Message-ID: <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>

On Mon, 29 Jan 2001 00:04:12 -0500, barry at digicool.com (Barry A. Warsaw) wrote:
 
> Oh yeah, please also add an entry to the NEWS file.

Done. The checkin to the NEWS file will be done in about a million years,
when my antique of a modem finishes sending the data.
I had to change test_opcodes since it tested that functions with the
same code compare equal.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From gregor at mediasupervision.de  Mon Jan 29 12:13:39 2001
From: gregor at mediasupervision.de (Gregor Hoffleit)
Date: Mon, 29 Jan 2001 12:13:39 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include pyport.h,2.20,2.21
In-Reply-To: <20010129013904.R962@xs4all.nl>; from thomas@xs4all.net on Mon, Jan 29, 2001 at 01:39:04AM +0100
References: <200010050142.SAA08326@slayer.i.sourceforge.net> <20001104001415.A2093@53b.hoffleit.de> <20010129013904.R962@xs4all.nl>
Message-ID: <20010129121339.A1166@mediasupervision.de>

On Mon, Jan 29, 2001 at 01:39:04AM +0100, Thomas Wouters wrote:
> On Sat, Nov 04, 2000 at 12:14:15AM +0100, Gregor Hoffleit wrote:
> > FYI: This misdefinition with LONG_BIT was due to a bug in glibc's limits.h. It
> > has been fixed in glibc 2.96.
> 
> Do you mean gcc 2.96, or glibc 2.(1|2).96 ? Or is 2.96 some internal
> versioning for glibc that I was unaware of ? :)

Sorry, it was fixed in glibc 2.1.96.

    Gregor
    



From mal at lemburg.com  Mon Jan 29 12:31:11 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 12:31:11 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid>  
	            <3A71FF5D.DC609775@lemburg.com> <200101271652.LAA30750@cj20424-a.reston1.va.home.com>
Message-ID: <3A75547F.A601E219@lemburg.com>

Guido van Rossum wrote:
> 
> > revision 1.34
> > date: 2000/07/19 17:09:51;  author: montanaro;  state: Exp;  lines: +18 -23
> > added rewritten normpath from Moshe Zadka that does the right thing with
> > paths containing ..
> [...]
> > Revision 1.33 clearly leaves initial slashes untouched.
> > I guess we should restore this...
> 
> Yes, please!  (Just the "leading extra slashes stay" behavior.)

Checked in a patch which preserves '/' and '//' but converts
more than 3 initial slashes into one (see Fredrik's note about
POSIX standard on this).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 29 13:24:15 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 13:24:15 +0100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com>
Message-ID: <3A7560EF.39D6CF@lemburg.com>

Here the results of my micro benckmark pybench 0.7:

PYBENCH 0.7

Benchmark: /home/lemburg/tmp/pybench-2.1a1.pyb (rounds=10, warp=20)

Tests:                              per run    per oper.  diff *
------------------------------------------------------------------------
          BuiltinFunctionCalls:    1102.30 ms    8.65 us   +7.56%
           BuiltinMethodLookup:     966.75 ms    1.84 us   +4.56%
                 ConcatStrings:    1198.55 ms    7.99 us  +11.63%
                 ConcatUnicode:    1835.60 ms   12.24 us  +19.29%
               CreateInstances:    1556.40 ms   37.06 us   +2.49%
       CreateStringsWithConcat:    1396.70 ms    6.98 us   +5.44%
       CreateUnicodeWithConcat:    1895.80 ms    9.48 us  +31.61%
                  DictCreation:    1760.50 ms   11.74 us   +2.43%
                      ForLoops:    1426.90 ms  142.69 us   -7.51%
                    IfThenElse:    1155.25 ms    1.71 us   -6.24%
                   ListSlicing:     555.40 ms  158.69 us   -4.14%
                NestedForLoops:     784.55 ms    2.24 us   -6.33%
          NormalClassAttribute:    1052.80 ms    1.75 us  -10.42%
       NormalInstanceAttribute:    1053.80 ms    1.76 us   +0.89%
           PythonFunctionCalls:    1127.50 ms    6.83 us  +12.56%
             PythonMethodCalls:     909.10 ms   12.12 us   +9.70%
                     Recursion:     942.40 ms   75.39 us  +23.74%
                  SecondImport:     924.20 ms   36.97 us   +3.98%
           SecondPackageImport:     951.10 ms   38.04 us   +6.16%
         SecondSubmoduleImport:    1211.30 ms   48.45 us   +7.69%
       SimpleComplexArithmetic:    1635.30 ms    7.43 us   +5.58%
        SimpleDictManipulation:     963.35 ms    3.21 us   -0.57%
         SimpleFloatArithmetic:     877.00 ms    1.59 us   -2.92%
      SimpleIntFloatArithmetic:     851.10 ms    1.29 us   -5.89%
       SimpleIntegerArithmetic:     850.05 ms    1.29 us   -6.41%
        SimpleListManipulation:    1168.50 ms    4.33 us   +8.14%
          SimpleLongArithmetic:    1231.15 ms    7.46 us   +1.52%
                    SmallLists:    2153.35 ms    8.44 us  +10.77%
                   SmallTuples:    1314.65 ms    5.48 us   +3.80%
         SpecialClassAttribute:    1050.80 ms    1.75 us   +1.48%
      SpecialInstanceAttribute:    1248.75 ms    2.08 us   -2.32%
                StringMappings:    1702.60 ms   13.51 us  +19.69%
              StringPredicates:    1024.25 ms    3.66 us  -25.49%
                 StringSlicing:    1093.35 ms    6.25 us   +4.35%
                     TryExcept:    1584.85 ms    1.06 us  -10.90%
                TryRaiseExcept:    1239.50 ms   82.63 us   +4.64%
                  TupleSlicing:     983.00 ms    9.36 us   +3.36%
               UnicodeMappings:    1631.65 ms   90.65 us  +42.76%
             UnicodePredicates:    1762.10 ms    7.83 us  +15.99%
             UnicodeProperties:    1410.80 ms    7.05 us  +19.57%
                UnicodeSlicing:    1366.20 ms    7.81 us  +19.23%
------------------------------------------------------------------------
            Average round time:   58001.00 ms              +3.30%

*) measured against: /home/lemburg/tmp/pybench-2.0.pyb (rounds=10, warp=20)

The benchmark is available here in case someone wants to verify
the results on different platforms:

	http://www.lemburg.com/python/pybench-0.7.zip

The above tests were done on a Linux 2.2 system, AMD K6 233MHz. 
The figures shown compare CVS Python (2.1a1) against stock
Python 2.0. 

As you can see, Python function calls have suffered
a lot for some reason. Unicode mappings and other Unicode database
related methods show the effect of the compression of the Unicode
database -- a clear space/speed tradeoff. 

I can't really explain why Unicode concatenation has had a 
slowdown -- perhaps the new coercion logic has something to
do with this ?!

On the nice side: attribute lookups are faster; probably due to
the string key optimizations in the dictionary implementation.
Loops and exceptions are also a tad faster.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fredrik at pythonware.com  Mon Jan 29 13:30:32 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 29 Jan 2001 13:30:32 +0100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com>
Message-ID: <01fc01c089ef$48072230$0900a8c0@SPIFF>

mal wrote:
>                UnicodeMappings:    1631.65 ms   90.65 us  +42.76%
>              UnicodePredicates:    1762.10 ms    7.83 us  +15.99%
>              UnicodeProperties:    1410.80 ms    7.05 us  +19.57%
>                 UnicodeSlicing:    1366.20 ms    7.81 us  +19.23%
>
> Unicode mappings and other Unicode database related methods
> show the effect of the compression of the Unicode database -- a
> clear space/speed tradeoff.

umm.  the tests don't seem to test the "\N{name}" escapes, so the
only thing that has changed in 2.1 is the "decomposition" method
(used in the UnicodeProperties test).

are you sure you're comparing against 2.0 final?

Cheers /F




From mal at lemburg.com  Mon Jan 29 13:52:12 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 13:52:12 +0100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com> <01fc01c089ef$48072230$0900a8c0@SPIFF>
Message-ID: <3A75677C.E4FA82A0@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:
> >                UnicodeMappings:    1631.65 ms   90.65 us  +42.76%
> >              UnicodePredicates:    1762.10 ms    7.83 us  +15.99%
> >              UnicodeProperties:    1410.80 ms    7.05 us  +19.57%
> >                 UnicodeSlicing:    1366.20 ms    7.81 us  +19.23%
> >
> > Unicode mappings and other Unicode database related methods
> > show the effect of the compression of the Unicode database -- a
> > clear space/speed tradeoff.
> 
> umm.  the tests don't seem to test the "\N{name}" escapes, so the
> only thing that has changed in 2.1 is the "decomposition" method
> (used in the UnicodeProperties test).

The mappings figure surprised me too: the code has not changed,
but the unicodetype_db.h look different. Don't know how this
affects performance though.

The differences could also be explained by a increase in Unicode
object creation time (the concatenation is also a lot slower),
so perhaps that's where we should look...

> are you sure you're comparing against 2.0 final?

Yes... after a check of the Makefile I found that I had compiled
Python 2.0 with -O3 and 2.1a1 with -O2 -- perhaps this makes
a difference w/r to inlining of code. I'll recompile and rerun
the benchmark.
 
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Mon Jan 29 13:56:49 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 07:56:49 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>

[Ping]
>     dict[key] = 1
>     if key in dict: ...
>     for key in dict: ...

[Guido]
> No chance of a time-machine escape, but I *can* say that I agree that
> Ping's proposal makes a lot of sense.  This is a reversal of my
> previous opinion on this matter.  (Take note -- those don't happen
> very often! :-)
>
> First to submit a working patch gets a free copy of 2.1a2 and
> subsequent releases,

Thomas since submitted a patch to do the "if key in dict" part (which I
reviewed and accepted, pending resolution of doc issues).

It does not do the "for key in dict" part.  It's not entirely clear whether
you intended to approve that part too (I've simplified away many layers of
quoting in the above <wink>).  In any case, nobody is working on that part.

WRT that part, Ping produced some stats in:

http://mail.python.org/pipermail/python-dev/2001-January/012106.html

> How often do you write 'dict.has_key(x)'?          (std lib says: 206)
> How often do you write 'for x in dict.keys()'?     (std lib says: 49)
>
> How often do you write 'x in dict.values()'?       (std lib says: 0)
> How often do you write 'for x in dict.values()'?   (std lib says: 3)

However, he did not report on occurrences of

    for k, v in dict.items()

I'm not clear exactly which files he examined in the above, or how the
counts were obtained.  So I don't know how this compares:  I counted 188
instances of the string ".items(" in 122 .py files, under the dist/ portion
of current CVS.  A number of those were assignment and return stmts, others
were dict.items() in an arglist, and at least one was in a comment.  After
weeding those out, I was left with 153 legit "for" loops iterating over
x.items().  In all:

    153 iterating over x.items()
    118     "     over x.keys()
     17     "     over x.values()

So I conclude that iterating over .values() is significantly more common
than iterating over .keys().

On c.l.py about an hour ago, Thomas complained that two (out of two) of his
coworkers guessed wrong about what

    for x in dict:

would do, but didn't say what they *did* think it would do.  Since Thomas
doesn't work with idiots, I'm guessing they *didn't* guess it would iterate
over either values or the lines of a freshly-opened file named "dict"
<wink>.

So if you did intend to approve "for x in dict" iterating over dict.keys(),
maybe you want to call me out on that "approval post" I forged under your
name.

falls-on-swords-so-often-there's-nothing-left-to-puncture<wink>-ly y'rs
    - tim




From mal at lemburg.com  Mon Jan 29 14:18:52 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 14:18:52 +0100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com> <01fc01c089ef$48072230$0900a8c0@SPIFF> <3A75677C.E4FA82A0@lemburg.com>
Message-ID: <3A756DBC.8EAC42F5@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> Fredrik Lundh wrote:
> >
> > mal wrote:
> > >                UnicodeMappings:    1631.65 ms   90.65 us  +42.76%
> > >              UnicodePredicates:    1762.10 ms    7.83 us  +15.99%
> > >              UnicodeProperties:    1410.80 ms    7.05 us  +19.57%
> > >                 UnicodeSlicing:    1366.20 ms    7.81 us  +19.23%
> > >
> > > Unicode mappings and other Unicode database related methods
> > > show the effect of the compression of the Unicode database -- a
> > > clear space/speed tradeoff.
> >
> > umm.  the tests don't seem to test the "\N{name}" escapes, so the
> > only thing that has changed in 2.1 is the "decomposition" method
> > (used in the UnicodeProperties test).
> 
> The mappings figure surprised me too: the code has not changed,
> but the unicodetype_db.h look different. Don't know how this
> affects performance though.
> 
> The differences could also be explained by a increase in Unicode
> object creation time (the concatenation is also a lot slower),
> so perhaps that's where we should look...
> 
> > are you sure you're comparing against 2.0 final?
> 
> Yes... after a check of the Makefile I found that I had compiled
> Python 2.0 with -O3 and 2.1a1 with -O2 -- perhaps this makes
> a difference w/r to inlining of code. I'll recompile and rerun
> the benchmark.

Looks like there is an effect of choosing -O3 over -O2 (even though
not necessarily positive all the way); what results do you get on
Windows ?

--

PYBENCH 0.7

Benchmark: /home/lemburg/tmp/pybench-2.1a1.pyb (rounds=10, warp=20)

Tests:                              per run    per oper.  diff *
------------------------------------------------------------------------
          BuiltinFunctionCalls:    1065.10 ms    8.35 us   +3.93%
           BuiltinMethodLookup:    1286.30 ms    2.45 us  +39.12%
                 ConcatStrings:    1243.30 ms    8.29 us  +15.80%
                 ConcatUnicode:    1449.10 ms    9.66 us   -5.83%
               CreateInstances:    1639.25 ms   39.03 us   +7.95%
       CreateStringsWithConcat:    1453.45 ms    7.27 us   +9.73%
       CreateUnicodeWithConcat:    1558.45 ms    7.79 us   +8.19%
                  DictCreation:    1869.35 ms   12.46 us   +8.77%
                      ForLoops:    1526.85 ms  152.69 us   -1.03%
                    IfThenElse:    1381.00 ms    2.05 us  +12.09%
                   ListSlicing:     547.40 ms  156.40 us   -5.52%
                NestedForLoops:     824.50 ms    2.36 us   -1.56%
          NormalClassAttribute:    1233.55 ms    2.06 us   +4.96%
       NormalInstanceAttribute:    1215.50 ms    2.03 us  +16.37%
           PythonFunctionCalls:    1107.30 ms    6.71 us  +10.55%
             PythonMethodCalls:    1047.00 ms   13.96 us  +26.34%
                     Recursion:     940.35 ms   75.23 us  +23.47%
                  SecondImport:     894.05 ms   35.76 us   +0.59%
           SecondPackageImport:     915.05 ms   36.60 us   +2.14%
         SecondSubmoduleImport:    1131.10 ms   45.24 us   +0.56%
       SimpleComplexArithmetic:    1652.05 ms    7.51 us   +6.67%
        SimpleDictManipulation:    1150.25 ms    3.83 us  +18.72%
         SimpleFloatArithmetic:     889.65 ms    1.62 us   -1.52%
      SimpleIntFloatArithmetic:     900.80 ms    1.36 us   -0.40%
       SimpleIntegerArithmetic:     901.75 ms    1.37 us   -0.72%
        SimpleListManipulation:    1125.40 ms    4.17 us   +4.15%
          SimpleLongArithmetic:    1305.15 ms    7.91 us   +7.62%
                    SmallLists:    2102.85 ms    8.25 us   +8.18%
                   SmallTuples:    1329.55 ms    5.54 us   +4.98%
         SpecialClassAttribute:    1234.60 ms    2.06 us  +19.23%
      SpecialInstanceAttribute:    1422.55 ms    2.37 us  +11.28%
                StringMappings:    1585.55 ms   12.58 us  +11.46%
              StringPredicates:    1241.35 ms    4.43 us   -9.69%
                 StringSlicing:    1206.20 ms    6.89 us  +15.12%
                     TryExcept:    1764.35 ms    1.18 us   -0.81%
                TryRaiseExcept:    1217.40 ms   81.16 us   +2.77%
                  TupleSlicing:     933.00 ms    8.89 us   -1.90%
               UnicodeMappings:    1137.35 ms   63.19 us   -0.49%
             UnicodePredicates:    1632.05 ms    7.25 us   +7.43%
             UnicodeProperties:    1244.05 ms    6.22 us   +5.44%
                UnicodeSlicing:    1252.10 ms    7.15 us   +9.27%
------------------------------------------------------------------------
            Average round time:   58804.00 ms              +4.73%

*) measured against: /home/lemburg/tmp/pybench-2.0.pyb (rounds=10, warp=20)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 29 14:28:24 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 14:28:24 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
Message-ID: <3A756FF8.B7185FA2@lemburg.com>

Tim Peters wrote:
> 
> [Ping]
> >     dict[key] = 1
> >     if key in dict: ...
> >     for key in dict: ...
> 
> [Guido]
> > No chance of a time-machine escape, but I *can* say that I agree that
> > Ping's proposal makes a lot of sense.  This is a reversal of my
> > previous opinion on this matter.  (Take note -- those don't happen
> > very often! :-)
> >
> > First to submit a working patch gets a free copy of 2.1a2 and
> > subsequent releases,
> 
> Thomas since submitted a patch to do the "if key in dict" part (which I
> reviewed and accepted, pending resolution of doc issues).
> 
> It does not do the "for key in dict" part.  It's not entirely clear whether
> you intended to approve that part too (I've simplified away many layers of
> quoting in the above <wink>).  In any case, nobody is working on that part.
> 
> WRT that part, Ping produced some stats in:
> 
> http://mail.python.org/pipermail/python-dev/2001-January/012106.html
> 
> > How often do you write 'dict.has_key(x)'?          (std lib says: 206)
> > How often do you write 'for x in dict.keys()'?     (std lib says: 49)
> >
> > How often do you write 'x in dict.values()'?       (std lib says: 0)
> > How often do you write 'for x in dict.values()'?   (std lib says: 3)
> 
> However, he did not report on occurrences of
> 
>     for k, v in dict.items()
> 
> I'm not clear exactly which files he examined in the above, or how the
> counts were obtained.  So I don't know how this compares:  I counted 188
> instances of the string ".items(" in 122 .py files, under the dist/ portion
> of current CVS.  A number of those were assignment and return stmts, others
> were dict.items() in an arglist, and at least one was in a comment.  After
> weeding those out, I was left with 153 legit "for" loops iterating over
> x.items().  In all:
> 
>     153 iterating over x.items()
>     118     "     over x.keys()
>      17     "     over x.values()
> 
> So I conclude that iterating over .values() is significantly more common
> than iterating over .keys().
> 
> On c.l.py about an hour ago, Thomas complained that two (out of two) of his
> coworkers guessed wrong about what
> 
>     for x in dict:
> 
> would do, but didn't say what they *did* think it would do.  Since Thomas
> doesn't work with idiots, I'm guessing they *didn't* guess it would iterate
> over either values or the lines of a freshly-opened file named "dict"
> <wink>.
> 
> So if you did intend to approve "for x in dict" iterating over dict.keys(),
> maybe you want to call me out on that "approval post" I forged under your
> name.

Dictionaries are not sequences. I wonder what order a user of
for k,v in dict: (or whatever other of this proposal you choose)
will expect...

Please also take into account that dictionaries are *mutable*
and their internal state is not defined to e.g. not change due to
lookups (take the string optimization for example...), so exposing
PyDict_Next() in any to Python will cause trouble. In the end,
you will need to create a list or tuple to iterate over one way
or another, so why bother overloading for-loops w/r to dictionaries ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From bckfnn at worldonline.dk  Mon Jan 29 14:48:44 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Mon, 29 Jan 2001 13:48:44 GMT
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>
References: <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>
Message-ID: <3a75747e.17414620@smtp.worldonline.dk>

On Mon, 29 Jan 2001 08:26:25 +0200 (IST), you wrote:

>I had to change test_opcodes since it tested that functions with the
>same code compare equal.

Thanks. With this change, Jython too can complete the test_opcodes. In
Jython a code object can never compare equal to anything but itself.

regards,
finn



From moshez at zadka.site.co.il  Mon Jan 29 15:04:47 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon, 29 Jan 2001 16:04:47 +0200 (IST)
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <3a75747e.17414620@smtp.worldonline.dk>
References: <3a75747e.17414620@smtp.worldonline.dk>, <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>
Message-ID: <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>

On Mon, 29 Jan 2001 13:48:44 GMT, bckfnn at worldonline.dk (Finn Bock) wrote:
 
> Thanks. With this change, Jython too can complete the test_opcodes. In
> Jython a code object can never compare equal to anything but itself.

Great! I'm happy to have helped.
I'm starting to wonder what the tests really test: the language definition,
or accidents of the implementation?
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From MarkH at ActiveState.com  Mon Jan 29 15:35:25 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Tue, 30 Jan 2001 01:35:25 +1100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <3A756DBC.8EAC42F5@lemburg.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPGEGHDAAA.MarkH@ActiveState.com>

"M.-A. Lemburg" wrote:
> what results do you get on Windows ?

Win2k, dual 800, relatively quiet!

Python 2.0

F:\src\Python-2.0\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.847605
This machine benchmarks at 11798 pystones/second

F:\src\Python-2.0\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.845104
This machine benchmarks at 11832.9 pystones/second

F:\src\Python-2.0\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.846069
This machine benchmarks at 11819.4 pystones/second

F:\src\Python-2.0\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.849447
This machine benchmarks at 11772.4 pystones/second

Python from CVS today:

F:\src\python-cvs\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.885801
This machine benchmarks at 11289.2 pystones/second

F:\src\python-cvs\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.889048
This machine benchmarks at 11248 pystones/second

F:\src\python-cvs\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.892422
This machine benchmarks at 11205.5 pystones/second


Although I deleted Tim's earlier mail, from memory this is pretty similar in
terms of performance lost.  I'm afraid I have no idea what your benchmarks
are or how to build them <wink>, but did check that the optimizer is set for
"mazimize for speed" (/O2).  Other compiler options gave significantly
smaller results (no optimizations around 8500, and "optimize for space"
(/O1) at around 10000).  Other fiddling with the optimizer couldn't get
better results than the existing settings.

Mark.




From guido at digicool.com  Mon Jan 29 15:48:22 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 09:48:22 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Mon, 29 Jan 2001 07:56:49 EST."
             <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> 
Message-ID: <200101291448.JAA11473@cj20424-a.reston1.va.home.com>

> [Ping]
> >     dict[key] = 1
> >     if key in dict: ...
> >     for key in dict: ...
> 
> [Guido]
> > No chance of a time-machine escape, but I *can* say that I agree that
> > Ping's proposal makes a lot of sense.  This is a reversal of my
> > previous opinion on this matter.  (Take note -- those don't happen
> > very often! :-)
> >
> > First to submit a working patch gets a free copy of 2.1a2 and
> > subsequent releases,
> 
> Thomas since submitted a patch to do the "if key in dict" part (which I
> reviewed and accepted, pending resolution of doc issues).
> 
> It does not do the "for key in dict" part.  It's not entirely clear whether
> you intended to approve that part too (I've simplified away many layers of
> quoting in the above <wink>).  In any case, nobody is working on that part.
> 
> WRT that part, Ping produced some stats in:
> 
> http://mail.python.org/pipermail/python-dev/2001-January/012106.html
> 
> > How often do you write 'dict.has_key(x)'?          (std lib says: 206)
> > How often do you write 'for x in dict.keys()'?     (std lib says: 49)
> >
> > How often do you write 'x in dict.values()'?       (std lib says: 0)
> > How often do you write 'for x in dict.values()'?   (std lib says: 3)
> 
> However, he did not report on occurrences of
> 
>     for k, v in dict.items()
> 
> I'm not clear exactly which files he examined in the above, or how the
> counts were obtained.  So I don't know how this compares:  I counted 188
> instances of the string ".items(" in 122 .py files, under the dist/ portion
> of current CVS.  A number of those were assignment and return stmts, others
> were dict.items() in an arglist, and at least one was in a comment.  After
> weeding those out, I was left with 153 legit "for" loops iterating over
> x.items().  In all:
> 
>     153 iterating over x.items()
>     118     "     over x.keys()
>      17     "     over x.values()
> 
> So I conclude that iterating over .values() is significantly more common
> than iterating over .keys().

I did a less sophisticated count but come to the same conclusion:
iterations over items() are (somewhat) more common than over keys(),
and values() are 1-2 orders of magnitude less common.  My numbers:

$ cd python/src/Lib
$ grep 'for .*items():' *.py | wc -l
     47
$ grep 'for .*keys():' *.py | wc -l
     43
$ grep 'for .*values():' *.py | wc -l
      2

> On c.l.py about an hour ago, Thomas complained that two (out of two) of his
> coworkers guessed wrong about what
> 
>     for x in dict:
> 
> would do, but didn't say what they *did* think it would do.  Since Thomas
> doesn't work with idiots, I'm guessing they *didn't* guess it would iterate
> over either values or the lines of a freshly-opened file named "dict"
> <wink>.

I don't much value to the readability argument: typically, one will
write "for key in dict" or "for name in dict" and then it's obvious
what is meant.

> So if you did intend to approve "for x in dict" iterating over dict.keys(),
> maybe you want to call me out on that "approval post" I forged under your
> name.

But here's my dilemma.  "if (k, v) in dict" is clearly useless (nobody
has even asked me for a has_item() method).  I can live with "x in
list" checking the values and "x in dict" checking the keys.  But I
can *not* live with "x in dict" equivalent to "dict.has_key(x)" if
"for x in dict" would mean "for x in dict.items()".  I also think that
defining "x in dict" but not "for x in dict" will be confusing.

So we need to think more.

How about:

    for key in dict: ...		# ... over keys

    for key:value in dict: ...		# ... over items

This is syntactically unambiguous (a colon is currently illegal in
that position).

This also suggests:

    for index:value in list: ...	# ... over zip(range(len(list), list)

while doesn't strike me as bad or ugly, and would fulfill my brother's
dearest wish.

(And why didn't we think of this before?)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Mon Jan 29 15:58:16 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 29 Jan 2001 15:58:16 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291448.JAA11473@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 09:48:22AM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <200101291448.JAA11473@cj20424-a.reston1.va.home.com>
Message-ID: <20010129155816.T962@xs4all.nl>

On Mon, Jan 29, 2001 at 09:48:22AM -0500, Guido van Rossum wrote:

> How about:

>     for key in dict: ...		# ... over keys

>     for key:value in dict: ...		# ... over items

> This is syntactically unambiguous (a colon is currently illegal in
> that position).

I won't comment on the syntax right now, I need to look at it for a while
first :-) However, what about MAL's point about dict ordering, internally ?
Wouldn't FOR_LOOP be forced to generate a list of keys anyway, to avoid
skipping keys ? I know currently the dict implementation doesn't do any
reordering except during adds/deletes, but there is nothing in the language
ref that supports that -- it's an implementation detail. Would we make a
future enhancement where (some form of) gc would 'clean up' large
dictionaries impossible ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Mon Jan 29 16:00:38 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 10:00:38 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Mon, 29 Jan 2001 14:28:24 +0100."
             <3A756FF8.B7185FA2@lemburg.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>  
            <3A756FF8.B7185FA2@lemburg.com> 
Message-ID: <200101291500.KAA11569@cj20424-a.reston1.va.home.com>

> Dictionaries are not sequences. I wonder what order a user of
> for k,v in dict: (or whatever other of this proposal you choose)
> will expect...

The same order that for k,v in dict.items() will yield, of course.

> Please also take into account that dictionaries are *mutable*
> and their internal state is not defined to e.g. not change due to
> lookups (take the string optimization for example...), so exposing
> PyDict_Next() in any to Python will cause trouble. In the end,
> you will need to create a list or tuple to iterate over one way
> or another, so why bother overloading for-loops w/r to dictionaries ?

Actually, I was going to propose to play dangerously here: the

    for k:v in dict: ...

syntax I proposed in my previous message should indeed expose
PyDict_Next().  It should be a big speed-up, and I'm expecting (though
don't have much proof) that most loops over dicts don't mutate the
dict.

Maybe we could add a flag to the dict that issues an error when a new
key is inserted during such a for loop?  (I don't think the key order
can be affected when a key is *deleted*.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Mon Jan 29 16:30:17 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 10:30:17 -0500
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: Your message of "Mon, 29 Jan 2001 16:04:47 +0200."
             <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il> 
References: <3a75747e.17414620@smtp.worldonline.dk>, <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>  
            <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il> 
Message-ID: <200101291530.KAA12037@cj20424-a.reston1.va.home.com>

> I'm starting to wonder what the tests really test: the language definition,
> or accidents of the implementation?

It's good to test conformance to the language definition, but this is
also a regression test for the implementation.  The "accidents of the
implementation" definitely need to be tested.  E.g. if we decide that
repr(s) uses \n rather than \012 or \x0a, this should be tested too.
The language definition gives the implementer a choice here; but once
the implementer has made a choice, it's good to have a test that tests
that this choice is implemented correctly.

Perhaps there should be several parts to the regression test,
e.g. language conformance, library conformance, platform-specific
features, and implementation conformance?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Mon Jan 29 16:57:12 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 10:57:12 -0500
Subject: [Python-Dev] What happened to Setup.local's functionality?
In-Reply-To: Your message of "Sun, 28 Jan 2001 15:40:28 CST."
             <14964.37324.642566.602319@beluga.mojam.com> 
References: <14964.37324.642566.602319@beluga.mojam.com> 
Message-ID: <200101291557.KAA12347@cj20424-a.reston1.va.home.com>

> I just remembered Modules/Setup.local.  I peeked at mine and noticed it had
> been zeroed out.  I then copied a version of it over from another machine
> and reran make a couple times.  Makesetup ran but nothing mentioned in
> Setup.local got built.
> 
> I don't think 2.1 can be released without providing a way for users to
> recover from this change.  I didn't see anything obvious in setup.py.  Am I
> missing something?

Well, Module/Setup is still used, so it should be trivial to add
Setup.local back too.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas at arctrix.com  Mon Jan 29 10:23:55 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 29 Jan 2001 01:23:55 -0800
Subject: [Python-Dev] What happened to Setup.local's functionality?
In-Reply-To: <14964.37324.642566.602319@beluga.mojam.com>; from skip@mojam.com on Sun, Jan 28, 2001 at 03:40:28PM -0600
References: <14964.37324.642566.602319@beluga.mojam.com>
Message-ID: <20010129012355.A14763@glacier.fnational.com>

On Sun, Jan 28, 2001 at 03:40:28PM -0600, Skip Montanaro wrote:
> Makesetup ran but nothing mentioned in Setup.local got built.

I believe Setup.local should still work.  One possibility is that
the modules in Setup.local were marked as shared.  Shared modules
from Setup* don't get build by default.  You have to do "make
oldsharedmods".  I'm not sure why oldsharedmods is not included
in the all target.  Andrew, can you think of any reason why it
shouldn't be added.

  Neil



From dgoodger at atsautomation.com  Mon Jan 29 17:19:12 2001
From: dgoodger at atsautomation.com (Goodger, David)
Date: Mon, 29 Jan 2001 11:19:12 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
Message-ID: <B4A8F5EFA7E4D41184A00003470D35DE0115894B@INTERGATE>

Marc-Andre Lemburg's patch to posixpath.py clears up the path problem.
Thanks!

MACHDEP is qnxJ for QNX 4.25, qnxG for QNX 4.23. I don't know what it is for
QNX 6 (Neutrino). Perhaps test for MACHDEP[:3]=='qnx'?

I'm still stuck at 'python setup.py build':

    unable to execute ld: no such file or directory
    running build
    running build_ext
    building 'struct' extension
    skipping //5/tmp/py/Python-2.1a1/Modules/structmodule.c
(build/temp.qnx-J-PCI-2.1/structmodule.o up-to-date)
    ld build/temp.qnx-J-PCI-2.1/structmodule.o -L/usr/local/lib -o
build/lib.qnx-J-PCI-2.1/struct.so
    error: command 'ld' failed with exit status 1
    make: *** [sharedmods] Error 1

Armin Steinhoff said "QNX4 doesn't support dynamic loading". Is this
compatible with distutils? If not, is there a workaround?

Neil Schemenauer asked, "what should LDSHARED say for QNX?". I don't know.
Python 2.0 compiled OK, and its makefile says LDSHARED=ld. However,
Modules/Setup has no uncommented "*shared*" line.

Those of us who rely on Python to get our work done, and who don't have the
bandwidth for the implementation complexities, owe a lot to everyone who
makes it possible to compile Python out-of-the-box. Very much appreciated.
Thank you!

David Goodger
Systems Administrator & Programmer, Advanced Systems
Automation Tooling Systems Inc., Automation Systems Division
direct: (519) 653-4483 ext. 7121    fax: (519) 650-6695
e-mail: dgoodger at atsautomation.com



From nas at arctrix.com  Mon Jan 29 10:40:07 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 29 Jan 2001 01:40:07 -0800
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <B4A8F5EFA7E4D41184A00003470D35DE0115894B@INTERGATE>; from dgoodger@atsautomation.com on Mon, Jan 29, 2001 at 11:19:12AM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE0115894B@INTERGATE>
Message-ID: <20010129014007.C14763@glacier.fnational.com>

On Mon, Jan 29, 2001 at 11:19:12AM -0500, Goodger, David wrote:
> I'm still stuck at 'python setup.py build':
...
> Armin Steinhoff said "QNX4 doesn't support dynamic loading". Is this
> compatible with distutils? If not, is there a workaround?

The setup.py script only builds shared modules.  Your going to
have to enable modules using the old Setup file.  I think
Setup.dist should got back to including all the modules
(commented out of course).  This would make it easier to people
who can't or don't want to build shared modules.

  Neil



From akuchlin at cnri.reston.va.us  Mon Jan 29 17:50:31 2001
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon, 29 Jan 2001 11:50:31 -0500
Subject: [Python-Dev] What happened to Setup.local's functionality?
In-Reply-To: <20010129012355.A14763@glacier.fnational.com>; from nas@arctrix.com on Mon, Jan 29, 2001 at 01:23:55AM -0800
References: <14964.37324.642566.602319@beluga.mojam.com> <20010129012355.A14763@glacier.fnational.com>
Message-ID: <20010129115031.B4018@amarok.cnri.reston.va.us>

On Mon, Jan 29, 2001 at 01:23:55AM -0800, Neil Schemenauer wrote:
>from Setup* don't get build by default.  You have to do "make
>oldsharedmods".  I'm not sure why oldsharedmods is not included
>in the all target.  Andrew, can you think of any reason why it
>shouldn't be added.

That's an excellent idea, particularly if we add back Setup.dist, too,
and comment out all but the required modules.  

I'll try to do that today.  Note that I'm leaving on vacation
tomorrow, and will be back next Monday.  Everyone, feel free to check
in changes to setup.py that are required.

--amk




From jeremy at alum.mit.edu  Mon Jan 29 17:48:11 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 29 Jan 2001 11:48:11 -0500 (EST)
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <3A75677C.E4FA82A0@lemburg.com>
References: <mailman.980616572.26954.python-list@python.org>
	<200101271858.NAA04898@mira.erols.com>
	<3A7560EF.39D6CF@lemburg.com>
	<01fc01c089ef$48072230$0900a8c0@SPIFF>
	<3A75677C.E4FA82A0@lemburg.com>
Message-ID: <14965.40651.233438.311104@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:

  MAL> Yes... after a check of the Makefile I found that I had
  MAL> compiled Python 2.0 with -O3 and 2.1a1 with -O2 -- perhaps this
  MAL> makes a difference w/r to inlining of code. I'll recompile and
  MAL> rerun the benchmark.
 
When I was working in the CALL_FUNCTION revision, I compared 2.0 final
with my development working using -O3.  At that time, I saw no
significant performance difference between the two.  And I did notice
a difference between -O2 and -O3.

The strange thing is that I notice a difference between -O2 and -O3
with 2.1a1, but in the opposite direction.  On pystone, python -O2
runs consistently faster than -O3; the difference is .05 sec on my
machine.  

Jeremy



From esr at thyrsus.com  Mon Jan 29 18:12:05 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 12:12:05 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <14965.40651.233438.311104@localhost.localdomain>; from jeremy@alum.mit.edu on Mon, Jan 29, 2001 at 11:48:11AM -0500
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com> <01fc01c089ef$48072230$0900a8c0@SPIFF> <3A75677C.E4FA82A0@lemburg.com> <14965.40651.233438.311104@localhost.localdomain>
Message-ID: <20010129121205.A8337@thyrsus.com>

Jeremy Hylton <jeremy at alum.mit.edu>:
> The strange thing is that I notice a difference between -O2 and -O3
> with 2.1a1, but in the opposite direction.  On pystone, python -O2
> runs consistently faster than -O3; the difference is .05 sec on my
> machine.  

Bizarre.  Make me wonder if we have a C compiler problem.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

In every country and in every age, the priest has been hostile to
liberty. He is always in alliance with the despot, abetting his abuses
in return for protection to his own.
	-- Thomas Jefferson, 1814



From jeremy at alum.mit.edu  Mon Jan 29 18:27:08 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 29 Jan 2001 12:27:08 -0500 (EST)
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <20010129121205.A8337@thyrsus.com>
References: <mailman.980616572.26954.python-list@python.org>
	<200101271858.NAA04898@mira.erols.com>
	<3A7560EF.39D6CF@lemburg.com>
	<01fc01c089ef$48072230$0900a8c0@SPIFF>
	<3A75677C.E4FA82A0@lemburg.com>
	<14965.40651.233438.311104@localhost.localdomain>
	<20010129121205.A8337@thyrsus.com>
Message-ID: <14965.42988.362288.154254@localhost.localdomain>

>>>>> "ESR" == Eric S Raymond <esr at thyrsus.com> writes:

  ESR> Jeremy Hylton <jeremy at alum.mit.edu>:
  >> The strange thing is that I notice a difference between -O2 and
  >> -O3 with 2.1a1, but in the opposite direction.  On pystone,
  >> python -O2 runs consistently faster than -O3; the difference is
  >> .05 sec on my machine.

  ESR> Bizarre.  Make me wonder if we have a C compiler problem.

Depends on your defintion of "compiler problem" <wink>.  If you mean,
it compiles our code so it runs slower, then, yes, we've got one :-).

One of the differences between -O2 and -O3, according to the man page,
is that -O3 will perform optimizations that involve a space-speed
tradeoff.  It also include -finline-functions.  I can imagine that
some of these optimizations hurt memory performance enough to make a
difference. 

not-really-understanding-but-not-really-expecting-too-ly y'rs,
Jeremy



From jeremy at alum.mit.edu  Mon Jan 29 18:39:05 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 29 Jan 2001 12:39:05 -0500 (EST)
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <14965.40651.233438.311104@localhost.localdomain>
References: <mailman.980616572.26954.python-list@python.org>
	<200101271858.NAA04898@mira.erols.com>
	<3A7560EF.39D6CF@lemburg.com>
	<01fc01c089ef$48072230$0900a8c0@SPIFF>
	<3A75677C.E4FA82A0@lemburg.com>
	<14965.40651.233438.311104@localhost.localdomain>
Message-ID: <14965.43705.367236.994786@localhost.localdomain>

The recursion test in pybench is testing the performance of the nested
scopes changes, which must do some extra bookkeeping to reference the
recursive function in a nested scope.  To some extent, a performance
hit is a necessary consequence for nested functions with free
variables.

Nonetheless, there are two interesting things to say about this
situation.

First, there is a bug in the current implementation of nested scopes
that the benchmark tickles.  The problem is with code like this:

def outer():
    global f
    def f(x):
        if x > 0:
            return f(x - 1)

The compiler determines that f is free in f.  (It's recursive.)  If f
is free in f, in the absence of the global decl, the body of outer
must allocate fresh storage (a cell) for f each time outer is called
and add a reference to that cell to f's closure.

If f is declared global in outer, then it ought to be treated as a
global in nested scopes, too.  In general terms, a free variable
should use the binding found in the nearest enclosing scope.  If the
nearest enclosing scope has a global binding, then the reference is
global. 

If I fix this problem, the recursion benchmark shouldn't be any slower
than a normal function call.

The second interesting thing to say is that frame allocation and
dealloc is probably more expensive than it needs to be in the current
implementation.  The frame object has a new f_closure slot that holds
a tuple that is freshly allocated every time the frame is allocated.
(Unless the closure is empty, then f_closure is just NULL.)

The extra tuple allocation can probably be done away with by using the
same allocation strategy as locals & stack.  If the f_localsplus array
holds cells + frees + locals + stack, then a new frame will never
require more than a single malloc (and often not even that).

Jeremy



From akuchlin at cnri.reston.va.us  Mon Jan 29 18:54:37 2001
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon, 29 Jan 2001 12:54:37 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <14965.42988.362288.154254@localhost.localdomain>; from jeremy@alum.mit.edu on Mon, Jan 29, 2001 at 12:27:08PM -0500
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com> <01fc01c089ef$48072230$0900a8c0@SPIFF> <3A75677C.E4FA82A0@lemburg.com> <14965.40651.233438.311104@localhost.localdomain> <20010129121205.A8337@thyrsus.com> <14965.42988.362288.154254@localhost.localdomain>
Message-ID: <20010129125437.E4018@amarok.cnri.reston.va.us>

On Mon, Jan 29, 2001 at 12:27:08PM -0500, Jeremy Hylton wrote:
>Depends on your defintion of "compiler problem" <wink>.  If you mean,
>it compiles our code so it runs slower, then, yes, we've got one :-).

Compiling with gcc and -g, with no optimization, 2.0 and 2.1cvs seem
to be very close, with 2.1 slightly slower:

2.0:
Pystone(1.1) time for 10000 passes = 1.04
This machine benchmarks at 9615.38 pystones/second
This machine benchmarks at 9345.79 pystones/second
This machine benchmarks at 9433.96 pystones/second
This machine benchmarks at 9433.96 pystones/second
This machine benchmarks at 9523.81 pystones/second

2.1cvs:
Pystone(1.1) time for 10000 passes = 1.09
This machine benchmarks at 9174.31 pystones/second
This machine benchmarks at 9090.91 pystones/second
This machine benchmarks at 9259.26 pystones/second
This machine benchmarks at 9174.31 pystones/second
This machine benchmarks at 9090.91 pystones/second

Would it be worth experimenting with platform-specific compiler
options to try to squeeze out the last bit of performance (can wait
for the betas, probably).

--amk



From jeremy at alum.mit.edu  Mon Jan 29 19:04:28 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 29 Jan 2001 13:04:28 -0500 (EST)
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <3A756DBC.8EAC42F5@lemburg.com>
References: <mailman.980616572.26954.python-list@python.org>
	<200101271858.NAA04898@mira.erols.com>
	<3A7560EF.39D6CF@lemburg.com>
	<01fc01c089ef$48072230$0900a8c0@SPIFF>
	<3A75677C.E4FA82A0@lemburg.com>
	<3A756DBC.8EAC42F5@lemburg.com>
Message-ID: <14965.45228.197778.579989@localhost.localdomain>

I hope another set of benchmarks isn't overkill for the list.  I see
different results comparing 2.1 with 2.0 (both -O3) using pybench
0.6. 

The interesting differences I see in this benchmark that I didn't see
in MAL's are:

DictCreation +15.87%
SeoncdImport +20.29%

Other curious differences, which show up in both benchmarks, include:
SpecialClassAttribute +17.91%     (private variables)
SpecialInstanceAttribute +15.34%  (__methods__)

Jeremy

PYBENCH 0.6

Benchmark: py21 (rounds=10, warp=20)

Tests:                              per run    per oper.  diff *
------------------------------------------------------------------------
          BuiltinFunctionCalls:     305.05 ms    2.39 us   +4.77%
           BuiltinMethodLookup:     319.65 ms    0.61 us   +2.55%
                 ConcatStrings:     383.70 ms    2.56 us   +1.27%
               CreateInstances:     463.85 ms   11.04 us   +1.96%
       CreateStringsWithConcat:     381.20 ms    1.91 us   +2.39%
                  DictCreation:     508.85 ms    3.39 us  +15.87%
                      ForLoops:     577.60 ms   57.76 us   +5.65%
                    IfThenElse:     443.70 ms    0.66 us   +1.02%
                   ListSlicing:     207.50 ms   59.29 us   -4.18%
                NestedForLoops:     315.75 ms    0.90 us   +3.54%
          NormalClassAttribute:     379.80 ms    0.63 us   +7.39%
       NormalInstanceAttribute:     385.45 ms    0.64 us   +8.04%
           PythonFunctionCalls:     400.00 ms    2.42 us  +13.62%
             PythonMethodCalls:     306.25 ms    4.08 us   +5.13%
                     Recursion:     337.25 ms   26.98 us  +19.00%
                  SecondImport:     301.20 ms   12.05 us  +20.29%
           SecondPackageImport:     298.20 ms   11.93 us  +18.15%
         SecondSubmoduleImport:     339.15 ms   13.57 us  +11.40%
       SimpleComplexArithmetic:     392.70 ms    1.79 us  -10.52%
        SimpleDictManipulation:     350.40 ms    1.17 us   +3.87%
         SimpleFloatArithmetic:     300.75 ms    0.55 us   +2.04%
      SimpleIntFloatArithmetic:     347.95 ms    0.53 us   +9.01%
       SimpleIntegerArithmetic:     356.40 ms    0.54 us  +12.01%
        SimpleListManipulation:     351.85 ms    1.30 us  +11.33%
          SimpleLongArithmetic:     309.00 ms    1.87 us   -5.81%
                    SmallLists:     584.25 ms    2.29 us  +10.20%
                   SmallTuples:     442.00 ms    1.84 us  +10.33%
         SpecialClassAttribute:     406.50 ms    0.68 us  +17.91%
      SpecialInstanceAttribute:     557.40 ms    0.93 us  +15.34%
                 StringSlicing:     336.45 ms    1.92 us   +9.56%
                     TryExcept:     650.60 ms    0.43 us   +1.40%
                TryRaiseExcept:     345.95 ms   23.06 us   +2.70%
                  TupleSlicing:     266.35 ms    2.54 us   +4.70%
------------------------------------------------------------------------
            Average round time:   14413.00 ms              +7.07%

*) measured against: py20 (rounds=10, warp=20)




From skip at mojam.com  Mon Jan 29 19:07:26 2001
From: skip at mojam.com (Skip Montanaro)
Date: Mon, 29 Jan 2001 12:07:26 -0600 (CST)
Subject: [Python-Dev] What happened to Setup.local's functionality?
In-Reply-To: <20010129012355.A14763@glacier.fnational.com>
References: <14964.37324.642566.602319@beluga.mojam.com>
	<20010129012355.A14763@glacier.fnational.com>
Message-ID: <14965.45406.933528.53857@beluga.mojam.com>

    Neil> You have to do "make oldsharedmods".  

This did the trick.  This should be emblazoned in big red letters somewhere
if the decision is made to not include oldsharedmods as a dependency for the
all target.

Thx,

Skip




From gvwilson at ca.baltimore.com  Mon Jan 29 19:19:21 2001
From: gvwilson at ca.baltimore.com (Greg Wilson)
Date: Mon, 29 Jan 2001 13:19:21 -0500
Subject: [Python-Dev] Re: Re: Sets: elt in dict, lst.include
In-Reply-To: <20010129162012.32158ED49@mail.python.org>
Message-ID: <001501c08a20$00dca2a0$770a0a0a@nevex.com>

> > > [Ping]
> > >     dict[key] = 1
> > >     if key in dict: ...
> > >     for key in dict: ...

> "Tim Peters" <tim.one at home.com>
> "if (k, v) in dict" is clearly useless...
> I can live with "x in list" checking the values and "x in dict"
> checking the keys.  But I can *not* live with "x in dict" equivalent
> to "dict.has_key(x)" if "for x in dict" would mean "for x in dict.items()".
> I also think that defining "x in dict" but not "for x in dict" will be
> confusing.

[Greg]
Quick poll (four people): if the expression "if a in b" works,
then all four expected "for a in b" to work as well.  This is
also my intuition; are there any exceptions in really existing
Python?

> [Guido]
>     for key in dict: ...		# ... over keys
>     for key:value in dict: ...	# ... over items

[Greg]
I'm probably revealing my ignorance of Python's internals here,
but can the iteration protocol be extended so that the object
(in this case, the dict) is told the number and type(s) of the
values the loop is expecting?  With:

    for key in dict: ...

the dict would be asked for one value; with:

    for (key, value) in dict:

the dict would be told that a two-element tuple was expected,
and so on.  This would allow multi-dimensional structures
(e.g. NumPy arrays) to do things like:

    for (i, j, k) in array:		# please give me three indices

and:

    for ((i, j, k), v) in array:	# three indices and value

> [Guido]
>     for index:value in list: ...	# ... over zip(range(len(list), list)

How do you feel about:

    for i in seq.keys():		# strings, tuples, etc.

"keys()" is kind of strange ("indices" or something would be
more natural), *but* this allows uniform iteration over all
built-in collections:

    def showem(c):
        for i in c.keys():
            print i, c[i]

Greg




From bckfnn at worldonline.dk  Mon Jan 29 19:31:48 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Mon, 29 Jan 2001 18:31:48 GMT
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>
References: <3a75747e.17414620@smtp.worldonline.dk>, <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il> <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>
Message-ID: <3a75aba9.31537178@smtp.worldonline.dk>

On Mon, 29 Jan 2001 16:04:47 +0200 (IST), you wrote:

>On Mon, 29 Jan 2001 13:48:44 GMT, bckfnn at worldonline.dk (Finn Bock) wrote:
> 
>> Thanks. With this change, Jython too can complete the test_opcodes. In
>> Jython a code object can never compare equal to anything but itself.
>
>Great! I'm happy to have helped.
>I'm starting to wonder what the tests really test: the language definition,
>or accidents of the implementation?

Based on the amount of code in test_opcodes dedicated to code
comparison, I doubt this particular situation was an accident.

The problems I have had with the test suite are better described as
accidents of the tests themself. From test_extcall:

  We expected (repr): "g() got multiple values for keyword argument 'b'"
  But instead we got: "g() got multiple values for keyword argument 'a'"

This is caused by a difference in iteration over a dictionary.

Or from test_import:

  test test_import crashed -- java.lang.ClassFormatError:
  java.lang.ClassFormatError: @test$py (Illegal Class name "@test$py")

where '@' isn't allowed in java classnames.

These are failures that have very little to do with the thing the test
are about and nothing at all to do with the language definition.

regards,
finn



From cgw at alum.mit.edu  Mon Jan 29 19:35:58 2001
From: cgw at alum.mit.edu (Charles G Waldman)
Date: Mon, 29 Jan 2001 12:35:58 -0600 (CST)
Subject: [Python-Dev] Re: Re: Sets: elt in dict, lst.include
In-Reply-To: <001501c08a20$00dca2a0$770a0a0a@nevex.com>
References: <20010129162012.32158ED49@mail.python.org>
	<001501c08a20$00dca2a0$770a0a0a@nevex.com>
Message-ID: <14965.47118.135246.700571@sirius.net.home>

Greg Wilson writes:

 > This would allow multi-dimensional structures
 > (e.g. NumPy arrays) to do things like:
 > 
 >     for (i, j, k) in array:		# please give me three indices
 > 
 > and:
 > 
 >     for ((i, j, k), v) in array:	# three indices and value

And what if I had, for example, a 3-dimensional array where the values
are 3-tuples?  Would "for (i,j,k) in array" refer to the indices or the
values?




From mal at lemburg.com  Mon Jan 29 20:03:41 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 20:03:41 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>  
	            <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
Message-ID: <3A75BE8D.1B7673EE@lemburg.com>

With all this confusion about how to actually write the
iteration on dictionary items, wouldn't it make more sense
to implement an extension module which then provides a __getitem__
style iterator for dictionaries by interfacing to PyDict_Next() ?

The module could have three different iterators:

1. iterate over items
2.     ... over keys
3.     ... over values

The reasoning behind this is that the __getitem__ interface
is well established and this doesn't introduce any new
syntax while still providing speed and flexibility.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 29 19:08:16 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 19:08:16 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>  
	            <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
Message-ID: <3A75B190.3FD2A883@lemburg.com>

Guido van Rossum wrote:
> 
> > Dictionaries are not sequences. I wonder what order a user of
> > for k,v in dict: (or whatever other of this proposal you choose)
> > will expect...
> 
> The same order that for k,v in dict.items() will yield, of course.

And then people find out that the order has some sorting
properties and start to use it... "how to sort a dictionary?"
comes up again, every now and then.
 
> > Please also take into account that dictionaries are *mutable*
> > and their internal state is not defined to e.g. not change due to
> > lookups (take the string optimization for example...), so exposing
> > PyDict_Next() in any to Python will cause trouble. In the end,
> > you will need to create a list or tuple to iterate over one way
> > or another, so why bother overloading for-loops w/r to dictionaries ?
> 
> Actually, I was going to propose to play dangerously here: the
> 
>     for k:v in dict: ...
> 
> syntax I proposed in my previous message should indeed expose
> PyDict_Next().  It should be a big speed-up, and I'm expecting (though
> don't have much proof) that most loops over dicts don't mutate the
> dict.
> 
> Maybe we could add a flag to the dict that issues an error when a new
> key is inserted during such a for loop?  (I don't think the key order
> can be affected when a key is *deleted*.)

You mean: mark it read-only ? That would be a "nice to have"
property for a lot of mutable types indeed -- sort of like
low-level locks. This would be another candidate for an object flag
(much like the one Fred wants to introduce for weak referenced
objects).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Mon Jan 29 20:22:07 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 14:22:07 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Mon, 29 Jan 2001 19:08:16 +0100."
             <3A75B190.3FD2A883@lemburg.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com>  
            <3A75B190.3FD2A883@lemburg.com> 
Message-ID: <200101291922.OAA13321@cj20424-a.reston1.va.home.com>

> > > Dictionaries are not sequences. I wonder what order a user of
> > > for k,v in dict: (or whatever other of this proposal you choose)
> > > will expect...
> > 
> > The same order that for k,v in dict.items() will yield, of course.
> 
> And then people find out that the order has some sorting
> properties and start to use it... "how to sort a dictionary?"
> comes up again, every now and then.

I don't understand why you bring this up.  We're not revealing
anything new here, the random order of dict items has always been part
of the language.  The answer to "how to sort a dict" should be "copy
it into a list and sort that."

Or am I missing something?

> > > Please also take into account that dictionaries are *mutable*
> > > and their internal state is not defined to e.g. not change due to
> > > lookups (take the string optimization for example...), so exposing
> > > PyDict_Next() in any to Python will cause trouble. In the end,
> > > you will need to create a list or tuple to iterate over one way
> > > or another, so why bother overloading for-loops w/r to dictionaries ?
> > 
> > Actually, I was going to propose to play dangerously here: the
> > 
> >     for k:v in dict: ...
> > 
> > syntax I proposed in my previous message should indeed expose
> > PyDict_Next().  It should be a big speed-up, and I'm expecting (though
> > don't have much proof) that most loops over dicts don't mutate the
> > dict.
> > 
> > Maybe we could add a flag to the dict that issues an error when a new
> > key is inserted during such a for loop?  (I don't think the key order
> > can be affected when a key is *deleted*.)
> 
> You mean: mark it read-only ? That would be a "nice to have"
> property for a lot of mutable types indeed -- sort of like
> low-level locks. This would be another candidate for an object flag
> (much like the one Fred wants to introduce for weak referenced
> objects).

Yes.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gvwilson at ca.baltimore.com  Mon Jan 29 20:38:50 2001
From: gvwilson at ca.baltimore.com (Greg Wilson)
Date: Mon, 29 Jan 2001 14:38:50 -0500
Subject: [Python-Dev] RE: Python-Dev digest, Vol 1 #1124 - 13 msgs
In-Reply-To: <20010129193101.7BF83EF62@mail.python.org>
Message-ID: <001a01c08a2b$1ba5a040$770a0a0a@nevex.com>

> Greg Wilson writes:
>  > This would allow multi-dimensional structures
>  > (e.g. NumPy arrays) to do things like:
>  >     for (i, j, k) in array:
>  > and:
>  >     for ((i, j, k), v) in array:	# three indices and value

> Charles Waldman asks:
> And what if I had, for example, a 3-dimensional array where the values
> are 3-tuples?  Would "for (i,j,k) in array" refer to the 
> indices or the values?

Greg Wilson writes:
That would be up to the module's implementer --- my idea was to have
the 'for' loop provide more information to the object being iterated
over, so that it could "do the right thing" (just as objects do right
now with "x[i]").

Greg



From mal at lemburg.com  Mon Jan 29 20:45:46 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 20:45:46 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com>  
	            <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com>
Message-ID: <3A75C86A.3A4236E8@lemburg.com>

Guido van Rossum wrote:
> 
> > > > Dictionaries are not sequences. I wonder what order a user of
> > > > for k,v in dict: (or whatever other of this proposal you choose)
> > > > will expect...
> > >
> > > The same order that for k,v in dict.items() will yield, of course.
> >
> > And then people find out that the order has some sorting
> > properties and start to use it... "how to sort a dictionary?"
> > comes up again, every now and then.
> 
> I don't understand why you bring this up.  We're not revealing
> anything new here, the random order of dict items has always been part
> of the language.  The answer to "how to sort a dict" should be "copy
> it into a list and sort that."
> 
> Or am I missing something?

I just wanted to hint at a problem which iterating over items
in an unordered set can cause. Especially new Python users will find 
it confusing that the order of the items in an iteration can change
from one run to the next.

Not much of an argument, but I like explicit programming more
than magic under the cover. What we really want is iterators for
dictionaries, so why not implement these instead of tweaking
for-loops.

If you are looking for speedups w/r to for-loops, applying a
different indexing technique in for-loops would go a lot further
and provide better performance not only to dictionary loops,
but also to other sequences.

I have made some good experience with a special counter object 
(sort of like a mutable integer) which is used instead of the 
iteration index integer in the current implementation. 

Using an iterator object instead of the integer + __getitem__
call machinery would allow more flexibility for all kinds of
sequences or containers. There could be an iterator type for
dictionaries, one for generic __getitem__ style sequences,
one for lists and tuples, etc. All of these could include
special logic to get the most out of the targetted datatype.

Well, just a thought...
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Mon Jan 29 21:02:47 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 15:02:47 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291922.OAA13321@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 02:22:07PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com>
Message-ID: <20010129150247.B10191@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> > > Maybe we could add a flag to the dict that issues an error when a new
> > > key is inserted during such a for loop?  (I don't think the key order
> > > can be affected when a key is *deleted*.)
> > 
> > You mean: mark it read-only ? That would be a "nice to have"
> > property for a lot of mutable types indeed -- sort of like
> > low-level locks. This would be another candidate for an object flag
> > (much like the one Fred wants to introduce for weak referenced
> > objects).
> 
> Yes.

For different reasons, I'd like to be able to set a constant flag on a
object instance.  Simple semantics: if you try to assign to a
member or method, it throws an exception.

Application?  I have a large Python program that goes to a lot of effort
to build elaborate context structures in core.  It would be nice to know
they can't be even inadvertently trashed without throwing an exception I 
can watch for.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

No one is bound to obey an unconstitutional law and no courts are bound
to enforce it.  
	-- 16 Am. Jur. Sec. 177 late 2d, Sec 256



From esr at thyrsus.com  Mon Jan 29 21:09:14 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 15:09:14 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3A75C86A.3A4236E8@lemburg.com>; from mal@lemburg.com on Mon, Jan 29, 2001 at 08:45:46PM +0100
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com>
Message-ID: <20010129150914.C10191@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> If you are looking for speedups w/r to for-loops, applying a
> different indexing technique in for-loops would go a lot further
> and provide better performance not only to dictionary loops,
> but also to other sequences.

Which reminds me...

There's not much I miss from C these days, but one thing I wish Python
had is a more general for-loop.  The C semantics that let you have 
any initialization, any termination test, and any iteration you like
are rather cool.

Yes, I realize that

	for (<init>; <test>; <step>) {<body>}

can be simulated with:

	<init>
	while 1:
		if <test>:
			break
		<body> 

Still, having them spatially grouped the way a C for does it is nice.
Makes it easier to see invariants, I think.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Rightful liberty is unobstructed action, according to our will, within limits
drawn around us by the equal rights of others."
	-- Thomas Jefferson



From moshez at zadka.site.co.il  Mon Jan 29 21:29:53 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon, 29 Jan 2001 22:29:53 +0200 (IST)
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <200101291530.KAA12037@cj20424-a.reston1.va.home.com>
References: <200101291530.KAA12037@cj20424-a.reston1.va.home.com>, <3a75747e.17414620@smtp.worldonline.dk>, <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>  
            <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>
Message-ID: <20010129202953.D1498A840@darjeeling.zadka.site.co.il>

On Mon, 29 Jan 2001 10:30:17 -0500, Guido van Rossum <guido at digicool.com> wrote:

> It's good to test conformance to the language definition, but this is
> also a regression test for the implementation.  The "accidents of the
> implementation" definitely need to be tested.  E.g. if we decide that
> repr(s) uses \n rather than \012 or \x0a, this should be tested too.
> The language definition gives the implementer a choice here; but once
> the implementer has made a choice, it's good to have a test that tests
> that this choice is implemented correctly.

I agree.

> Perhaps there should be several parts to the regression test,
> e.g. language conformance, library conformance, platform-specific
> features, and implementation conformance?

This sounds like a good idea...probably for the 2.2 timeline.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From tim.one at home.com  Mon Jan 29 22:51:56 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 16:51:56 -0500
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEBBIMAA.tim.one@home.com>

[Moshe Zadka]
> ...
> I'm starting to wonder what the tests really test: the language
> definition, or accidents of the implementation?

You'd be amazed (appalled?) at how hard it is to separate them.

In two previous lives as a Big Iron compiler hacker, we routinely had to get
our compilers validated by a govt agency before any US govt account would be
allowed to buy our stuff; e.g.,

    http://www.itl.nist.gov/div897/ctg/vpl/language.htm

This usually *started* as a two-day process, flying the inspector to our
headquarters, taking perhaps 2 minutes of machine time to run the test
suite, then sitting around that day and into the next arguing about whether
the "failures" were due to non-standard assumptions in the tests, or
compiler bugs.  It was almost always the former, but sometimes that didn't
get fully resolved for months (if the inspector was being particularly
troublesome, it could require getting an Official Interpretation from the
relevant stds body -- not swift!).  (BTW, this is one reason huge customers
are often very reluctant to move to a new release:  the validation process
can be very expensive and drag on for months)

>>> def f():
...     global g
...     g += 1
...     return g
...
>>> g = 0
>>> d = {f(): f()}
>>> d
{2: 1}
>>>

The Python Lang Ref doesn't really say whether {2: 1} or {1: 2} "should be"
the result, nor does it say it's implementation-defined.  If you *asked*
Guido what he thought it should do, he'd probably say {1: 2} (not much of a
guess:  I asked him in the past, and that's what he did say <wink>).

Something "like that" can show up in the test suite, but buried under layers
of obfuscating accidents.  Nobody is likely to realize it in the absence of
a failure motivating people to search for it.

Which is a trap:  sometimes ours was the only compiler (of dozens and
dozens) that had *ever* "failed" a particular test.  This was most often the
case at Cray Research, which had bizarre (but exceedingly fast -- which is
what Cray's customers valued most) floating-point arithmetic.  I recall one
test in particular that failed because Cray's was the only box on earth that
set I to 1 in

    INTEGER I
    I = 6.0/3.0

Fortran doesn't define that the result must be 2.  But-- you guessed
it --neither does Python.

Cute:  at KSR, INT(6.0/3.0) did return 2 -- but INT(98./49.) did not <wink>.

then-again-the-python-test-suite-is-still-shallow-ly y'rs  - tim




From hughett at mercur.uphs.upenn.edu  Mon Jan 29 23:05:22 2001
From: hughett at mercur.uphs.upenn.edu (Paul Hughett)
Date: Mon, 29 Jan 2001 17:05:22 -0500
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEBBIMAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCEEBBIMAA.tim.one@home.com>
Message-ID: <200101292205.RAA18790@mercur.uphs.upenn.edu>

tim says:

> Cray's was the only box on earth that set I to 1 in

>    INTEGER I
>    I = 6.0/3.0

> Fortran doesn't define that the result must be 2.  But-- you guessed
> it --neither does Python.

I would _guess_ that the IEEE 754 floating point standard does require
that, but I haven't actually gotten my hands on a copy of the standard
yet.  If it doesn't, I may have to stop writing code that depends on
the assumption that floating point computation is exact for exactly
representable integers.  If so, then we're reasonably safe; there
aren't many non-IEEE machines left these days.

Un-lurking-ly yours,

Paul Hughett



From tim.one at home.com  Mon Jan 29 23:53:43 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 17:53:43 -0500
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <200101292205.RAA18790@mercur.uphs.upenn.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEBEIMAA.tim.one@home.com>

[Paul Hughett]
> I would _guess_ that the IEEE 754 floating point standard does require
> that [6./3. == 2.],

It does, but 754 is silent on how languages may or may not *bind* to its
semantics.  The C99 std finally addresses that (15 years after 754), and
Java does too (albeit in a way Kahan despises), but that's about it for
"name brand" <wink> languages.

> ...
> If it doesn't, I may have to stop writing code that depends on
> the assumption that floating point computation is exact for exactly
> representable integers.  If so, then we're reasonably safe; there
> aren't many non-IEEE machines left these days.

I'm afraid you've got no guarantees even on a box with 100% conforming 754
hardware.  One of the last "mystery bugs" I helped tracked down at my
previous employer only showed up under Intel's C++ compiler.  It turned out
the compiler was looking for code of the form:

    double *a, *b, scale;
    for (i=0; i < n; ++i) {
        a[i] = b[i] / scale;
    }

and rewriting it as:

    double __temp = 1./scale;
    for (i=0; i < n; ++i) {
        a[i] = b[i] * __temp;
    }

for speed.  As time goes on, PC compilers are becoming more and more like
Cray's and KSR's in this respect:  float division is much more expensive
than float mult, and so variations of "so multiply by the reciprocal
instead" are hard for vendors to resist.  And, e.g., under 754 double rules,

   (17. * 123.) * (1./123.)

must *not* yield exactly 17.0 if done wholly in 754 double (but then 754
says nothing about how any language maps that string to 754 operations).

if-you-like-logic-chopping-you'll-love-arguing-stds<wink>-ly y'rs  - tim




From guido at digicool.com  Tue Jan 30 00:59:34 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 18:59:34 -0500
Subject: [Python-Dev] Does autoconfig detect INSTALL incorrectly?
In-Reply-To: Your message of "Tue, 23 Jan 2001 00:30:56 PST."
             <20010123003056.A28309@glacier.fnational.com> 
References: <20010123003056.A28309@glacier.fnational.com> 
Message-ID: <200101292359.SAA20364@cj20424-a.reston1.va.home.com>

> Why is the configure.in file set to always use "install-sh"?
> There is a comment that says:
> 
>     # Install just never works :-(
> 
> I don't think that statement is accurate.  /usr/bin/install works
> quite well on my machine.  The only commments I can find in the
> changelog are:
> 
>     revision 1.16
>     date: 1995/01/20 14:12:16;  author: guido;  state: Exp;  lines: +27 -2
>     add INSTALL_PROGRAM and INSTALL_DATA; check for getopt
> 
> and:
> 
>     revision 1.5
>     date: 1994/08/19 15:33:51;  author: guido;  state: Exp;  lines: +14 -6
>     Simplify value of INSTALL (always 'cp').
> 
> Is there any reason why the autoconf macro AC_PROG_INSTALL is not used?  The
> documentation seems to indicate that is does what we want.

Neil,

It's too long for me to remember, and I bet this was before
AC_PROG_INSTALL.  If there's a reason to prefer a working "install"
over install-sh, feel free to do the right thing!  (You're in charge
of the Makefile anyway now, it seems. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Tue Jan 30 01:17:25 2001
From: skip at mojam.com (Skip Montanaro)
Date: Mon, 29 Jan 2001 18:17:25 -0600 (CST)
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
Message-ID: <14966.2069.950895.627663@beluga.mojam.com>

After reading through this thread and noticing (but not paying close
attention to) all the related posts on c.l.py (subject: "in for dicts"), it
seems to me that the whole "if/for something in dict" thing needds to be
hashed out in a PEP.  There were a fair amount of "Python's changing too
fast" rants when 2.0 was released.  Adding a major feature such as this at
the 2.1 stage is only going to generate that many more rants.  The fact that
it was easy for Thomas to implement "if key in dict" doesn't make the
overall concept less controversial.  There are apparently lots of varying
opinions about what's reasonable.  This topic seems related to PEP 212 (Loop
Counter Iteration) and PEP 218 (Adding a Built-In Set Object Type), but may
well warrant its own.

That said, I have plenty enough on my plate trying to keep Mojam afloat
these days, so I can't step into the crevass, just observe that it looks to
me like a very long ways to the bottom... ;-)

Skip



From guido at digicool.com  Tue Jan 30 01:22:58 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 19:22:58 -0500
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: Your message of "Mon, 29 Jan 2001 18:17:25 CST."
             <14966.2069.950895.627663@beluga.mojam.com> 
References: <14966.2069.950895.627663@beluga.mojam.com> 
Message-ID: <200101300022.TAA21244@cj20424-a.reston1.va.home.com>

> After reading through this thread and noticing (but not paying close
> attention to) all the related posts on c.l.py (subject: "in for dicts"), it
> seems to me that the whole "if/for something in dict" thing needds to be
> hashed out in a PEP.  There were a fair amount of "Python's changing too
> fast" rants when 2.0 was released.  Adding a major feature such as this at
> the 2.1 stage is only going to generate that many more rants.  The fact that
> it was easy for Thomas to implement "if key in dict" doesn't make the
> overall concept less controversial.  There are apparently lots of varying
> opinions about what's reasonable.  This topic seems related to PEP 212 (Loop
> Counter Iteration) and PEP 218 (Adding a Built-In Set Object Type), but may
> well warrant its own.

Excellent.  Good reminder also that this shouldn't go into 2.1 --
clearly the design space is too complicated for a quick decision.

> That said, I have plenty enough on my plate trying to keep Mojam afloat
> these days, so I can't step into the crevass, just observe that it looks to
> me like a very long ways to the bottom... ;-)

I'm not able to lead such a PEP effort myself either, but I hope
*someone* will be.  This PEP has a good chance for 2.2 though (what
with BDFL approval and all :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)




From tim.one at home.com  Tue Jan 30 02:39:17 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 20:39:17 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291448.JAA11473@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com>

[Guido]
> I did a less sophisticated count but come to the same conclusion:
> iterations over items() are (somewhat) more common than over keys(),
> and values() are 1-2 orders of magnitude less common.  My numbers:
>
> $ cd python/src/Lib
> $ grep 'for .*items():' *.py | wc -l
>      47
> $ grep 'for .*keys():' *.py | wc -l
>      43
> $ grep 'for .*values():' *.py | wc -l
>       2

I like my larger sample and anal methodology better <wink>.  A closer look
showed that it may have been unduly biased by the mass of files in
Lib/encodings/, where

encoding_map = {}
for k,v in decoding_map.items():
    encoding_map[v] = k

is at the end of most files (btw, MAL, that's the answer to your question:
people would expect "the same" ordering you expected there, i.e. none in
particular).

> ...
> I don't much value to the readability argument: typically, one will
> write "for key in dict" or "for name in dict" and then it's obvious
> what is meant.

Well, "fiddlesticks" comes to mind <0.9 wink>.  If I've got a dict mapping
phone numbers to names, "for name in dict" is dead backwards.

    for vevent in keydefs.keys():
    for x in self.subdirs.keys():
    for name in lsumdict.keys():
    for locale in self.descriptions.keys():
    for name in attrs.keys():
    for func in other.top_level.keys():
    for func in target.keys():
    for i in u2.keys():
    for s in d.keys():
    for url in self.bad.keys():

are other cases in the CVS tree where I don't think the name makes it
obvious in the absence of ".keys()".

But I don't personally give any weight to whether people can guess what
something does at first glance.  My rule is that it doesn't matter, provided
it's (a) easy to learn; and (especially), (b) hard to *forget* once you've
learned it.  A classic example is Python's "points between elements"
treatment of slice indices:  few people guess right what that does at first
glance, but once they "get it" they're delighted and rarely mess up again.

And I think this is "like that".

> ...
> But here's my dilemma.  "if (k, v) in dict" is clearly useless (nobody
> has even asked me for a has_item() method).

Yup.

> I can live with "x in list" checking the values and "x in dict"
> checking the keys.  But I can *not* live with "x in dict" equivalent
> to "dict.has_key(x)" if "for x in dict" would mean
> "for x in dict.items()".

That's why I brought it up -- it's not entirely clear what's to be done
here.

> I also think that defining "x in dict" but not "for x in dict" will
> be confusing.
>
> So we need to think more.

The hoped-for next step indeed.

> How about:
>
>     for key in dict: ...		# ... over keys
>
>     for key:value in dict: ...		# ... over items
>
> This is syntactically unambiguous (a colon is currently illegal in
> that position).

Cool!  Can we resist adding

    if key:value in dict

for "parallelism"?  (I know I can ...)  2/3rd of these are marginally more
attractive:

    for key: in dict:    # over dict.keys()
    for :value in dict:  # over dict.values()
    for : in dict:       # a delay loop

> This also suggests:
>
>     for index:value in list: ...	# ... over zip(range(len(list), list)
>
> while doesn't strike me as bad or ugly, and would fulfill my brother's
> dearest wish.

You mean besides the one that you fry in hell for not adding "for ...
indexing"?  Ya, probably.

> (And why didn't we think of this before?)

Best guess:  we were focused exclusively on sequences, and a colon just
didn't suggest itself in that context.  Second-best guess:  having finally
approved one of these gimmicks, you finally got desperate enough to make it
work <wink>.

ponderingly y'rs  - tim




From tim.one at home.com  Tue Jan 30 02:58:59 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 20:58:59 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEBPIMAA.tim.one@home.com>

[Guido]
> ...
> I'm expecting (though don't have much proof) that most loops over
> dicts don't mutate the dict.

Safe bet!  I do recall writing one once:  it del'ed keys for which the
associated count was 1, because the rest of the algorithm was only
interested in duplicates.

> Maybe we could add a flag to the dict that issues an error when a new
> key is inserted during such a for loop?  (I don't think the key order
> can be affected when a key is *deleted*.)

That latter is true but specific to this implementation.  "Can't mutate the
dict period" is easier to keep straight, and probably harmless in practice
(if not, it could be relaxed later).  Recall that a similar trick is played
during list.sort(), replacing the list's type pointer for the duration (to
point to an internal "immutable list" type, same as the list type except the
"dangerous" slots point to a function that raises an "immutable list"
TypeError).  Then no runtime expense is incurred for regular lists to keep
checking flags.  I thought of this as an elegant use for switching types at
runtime; you may still be appalled by it, though!




From tim.one at home.com  Tue Jan 30 03:07:36 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 21:07:36 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3A75B190.3FD2A883@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECAIMAA.tim.one@home.com>

[Guido]
> The same order that for k,v in dict.items() will yield, of course.

[MAL]
> And then people find out that the order has some sorting
> properties and start to use it...

Except that it has none.  dict insertion has never used any comparison
outcome beyond "equal"/"not equal", so any ordering you think you see is--
and always was --an illusion.




From guido at digicool.com  Tue Jan 30 03:06:35 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:06:35 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Mon, 29 Jan 2001 20:39:17 EST."
             <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com> 
Message-ID: <200101300206.VAA21925@cj20424-a.reston1.va.home.com>

This is all PEP material now.  Tim, do you want to own the PEP?  It
seems just up your alley!

> Cool!  Can we resist adding
> 
>     if key:value in dict
> 
> for "parallelism"?  (I know I can ...)

That's easy to resist because, unlike ``for key:value in dict'', it's
not unambiguous: ``if key:value in dict'' is already legal syntax
currently, with 'key' as the condition and 'value in dict' as the (not
particularly useful) body of the if statement.

> > (And why didn't we think of this before?)
> 
> Best guess:  we were focused exclusively on sequences, and a colon just
> didn't suggest itself in that context.  Second-best guess:  having finally
> approved one of these gimmicks, you finally got desperate enough to make it
> work <wink>.

I'm certainly more comfortable with just ``for key in dict'' than with
the whole slow of extensions using colons.

But, again, that's for the PEP to fight over.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan 30 03:15:04 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:15:04 -0500
Subject: [Python-Dev] C's for statement
In-Reply-To: Your message of "Mon, 29 Jan 2001 15:09:14 EST."
             <20010129150914.C10191@thyrsus.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com>  
            <20010129150914.C10191@thyrsus.com> 
Message-ID: <200101300215.VAA21955@cj20424-a.reston1.va.home.com>

[ESR]
> There's not much I miss from C these days, but one thing I wish Python
> had is a more general for-loop.  The C semantics that let you have 
> any initialization, any termination test, and any iteration you like
> are rather cool.
> 
> Yes, I realize that
> 
> 	for (<init>; <test>; <step>) {<body>}
> 
> can be simulated with:
> 
> 	<init>
> 	while 1:
> 		if <test>:
> 			break
> 		<body> 
> 
> Still, having them spatially grouped the way a C for does it is nice.
> Makes it easier to see invariants, I think.

Hm, I've seen too many ugly C for loops to have much appreciation for
it.  I can recognize and appreciate the few common forms that clearly
iterate over an array; most other forms look rather contorted to me.
Check out the Python C sources; if you find anything more complicated
than ``for (i = n; i > 0; i--)'' I probably didn't write
it. :-)

Common abominations include:

- writing a while loop as for(;<test>;)

- putting arbitrary initialization code in <init>

- having an empty condition, so the <step> becomes an arbitraty
  extension of the body that's written out-of-sequence

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Tue Jan 30 03:19:12 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 21:19:12 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3A75C86A.3A4236E8@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECAIMAA.tim.one@home.com>

[MAL]
> I just wanted to hint at a problem which iterating over items
> in an unordered set can cause.  Especially new Python users will find
> it confusing that the order of the items in an iteration can change
> from one run to the next.

Do they find "for k, v in dict.items()" confusing now?  Would be the same.

> ...
> What we really want is iterators for dictionaries, so why not
> implement these instead of tweaking for-loops.

Seems an unrelated topic:  would "iterators for dictionaries" solve the
supposed problem with iteration order?

> If you are looking for speedups w/r to for-loops, applying a
> different indexing technique in for-loops would go a lot further
> and provide better performance not only to dictionary loops,
> but also to other sequences.
>
> I have made some good experience with a special counter object
> (sort of like a mutable integer) which is used instead of the
> iteration index integer in the current implementation.

Please quantify, if possible.  My belief (based on past experiments) is that
in loops fancier than

    for i in range(n):
        pass

the loop overhead quickly falls into the noise even now.

> Using an iterator object instead of the integer + __getitem__
> call machinery would allow more flexibility for all kinds of
> sequences or containers. ...

This is yet another abrupt change of topic, yes <0.9 wink>?  I agree a new
iteration *protocol* could have major attractions.




From guido at digicool.com  Tue Jan 30 03:17:27 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:17:27 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: Your message of "Mon, 29 Jan 2001 15:02:47 EST."
             <20010129150247.B10191@thyrsus.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com>  
            <20010129150247.B10191@thyrsus.com> 
Message-ID: <200101300217.VAA21978@cj20424-a.reston1.va.home.com>

[ESR]
> For different reasons, I'd like to be able to set a constant flag on a
> object instance.  Simple semantics: if you try to assign to a
> member or method, it throws an exception.
> 
> Application?  I have a large Python program that goes to a lot of effort
> to build elaborate context structures in core.  It would be nice to know
> they can't be even inadvertently trashed without throwing an exception I 
> can watch for.

Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:

- How to spell it?  x.freeze()?  x.readonly()?

- Should this reversible?  I.e. should there be an x.unfreeze()?

- Should we support something like this for instances too?  Sometimes
  it might be cool to be able to freeze changing attribute values...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Tue Jan 30 03:29:25 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 21:29:25 -0500
Subject: [Python-Dev] C's for statement
In-Reply-To: <200101300215.VAA21955@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKECBIMAA.tim.one@home.com>

Check out SETL's loop statement.  I think Perl5 is a subset of it <0.9
wink>.




From esr at thyrsus.com  Tue Jan 30 03:34:01 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 21:34:01 -0500
Subject: [Python-Dev] Re: C's for statement
In-Reply-To: <200101300215.VAA21955@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 09:15:04PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com> <20010129150914.C10191@thyrsus.com> <200101300215.VAA21955@cj20424-a.reston1.va.home.com>
Message-ID: <20010129213401.A17235@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> Common abominations include:
> 
> - writing a while loop as for(;<test>;)

Agreed. Bletch.
 
> - putting arbitrary initialization code in <init>

Not sure what's "arbitrary", unless you mean unrelated to the 
iteration variable.

> - having an empty condition, so the <step> becomes an arbitraty
>   extension of the body that's written out-of-sequence

Again agreed.  Double bletch.

I guess my archetype of the cute C for-loop is the idiom for 
pointer-list traversal:

	struct foo {int data; struct foo *next;} *ptr, *head; 

	for (ptr = head; *ptr; ptr = ptr->next)
		do_something_with(ptr->data)

This is elegant.  It separates the logic for list traversal from the
operation on the list element.

Not the highest on my list of wants -- I'd sooner have ?: back.  I submitted
a patch for that once, and the discussion sort of died.  Were you dead
det against it, or should I revive this proposal?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"The bearing of arms is the essential medium through which the
individual asserts both his social power and his participation in
politics as a responsible moral being..."
        -- J.G.A. Pocock, describing the beliefs of the founders of the U.S.



From esr at thyrsus.com  Tue Jan 30 03:49:59 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 21:49:59 -0500
Subject: [Python-Dev] Re: Making mutable objects readonly
In-Reply-To: <200101300217.VAA21978@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 09:17:27PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <20010129214959.B17235@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
> 
> - How to spell it?  x.freeze()?  x.readonly()?

I like "freeze", it'a a clear imperative where "readonly()" sounds
like a test (e.g. "is this readonly()?")
 
> - Should we support something like this for instances too?  Sometimes
>   it might be cool to be able to freeze changing attribute values...

Moshe Zadka sent me a hack that handles instances:

> class MarkableAsConstant:
> 
> 	def __init__(self):
> 		self.mark_writable()
> 
> 	def __setattr__(self, name, value):
> 		if self._writable:
> 			self.__dict__[name] = value
> 		else:
> 			raise ValueError, "object is read only"
> 
> 	def mark_writable(self):
> 		self.__dict__['_writable'] = 1
> 
> 	def mark_readonly(self):
> 		self.__dict__['_writable'] = 0

> - Should this reversible?  I.e. should there be an x.unfreeze()?

I gave this some thought earlier today.  There are advantages to either
way.  Making freeze a one-way operation would make it possible to use
freezing to get certain kinds of security and integrity guarantees that
you can't have if freezing is reversible.

Fortunately, there's a semantics that captures both.  If we allow
freeze to take an optional key argument, and require that an unfreeze
call must supply the same key or fail, we get both worlds.  We can
even one-way-hash the keys so they don't have to be stored in the
bytecode.

Want to lock a structure permanently?  Pick a random long key.  Freeze
with it.  Then throw that key away...
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Strict gun laws are about as effective as strict drug laws...It pains
me to say this, but the NRA seems to be right: The cities and states
that have the toughest gun laws have the most murder and mayhem.
        -- Mike Royko, Chicago Tribune



From tim.one at home.com  Tue Jan 30 03:57:59 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 21:57:59 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGECDIMAA.tim.one@home.com>

[Guido]
> Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
>
> - How to spell it?  x.freeze()?  x.readonly()?

See below.

> - Should this reversible?

Of course.  Or x.freeze(solid=1) to default to permanent rigidity, but not
require it.

>  I.e. should there be an x.unfreeze()?

That conveniently answers the first question, since x.unreadonly() reads
horribly <wink>.

> - Should we support something like this for instances too?  Sometimes
>   it might be cool to be able to freeze changing attribute values...

"Should be" supported for every mutable object.  Next step:  as in endless
C++ debates, endless Python debates about "representation freeze" vs
"logical freeze" ("well, yes, I'm changing this member, but it's just an
invisible cache so I *should* be able to tag the object as const anyway
..."; etc etc etc).

keep-it-simple-ly y'rs  - tim




From guido at digicool.com  Tue Jan 30 03:57:24 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:57:24 -0500
Subject: [Python-Dev] Re: C's for statement
In-Reply-To: Your message of "Mon, 29 Jan 2001 21:34:01 EST."
             <20010129213401.A17235@thyrsus.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com> <20010129150914.C10191@thyrsus.com> <200101300215.VAA21955@cj20424-a.reston1.va.home.com>  
            <20010129213401.A17235@thyrsus.com> 
Message-ID: <200101300257.VAA22186@cj20424-a.reston1.va.home.com>

> > - putting arbitrary initialization code in <init>
> 
> Not sure what's "arbitrary", unless you mean unrelated to the 
> iteration variable.

Yes, that.

> I guess my archetype of the cute C for-loop is the idiom for 
> pointer-list traversal:
> 
> 	struct foo {int data; struct foo *next;} *ptr, *head; 
> 
> 	for (ptr = head; *ptr; ptr = ptr->next)
> 		do_something_with(ptr->data)
> 
> This is elegant.  It separates the logic for list traversal from the
> operation on the list element.

And it rarely happens in Python, because sequences are rarely
represented as linked lists.

> Not the highest on my list of wants -- I'd sooner have ?: back.  I submitted
> a patch for that once, and the discussion sort of died.  Were you dead
> det against it, or should I revive this proposal?

Not dead set against something like it, but dead set against the ?:
syntax because then : becomes too overloaded for the human reader, e.g.:

    if foo ? bar : bletch : spam = eggs

If you want to revive this, I strongly suggest writing a PEP first
before posting here.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan 30 03:59:17 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:59:17 -0500
Subject: [Python-Dev] Re: Making mutable objects readonly
In-Reply-To: Your message of "Mon, 29 Jan 2001 21:49:59 EST."
             <20010129214959.B17235@thyrsus.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com>  
            <20010129214959.B17235@thyrsus.com> 
Message-ID: <200101300259.VAA22208@cj20424-a.reston1.va.home.com>

> > - How to spell it?  x.freeze()?  x.readonly()?
> 
> I like "freeze", it'a a clear imperative where "readonly()" sounds
> like a test (e.g. "is this readonly()?")

Agreed.

> > - Should we support something like this for instances too?  Sometimes
> >   it might be cool to be able to freeze changing attribute values...
> 
> Moshe Zadka sent me a hack that handles instances:
[...]

OK, so no special support needed there.

> > - Should this reversible?  I.e. should there be an x.unfreeze()?
> 
> I gave this some thought earlier today.  There are advantages to either
> way.  Making freeze a one-way operation would make it possible to use
> freezing to get certain kinds of security and integrity guarantees that
> you can't have if freezing is reversible.
> 
> Fortunately, there's a semantics that captures both.  If we allow
> freeze to take an optional key argument, and require that an unfreeze
> call must supply the same key or fail, we get both worlds.  We can
> even one-way-hash the keys so they don't have to be stored in the
> bytecode.
> 
> Want to lock a structure permanently?  Pick a random long key.  Freeze
> with it.  Then throw that key away...

Way too cute.  My suggestion freeze(0) freezes forever, freeze(1)
can be unfrozen.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Tue Jan 30 04:06:19 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 22:06:19 -0500
Subject: [Python-Dev] Re: C's for statement
In-Reply-To: <200101300257.VAA22186@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 09:57:24PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com> <20010129150914.C10191@thyrsus.com> <200101300215.VAA21955@cj20424-a.reston1.va.home.com> <20010129213401.A17235@thyrsus.com> <200101300257.VAA22186@cj20424-a.reston1.va.home.com>
Message-ID: <20010129220619.A17713@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> Not dead set against something like it, but dead set against the ?:
> syntax because then : becomes too overloaded for the human reader, e.g.:
> 
>     if foo ? bar : bletch : spam = eggs
> 
> If you want to revive this, I strongly suggest writing a PEP first
> before posting here.

Noted.  Will do.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Such are a well regulated militia, composed of the freeholders,
citizen and husbandman, who take up arms to preserve their property,
as individuals, and their rights as freemen.
        -- "M.T. Cicero", in a newspaper letter of 1788 touching the "militia" 
            referred to in the Second Amendment to the Constitution.



From tim.one at home.com  Tue Jan 30 04:18:47 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 22:18:47 -0500
Subject: [Python-Dev] Re: Making mutable objects readonly
In-Reply-To: <20010129214959.B17235@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECEIMAA.tim.one@home.com>

Note that even adding a "frozen" flag would add 4 bytes to every freezable
object on most machines.  That's why I'd rather .freeze() replace the type
pointer and .unfreeze() restore it.  No time or space overhead; no
cluttering up the normal-case (i.e., unfrozen) type implementations with new
tests.




From tim.one at home.com  Tue Jan 30 04:57:07 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 22:57:07 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <14965.42988.362288.154254@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com>

Note that optimizing compilers use a pile of linear-time heuristics to
attempt to solve exponential-time optimization problems (from optimal
register assignment to optimal instruction scheduling, they're all formally
intractable even in isolation).

When code gets non-trivial, not even a compiler's chief designer can
reliably outguess what optimization may do.  It's really not unusual for a
higher optimization level to yield slower code, and especially not when the
source code is pushing or exceeding machine limits (# of registers, # of
instruction pipes, size of branch-prediction buffers; I-cache structure;
dynamic restrictions on execution units; ...).

[Jeremy]
> ...
> One of the differences between -O2 and -O3, according to the man page,
> is that -O3 will perform optimizations that involve a space-speed
> tradeoff.  It also include -finline-functions.  I can imagine that
> some of these optimizations hurt memory performance enough to make a
> difference.

One of the time-consuming ongoing tasks at my last employer was running
profiles and using them to override counterproductive compiler inlining
decisions (in both directions).  It's not just memory that excessive
inlining can screw up, but also things like running out of registers and so
inserting gobs of register spill/restore code, and inlining so much code
that the instruction scheduler effectively gives up (under many compilers, a
sure sign of this is when you look at the generated code for a function, and
it looks beautiful "at the top" but terrible "at the bottom"; some clever
optimizers tried to get around that by optimizing "bottom-up", and then it
looks beautiful at the bottom but terrible at the top <0.5 wink>; others
work middle-out or burn the candle at both ends, with visible consequences
you should be able to recognize now!).

optimization-is-easier-than-speech-recog-but-the-latter-doesn't-work-
    all-that-well-either-ly y'rs  - tim




From barry at digicool.com  Tue Jan 30 05:13:24 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 23:13:24 -0500
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
References: <14966.2069.950895.627663@beluga.mojam.com>
Message-ID: <14966.16228.548177.112853@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip at mojam.com> writes:

    SM> it seems to me that the whole "if/for something in dict" thing
    SM> needds to be hashed out in a PEP.
    
    SM> There are apparently lots of varying opinions about what's
    SM> reasonable.  This topic seems related to PEP 212 (Loop Counter
    SM> Iteration) and PEP 218 (Adding a Built-In Set Object Type),
    SM> but may well warrant its own.

As keeper of PEP0, I have to agree.  I personally would vastly prefer
a new iterator protocol than syntax such as "for key:value in dict".
I'd really like to see a PEP on an iterator protocol for Python, but
like Skip, I'm too busy at the moment to do it myself.  If nobody
takes it on before then, I might be willing to champion such a PEP for
the 2.2 time frame.  Until then, I'm decidedly -1 on "for/if in dict".

-Barry



From barry at digicool.com  Tue Jan 30 05:25:09 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 23:25:09 -0500
Subject: [Python-Dev] Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
	<3A756FF8.B7185FA2@lemburg.com>
	<200101291500.KAA11569@cj20424-a.reston1.va.home.com>
	<3A75B190.3FD2A883@lemburg.com>
	<200101291922.OAA13321@cj20424-a.reston1.va.home.com>
	<20010129150247.B10191@thyrsus.com>
	<200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <14966.16933.209494.214183@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido at digicool.com> writes:

    GvR> Yes, this is a good thing.  Easy to do on lists and dicts.
    GvR> Questions:

    GvR> - How to spell it?  x.freeze()?  x.readonly()?

    GvR> - Should this reversible?  I.e. should there be an
    GvR> x.unfreeze()?

    GvR> - Should we support something like this for instances too?
    GvR> Sometimes it might be cool to be able to freeze changing
    GvR> attribute values...

lock(x) ...? :)

-Barry



From barry at digicool.com  Tue Jan 30 05:26:50 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 23:26:50 -0500
Subject: [Python-Dev] Re: Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
	<3A756FF8.B7185FA2@lemburg.com>
	<200101291500.KAA11569@cj20424-a.reston1.va.home.com>
	<3A75B190.3FD2A883@lemburg.com>
	<200101291922.OAA13321@cj20424-a.reston1.va.home.com>
	<20010129150247.B10191@thyrsus.com>
	<200101300217.VAA21978@cj20424-a.reston1.va.home.com>
	<20010129214959.B17235@thyrsus.com>
Message-ID: <14966.17034.721204.305315@anthem.wooz.org>

>>>>> "ESR" == Eric S Raymond <esr at thyrsus.com> writes:

    ESR> Fortunately, there's a semantics that captures both.  If we
    ESR> allow freeze to take an optional key argument, and require
    ESR> that an unfreeze call must supply the same key or fail, we
    ESR> get both worlds.  We can even one-way-hash the keys so they
    ESR> don't have to be stored in the bytecode.

    ESR> Want to lock a structure permanently?  Pick a random long
    ESR> key.  Freeze with it.  Then throw that key away...

Clever!



From esr at thyrsus.com  Tue Jan 30 05:32:16 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 23:32:16 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <14966.16933.209494.214183@anthem.wooz.org>; from barry@digicool.com on Mon, Jan 29, 2001 at 11:25:09PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com> <14966.16933.209494.214183@anthem.wooz.org>
Message-ID: <20010129233215.A18533@thyrsus.com>

Barry A. Warsaw <barry at digicool.com>:
> lock(x) ...? :)

I was thinking that myself, Barry.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Boys who own legal firearms have much lower rates of delinquency and
drug use and are even slightly less delinquent than nonowners of guns."
	-- U.S. Department of Justice, National Institute of
	   Justice, Office of Juvenile Justice and Delinquency Prevention,
	   NCJ-143454, "Urban Delinquency and Substance Abuse," August 1995.



From tim.one at home.com  Tue Jan 30 05:56:09 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 23:56:09 -0500
Subject: [Python-Dev] SSL socket read at EOF; SourceForge problem
Message-ID: <LNBBLJKPBEHFEDALKOLCEECJIMAA.tim.one@home.com>

I tried to open an SF bug for the following msg from c.l.py, but SF balked:

    ERROR
    ERROR getting bug_id

Logged out, logged in, tried it again, same outcome.

Intended bug report content:

Good question from c.l.py, assigned to Guido cuz he's a Socket Guy:

From: Clarence Gardner <clarence at netlojix.com>
Subject: RE: Thread Safety
Date: Mon, 29 Jan 2001 09:51:03 -0800

...

I'm going to repeat a question that I posted about a week ago that passed
without comment on the newsgroup. The issue is the SSL support in the socket
module, which raises an exception when the reading socket is at EOF, rather
than returning an empty string. I'm hesitant to call it a "bug", but I
wouldn't have implemented it this way.  There are the names of two people
mentioned at the top of socketmodule.c, but no contact information, so I'm
suggesting here that it be changed to conform to normal file/socket
practice. (SSL was actually added at 2.0, so I'm late to the party with
this; mea culpa, mea culpa.  I delayed trying Python2 because of the
extension rebuilding.)




From thomas at xs4all.net  Tue Jan 30 07:14:20 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 07:14:20 +0100
Subject: [Python-Dev] Re: C's for statement
In-Reply-To: <20010129213401.A17235@thyrsus.com>; from esr@thyrsus.com on Mon, Jan 29, 2001 at 09:34:01PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com> <20010129150914.C10191@thyrsus.com> <200101300215.VAA21955@cj20424-a.reston1.va.home.com> <20010129213401.A17235@thyrsus.com>
Message-ID: <20010130071420.U962@xs4all.nl>

On Mon, Jan 29, 2001 at 09:34:01PM -0500, Eric S. Raymond wrote:

> I guess my archetype of the cute C for-loop is the idiom for 
> pointer-list traversal:

> 	struct foo {int data; struct foo *next;} *ptr, *head; 

> 	for (ptr = head; *ptr; ptr = ptr->next)
> 		do_something_with(ptr->data)

Note two things: in Python, you would use a list, so 'for x i list' does
exactly what you want here ;) And if you really need it, you could use
iterators for exactly this (once we have them, of course): you are inventing
a new storage type. Quite common in C, since the only one it has is useless
for anything other than strings<wink>, but not so common in Python.

> Not the highest on my list of wants -- I'd sooner have ?: back.  I submitted
> a patch for that once, and the discussion sort of died.  Were you dead
> det against it, or should I revive this proposal?

Triple blech. Guido will never go for it! (There, increased your chance of
getting it approved! :) Seriously though, I wouldn't like it much, it's too
cryptic a syntax. I notice I use it less and less in C, too.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Tue Jan 30 07:18:25 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 07:18:25 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 29, 2001 at 08:39:17PM -0500
References: <200101291448.JAA11473@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com>
Message-ID: <20010130071825.V962@xs4all.nl>

On Mon, Jan 29, 2001 at 08:39:17PM -0500, Tim Peters wrote:

>     for key: in dict:    # over dict.keys()
>     for :value in dict:  # over dict.values()
>     for : in dict:       # a delay loop

Wot's the last one supposed to do ? 'for unused_var in range(len(dict)):' ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tim.one at home.com  Tue Jan 30 07:25:51 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 30 Jan 2001 01:25:51 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010130071825.V962@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECLIMAA.tim.one@home.com>

>>     for key: in dict:    # over dict.keys()
>>     for :value in dict:  # over dict.values()
>>     for : in dict:       # a delay loop

[Thomas Wouters]
> Wot's the last one supposed to do ? 'for unused_var in
> range(len(dict)):' ?

Well, as the preceding line said in the original:

>>    2/3rd of these are marginally more attractive [than
>>    "if key:value in dict"]:

I think you've guessed which 2/3 those are <wink>.  I don't see that the
last line has any visible semantics whatsoever, so Python can do whatever it
likes, provided it doesn't do anything visible.

You still hang out on c.l.py!  So you gotta know that if something of the
form

    x:y

is suggested, people will line up to suggest meanings for the 3 obvious
variations, along with

    x::y

and

    x:-:y

and

    x lambda y

too <0.9 wink>.




From thomas at xs4all.net  Tue Jan 30 07:26:48 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 07:26:48 +0100
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <14966.2069.950895.627663@beluga.mojam.com>; from skip@mojam.com on Mon, Jan 29, 2001 at 06:17:25PM -0600
References: <14966.2069.950895.627663@beluga.mojam.com>
Message-ID: <20010130072648.W962@xs4all.nl>

On Mon, Jan 29, 2001 at 06:17:25PM -0600, Skip Montanaro wrote:

> The fact that it was easy for Thomas to implement "if key in dict" doesn't
> make the overall concept less controversial.

Note that the fact I implemented it doesn't mean I'm +1 on it (witness my
posts on python-list.) In fact, *while implementing it*, I grew from +0 to
-0 and maybe even to a weak -1 (all in 5 minutes :) The enthousiastic
subject of the patch was a weak attempt at 5AM humour, not a venting of an
ancient desire :)

More-5AM-humour-ly y'rs,
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Tue Jan 30 07:55:16 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 07:55:16 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net>; from jhylton@users.sourceforge.net on Mon, Jan 29, 2001 at 05:27:30PM -0800
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010130075515.X962@xs4all.nl>

On Mon, Jan 29, 2001 at 05:27:30PM -0800, Jeremy Hylton wrote:

> add note about two kinds of illegal imports that are now checked

> + - The compiler will report a SyntaxError if "from ... import *" occurs
> +   in a function or class scope or if a name bound by the import
> +   statement is declared global in the same scope.  The language
> +   reference has also documented that these cases are illegal, but
> +   they were not enforced.

Woah. Is this really a good idea ? I have seen 'from ... import *' in a
function scope put to good (relatively -- we're talking 'import *' here)
use. I also thought of 'import' as yet another assignment statement, so to
me it's both logical and consistent if 'import' would listen to 'global'.
Otherwise we have to re-invent 'import spam; eggs = spam' if we want eggs to
be global. 

Is there really a reason to enforce this, or are we enforcing the wording of
the language reference for the sake of enforcing the wording of the language
reference ? When writing 'import as' for 2.0, I fixed some of the
inconsistencies in import, making it adhere to 'global' statements in as
many cases as possible (all except 'from ... import *') but I was apparently
not aware of the wording of the language reference. I'd suggest updating the
wording in the language reference, not the implementation, unless there is a
good reason to disallow this.

I also have another issue with your recent patches, Jeremy, also in the
backwards-compatibility departement :) You gave new.code two new,
non-optional arguments, in the middle of the long argument list. I sent a
note about it to python-checkins instead of python-dev by accident, but Fred
seemed to agree with me there.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mwh21 at cam.ac.uk  Tue Jan 30 09:30:15 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 30 Jan 2001 08:30:15 +0000
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: "Tim Peters"'s message of "Mon, 29 Jan 2001 22:57:07 -0500"
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com>
Message-ID: <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk>

In the interest of generating some numbers (and filling up my hard
drive), last night I wrote a script to build lots & lots of versions
of python (many of which turned out to be redundant - eg. -O6 didn't
seem to do anything different to -O3 and pybench doesn't work with
1.5.2), and then run pybench with them.  Summarised results below;
first a key:

src-n: this morning's CVS (with Jeremy's f_localsplus optimisation)
        (only built this with -O3)
src: CVS from yesterday afternoon
src-obmalloc: CVS from yesterday afternoon with Vladimir's obmalloc 
        patch applied.  More on this later...
Python-2.0: you can guess what this is.

All runs are compared against Python-2.0-O2:

Benchmark: src-n-O3 (rounds=10, warp=20)
            Average round time:   49029.00 ms              -0.86%
Benchmark: src (rounds=10, warp=20)
            Average round time:   67141.00 ms             +35.76%
Benchmark: src-O (rounds=10, warp=20)
            Average round time:   50167.00 ms              +1.44%
Benchmark: src-O2 (rounds=10, warp=20)
            Average round time:   49641.00 ms              +0.37%
Benchmark: src-O3 (rounds=10, warp=20)
            Average round time:   49104.00 ms              -0.71%
Benchmark: src-O6 (rounds=10, warp=20)
            Average round time:   49131.00 ms              -0.66%
Benchmark: src-obmalloc (rounds=10, warp=20)
            Average round time:   63276.00 ms             +27.94%
Benchmark: src-obmalloc-O (rounds=10, warp=20)
            Average round time:   46927.00 ms              -5.11%
Benchmark: src-obmalloc-O2 (rounds=10, warp=20)
            Average round time:   46146.00 ms              -6.69%
Benchmark: src-obmalloc-O3 (rounds=10, warp=20)
            Average round time:   46456.00 ms              -6.07%
Benchmark: src-obmalloc-O6 (rounds=10, warp=20)
            Average round time:   46450.00 ms              -6.08%
Benchmark: Python-2.0 (rounds=10, warp=20)
            Average round time:   68933.00 ms             +39.38%
Benchmark: Python-2.0-O (rounds=10, warp=20)
            Average round time:   49542.00 ms              +0.17%
Benchmark: Python-2.0-O3 (rounds=10, warp=20)
            Average round time:   48262.00 ms              -2.41%
Benchmark: Python-2.0-O6 (rounds=10, warp=20)
            Average round time:   48273.00 ms              -2.39%

My conclusion?  Python 2.1 is slower than Python 2.0, but not by
enough to care about.

Interestingly, adding obmalloc speeds things up.  Let's take a closer
look:

$ python pybench.py -c src-obmalloc-O3 -s src-O3      
PYBENCH 0.7

Benchmark: src-O3 (rounds=10, warp=20)

Tests:                              per run    per oper.  diff *
------------------------------------------------------------------------
          BuiltinFunctionCalls:     843.35 ms    6.61 us   +2.93%
           BuiltinMethodLookup:     878.70 ms    1.67 us   +0.56%
                 ConcatStrings:    1068.80 ms    7.13 us   -1.22%
                 ConcatUnicode:    1373.70 ms    9.16 us   -1.24%
               CreateInstances:    1433.55 ms   34.13 us   +9.06%
       CreateStringsWithConcat:    1031.75 ms    5.16 us  +10.95%
       CreateUnicodeWithConcat:    1277.85 ms    6.39 us   +3.14%
                  DictCreation:    1275.80 ms    8.51 us  +44.22%
                      ForLoops:    1415.90 ms  141.59 us   -0.64%
                    IfThenElse:    1152.70 ms    1.71 us   -0.15%
                   ListSlicing:     397.40 ms  113.54 us   -0.53%
                NestedForLoops:     789.75 ms    2.26 us   -0.37%
          NormalClassAttribute:     935.15 ms    1.56 us   -0.41%
       NormalInstanceAttribute:     961.15 ms    1.60 us   -0.60%
           PythonFunctionCalls:    1079.65 ms    6.54 us   -1.00%
             PythonMethodCalls:     908.05 ms   12.11 us   -0.88%
                     Recursion:     838.50 ms   67.08 us   -0.00%
                  SecondImport:     741.20 ms   29.65 us  +25.57%
           SecondPackageImport:     744.25 ms   29.77 us  +18.66%
         SecondSubmoduleImport:     947.05 ms   37.88 us  +25.60%
       SimpleComplexArithmetic:    1129.40 ms    5.13 us  +114.92%
        SimpleDictManipulation:    1048.55 ms    3.50 us   -0.00%
         SimpleFloatArithmetic:     746.05 ms    1.36 us   -2.75%
      SimpleIntFloatArithmetic:     823.35 ms    1.25 us   -0.37%
       SimpleIntegerArithmetic:     823.40 ms    1.25 us   -0.37%
        SimpleListManipulation:    1004.70 ms    3.72 us   +0.01%
          SimpleLongArithmetic:     865.30 ms    5.24 us  +100.65%
                    SmallLists:    1657.65 ms    6.50 us   +6.63%
                   SmallTuples:    1143.95 ms    4.77 us   +2.90%
         SpecialClassAttribute:     949.00 ms    1.58 us   -0.22%
      SpecialInstanceAttribute:    1353.05 ms    2.26 us   -0.73%
                StringMappings:    1161.00 ms    9.21 us   +7.30%
              StringPredicates:    1069.65 ms    3.82 us   -5.30%
                 StringSlicing:     846.30 ms    4.84 us   +8.61%
                     TryExcept:    1590.40 ms    1.06 us   -0.49%
                TryRaiseExcept:    1104.65 ms   73.64 us  +24.46%
                  TupleSlicing:     681.10 ms    6.49 us   -3.13%
               UnicodeMappings:    1021.70 ms   56.76 us   +0.79%
             UnicodePredicates:    1308.45 ms    5.82 us   -4.79%
             UnicodeProperties:    1148.45 ms    5.74 us  +13.67%
                UnicodeSlicing:     984.15 ms    5.62 us   -0.51%
------------------------------------------------------------------------
            Average round time:   49104.00 ms              +5.70%

*) measured against: src-obmalloc-O3 (rounds=10, warp=20)

Words fail me slightly, but maybe some tuning of the memory allocation
of longs & complex numbers would be in order?

Time for lectures - I don't think algebraic geometry is going to make
my head hurt as much as trying to explain benchmarks...

Cheers,
M.

-- 
  ARTHUR:  But which is probably incapable of drinking the coffee.
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 6




From ping at lfw.org  Tue Jan 30 09:38:12 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Tue, 30 Jan 2001 00:38:12 -0800 (PST)
Subject: [Python-Dev] Read-only function attributes
Message-ID: <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org>

Hi there.

I see that the function attribute feature specifically allows
assignment to func_code and func_defaults, but no other special
attributes.  This seems really suspect to me.  Why would we want
to allow the reassignment of special attributes at all?

Functions have always been immutable objects, and i can see some
motivation for attaching mutable dictionaries to them, but it's
a more serious move to make the functions mutable themselves.

I don't recall any discussion about changing special attributes;
i don't see a clear purpose to them; and i do see a danger in
making it harder to be certain that a program is safe and predictable.

(Yes, i did notice that function attributes can't be set in
restricted mode, but the addition of extra features requiring
extra security checks makes me uneasy.)


-- ?!ng




From ping at lfw.org  Tue Jan 30 09:52:43 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Tue, 30 Jan 2001 00:52:43 -0800 (PST)
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101300043260.7769-100000@skuld.kingmanhall.org>

Eric S. Raymond wrote:
> For different reasons, I'd like to be able to set a constant flag on a
> object instance.  Simple semantics: if you try to assign to a
> member or method, it throws an exception.

Guido van Rossum wrote:
> Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
> 
> - How to spell it?  x.freeze()?  x.readonly()?

I'm not so sure.  There seem to be many issues here.  More questions:

What's the difference between a frozen list and a tuple?

Is a frozen list hashable?

> - Should this reversible?  I.e. should there be an x.unfreeze()?

What if two threads lock and then unlock the same structure?

> - Should we support something like this for instances too?  Sometimes
>   it might be cool to be able to freeze changing attribute values...

If you do this, i bet people will immediately want to freeze
individual attributes.  Some might be confused by

    a.x = [1, 2, 3]
    lock(a.x)        # intend to lock the attribute, not the list
    a.x = 3          # hey, why is this allowed?

What does locking an extension object do?

What happens when you lock an object that implements list or dict
semantics?  Do we care that locking a UserList accomplishes nothing?

Should unfreeze/unlock() be disallowed in restricted mode?


-- ?!ng

No software is totally secure, but using [Microsoft] Outlook is like
hanging a sign on your back that reads "PLEASE MESS WITH MY COMPUTER."
    -- Scott Rosenberg, Salon Magazine




From fredrik at effbot.org  Tue Jan 30 10:05:47 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 30 Jan 2001 10:05:47 +0100
Subject: [Python-Dev] Read-only function attributes
References: <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org>
Message-ID: <01d701c08a9b$d7a9fe60$e46940d5@hagrid>

Ka-Ping Yee wrote:
> I see that the function attribute feature specifically allows
> assignment to func_code and func_defaults, but no other special
> attributes.  This seems really suspect to me.  Why would we want
> to allow the reassignment of special attributes at all?

to allow an IDE to "patch" a running program?

</F>




From gvwilson at ca.baltimore.com  Tue Jan 30 14:08:42 2001
From: gvwilson at ca.baltimore.com (Greg Wilson)
Date: Tue, 30 Jan 2001 08:08:42 -0500 (EST)
Subject: [Python-Dev] re: Making mutable objects readonly
In-Reply-To: <20010130085202.18E71EAC4@mail.python.org>
Message-ID: <Pine.LNX.4.10.10101300804330.14867-100000@akbar.nevex.com>

> Barry Warsaw:
> lock(x) ...? :)

Greg Wilson:

-1 --- everyone will assume it's mutual exclusion, rather than immutability.






From guido at digicool.com  Tue Jan 30 15:01:15 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 09:01:15 -0500
Subject: [Python-Dev] Read-only function attributes
In-Reply-To: Your message of "Tue, 30 Jan 2001 00:38:12 PST."
             <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org> 
Message-ID: <200101301401.JAA25600@cj20424-a.reston1.va.home.com>

> I see that the function attribute feature specifically allows
> assignment to func_code and func_defaults, but no other special
> attributes.  This seems really suspect to me.  Why would we want
> to allow the reassignment of special attributes at all?

As Effbot said, this is useful in certain circumstances where a
development environment wants to implement a "better reload".  For
this same reason you can assign to a class's __bases__ and __dict__
and to an instance's __class__ and __dict__.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido at digicool.com  Tue Jan 30 16:00:58 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 10:00:58 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: Your message of "Tue, 30 Jan 2001 00:52:43 PST."
             <Pine.LNX.4.10.10101300043260.7769-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101300043260.7769-100000@skuld.kingmanhall.org> 
Message-ID: <200101301500.KAA25733@cj20424-a.reston1.va.home.com>

> Guido van Rossum wrote:
> > Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
> > 
> > - How to spell it?  x.freeze()?  x.readonly()?

Ping:
> I'm not so sure.  There seem to be many issues here.  More questions:
> 
> What's the difference between a frozen list and a tuple?

A frozen list can be unfrozen (maybe)?

> Is a frozen list hashable?

Yes -- that's what started this thread (using dicts as dict keys,
actually).

> > - Should this reversible?  I.e. should there be an x.unfreeze()?
> 
> What if two threads lock and then unlock the same structure?

That's up to the threads -- it's no different that other concurrent
access.

> > - Should we support something like this for instances too?  Sometimes
> >   it might be cool to be able to freeze changing attribute values...
> 
> If you do this, i bet people will immediately want to freeze
> individual attributes.  Some might be confused by
> 
>     a.x = [1, 2, 3]
>     lock(a.x)        # intend to lock the attribute, not the list
>     a.x = 3          # hey, why is this allowed?

That's a matter of API.  I wouldn't make this a built-in, but rather a
method on freezable objects (please don't call it lock()!).

> What does locking an extension object do?

What does adding 1 to an extension object do?

> What happens when you lock an object that implements list or dict
> semantics?  Do we care that locking a UserList accomplishes nothing?

Who says it doesn't?

> Should unfreeze/unlock() be disallowed in restricted mode?

I don't see why not.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan 30 16:06:57 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 10:06:57 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: Your message of "Tue, 30 Jan 2001 07:55:16 +0100."
             <20010130075515.X962@xs4all.nl> 
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net>  
            <20010130075515.X962@xs4all.nl> 
Message-ID: <200101301506.KAA25763@cj20424-a.reston1.va.home.com>

> On Mon, Jan 29, 2001 at 05:27:30PM -0800, Jeremy Hylton wrote:
> 
> > add note about two kinds of illegal imports that are now checked
> 
> > + - The compiler will report a SyntaxError if "from ... import *" occurs
> > +   in a function or class scope or if a name bound by the import
> > +   statement is declared global in the same scope.  The language
> > +   reference has also documented that these cases are illegal, but
> > +   they were not enforced.

> Woah. Is this really a good idea ? I have seen 'from ... import *'
> in a function scope put to good (relatively -- we're talking 'import
> *' here) use. I also thought of 'import' as yet another assignment
> statement, so to me it's both logical and consistent if 'import'
> would listen to 'global'.  Otherwise we have to re-invent 'import
> spam; eggs = spam' if we want eggs to be global.

Note that Jeremy is only raising errors for "from M import *".

> Is there really a reason to enforce this, or are we enforcing the
> wording of the language reference for the sake of enforcing the
> wording of the language reference ? When writing 'import as' for
> 2.0, I fixed some of the inconsistencies in import, making it adhere
> to 'global' statements in as many cases as possible (all except
> 'from ... import *') but I was apparently not aware of the wording
> of the language reference. I'd suggest updating the wording in the
> language reference, not the implementation, unless there is a good
> reason to disallow this.

I think Jeremy has an excellent reason.  Compilers want to do analysis
of name usage at compile time.  The value of * cannot be determined at
compile time (you cannot know what module will actually be imported at
run time).  Up till now, we were able to fudge this, but Jeremy's new
compiler needs to know exactly which names are defined in all local
scopes, in order to do nested scopes right.

> I also have another issue with your recent patches, Jeremy, also in
> the backwards-compatibility departement :) You gave new.code two
> new, non-optional arguments, in the middle of the long argument
> list. I sent a note about it to python-checkins instead of
> python-dev by accident, but Fred seemed to agree with me there.

(Tim will love this. :-)

I don't know what those new arguments represent.  If they can
reasonably be assumed to be empty for code that doesn't use the new
features, I'd say move them to the end and default them properly.  If
they must be specified, I'd say too bad, the new module is an accident
of the implementation anyway, and its users should update their code.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan 30 16:08:39 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 10:08:39 -0500
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: Your message of "Tue, 30 Jan 2001 07:26:48 +0100."
             <20010130072648.W962@xs4all.nl> 
References: <14966.2069.950895.627663@beluga.mojam.com>  
            <20010130072648.W962@xs4all.nl> 
Message-ID: <200101301508.KAA25825@cj20424-a.reston1.va.home.com>

> Note that the fact I implemented it doesn't mean I'm +1 on it (witness my
> posts on python-list.) In fact, *while implementing it*, I grew from +0 to
> -0 and maybe even to a weak -1 (all in 5 minutes :) The enthousiastic
> subject of the patch was a weak attempt at 5AM humour, not a venting of an
> ancient desire :)

Can you say "PEP time"? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Tue Jan 30 16:29:43 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 30 Jan 2001 10:29:43 -0500
Subject: [Python-Dev] Read-only function attributes
References: <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org>
Message-ID: <14966.56807.288840.7850@anthem.wooz.org>

>>>>> "KY" == Ka-Ping Yee <ping at lfw.org> writes:

    KY> I see that the function attribute feature specifically allows
    KY> assignment to func_code and func_defaults, but no other
    KY> special attributes.  This seems really suspect to me.  Why
    KY> would we want to allow the reassignment of special attributes
    KY> at all?

... and actually, none of that changed w/ the function attribute
patch.  You've been able to assign to func_code and func_defaults
since Python 1.6!

-Barry



From thomas at xs4all.net  Tue Jan 30 16:52:04 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 16:52:04 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <200101301506.KAA25763@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 30, 2001 at 10:06:57AM -0500
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com>
Message-ID: <20010130165204.I962@xs4all.nl>

On Tue, Jan 30, 2001 at 10:06:57AM -0500, Guido van Rossum wrote:
> > On Mon, Jan 29, 2001 at 05:27:30PM -0800, Jeremy Hylton wrote:
> > 
> > > add note about two kinds of illegal imports that are now checked
> > 
> > > + - The compiler will report a SyntaxError if "from ... import *" occurs
> > > +   in a function or class scope or if a name bound by the import
> > > +   statement is declared global in the same scope.  The language
> > > +   reference has also documented that these cases are illegal, but
> > > +   they were not enforced.

> > Woah. Is this really a good idea ? I have seen 'from ... import *'
> > in a function scope put to good (relatively -- we're talking 'import
> > *' here) use. I also thought of 'import' as yet another assignment
> > statement, so to me it's both logical and consistent if 'import'
> > would listen to 'global'.  Otherwise we have to re-invent 'import
> > spam; eggs = spam' if we want eggs to be global.

> Note that Jeremy is only raising errors for "from M import *".

No, he says he's also raising errors for 'import spam' if 'spam' is declared
global, like so:

def viking():
    global spam
    import spam

> > Is there really a reason to enforce this, or are we enforcing the
> > wording of the language reference for the sake of enforcing the
> > wording of the language reference ? When writing 'import as' for
> > 2.0, I fixed some of the inconsistencies in import, making it adhere
> > to 'global' statements in as many cases as possible (all except
> > 'from ... import *') but I was apparently not aware of the wording
> > of the language reference. I'd suggest updating the wording in the
> > language reference, not the implementation, unless there is a good
> > reason to disallow this.

> I think Jeremy has an excellent reason.  Compilers want to do analysis
> of name usage at compile time.  The value of * cannot be determined at
> compile time (you cannot know what module will actually be imported at
> run time).  Up till now, we were able to fudge this, but Jeremy's new
> compiler needs to know exactly which names are defined in all local
> scopes, in order to do nested scopes right.

Hrrmm.... I guess I have to agree with that. None the less, I wish we could
have a "ack! this is stupid code! it uses 'from larch import *'! All bets
are off, we do a lot of slow complicated runtime checking now!" mode. The
thing I still enjoy most about Python is that it always does what I want,
and though I'd never want to do 'from different import *' in a local scope,
I do want other, less wise people to have the same experience, where
possible :)

And I also want to be able to do:

def fill_me(with):
    global me
    if with == 1:
        import me
    elif with == 2:
        import me_too as me
    elif with == 3:
        from me.Tools import me_me as me
    elif with == 4:
        me = FakeModule()
        sys.modules['me'] = me
    else:
        raise ValueError

And I can't quite argue that away with 'the compiler needs to know ...' --
it's all there!

> > I also have another issue with your recent patches, Jeremy, also in
> > the backwards-compatibility departement :) You gave new.code two
> > new, non-optional arguments, in the middle of the long argument
> > list. I sent a note about it to python-checkins instead of
> > python-dev by accident, but Fred seemed to agree with me there.

> (Tim will love this. :-)

> I don't know what those new arguments represent.  If they can
> reasonably be assumed to be empty for code that doesn't use the new
> features, I'd say move them to the end and default them properly.  If
> they must be specified, I'd say too bad, the new module is an accident
> of the implementation anyway, and its users should update their code.

Okay, I can live with that. It's sure to cause some gripes though. Then
again, from looking at the code I'd say those arguments (freevars and
cellvars) can easily default to empty tuples.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From bckfnn at worldonline.dk  Tue Jan 30 18:34:10 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Tue, 30 Jan 2001 17:34:10 GMT
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>   <3A756FF8.B7185FA2@lemburg.com>  <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
Message-ID: <3a76df10.22007715@smtp.worldonline.dk>

[Guido]

>Maybe we could add a flag to the dict that issues an error when a new
>key is inserted during such a for loop?  

FWIW, some of the java2 collections decided to throw a Concurrent-
ModificationException in the iterator if the collection was modified
during the iteration. Generally none of java2 collections can be
modified while iterating over it (the exception is calling .remove() on
the iterator object and not all collections support that).

>(I don't think the key order can be affected when a key is *deleted*.)

Probably also true for the Hashtables which is backing our PyDictionary,
but I'll rather not depend too much on it being true.

[Tim]

>That latter is true but specific to this implementation.  "Can't mutate the
>dict period" is easier to keep straight, and probably harmless in practice
>(if not, it could be relaxed later).  

Agree.

>Recall that a similar trick is played
>during list.sort(), replacing the list's type pointer for the duration (to
>point to an internal "immutable list" type, same as the list type except the
>"dangerous" slots point to a function that raises an "immutable list"
>TypeError).  Then no runtime expense is incurred for regular lists to keep
>checking flags.  I thought of this as an elegant use for switching types at
>runtime; you may still be appalled by it, though!

Changing the type of a type? Yuck! 

I might very likely be reading the CPython sources wrongly, but it seems
this trick will cause an BadInternalCall if some other C extension are
trying to modify a list while it is freezed by the type switching trick.
I imagine this would happen if the extension called:

  PyList_SetItem(myList, 0, aValue);

I guess Jython could support this from the python side, but its hard to
ensure from the java side without adding an additional PyList_Check(..)
to all list methods. It just doesn't feel like the right thing to go
since it would cause slower access to all mutable objects.

regards,
finn



From guido at digicool.com  Tue Jan 30 21:42:58 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 15:42:58 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: Your message of "Tue, 30 Jan 2001 16:52:04 +0100."
             <20010130165204.I962@xs4all.nl> 
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com>  
            <20010130165204.I962@xs4all.nl> 
Message-ID: <200101302042.PAA29301@cj20424-a.reston1.va.home.com>

> > > Woah. Is this really a good idea ? I have seen 'from ... import *'
> > > in a function scope put to good (relatively -- we're talking 'import
> > > *' here) use. I also thought of 'import' as yet another assignment
> > > statement, so to me it's both logical and consistent if 'import'
> > > would listen to 'global'.  Otherwise we have to re-invent 'import
> > > spam; eggs = spam' if we want eggs to be global.
> 
> > Note that Jeremy is only raising errors for "from M import *".
> 
> No, he says he's also raising errors for 'import spam' if 'spam' is declared
> global, like so:
> 
> def viking():
>     global spam
>     import spam

Yeah, this was just brought to my attention at our group meeting
today.  I'm with you on this one -- there really isn't a good reason
why this shouldn't work.  (I wonder why that constraint was ever added
to the reference manual; maybe I was just upset that someone would
*do* something as ugly as that, or maybe there was a J[P]ython
reason???.)

> > I think Jeremy has an excellent reason.  Compilers want to do analysis
> > of name usage at compile time.  The value of * cannot be determined at
> > compile time (you cannot know what module will actually be imported at
> > run time).  Up till now, we were able to fudge this, but Jeremy's new
> > compiler needs to know exactly which names are defined in all local
> > scopes, in order to do nested scopes right.
> 
> Hrrmm.... I guess I have to agree with that. None the less, I wish we could
> have a "ack! this is stupid code! it uses 'from larch import *'! All bets
> are off, we do a lot of slow complicated runtime checking now!" mode. The
> thing I still enjoy most about Python is that it always does what I want,
> and though I'd never want to do 'from different import *' in a local scope,
> I do want other, less wise people to have the same experience, where
> possible :)

Hm, maybe, just *maybe* Jeremy can do this if there are no nested
scopes in sight.  But I don't think it's a big deal as long as the
error message is clear -- it's bad style.

> And I also want to be able to do:
> 
> def fill_me(with):
>     global me
>     if with == 1:
>         import me
>     elif with == 2:
>         import me_too as me
>     elif with == 3:
>         from me.Tools import me_me as me
>     elif with == 4:
>         me = FakeModule()
>         sys.modules['me'] = me
>     else:
>         raise ValueError
> 
> And I can't quite argue that away with 'the compiler needs to know ...' --
> it's all there!

Sort of, although I would prefer to do a two-stager here: first some
variation of "import me as meohmy", and then "global me; me = meohmy" .

> > > I also have another issue with your recent patches, Jeremy, also in
> > > the backwards-compatibility departement :) You gave new.code two
> > > new, non-optional arguments, in the middle of the long argument
> > > list. I sent a note about it to python-checkins instead of
> > > python-dev by accident, but Fred seemed to agree with me there.
> 
> > (Tim will love this. :-)
> 
> > I don't know what those new arguments represent.  If they can
> > reasonably be assumed to be empty for code that doesn't use the new
> > features, I'd say move them to the end and default them properly.  If
> > they must be specified, I'd say too bad, the new module is an accident
> > of the implementation anyway, and its users should update their code.
> 
> Okay, I can live with that. It's sure to cause some gripes though. Then
> again, from looking at the code I'd say those arguments (freevars and
> cellvars) can easily default to empty tuples.

OK.  I hope Jeremy can fix this when he gets home.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Tue Jan 30 23:30:25 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 23:30:25 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3a76df10.22007715@smtp.worldonline.dk>; from bckfnn@worldonline.dk on Tue, Jan 30, 2001 at 05:34:10PM +0000
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3a76df10.22007715@smtp.worldonline.dk>
Message-ID: <20010130233025.J962@xs4all.nl>

On Tue, Jan 30, 2001 at 05:34:10PM +0000, Finn Bock wrote:

> >Recall that a similar trick is played during list.sort(), replacing the
> >list's type pointer for the duration (to point to an internal "immutable
> >list" type, same as the list type except the "dangerous" slots point to a
> >function that raises an "immutable list" TypeError).  Then no runtime
> >expense is incurred for regular lists to keep checking flags.  I thought
> >of this as an elegant use for switching types at runtime; you may still
> >be appalled by it, though!

> Changing the type of a type? Yuck! 

No, the typeobject itself isn't changed -- that would freeze *all*
dicts/lists/whatever, not just the one we want. We'd be changing the type of
an object (or 'type instance', if you want, but not "type 'instance'"), not
the type of a type.

> I might very likely be reading the CPython sources wrongly, but it seems
> this trick will cause an BadInternalCall if some other C extension are
> trying to modify a list while it is freezed by the type switching trick.
> I imagine this would happen if the extension called:

>   PyList_SetItem(myList, 0, aValue);

Only if PyList_SetItem refuses to handle 'frozen' lists. In my eyes,
'frozen' lists should still pass PyList_Check(), but also PyList_Frozen()
(or whatever), and methods/operations that modify the listobject would have
to check if the list is frozen, and raise an appropriate error if so. This
might throw 'unexpected' errors, but only in situations that can't happen
right now!

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From fredrik at effbot.org  Tue Jan 30 23:45:16 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 30 Jan 2001 23:45:16 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3a76df10.22007715@smtp.worldonline.dk> <20010130233025.J962@xs4all.nl>
Message-ID: <003501c08b0e$51f975c0$e46940d5@hagrid>

> Only if PyList_SetItem refuses to handle 'frozen' lists. In my eyes,
> 'frozen' lists should still pass PyList_Check(), but also PyList_Frozen()
> (or whatever), and methods/operations that modify the listobject would have
> to check if the list is frozen, and raise an appropriate error if so. This
> might throw 'unexpected' errors.

did someone just subscribe me to the perl-porters list?

-1 on "modal freeze" (it's madness)
-0 on an "immutable dictionary" type in the core




From tim.one at home.com  Wed Jan 31 00:53:45 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 30 Jan 2001 18:53:45 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101300206.VAA21925@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>

[Guido]
> This is all PEP material now.

Yup.

> Tim, do you want to own the PEP?

Not really.  Available time is finite, and this isn't at the top of the list
of things I'd like to see (resuming the discussion of generators +
coroutines + iteration protocol comes to mind first).

>> Cool!  Can we resist adding
>>
>>     if key:value in dict
>>
>> for "parallelism"?  (I know I can ...)

> That's easy to resist because, unlike ``for key:value in dict'', it's
> not unambiguous:

But

    if (key:value) in dict

is.  Just trying to help whoever *does* want the PEP <wink>.

> ...
> I'm certainly more comfortable with just ``for key in dict'' than with
> the whole slow of extensions using colons.

What about just the

    for key:value in dict
    for index:value in sequence

extensions?  The degenerate forms (omitting x or y or both in x:y) are
mechanical variations so are likely to get raised.

> But, again, that's for the PEP to fight over.

PEPs are easier if you Pronounce on things you hate early so that those can
get recorded in the "BDFL Pronouncements" section without further ado.

whatever-this-may-look-like-it's-not-a-pep-discussion<wink>-ly y'rs  - tim




From nas at arctrix.com  Tue Jan 30 18:12:15 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 30 Jan 2001 09:12:15 -0800
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <003501c08b0e$51f975c0$e46940d5@hagrid>; from fredrik@effbot.org on Tue, Jan 30, 2001 at 11:45:16PM +0100
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3a76df10.22007715@smtp.worldonline.dk> <20010130233025.J962@xs4all.nl> <003501c08b0e$51f975c0$e46940d5@hagrid>
Message-ID: <20010130091215.C18319@glacier.fnational.com>

On Tue, Jan 30, 2001 at 11:45:16PM +0100, Fredrik Lundh wrote:
> did someone just subscribe me to the perl-porters list?
> 
> -1 on "modal freeze" (it's madness)
> -0 on an "immutable dictionary" type in the core

I'm glad I'm not the only one who had that feeling.  I agree with
your votes too.

  Neil



From nas at arctrix.com  Tue Jan 30 18:24:54 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 30 Jan 2001 09:24:54 -0800
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 30, 2001 at 06:53:45PM -0500
References: <200101300206.VAA21925@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>
Message-ID: <20010130092454.D18319@glacier.fnational.com>

[Tim Peters on adding yet more syntatic sugar]
> Available time is finite, and this isn't at the top of the list
> of things I'd like to see (resuming the discussion of
> generators + coroutines + iteration protocol comes to mind
> first).

What's the chances of getting generators into 2.2?  The
implementation should not be hard.  Didn't Steven Majewski have
something years ago?  Why do we always get sidetracked on trying
to figure out how to do coroutines and continuations?

Generators would add real power to the language and are simple
enough that most users could benefit from them.  Also, it should be
possible to design an interface that does not preclude the
addition of coroutines or continuations later.

I'm not volunteering to champion the cause just yet.  I just want
to know if there is some issue I'm missing.

  Neil



From barry at digicool.com  Wed Jan 31 01:24:05 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 30 Jan 2001 19:24:05 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <200101300206.VAA21925@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>
	<20010130092454.D18319@glacier.fnational.com>
Message-ID: <14967.23333.57259.347222@anthem.wooz.org>

>>>>> "NS" == Neil Schemenauer <nas at arctrix.com> writes:

    NS> What's the chances of getting generators into 2.2?  The
    NS> implementation should not be hard.  Didn't Steven Majewski
    NS> have something years ago?  Why do we always get sidetracked on
    NS> trying to figure out how to do coroutines and continuations?

I'd be +1 on someone wrestling PEP 220 from Gordon's icy claws,
renaming it just "Generators" and filling it out for the 2.2 time
frame.  If we want to address coroutines and continuations later, we
can write separate PEPs for them.

Send me a draft.

-Barry



From guido at digicool.com  Wed Jan 31 01:28:44 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 19:28:44 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Tue, 30 Jan 2001 18:53:45 EST."
             <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com> 
Message-ID: <200101310028.TAA30090@cj20424-a.reston1.va.home.com>

> Not really.  Available time is finite, and this isn't at the top of the list
> of things I'd like to see (resuming the discussion of generators +
> coroutines + iteration protocol comes to mind first).

OK, get going on that one then!

> >> Cool!  Can we resist adding
> >>
> >>     if key:value in dict
> >>
> >> for "parallelism"?  (I know I can ...)
> 
> > That's easy to resist because, unlike ``for key:value in dict'', it's
> > not unambiguous:
> 
> But
> 
>     if (key:value) in dict
> 
> is.  Just trying to help whoever *does* want the PEP <wink>.

OK, I'll pronounce -1 on this one.  It looks ugly to me -- too
reminiscent of C's if (...) required parentheses.  Also it suggests
that (key:value) is a new tuple notation that might be useful in other
contexts -- which it's not.

> > ...
> > I'm certainly more comfortable with just ``for key in dict'' than with
> > the whole slow of extensions using colons.
> 
> What about just the
> 
>     for key:value in dict
>     for index:value in sequence
> 
> extensions?

I'm not against these -- I'd say +0.5.

> The degenerate forms (omitting x or y or both in x:y) are
> mechanical variations so are likely to get raised.

For those, +0.2.

> > But, again, that's for the PEP to fight over.
> 
> PEPs are easier if you Pronounce on things you hate early so that those can
> get recorded in the "BDFL Pronouncements" section without further ado.

At your service -- see above.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Wed Jan 31 01:49:24 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 19:49:24 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Tue, 30 Jan 2001 09:24:54 PST."
             <20010130092454.D18319@glacier.fnational.com> 
References: <200101300206.VAA21925@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>  
            <20010130092454.D18319@glacier.fnational.com> 
Message-ID: <200101310049.TAA30197@cj20424-a.reston1.va.home.com>

> [Tim Peters on adding yet more syntatic sugar]
> > Available time is finite, and this isn't at the top of the list
> > of things I'd like to see (resuming the discussion of
> > generators + coroutines + iteration protocol comes to mind
> > first).
> 
> What's the chances of getting generators into 2.2?  The
> implementation should not be hard.  Didn't Steven Majewski have
> something years ago?  Why do we always get sidetracked on trying
> to figure out how to do coroutines and continuations?

I think there's a very good chance of getting them into 2.2.  But it
*is* true that coroutines are a very attractice piece of land "just
nextdoor".  On the other hand, continiations are a mirage, so don't
try to go there. :-)

> Generators would add real power to the language and are simple
> enough that most users could benefit from them.  Also, it should be
> possible to design an interface that does not preclude the
> addition of coroutines or continuations later.
> 
> I'm not volunteering to champion the cause just yet.  I just want
> to know if there is some issue I'm missing.

There are different ways to do interators.

Here is a very "tame" proposal (and definitely in the realm of 2.2),
that doesn't require any coroutine-like tricks.  Let's propose that

    for var in expr:
	...do something with var...

will henceforth be translated into

    __iter = iterator(expr)
    while __iter.more():
        var = __iter.next()
        ...do something with var...

-- or some variation that combines more() and next() (I don't care).

Then a new built-in function iterator() is needed that creates an
iterator object.  It should try two things:

(1) If the object implements __iterator__() (or a C API equivalent),
    call that and be done; this way arbitrary iterators can be
    created.

(2) If the object smells like a sequence (how to test???), use an
    iterator sort of like this:

    class Iterator:

        def __init__(self, sequence):
            self.sequence = sequence
            self.index = 0

        def more(self):
	    # Store the item so that each index is tried exactly once
            try:
                self.item = self.sequence[self.index]
            except IndexError:
                return 0
            else:
                self.index = self.index + 1
                return 1

        def next(self):
            return self.item

    (I don't necessarily mean that all those instance variables should
    be publicly available.)

The built-in sequence types can use a very fast built-in iterator type
that uses a C int for the index and doesn't store the item in the
iterator.  (This should be as fast as Marc-Andre's for loop
optimization using a C counter.)

Dictionaries can define an appropriate iterator that uses
PyDict_Next().

If the argument to iterator() is itself an iterator (how to test???),
it returns the argument unchanged, so that one can also write

    for var in iterator(obj):
	...do something with var...

Files of course should have iterators that return the next input line.

We could build filtering and mapping iterators that take an iterator
argument and do certain manipulations with the elements; this would
effectively introduce the notion lazy evaluation on sequences.

Etc., etc.

This does not come close to Icon generators -- but it doesn't require
any coroutine-like capabilities, unlike those.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Wed Jan 31 01:55:10 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 30 Jan 2001 19:55:10 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3a76df10.22007715@smtp.worldonline.dk>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEJIMAA.tim.one@home.com>

[Finn Bock]
> Changing the type of a type? Yuck!

No, it temporarily changes the type of the single list being sorted, like
so, where "self" is a pointer to a PyListObject (which is a list, not a list
*type* object):

	self->ob_type = &immutable_list_type;
	err = samplesortslice(self->ob_item,
			      self->ob_item + self->ob_size,
			      compare);
	self->ob_type = &PyList_Type;

immutable_list_type is "just like" PyList_Type, except that the slots for
mutating methods point to a function that raises a TypeError.

Before this drastic step came years of increasingly ugly hacks trying to
stop core dumps when people mutated a list during the sort.  Python's sort
is very complex, and lots of pointers are tucked away -- having the size of
the array, or its position in memory, or the set of objects it contains,
change as a side effect of doing a compare, would be difficult and expensive
to recover from -- and by "difficult" read "nobody ever managed to get it
right before this" <0.5 wink>.

> I might very likely be reading the CPython sources wrongly, but it seems
> this trick will cause an BadInternalCall if some other C extension are
> trying to modify a list while it is freezed by the type switching trick.
> I imagine this would happen if the extension called:
>
>   PyList_SetItem(myList, 0, aValue);

Well, in CPython it's not "legal" for any other thread to use the C API
while the sort is in progress, because the thread doing the sort holds the
global interpreter lock for the duration.  So this could happen "legally"
only if a comparison function called by the sort called out to a C extension
attempting to mutate the list.  In that case, fine, it *is* a bad call:
mutation is not allowed during list sorting, so they deserve whatever they
get -- and far better a "bad internal call" than a core dump.

If the immutable_list_type were used more generally, it would require more
general support (but I see Thomas already talked about that -- thanks).




From guido at digicool.com  Wed Jan 31 01:55:19 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 19:55:19 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Tue, 30 Jan 2001 19:24:05 EST."
             <14967.23333.57259.347222@anthem.wooz.org> 
References: <200101300206.VAA21925@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com> <20010130092454.D18319@glacier.fnational.com>  
            <14967.23333.57259.347222@anthem.wooz.org> 
Message-ID: <200101310055.TAA30250@cj20424-a.reston1.va.home.com>

> I'd be +1 on someone wrestling PEP 220 from Gordon's icy claws,
> renaming it just "Generators" and filling it out for the 2.2 time
> frame.  If we want to address coroutines and continuations later, we
> can write separate PEPs for them.

I think it's better not to re-use PEP 220 for that.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Wed Jan 31 01:58:32 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 31 Jan 2001 01:58:32 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101310028.TAA30090@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 30, 2001 at 07:28:44PM -0500
References: <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com> <200101310028.TAA30090@cj20424-a.reston1.va.home.com>
Message-ID: <20010131015832.K962@xs4all.nl>

On Tue, Jan 30, 2001 at 07:28:44PM -0500, Guido van Rossum wrote:

> > What about just the

> >     for key:value in dict
> >     for index:value in sequence

> > extensions?

> I'm not against these -- I'd say +0.5.

What, fractions ? Isn't that against the whole idea of (+|-)(0|1) ? :)
But since we are voting, I'm -0 on this right now, and might end up -1 or
+0, depending on the implementation; I still can't *see* this, though I
wouldn't be myself if I hadn't tried to implement it anyway :) And I ran
into some fairly mind-boggling issues. The worst bit is 'how the f*ck
does FOR_LOOP know if something's a dict or a list'. And the
almost-as-bad bit is 'WTF to do for user classes, extension types and
almost-list/almost-dict practically-builtin types (arrays, the *dbm's,
etc.)'. After some sleep-deprived consideration I gave up and decided we
need an iteration/generator protocol first.

However, my life's been busy (or rather, my work has been) with all kinds
of small and not so small details, and I haven't been getting much sleep
in the last week or so, so I might be overlooking something very simple.
That's why I can go either way based on implementation -- it might prove
me wrong :) Until my boss is back and I stop being 'responsible' (end of
this week, start of next week) and I get a chance to get rid of about 2
months of work backlog (the time he was away) I won't have time to
champion or even contribute to such a PEP. Then again, by that time I
might be preparing for IPC9 (_if_ my boss sends me there) or even my
ApacheCon US presentation (which got accepted today, yay!)

So, if that other message was an attempt to drop the PEP on me, Guido,
the answer is the same as I tend to give to suits that show up next to my
desk wanting to discuss something important (to them) right away:
"b'gg'r 'ff" :)

I'll-save-my-answer-to-PR-officers-doing-the-same-for-when-you-do-something-
 -*really*-offensive-ly <wink> y'rs
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Wed Jan 31 02:16:51 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 20:16:51 -0500
Subject: [Python-Dev] Let's release 2.1a2 Thursday night
Message-ID: <200101310116.UAA30386@cj20424-a.reston1.va.home.com>

Things look good for a release of 2.1a2 this week; we're aiming for
Thursday night.  I won't be in town (speaking to the press at
LinuxWorld Expo in New York) but Jeremy will handle the release
process and the other PythonLabs folks will assist him.

Tomorrow Fred will check in his weak references after making some
changes (mostly making it more Spartan :-) that I suggested in a code
review.

After that, I think we're good for the second (and last!) alpha
release; and enough has changed (e.g. nested scopes, lots of setup.py
changes, flat Makefile) to warrant going ahead now.

Now is the time for those last-minute bugfixes that you're all so
famous for!

I propose a checkin freeze for non-PythonLabs folks Wednesday midnight
US west coast time, to give Jeremy c.s. enough time to build the
release and give it a good work-out.  (An internal freeze is up to
Jeremy to declare, but should probably take Tim's sleep cycle into
account.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

PS. I'll be out of reach from noon US east coast time tomorrow
(Wednesday), traveling to New York by train.  I probably won't check
my email while out there; I'll be back Friday night.



From guido at digicool.com  Wed Jan 31 02:35:25 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 20:35:25 -0500
Subject: [Python-Dev] SSL socket read at EOF; SourceForge problem
In-Reply-To: Your message of "Mon, 29 Jan 2001 23:56:09 EST."
             <LNBBLJKPBEHFEDALKOLCEECJIMAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCEECJIMAA.tim.one@home.com> 
Message-ID: <200101310135.UAA30629@cj20424-a.reston1.va.home.com>

> I'm going to repeat a question that I posted about a week ago that passed
> without comment on the newsgroup. The issue is the SSL support in the socket
> module, which raises an exception when the reading socket is at EOF, rather
> than returning an empty string. I'm hesitant to call it a "bug", but I
> wouldn't have implemented it this way.  There are the names of two people
> mentioned at the top of socketmodule.c, but no contact information, so I'm
> suggesting here that it be changed to conform to normal file/socket
> practice. (SSL was actually added at 2.0, so I'm late to the party with
> this; mea culpa, mea culpa.  I delayed trying Python2 because of the
> extension rebuilding.)

I agree that it makes more sense if a read at EOF returns an empty
string, since that's what other file-like objects in Python do.  I
can't do much about this right now, but I'd love to see a patch.  It
could go into 2.1a2 if small enough.

Note that input() and raw_input() are specifically excepted because
they are intended for use in interactive mode by newbies mostly; and
because "" as return value for EOF would be ambiguous for these.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From greg at cosc.canterbury.ac.nz  Wed Jan 31 05:12:23 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 31 Jan 2001 17:12:23 +1300 (NZDT)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101310028.TAA30090@cj20424-a.reston1.va.home.com>
Message-ID: <200101310412.RAA03140@s454.cosc.canterbury.ac.nz>

<someone whose attribution has been lost>:

>     for index:value in sequence

-1, because we only construct dicts using that
notation, not sequences.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From guido at digicool.com  Wed Jan 31 06:21:37 2001
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 31 Jan 2001 00:21:37 -0500
Subject: [Python-Dev] codecity.com
Message-ID: <200101310521.AAA31653@cj20424-a.reston1.va.home.com>

Should I spread this word, or is this a joke?  The Python quiz
category is laughable.

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Sat, 27 Jan 2001 23:16:02 -0800
From:    "Jeff Cordova" <jeffc at codecity.com>
To:      <guido at python.org>
Subject: New, fun way to learn Python.

Hi Guido,

  I wanted to let you know about www.codecity.com After several years of
managing large software projects in Silicon Valley, I realized that I was
spending a lot of time teaching jr. programmers how to write code. So, I
created CodeCity to help me automate some of that. If you go to the site,
you'll see that I've created a category for Python. There's not much depth
to the Python content yet (the site is only a week old) but I'm expecting
the Python community to add their wisdom over a period of time. If you could
spread the word, it would be highly appreciated.

Thankyou,

Jeff C.

------- End of Forwarded Message




From tim.one at home.com  Wed Jan 31 07:16:48 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 01:16:48 -0500
Subject: [Python-Dev] codecity.com
In-Reply-To: <200101310521.AAA31653@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEFJIMAA.tim.one@home.com>

[Guido, on www.codecity.com]
> Should I spread this word, or is this a joke?  The Python quiz
> category is laughable.

While the Python section still seems to have only one question, the first
day this was announced the third choice wasn't today's:

    Python is Open Source code, so it doesn't have a creator

but:

    Martha Stewart

I liked it better before <0.9 wink>.




From moshez at zadka.site.co.il  Wed Jan 31 07:30:07 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Wed, 31 Jan 2001 08:30:07 +0200 (IST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101310049.TAA30197@cj20424-a.reston1.va.home.com>
References: <200101310049.TAA30197@cj20424-a.reston1.va.home.com>, <200101300206.VAA21925@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>  
            <20010130092454.D18319@glacier.fnational.com>
Message-ID: <20010131063007.536ACA83E@darjeeling.zadka.site.co.il>

On Tue, 30 Jan 2001 19:49:24 -0500, Guido van Rossum <guido at digicool.com> wrote:

> There are different ways to do interators.
> 
> Here is a very "tame" proposal (and definitely in the realm of 2.2),
> that doesn't require any coroutine-like tricks.  Let's propose that
> 
>     for var in expr:
> 	...do something with var...
> 
> will henceforth be translated into
> 
>     __iter = iterator(expr)
>     while __iter.more():
>         var = __iter.next()
>         ...do something with var...

I'm +1 on that...but Tim's "try to use that to write something that
will return the nodes of a binary tree" still haunts me.

Personally, though, I'd thin down the interface to

while 1:
	try:
		var = __iter.next()
	except NoMoreError:
		break # pseudo-break?

With the usual caveat that this is a lie as far as "else" is concerned
(IOW, pseudo-break gets into the else)

> Then a new built-in function iterator() is needed that creates an
> iterator object.  It should try two things:
> 
> (1) If the object implements __iterator__() (or a C API equivalent),
>     call that and be done; this way arbitrary iterators can be
>     created.
 
> (2) If the object smells like a sequence (how to test???), use an
>     iterator sort of like this:

Why not, "if the object doesn't have __iterator__, try this. If it 
won't work, we'll find out by the exception that will be thrown in
our face".

class Iterator:

	def __init__(self, seq):
		self.seq = seq
		self.index = 0

	def next(self):
		try:
			try:
				return self.seq[self.index] # <- smells like
			except IndexError:
				raise NoMoreError(self.index)
		finally:
			self.index += 1

>     (I don't necessarily mean that all those instance variables should
>     be publicly available.)

But what about your poor brother? <wink> Er....I mean, this would make
implementing "indexing" really about just getting the index from the
iterator.

> If the argument to iterator() is itself an iterator (how to test???),

No idea, and this looks problematic. I see your point -- but it's
still problematic.

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From tim.one at home.com  Wed Jan 31 07:57:26 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 01:57:26 -0500
Subject: [Python-Dev] Can't enter new Python bugs on SourceForge?
Message-ID: <LNBBLJKPBEHFEDALKOLCGEFLIMAA.tim.one@home.com>

Reported this earlier.  Still can't create a new bug.  Guido either.  Here's
the SF Support request opened on this:

http://sourceforge.net/support/
    index.php?func=detailsupport&support_id=113100&group_id=1

The good(?) news is that Python isn't the only project to report this
problem.




From tim.one at home.com  Wed Jan 31 08:50:18 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 02:50:18 -0500
Subject: [Python-Dev] FW: Python programmer needed (addition to urllib2 and HTTPS support)
Message-ID: <LNBBLJKPBEHFEDALKOLCCEFPIMAA.tim.one@home.com>

Get rich quick!

-----Original Message-----
From: python-list-admin at python.org
[mailto:python-list-admin at python.org]On Behalf Of Albert Chin-A-Young
Sent: Wednesday, January 31, 2001 2:31 AM
To: python-list at python.org
Subject: Python programmer needed (addition to urllib2 and HTTPS
support)


We're in need of a contract Python programmer for the following:
  1. Allow connecting to a host with urlopen() which requires
     BASIC HTTP authentication with a proxy (via urllib2.py).
     This should address bug #125217:
     http://sourceforge.net/bugs/?func=detailbug&bug_id=125217&group_id=5470
  2. Allow connecting to a host with urlopen() which requires
     BASIC HTTP authentication with a proxy that requires
     BASIC HTTP authentication (via urllib2.py).
  3. Support for non-authenticated clients to connect to a
     HTTPS server
  4. Support for a client to authenticate the HTTPS host (to
     verify that it's certificate is valid)

What we might consider adding (depends on cost):
  1. Support for authenticated clients to connect to a HTTPS server.

Please note that solutions to the four items above must be rolled back
into the main Python distribution (implies the "community" and the
Python developers need to agree on the adopted solution).

--
albert chin (china at thewrittenword dot com)
--
http://mail.python.org/mailman/listinfo/python-list




From ping at lfw.org  Wed Jan 31 10:47:10 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Wed, 31 Jan 2001 01:47:10 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <Pine.LNX.4.10.10101310142480.8204-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>

On Tue, 30 Jan 2001, Guido van Rossum wrote:
> 
> Can you say "PEP time"? :-)

Okay, i have written a draft PEP that tries to combine the
"elt in dict", custom iterator, and "for k:v" issues into a
coherent proposal.  Have a look:

    http://www.lfw.org/python/pep-iterators.txt
    http://www.lfw.org/python/pep-iterators.html

Could i get a number for this please?


-- ?!ng

"The only `intuitive' interface is the nipple.  After that, it's all learned."
    -- Bruce Ediger, on user interfaces




From moshez at zadka.site.co.il  Wed Jan 31 11:14:49 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Wed, 31 Jan 2001 12:14:49 +0200 (IST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
Message-ID: <20010131101449.B28C5A83E@darjeeling.zadka.site.co.il>

On Wed, 31 Jan 2001 01:47:10 -0800 (PST), Ka-Ping Yee <ping at lfw.org> wrote:

> Okay, i have written a draft PEP that tries to combine the
> "elt in dict", custom iterator, and "for k:v" issues into a
> coherent proposal.  Have a look:
> 
>     http://www.lfw.org/python/pep-iterators.txt
>     http://www.lfw.org/python/pep-iterators.html

Er....one problem with first reading: you forgot to mention in the
while loop description that 'else:' would be executed if the exception
is raised, so the 'break' is a pseudo-break'.

Basic response: I *love* the iter(), sq_iter and __iter__ parts.
I tremble at seeing the rest.
Why not add a method to dictionaries .iteritems() and do

for (k, v) in dict.iteritems():
	pass

(dict.iteritems() would return an an iterator to the items)

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From MarkH at ActiveState.com  Wed Jan 31 11:34:01 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Wed, 31 Jan 2001 21:34:01 +1100
Subject: [Python-Dev] WARNING: Changed build process for zlib on Windows
Message-ID: <LCEPIIGDJPKCOIHOBJEPMEKGDAAA.MarkH@ActiveState.com>

Hi all,
	In an attempt to solve "[ Bug #129293 ] zlib library used for binary win32
distribution can crash"
(https://sourceforge.net/bugs/?func=detailbug&group_id=5470&bug_id=129293),
Tim and I have decided that we should fix the build process of zlib.pyd on
windows.

The current process requires that the builder download _2_ zlib archives - a
binary distribution for zlib.lib, and the source archive for the headers.
We believe that slight differences between the 2 are causing the above bug.
A particular warning-light is that the current process defines ZLIB_DLL even
though we are _not_ currently using the DLL but the static lib.  Removing
this #define generates linker errors.

The new process is very simple, but may break some peoples build.  In theory
it _should_ still work for everyone, but if it fails to build, please check
your directory structure.


From ping at lfw.org  Wed Jan 31 12:00:48 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Wed, 31 Jan 2001 03:00:48 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010131015832.K962@xs4all.nl>
Message-ID: <Pine.LNX.4.10.10101310252370.8204-100000@skuld.kingmanhall.org>

On Wed, 31 Jan 2001, Thomas Wouters wrote:
> I still can't *see* this, though I
> wouldn't be myself if I hadn't tried to implement it anyway :) And I ran
> into some fairly mind-boggling issues. The worst bit is 'how the f*ck
> does FOR_LOOP know if something's a dict or a list'.

I believe the Pythonic answer to that is "see if the appropriate
method is available".

The best definition of "sequence-like" or "mapping-like" i can
come up with is:

    x is sequence-like if it provides __getitem__() but not keys()
    x is mapping-like if it provides __getitem__() and keys()

But in our case, since we need iteration, we can look for specific
methods that have to do with just what we need for iteration and
nothing else.  Thus, e.g. a mapping-like class without a values()
method is no problem if we never ask to iterate over values.

> And the
> almost-as-bad bit is 'WTF to do for user classes, extension types and
> almost-list/almost-dict practically-builtin types

I think it can be done; the draft PEP at

    http://www.lfw.org/python/pep-iterators.html

is a best-attempt at supporting everything just as you would expect.
Let me know if you think there are important cases it doesn't cover.

I know, the table

    mp_iteritems    __iteritems__, __iter__, items, __getitem__
    mp_iterkeys     __iterkeys__, __iter__, keys, __getitem__
    mp_itervalues   __itervalues__, __iter__, values, __getitem__
    sq_iter         __iter__, __getitem__

might look a little frightening, but it's not so bad, and i think
it's about as simple as you can make it while continuing to support
existing pseudo-lists and pseudo-dictionaries.  No instance should
ever provide __iter__ at the same time as any of the other __iter*__
methods anyway.


-- ?!ng

"The only `intuitive' interface is the nipple.  After that, it's all learned."
    -- Bruce Ediger, on user interfaces




From mal at lemburg.com  Wed Jan 31 12:56:12 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 12:56:12 +0100
Subject: [Python-Dev] Re: from ... import * ([Python-checkins] CVS: python/dist/src/Python
 compile.c,2.153,2.154)
References: <E14NPXJ-0004Re-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <3A77FD5C.DE8729DC@lemburg.com>

> Update of /cvsroot/python/python/dist/src/Python
> In directory usw-pr-cvs1:/tmp/cvs-serv17061/Python
> 
> Modified Files:
>         compile.c 
> Log Message:
> Enforce two illegal import statements that were outlawed in the
> reference manual but not checked: Names bound by import statemants may
> not occur in global statements in the same scope. The from ... import *
> form may only occur in a module scope.
> 
> I guess these changes could break code, but the reference manual
> warned about them.

Jeremy, your code breaks all uses of "from package import submodule"
inside packages.

Try distutils for example or setup.py....

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Wed Jan 31 13:01:24 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 13:01:24 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com>  
	            <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <3A77FE94.E5082136@lemburg.com>

Guido van Rossum wrote:
> 
> [ESR]
> > For different reasons, I'd like to be able to set a constant flag on a
> > object instance.  Simple semantics: if you try to assign to a
> > member or method, it throws an exception.
> >
> > Application?  I have a large Python program that goes to a lot of effort
> > to build elaborate context structures in core.  It would be nice to know
> > they can't be even inadvertently trashed without throwing an exception I
> > can watch for.
> 
> Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
> 
> - How to spell it?  x.freeze()?  x.readonly()?

How about .lock() and .unlock() ?
 
> - Should this reversible?  I.e. should there be an x.unfreeze()?

Yes. These low-level locks could be used in thread programming
since the above calls are C level functions and thus thread safe
w/r to the global interpreter lock.
 
> - Should we support something like this for instances too?  Sometimes
>   it might be cool to be able to freeze changing attribute values...

Sure :)

Eric, could you write a PEP for this ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Wed Jan 31 13:08:15 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 13:08:15 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCMECAIMAA.tim.one@home.com>
Message-ID: <3A78002F.DC8F0582@lemburg.com>

Tim Peters wrote:
> 
> [MAL]
> > ...
> > What we really want is iterators for dictionaries, so why not
> > implement these instead of tweaking for-loops.
> 
> Seems an unrelated topic:  would "iterators for dictionaries" solve the
> supposed problem with iteration order?

No, but it would solve the problem in a more elegant and
generalized way. Besides, it also allows writing code which
is thread safe, since the iterator can take special actions
to assure that the dictionary doesn't change during the
iteration phase (see the other thread about "making mutable objects
readonly").
 
> > If you are looking for speedups w/r to for-loops, applying a
> > different indexing technique in for-loops would go a lot further
> > and provide better performance not only to dictionary loops,
> > but also to other sequences.
> >
> > I have made some good experience with a special counter object
> > (sort of like a mutable integer) which is used instead of the
> > iteration index integer in the current implementation.
> 
> Please quantify, if possible.  My belief (based on past experiments) is that
> in loops fancier than
> 
>     for i in range(n):
>         pass
> 
> the loop overhead quickly falls into the noise even now.

I don't remember the figures, but these micor optimizations do
speedup loops by a noticable amount. Just compare the performance
of stock Python 1.5 against my patched version.
 
> > Using an iterator object instead of the integer + __getitem__
> > call machinery would allow more flexibility for all kinds of
> > sequences or containers. ...
> 
> This is yet another abrupt change of topic, yes <0.9 wink>?  I agree a new
> iteration *protocol* could have major attractions.

Not really... the counter object is just a special case of
an iterator -- in this case iteration is over the IN.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Wed Jan 31 13:10:43 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 13:10:43 +0100
Subject: [Python-Dev] Re: Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCEECEIMAA.tim.one@home.com>
Message-ID: <3A7800C3.B5D3203F@lemburg.com>

Tim Peters wrote:
> 
> Note that even adding a "frozen" flag would add 4 bytes to every freezable
> object on most machines.  That's why I'd rather .freeze() replace the type
> pointer and .unfreeze() restore it.  No time or space overhead; no
> cluttering up the normal-case (i.e., unfrozen) type implementations with new
> tests.

Note that Fred's weak ref implementation also need a flag on every
weak referencable object (at least last time I looked at his patches).

Why not add a flag byte or word to these objects -- then we'd have
8 or 16 choices of what to do with them ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From MarkH at ActiveState.com  Wed Jan 31 13:18:12 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Wed, 31 Jan 2001 23:18:12 +1100
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A77FE94.E5082136@lemburg.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPEEKJDAAA.MarkH@ActiveState.com>

MAL writes:

> > - How to spell it?  x.freeze()?  x.readonly()?
>
> How about .lock() and .unlock() ?

I'm with Greg here - lock() and unlock() imply an operation similar to
threading.Lock() - ie, exclusivity rather than immutability.

I don't have a strong opinion on the other names, but definately prefer any
of the others over lock() for this operation.

Mark.




From mal at lemburg.com  Wed Jan 31 13:26:07 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 13:26:07 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LCEPIIGDJPKCOIHOBJEPEEKJDAAA.MarkH@ActiveState.com>
Message-ID: <3A78045F.7DB50871@lemburg.com>

Mark Hammond wrote:
> 
> MAL writes:
> 
> > > - How to spell it?  x.freeze()?  x.readonly()?
> >
> > How about .lock() and .unlock() ?
> 
> I'm with Greg here - lock() and unlock() imply an operation similar to
> threading.Lock() - ie, exclusivity rather than immutability.
> 
> I don't have a strong opinion on the other names, but definately prefer any
> of the others over lock() for this operation.

Funny, I though that .lock() and .unlock() could be used to
implement exactly what threading.Lock() does...

Anyway, names really don't matter much, so how about: 

.mutable([flag]) -> integer

  If called without argument, returns 1/0 depending on whether
  the object is mutable or not. When called with a flag argument,
  sets the mutable state of the object to the value indicated
  by flag and returns the previous flag state.

The semantics of this interface would be in sync with many other
state APIs in Python and C (e.g. setlocale()).

The advantage of making this a method should be clear: it allows
writing polymorphic code.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From pedroni at inf.ethz.ch  Wed Jan 31 13:34:32 2001
From: pedroni at inf.ethz.ch (Samuele Pedroni)
Date: Wed, 31 Jan 2001 13:34:32 +0100 (MET)
Subject: [Python-Dev] weak refs and jython
Message-ID: <200101311234.NAA24584@core.inf.ethz.ch>

Hi.

I have read weak ref PEP, maybe too late.
I don't know if portability of code using weak refs between python and jython
was a goal or could be one, and up to which extent actual impl. will correspond 
to the PEP.

But about

    The callbacks registered with weak references must accept a single
    parameter, which will be the weak-ly referenced object itself.
    The object can be resurrected by creating some other reference to
    the object in the callback, in which case the weak reference
    generating the callback will still be cleared but no remaining
    weak references to the object will be cleared.
    
AFAIK using java weak refs (which I think is a natural choice) I see
no way (at least no worth-the-effort way) to implement this in jython.
Java weak refs cannot be resurrected.

regards, Samuele Pedroni.


PS: Mr. X  is a jython developer.




From bckfnn at worldonline.dk  Wed Jan 31 13:49:22 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Wed, 31 Jan 2001 12:49:22 GMT
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <200101302042.PAA29301@cj20424-a.reston1.va.home.com>
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com>   <20010130165204.I962@xs4all.nl>  <200101302042.PAA29301@cj20424-a.reston1.va.home.com>
Message-ID: <3a7809c0.14839067@smtp.worldonline.dk>

>> > Note that Jeremy is only raising errors for "from M import *".
>> 
>> No, he says he's also raising errors for 'import spam' if 'spam' is declared
>> global, like so:
>> 
>> def viking():
>>     global spam
>>     import spam
>
>Yeah, this was just brought to my attention at our group meeting
>today.  I'm with you on this one -- there really isn't a good reason
>why this shouldn't work.  (I wonder why that constraint was ever added
>to the reference manual; maybe I was just upset that someone would
>*do* something as ugly as that, or maybe there was a J[P]ython
>reason???.)

Previously Jython have had problems with "from .. import *" in function
scope, and still have problems when used with the python -> java
compiler:

http://sourceforge.net/bugs/?func=detailbug&bug_id=122834&group_id=12867

Using global on an import name is currently ignored by Jython because
the name assignment is done by the runtime, not the compiler.

regards,
finn



From thomas at xs4all.net  Wed Jan 31 13:59:14 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 31 Jan 2001 13:59:14 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <3a7809c0.14839067@smtp.worldonline.dk>; from bckfnn@worldonline.dk on Wed, Jan 31, 2001 at 12:49:22PM +0000
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com> <20010130165204.I962@xs4all.nl> <200101302042.PAA29301@cj20424-a.reston1.va.home.com> <3a7809c0.14839067@smtp.worldonline.dk>
Message-ID: <20010131135914.N962@xs4all.nl>

On Wed, Jan 31, 2001 at 12:49:22PM +0000, Finn Bock wrote:

> Using global on an import name is currently ignored by Jython because
> the name assignment is done by the runtime, not the compiler.

So it's impossible to do, in Jython, something like:

def fillme():
    global me
    import me

but it is possible to do:

def fillme():
    global me
    import me as _me
    me = _me

? I have to say I don't like that; we're always claiming 'import' (and
'def' and 'class' for that matter) are 'just another way of writing
assignment'. All these special cases break that.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From bckfnn at worldonline.dk  Wed Jan 31 14:35:36 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Wed, 31 Jan 2001 13:35:36 GMT
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <20010131135914.N962@xs4all.nl>
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com> <20010130165204.I962@xs4all.nl> <200101302042.PAA29301@cj20424-a.reston1.va.home.com> <3a7809c0.14839067@smtp.worldonline.dk> <20010131135914.N962@xs4all.nl>
Message-ID: <3a780eda.16144995@smtp.worldonline.dk>

On Wed, 31 Jan 2001 13:59:14 +0100, you wrote:

>On Wed, Jan 31, 2001 at 12:49:22PM +0000, Finn Bock wrote:
>
>> Using global on an import name is currently ignored by Jython because
>> the name assignment is done by the runtime, not the compiler.
>
>So it's impossible to do, in Jython, something like:
>
>def fillme():
>    global me
>    import me
>
>but it is possible to do:
>
>def fillme():
>    global me
>    import me as _me
>    me = _me
>
>?

Yes, only the second example will make a global variable.

> I have to say I don't like that; we're always claiming 'import' (and
>'def' and 'class' for that matter) are 'just another way of writing
>assignment'. All these special cases break that.

I don't like it either, I was only reported what jython currently does.
The current design used by Jython does lend itself directly towards a
solution, but I don't see anything that makes it impossible to solve.

regards,
finn



From mal at lemburg.com  Wed Jan 31 15:34:19 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 15:34:19 +0100
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com> <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <3A78226B.2E177EFE@lemburg.com>

Michael Hudson wrote:
> 
> In the interest of generating some numbers (and filling up my hard
> drive), last night I wrote a script to build lots & lots of versions
> of python (many of which turned out to be redundant - eg. -O6 didn't
> seem to do anything different to -O3 and pybench doesn't work with
> 1.5.2), and then run pybench with them.  Summarised results below;
> first a key:
> 
> src-n: this morning's CVS (with Jeremy's f_localsplus optimisation)
>         (only built this with -O3)
> src: CVS from yesterday afternoon
> src-obmalloc: CVS from yesterday afternoon with Vladimir's obmalloc
>         patch applied.  More on this later...
> Python-2.0: you can guess what this is.
> 
> All runs are compared against Python-2.0-O2:
> 
> Benchmark: src-n-O3 (rounds=10, warp=20)
>             Average round time:   49029.00 ms              -0.86%
> Benchmark: src (rounds=10, warp=20)
>             Average round time:   67141.00 ms             +35.76%
> Benchmark: src-O (rounds=10, warp=20)
>             Average round time:   50167.00 ms              +1.44%
> Benchmark: src-O2 (rounds=10, warp=20)
>             Average round time:   49641.00 ms              +0.37%
> Benchmark: src-O3 (rounds=10, warp=20)
>             Average round time:   49104.00 ms              -0.71%
> Benchmark: src-O6 (rounds=10, warp=20)
>             Average round time:   49131.00 ms              -0.66%
> Benchmark: src-obmalloc (rounds=10, warp=20)
>             Average round time:   63276.00 ms             +27.94%
> Benchmark: src-obmalloc-O (rounds=10, warp=20)
>             Average round time:   46927.00 ms              -5.11%
> Benchmark: src-obmalloc-O2 (rounds=10, warp=20)
>             Average round time:   46146.00 ms              -6.69%
> Benchmark: src-obmalloc-O3 (rounds=10, warp=20)
>             Average round time:   46456.00 ms              -6.07%
> Benchmark: src-obmalloc-O6 (rounds=10, warp=20)
>             Average round time:   46450.00 ms              -6.08%
> Benchmark: Python-2.0 (rounds=10, warp=20)
>             Average round time:   68933.00 ms             +39.38%
> Benchmark: Python-2.0-O (rounds=10, warp=20)
>             Average round time:   49542.00 ms              +0.17%
> Benchmark: Python-2.0-O3 (rounds=10, warp=20)
>             Average round time:   48262.00 ms              -2.41%
> Benchmark: Python-2.0-O6 (rounds=10, warp=20)
>             Average round time:   48273.00 ms              -2.39%
> 
> My conclusion?  Python 2.1 is slower than Python 2.0, but not by
> enough to care about.

What compiler did you use and on which platform ?

I have made similar experience with -On with n>3 compared to -O2
using pgcc (gcc optimized for PC processors). BTW, the Linux
kernel uses "-Wall -Wstrict-prototypes -O3 -fomit-frame-pointer"
as CFLAGS -- perhaps Python should too on Linux ?!
 
Does anybody know about the effect of -fomit-frame-pointer ?
Would it cause problems or produce code which is not compatible
with code compiled without this flag ?

> Interestingly, adding obmalloc speeds things up.  Let's take a closer
> look:
> 
> $ python pybench.py -c src-obmalloc-O3 -s src-O3
> PYBENCH 0.7
> 
> Benchmark: src-O3 (rounds=10, warp=20)
> 
> Tests:                              per run    per oper.  diff *
> ------------------------------------------------------------------------
>           BuiltinFunctionCalls:     843.35 ms    6.61 us   +2.93%
>            BuiltinMethodLookup:     878.70 ms    1.67 us   +0.56%
>                  ConcatStrings:    1068.80 ms    7.13 us   -1.22%
>                  ConcatUnicode:    1373.70 ms    9.16 us   -1.24%
>                CreateInstances:    1433.55 ms   34.13 us   +9.06%
>        CreateStringsWithConcat:    1031.75 ms    5.16 us  +10.95%
>        CreateUnicodeWithConcat:    1277.85 ms    6.39 us   +3.14%
>                   DictCreation:    1275.80 ms    8.51 us  +44.22%
>                       ForLoops:    1415.90 ms  141.59 us   -0.64%
>                     IfThenElse:    1152.70 ms    1.71 us   -0.15%
>                    ListSlicing:     397.40 ms  113.54 us   -0.53%
>                 NestedForLoops:     789.75 ms    2.26 us   -0.37%
>           NormalClassAttribute:     935.15 ms    1.56 us   -0.41%
>        NormalInstanceAttribute:     961.15 ms    1.60 us   -0.60%
>            PythonFunctionCalls:    1079.65 ms    6.54 us   -1.00%
>              PythonMethodCalls:     908.05 ms   12.11 us   -0.88%
>                      Recursion:     838.50 ms   67.08 us   -0.00%
>                   SecondImport:     741.20 ms   29.65 us  +25.57%
>            SecondPackageImport:     744.25 ms   29.77 us  +18.66%
>          SecondSubmoduleImport:     947.05 ms   37.88 us  +25.60%
>        SimpleComplexArithmetic:    1129.40 ms    5.13 us  +114.92%
>         SimpleDictManipulation:    1048.55 ms    3.50 us   -0.00%
>          SimpleFloatArithmetic:     746.05 ms    1.36 us   -2.75%
>       SimpleIntFloatArithmetic:     823.35 ms    1.25 us   -0.37%
>        SimpleIntegerArithmetic:     823.40 ms    1.25 us   -0.37%
>         SimpleListManipulation:    1004.70 ms    3.72 us   +0.01%
>           SimpleLongArithmetic:     865.30 ms    5.24 us  +100.65%
>                     SmallLists:    1657.65 ms    6.50 us   +6.63%
>                    SmallTuples:    1143.95 ms    4.77 us   +2.90%
>          SpecialClassAttribute:     949.00 ms    1.58 us   -0.22%
>       SpecialInstanceAttribute:    1353.05 ms    2.26 us   -0.73%
>                 StringMappings:    1161.00 ms    9.21 us   +7.30%
>               StringPredicates:    1069.65 ms    3.82 us   -5.30%
>                  StringSlicing:     846.30 ms    4.84 us   +8.61%
>                      TryExcept:    1590.40 ms    1.06 us   -0.49%
>                 TryRaiseExcept:    1104.65 ms   73.64 us  +24.46%
>                   TupleSlicing:     681.10 ms    6.49 us   -3.13%
>                UnicodeMappings:    1021.70 ms   56.76 us   +0.79%
>              UnicodePredicates:    1308.45 ms    5.82 us   -4.79%
>              UnicodeProperties:    1148.45 ms    5.74 us  +13.67%
>                 UnicodeSlicing:     984.15 ms    5.62 us   -0.51%
> ------------------------------------------------------------------------
>             Average round time:   49104.00 ms              +5.70%
> 
> *) measured against: src-obmalloc-O3 (rounds=10, warp=20)
> 
> Words fail me slightly, but maybe some tuning of the memory allocation
> of longs & complex numbers would be in order?

AFAIR, Vladimir's malloc implementation favours small objects.
All number objects (except longs) fall into this category.

Perhaps we should think about adding his lib to the core ?!

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Wed Jan 31 15:39:01 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 15:39:01 +0100
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com> <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <3A782385.5B544CD5@lemburg.com>

> In the interest of generating some numbers (and filling up my hard
> drive), last night I wrote a script to build lots & lots of versions
> of python (many of which turned out to be redundant - eg. -O6 didn't
> seem to do anything different to -O3 and pybench doesn't work with
> 1.5.2), and then run pybench with them. 

FYI, I've just updated the archive to also work under Python 1.5.x:

	http://www.lemburg.com/python/pybench-0.7.zip

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mwh21 at cam.ac.uk  Wed Jan 31 16:52:23 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 31 Jan 2001 15:52:23 +0000
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: "M.-A. Lemburg"'s message of "Wed, 31 Jan 2001 15:34:19 +0100"
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com> <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk> <3A78226B.2E177EFE@lemburg.com>
Message-ID: <m3itmv7m88.fsf@atrus.jesus.cam.ac.uk>

"M.-A. Lemburg" <mal at lemburg.com> writes:

> > My conclusion?  Python 2.1 is slower than Python 2.0, but not by
> > enough to care about.
> 
> What compiler did you use and on which platform ?

Argh, sorry; I meant to put this in!

$ uname -a
Linux atrus.jesus.cam.ac.uk 2.2.14-1.1.0 #1 Thu Jan 6 05:12:58 EST 2000 i686 unknown
$ gcc --version
2.95.1

It's a Dell Dimension XPS D233 (a 233MHz PII) with a reasonably fast
hard drive (two year old 10G IBM 7200rpm thingy) and quite a lot of
RAM (192Mb).

[snip]
 
> AFAIR, Vladimir's malloc implementation favours small objects.
> All number objects (except longs) fall into this category.

Well, longs & complex numbers don't do any free list handling (like
floats and int do), so I see two conclusions:

1) Don't add obmalloc to the core, but do simple free list stuff for
   longs (might be tricky) and complex nubmers (this should be a
   no-brainer).
2) Integrate obmalloc - then maybe we can ditch all of that icky
   freelist stuff.

> Perhaps we should think about adding his lib to the core ?!

Strikes me as the better solution.  Can anyone try this on Windows?
Seeing as windows malloc reputedly sucks, maybe the differences would
be bigger.

Cheers,
M.

-- 
  Our lecture theatre has just crashed. It will currently only
  silently display an unexplained line-drawing of a large dog
  accompanied by spookily flickering lights.
     -- Dan Sheppard, ucam.chat (from Owen Dunn's summary of the year)




From barry at digicool.com  Wed Jan 31 17:42:28 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Wed, 31 Jan 2001 11:42:28 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
References: <Pine.LNX.4.10.10101310142480.8204-100000@skuld.kingmanhall.org>
	<Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
Message-ID: <14968.16500.594486.613828@anthem.wooz.org>

>>>>> "KY" == Ka-Ping Yee <ping at lfw.org> writes:

    KY> Could i get a number for this please?

Looks like you beat Eric to PEP 234. :)

I'll update PEP 0 and let you check in your txt file.  I may want to
do an editorial pass over it.

-Barry



From barry at digicool.com  Wed Jan 31 17:50:10 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Wed, 31 Jan 2001 11:50:10 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
	<20010131101449.B28C5A83E@darjeeling.zadka.site.co.il>
Message-ID: <14968.16962.830739.920771@anthem.wooz.org>

>>>>> "MZ" == Moshe Zadka <moshez at zadka.site.co.il> writes:

    MZ> Basic response: I *love* the iter(), sq_iter and __iter__
    MZ> parts.  I tremble at seeing the rest.  Why not add a method to
    MZ> dictionaries .iteritems() and do

    | for (k, v) in dict.iteritems():
    | 	pass

    MZ> (dict.iteritems() would return an an iterator to the items)

Moshe, I had exactly the same reaction and exactly the same idea.  I'm
a strong -1 on introducing new syntax for this when new methods can
handle it in a much more readable way (IMO).

Another idea would be to allow the iterator() method to take an
argument:

    for key in dict.iterator()

a.k.a.

    for key in dict.iterator(KEYS)

and also

    for value in dict.iterator(VALUES)
    for key, value in dict.iterator(ITEMS)

One problem is that the constants KEYS, VALUES, and ITEMS would either
have to be defined some place, or you'd just use values like 0, 1, 2,
which is less readable perhaps than just having iteratoritems(),
iteratorkeys(), and iteratorvalues() methods.  Alternative spellings:

    itemsiter(), keysiter(), valsiter()
    itemsiterator(), keysiterator(), valuesiterator()
    iiterator(), kiterator(), viterator()

ad-nauseum-ly y'rs,
-Barry



From skip at mojam.com  Wed Jan 31 17:11:19 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 31 Jan 2001 10:11:19 -0600 (CST)
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A77FE94.E5082136@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
	<3A756FF8.B7185FA2@lemburg.com>
	<200101291500.KAA11569@cj20424-a.reston1.va.home.com>
	<3A75B190.3FD2A883@lemburg.com>
	<200101291922.OAA13321@cj20424-a.reston1.va.home.com>
	<20010129150247.B10191@thyrsus.com>
	<200101300217.VAA21978@cj20424-a.reston1.va.home.com>
	<3A77FE94.E5082136@lemburg.com>
Message-ID: <14968.14631.419491.440774@beluga.mojam.com>

What stimulated this thread about making mutable objects (temporarily)
immutable?  Can someone give me an example where this is actually useful and
can't be handled through some existing mechanism?  I'm definitely with
Fredrik on this one.  Sounds like madness to me.

I'm just guessing here, but since the most common need for immutable objects
is a dictionary keys, I can envision having to test the lock state of a list
or dict that someone wants to use as a key everywhere you would normally
call has_key:

    if l.islocked() and d.has_key(l):
       ...

If you want immutable dicts or lists in order to use them as dictionary
keys, just serialize them first:

    survey_says = {"spam": 14, "eggs": 42}
    sl = marshal.dumps(survey_says)
    dict[sl] = "spam"

Here's another pitfall I can envision.

    survey_says = {"spam": 14, "eggs": 42}
    survey_says.lock()
    dict[survey_says] = "Richard Dawson"
    survey_says.unlock()

At this point can I safely iterate over the keys in the dictionary or not?

Skip




From skip at mojam.com  Wed Jan 31 16:57:30 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 31 Jan 2001 09:57:30 -0600 (CST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <Pine.LNX.4.10.10101310252370.8204-100000@skuld.kingmanhall.org>
References: <20010131015832.K962@xs4all.nl>
	<Pine.LNX.4.10.10101310252370.8204-100000@skuld.kingmanhall.org>
Message-ID: <14968.13802.22823.702114@beluga.mojam.com>

    Ping>     x is sequence-like if it provides __getitem__() but not keys()

So why does this barf?

    >>> [].__getitem__
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    AttributeError: __getitem__

(Obviously, lists *do* understand __getitem__ at some level.  Why isn't it
exposed in the method table?)

Skip



From fredrik at pythonware.com  Wed Jan 31 18:19:44 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 31 Jan 2001 18:19:44 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org><20010131101449.B28C5A83E@darjeeling.zadka.site.co.il> <14968.16962.830739.920771@anthem.wooz.org>
Message-ID: <007301c08baa$02908220$e46940d5@hagrid>

barry wrote:
> Alternative spellings:
> 
>     itemsiter(), keysiter(), valsiter()
>     itemsiterator(), keysiterator(), valuesiterator()
>     iiterator(), kiterator(), viterator()

shouldn't that be xitems, xkeys, xvalues?

</F>




From mal at lemburg.com  Wed Jan 31 18:21:02 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 18:21:02 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
		<3A756FF8.B7185FA2@lemburg.com>
		<200101291500.KAA11569@cj20424-a.reston1.va.home.com>
		<3A75B190.3FD2A883@lemburg.com>
		<200101291922.OAA13321@cj20424-a.reston1.va.home.com>
		<20010129150247.B10191@thyrsus.com>
		<200101300217.VAA21978@cj20424-a.reston1.va.home.com>
		<3A77FE94.E5082136@lemburg.com> <14968.14631.419491.440774@beluga.mojam.com>
Message-ID: <3A78497E.8BCF197E@lemburg.com>

Skip Montanaro wrote:
> 
> What stimulated this thread about making mutable objects (temporarily)
> immutable?  Can someone give me an example where this is actually useful and
> can't be handled through some existing mechanism?  I'm definitely with
> Fredrik on this one.  Sounds like madness to me.

This thread is an offspring of the "for something in dict:" thread.
The problem we face when iterating over mutable objects is that
the underlying objects can change. By marking them read-only we can
safely iterate over their contents.

Another advantage of being able to mark mutable as read-only is
that they may become usable as dictionary keys. Optimizations such
as self-reorganizing read-only dictionaries would also become
possible (e.g. attribute dictionaries which are read-only could
calculate a second hash value to make the hashing perfect).
 
> I'm just guessing here, but since the most common need for immutable objects
> is a dictionary keys, I can envision having to test the lock state of a list
> or dict that someone wants to use as a key everywhere you would normally
> call has_key:
> 
>     if l.islocked() and d.has_key(l):
>        ...
> 
> If you want immutable dicts or lists in order to use them as dictionary
> keys, just serialize them first:
> 
>     survey_says = {"spam": 14, "eggs": 42}
>     sl = marshal.dumps(survey_says)
>     dict[sl] = "spam"

Sure and that's what .items(), .keys() and .values() do. The idea
was to avoid the extra step of creating lists or tuples first.
 
> Here's another pitfall I can envision.
> 
>     survey_says = {"spam": 14, "eggs": 42}
>     survey_says.lock()
>     dict[survey_says] = "Richard Dawson"
>     survey_says.unlock()
>
> At this point can I safely iterate over the keys in the dictionary or not?

Tim already pointed out that we will need two different read-only
states:

	a) temporary
	b) permanent

For dictionaries to become usable as keys in another dictionary,
they'd have to marked permanently read-only.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From jeremy at alum.mit.edu  Wed Jan 31 05:35:58 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue, 30 Jan 2001 23:35:58 -0500 (EST)
Subject: [Python-Dev] Re: from ... import * ([Python-checkins] CVS: python/dist/src/Python
 compile.c,2.153,2.154)
In-Reply-To: <3A77FD5C.DE8729DC@lemburg.com>
References: <E14NPXJ-0004Re-00@usw-pr-cvs1.sourceforge.net>
	<3A77FD5C.DE8729DC@lemburg.com>
Message-ID: <14967.38446.700271.122029@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:

  >> Modified Files: compile.c Log Message: Enforce two illegal import
  >> statements that were outlawed in the reference manual but not
  >> checked: Names bound by import statemants may not occur in global
  >> statements in the same scope. The from ... import * form may only
  >> occur in a module scope.
  >>
  >> I guess these changes could break code, but the reference manual
  >> warned about them.

  MAL> Jeremy, your code breaks all uses of "from package import
  MAL> submodule" inside packages.

  MAL> Try distutils for example or setup.py....

Quite aside from whether the changes should be preserved, I don't see
how "from package import submodule" is affected.  I ran setup.py
without any problem; I wouldn't have been able to build Python
otherwise.  I wrote some simple test cases and didn't have any trouble
with the form you describe.

Can you provide a concrete example?  It may be that something other
than the changes mentioned above that is causing you problems.

Jeremy



From jeremy at alum.mit.edu  Wed Jan 31 05:35:58 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue, 30 Jan 2001 23:35:58 -0500 (EST)
Subject: [Python-Dev] Re: from ... import * ([Python-checkins] CVS: python/dist/src/Python
 compile.c,2.153,2.154)
In-Reply-To: <3A77FD5C.DE8729DC@lemburg.com>
References: <E14NPXJ-0004Re-00@usw-pr-cvs1.sourceforge.net>
	<3A77FD5C.DE8729DC@lemburg.com>
Message-ID: <14967.38446.700271.122029@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:

  >> Modified Files: compile.c Log Message: Enforce two illegal import
  >> statements that were outlawed in the reference manual but not
  >> checked: Names bound by import statemants may not occur in global
  >> statements in the same scope. The from ... import * form may only
  >> occur in a module scope.
  >>
  >> I guess these changes could break code, but the reference manual
  >> warned about them.

  MAL> Jeremy, your code breaks all uses of "from package import
  MAL> submodule" inside packages.

  MAL> Try distutils for example or setup.py....

Quite aside from whether the changes should be preserved, I don't see
how "from package import submodule" is affected.  I ran setup.py
without any problem; I wouldn't have been able to build Python
otherwise.  I wrote some simple test cases and didn't have any trouble
with the form you describe.

Can you provide a concrete example?  It may be that something other
than the changes mentioned above that is causing you problems.

Jeremy



From barry at digicool.com  Wed Jan 31 18:20:24 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Wed, 31 Jan 2001 12:20:24 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
	<20010131101449.B28C5A83E@darjeeling.zadka.site.co.il>
	<14968.16962.830739.920771@anthem.wooz.org>
	<007301c08baa$02908220$e46940d5@hagrid>
Message-ID: <14968.18776.644453.903217@anthem.wooz.org>

>>>>> "FL" == Fredrik Lundh <fredrik at pythonware.com> writes:

    FL> shouldn't that be xitems, xkeys, xvalues?

Or iitems(), ikeys(), ivalues()?

Personally, I don't much care.  If we get consensus on the more
important issue of going with methods instead of new syntax, I'm sure
Guido will pick whatever method names appeal to him most.

-Barry



From ping at lfw.org  Wed Jan 31 18:14:15 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Wed, 31 Jan 2001 09:14:15 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <14968.13802.22823.702114@beluga.mojam.com>
Message-ID: <Pine.LNX.4.10.10101310903380.8204-100000@skuld.kingmanhall.org>

On Wed, 31 Jan 2001, Skip Montanaro wrote:
> Ping> x is sequence-like if it provides __getitem__() but not keys()
> 
> So why does this barf?
> 
>     >>> [].__getitem__

I was describing how to tell if instances are sequence-like.  Before
we get to make that judgement, first we have to look at the C method
table.  So:

    x is sequence-like if it has tp_as_sequence;
        all instances have tp_as_sequence;
            an instance is sequence-like if it has __getitem__() but not keys()

    x is mapping-like if it has tp_as_mapping;
        all instances have tp_as_mapping;
            an instance is mapping-like if it has both __getitem__() and keys()

The "in" operator is implemented this way.

    x customizes "in" if it has sq_contains;
        all instances have sq_contains;
            an instance customizes "in" if it has __contains__()

If sq_contains is missing, or if an instance has no __contains__ method,
we supply the default behaviour by comparing the operand to each member
of x in turn.  This default behaviour is implemented twice: once in
PyObject_Contains, and once in instance_contains.

So i proposed this same structure for sq_iter and __iter__.

    x customizes "for ... in x" if it has sq_iter;
        all instances have sq_iter;
            an instance customizes "in" if it has __iter__()

If sq_iter is missing, or if an instance has no __iter__ method,
we supply the default behaviour by calling PyObject_GetItem on x
and incrementing the index until IndexError.


-- ?!ng

"The only `intuitive' interface is the nipple.  After that, it's all learned."
    -- Bruce Ediger, on user interfaces




From mal at lemburg.com  Wed Jan 31 18:57:20 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 18:57:20 +0100
Subject: [Python-Dev] Re: from ... import * ([Python-checkins] CVS: python/dist/src/Python
 compile.c,2.153,2.154)
References: <E14NPXJ-0004Re-00@usw-pr-cvs1.sourceforge.net>
		<3A77FD5C.DE8729DC@lemburg.com> <14967.38446.700271.122029@localhost.localdomain>
Message-ID: <3A785200.FFB37CAD@lemburg.com>

Jeremy Hylton wrote:
> 
> >>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:
> 
>   >> Modified Files: compile.c Log Message: Enforce two illegal import
>   >> statements that were outlawed in the reference manual but not
>   >> checked: Names bound by import statemants may not occur in global
>   >> statements in the same scope. The from ... import * form may only
>   >> occur in a module scope.
>   >>
>   >> I guess these changes could break code, but the reference manual
>   >> warned about them.
> 
>   MAL> Jeremy, your code breaks all uses of "from package import
>   MAL> submodule" inside packages.
> 
>   MAL> Try distutils for example or setup.py....
> 
> Quite aside from whether the changes should be preserved, I don't see
> how "from package import submodule" is affected.  I ran setup.py
> without any problem; I wouldn't have been able to build Python
> otherwise.  I wrote some simple test cases and didn't have any trouble
> with the form you describe.

Perhaps you still had old .pyc files in your installation dir ?
 
> Can you provide a concrete example?  It may be that something other
> than the changes mentioned above that is causing you problems.

The distutils code is full of imports like these (and other
code I'm running is too):

distutils/cmd.py:

    def __init__ (self, dist):
        """Create and initialize a new Command object.  Most importantly,
        invokes the 'initialize_options()' method, which is the real
        initializer and depends on the actual command being
        instantiated.
        """
        # late import because of mutual dependence between these classes
        from distutils.dist import Distribution

This is the report I got from Benjamin Collar:

> I've gotten the newest CVS tarball, but setup.py is still not
> working; this time with a different error. I will resubmit a bug on
> sourceforge if that's the proper way to handle this. Here's the error:
> 
> ./python ./setup.py build
> Traceback (most recent call last):
>   File "./setup.py", line 12, in ?
>     from distutils.core import Extension, setup
>   File "/usr/src/python/dist/src/Lib/distutils/core.py", line 20, in ?
>     from distutils.cmd import Command
>   File "/usr/src/python/dist/src/Lib/distutils/cmd.py", line 15, in ?
>     from distutils import util, dir_util, file_util, archive_util,
> dep_util
> SyntaxError: 'from ... import *' may only occur in a module scope
> make: *** [sharedmods] Error 1

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From skip at mojam.com  Wed Jan 31 19:33:56 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 31 Jan 2001 12:33:56 -0600 (CST)
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A78497E.8BCF197E@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
	<3A756FF8.B7185FA2@lemburg.com>
	<200101291500.KAA11569@cj20424-a.reston1.va.home.com>
	<3A75B190.3FD2A883@lemburg.com>
	<200101291922.OAA13321@cj20424-a.reston1.va.home.com>
	<20010129150247.B10191@thyrsus.com>
	<200101300217.VAA21978@cj20424-a.reston1.va.home.com>
	<3A77FE94.E5082136@lemburg.com>
	<14968.14631.419491.440774@beluga.mojam.com>
	<3A78497E.8BCF197E@lemburg.com>
Message-ID: <14968.23188.573257.392841@beluga.mojam.com>

    MAL> This thread is an offspring of the "for something in dict:" thread.
    MAL> The problem we face when iterating over mutable objects is that the
    MAL> underlying objects can change. By marking them read-only we can
    MAL> safely iterate over their contents.

I suspect you'll find it difficult to mark dbm/bsddb/gdbm files read-only.
(And what about Andy Dustman's cool sqldict stuff?)  If you can't extend
this concept in a reasonable fashion to cover (most of) the other objects
that smell like dictionaries, I think you'll just be adding needless
complications for a feature than can't be used where it's really needed.

I see no problem asking for the items() of an in-memory dictionary in order
to get a predictable list to iterate over, but doing that for disk-based
mappings would be next to impossible.  So, I'm stuck iterating over
something can can change out from under me.  In the end, the programmer will
still have to handle border cases specially.  Besides, even if you *could*
lock your disk-based mapping, are you really going to do that in situations
where its sharable (that's what databases they are there for, after all)?  I
suspect you're going to keep the database mutable and work around any
resulting problems.

If you want to implement "for key in dict:", why not just have the VM call
keys() under the covers and use that list?  It would be no worse than the
situation today where you call "for key in dict.keys():", and with the same
caveats.  If you're dumb enough to do that for an on-disk mapping object,
well, you get what you asked for.

Skip



From esr at thyrsus.com  Wed Jan 31 18:55:00 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 31 Jan 2001 12:55:00 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A78045F.7DB50871@lemburg.com>; from mal@lemburg.com on Wed, Jan 31, 2001 at 01:26:07PM +0100
References: <LCEPIIGDJPKCOIHOBJEPEEKJDAAA.MarkH@ActiveState.com> <3A78045F.7DB50871@lemburg.com>
Message-ID: <20010131125500.C5151@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> Anyway, names really don't matter much, so how about: 
> 
> .mutable([flag]) -> integer
> 
>   If called without argument, returns 1/0 depending on whether
>   the object is mutable or not. When called with a flag argument,
>   sets the mutable state of the object to the value indicated
>   by flag and returns the previous flag state.

I'll bear this in mind if things progress to the point where a PEP is
indicated.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>




From tim.one at home.com  Wed Jan 31 20:49:34 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 14:49:34 -0500
Subject: [Python-Dev] WARNING: Changed build process for zlib on Windows
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPMEKGDAAA.MarkH@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEHBIMAA.tim.one@home.com>

[Mark Hammond]
> ...
> The new process is very simple, but may break some peoples build.
> ...
> The reason this _should_ not break your build is that your
> _probably_ already have a "..\..\zlib-1.1.3" directory installed
> in the right place so the header files can be located.

Actually, it's certain to break the build for anyone who read
PCbuild\readme.txt.  But I *want* it to break:  changing the directory name
is a strong hint that they should download the zlib source code from the
same place you did (and which is now explained in PCbuild\readme.txt, and
mentioned in the 2.1a2 NEWS file).

Other than that, worked first time, and-- even better --the second time too
<wink>.




From esr at thyrsus.com  Wed Jan 31 18:53:16 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 31 Jan 2001 12:53:16 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A77FE94.E5082136@lemburg.com>; from mal@lemburg.com on Wed, Jan 31, 2001 at 01:01:24PM +0100
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com> <3A77FE94.E5082136@lemburg.com>
Message-ID: <20010131125316.B5151@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> Eric, could you write a PEP for this ?

Not yet.  I'm about (at Guido's suggestion) to submit a revised ternary-select
proposal.  Let's process that first.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Today, we need a nation of Minutemen, citizens who are not only prepared to
take arms, but citizens who regard the preservation of freedom as the basic
purpose of their daily life and who are willing to consciously work and
sacrifice for that freedom."
	-- John F. Kennedy



From tim.one at home.com  Wed Jan 31 21:28:00 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 15:28:00 -0500
Subject: [Python-Dev] weak refs and jython
In-Reply-To: <200101311234.NAA24584@core.inf.ethz.ch>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHCIMAA.tim.one@home.com>

[Samuele Pedroni]
> I have read weak ref PEP, maybe too late.
> I don't know if portability of code using weak refs between
> python and jython was a goal or could be one,

CPython generally doesn't want to do anything impossible for Jython, if it
can help it.

> and up to which extent actual impl. will correspond to the PEP.

Don't care about that.

> ...
> AFAIK using java weak refs (which I think is a natural choice) I
> see no way (at least no worth-the-effort way) to implement this
> in jython.  Java weak refs cannot be resurrected.

Thanks for bringing this up!  Fred is looking into it.




From fdrake at acm.org  Wed Jan 31 21:25:51 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 31 Jan 2001 15:25:51 -0500 (EST)
Subject: [Python-Dev] weak refs and jython
In-Reply-To: <200101311234.NAA24584@core.inf.ethz.ch>
References: <200101311234.NAA24584@core.inf.ethz.ch>
Message-ID: <14968.29903.183882.41485@cj42289-a.reston1.va.home.com>

Samuele Pedroni writes:
 > AFAIK using java weak refs (which I think is a natural choice) I see
 > no way (at least no worth-the-effort way) to implement this in jython.
 > Java weak refs cannot be resurrected.

  This is certainly annoying.
  How about this: the callback receives the weak reference object or
proxy which it was registered on as a parameter.  Since the reference
has already been cleared, there's no way to get the object back, so we
don't need to get it from Java either.
  Would that be workable?  (I'm adjusting my patch now.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tim.one at home.com  Wed Jan 31 21:56:52 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 15:56:52 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <14968.13802.22823.702114@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHDIMAA.tim.one@home.com>

[Ping]
> x is sequence-like if it provides __getitem__() but not keys()

[Skip]
> So why does this barf?
>
>     >>> [].__getitem__
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     AttributeError: __getitem__
>
> (Obviously, lists *do* understand __getitem__ at some level.  Why
> isn't it exposed in the method table?)

The old type/class split:  list is a type, and types spell their "method
tables" in ways that have little in common with how classes do it.

See PyObject_GetItem in abstract.c for gory details (e.g., dicts spell their
version of getitem via ->tp_as_mapping->mp_subscript(...), while lists spell
it ->tp_as_sequence->sq_item(...); neither has any binding to the attr
"__getitem__"; instance objects fill in both the tp_as_mapping and
tp_as_sequence slots, then map both the mp_subscript and sq_item slots to
classobject.c's instance_item, which in turn looks up "__getitem__").

bet-you're-sorry-you-asked<wink>-ly y'rs  - tim




From tim.one at home.com  Wed Jan 31 22:24:53 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 16:24:53 -0500
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: <3A78226B.2E177EFE@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHFIMAA.tim.one@home.com>

[M.-A. Lemburg]
> AFAIR, Vladimir's malloc implementation favours small objects.

It favors the memory alloc/dealloc patterns Vlad recorded while running an
instrumented Python.  Which is mostly good news.  The flip side is that it
favors the specific programs he ran, and who knows whether those are
"typical".  OTOH, vendor mallocs favor the programs *they* ran, which
probably didn't include Python at all <wink>.

> ...
> Perhaps we should think about adding his lib to the core ?!

It's patch 101104 on SF.  I pushed Vlad to push this for 2.0, but he wisely
decided it was too big a change at the time.  It's certainly too much a
change to slam into 2.1 at this late stage too.  There are many reasons to
want this (e.g., list.append() calls realloc every time today, because,
despite over-allocating, it has no idea how much storage *has* already been
allocated; any malloc has to know this info under the covers, but there's no
way for us to know that too unless we add another N bytes to every list
object to record it, or use our own malloc which *can* tell us that info).

list.append()-behavior-varies-wildly-across-platforms-today-
    when-the-list-gets-large-because-of-that-ly y'rs  - tim




From tim.one at home.com  Wed Jan 31 22:49:31 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 16:49:31 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3A78002F.DC8F0582@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHGIMAA.tim.one@home.com>

[Tim]
>> Seems an unrelated topic:  would "iterators for dictionaries" solve the
>> supposed problem with iteration order?

[MAL]
> No, but it would solve the problem in a more elegant and
> generalized way.

I'm lost.  "Would [it] solve the ... problem?" "No [it wouldn't solve the
problem], but it would solve the problem ...".  Can only assume we're
switching topics within single sentences now <wink>.

> Besides, it also allows writing code which is thread safe, since
> the iterator can take special actions to assure that the dictionary
> doesn't change during the iteration phase (see the other thread
> about "making mutable objects readonly").

Sorry, but immutability has nothing to do with thread safety (the latter has
to do with "doing a right thing" in the presence of multiple threads, to
keep data structures internally consistent; raising an exception is never "a
right thing" unless the user is violating the advertised semantics, and if
mutation during iteration is such a violation, the presence or absence of
multiple threads has nothing to do with that).  IOW, perhaps, a critical
section is an area of non-exceptional serialization, not a landmine that
makes other threads *blow up* if they touch it.

> ...
> I don't remember the figures, but these micor optimizations

That's plural, but I thought you were talking specifically about the mutable
counter object.  I don't know which, but the two statements don't jibe.

> do speedup loops by a noticable amount. Just compare the performance
> of stock Python 1.5 against my patched version.

No time now, but after 2.1 is out, sure, wrt it (not 1.5).




From tim.one at home.com  Wed Jan 31 23:10:12 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 17:10:12 -0500
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: <m3itmv7m88.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEHIIMAA.tim.one@home.com>

[Michael Hudson]
> ...
> Can anyone try this on Windows?  Seeing as windows malloc
> reputedly sucks, maybe the differences would be bigger.

No time now (pymalloc is a non-starter for 2.1).  Was tried in the past on
Windows.  Helped significantly.  Unclear how much was simply due to
exploiting the global interpreter lock, though.  "Windows" is also a
multiheaded beast (e.g., NT has very different memory performance
characteristics than 95).




From tim.one at home.com  Wed Jan 31 23:43:59 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 17:43:59 -0500
Subject: generators (was RE: [Python-Dev] Re: Sets: elt in dict, lst.include)
In-Reply-To: <20010130092454.D18319@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHIIMAA.tim.one@home.com>

[Neil Schemenauer]
> What's the chances of getting generators into 2.2?

Unknown.  IMO it has more to do with generalizing the iteration protocol
than with generators per se (a generator object that doesn't play nice with
"for" is unpleasant to use; otoh, a generator object that can't be used
divorced from "for" is frustrating too (like when comparing the fringes of
two trees efficiently, which requires interleaving two distinct traversals,
each naturally recursive on its own)).

> The implementation should not be hard.  Didn't Steven Majewski have
> something years ago?

Yes, but Guido also sketched out a nearly complete implementation within the
last year or so.

> Why do we always get sidetracked on trying to figure out how to
> do coroutines and continuations?

Sorry, I've been failing to find a good answer to that question for a decade
<0.4 wink>.  I should note, though, that Guido's current notion of
"generator" is stronger than Icon/CLU/Sather's (which are "strictly
stack-like"), and requires machinery more elaborate than StevenM (or Guido)
sketched before.

> Generators would add real power to the language and are simple
> enough that most users could benefit from them.  Also, it should be
> possible to design an interface that does not preclude the
> addition of coroutines or continuations later.

Agreed.

> I'm not volunteering to champion the cause just yet.  I just want
> to know if there is some issue I'm missing.

microthreads have an enthusiastic and possibly growing audience.  That gets
into (C) stacklessness, though, as do coroutines.  I'm afraid that once you
go beyond "simple" (Icon) generators, a whole world of other stuff gets
pulled in.  The key trick to implementing simple generators in current
Python is simply to decline decrementing the frame's refcount upon a
"suspend" (of course the full details are more involved than *just* that,
but they mostly follow *from* just that).

everything-is-the-enemy-of-something-ly y'rs  - tim




From skip at mojam.com  Wed Jan 31 23:27:38 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 31 Jan 2001 16:27:38 -0600 (CST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEHDIMAA.tim.one@home.com>
References: <14968.13802.22823.702114@beluga.mojam.com>
	<LNBBLJKPBEHFEDALKOLCOEHDIMAA.tim.one@home.com>
Message-ID: <14968.37210.886842.820413@beluga.mojam.com>

>>>>> "Tim" == Tim Peters <tim.one at home.com> writes:

    >> (Obviously, lists *do* understand __getitem__ at some level.  Why
    >> isn't it exposed in the method table?)

    Tim> The old type/class split: list is a type, and types spell their
    Tim> "method tables" in ways that have little in common with how classes
    Tim> do it.

The problem that rolls around in the back of my mind from time-to-time is
that since Python doesn't currently support interfaces, checking for
specific methods seems to be the only reasonable way to determine if a
object does what you want or not.

What would break if we decided to simply add __getitem__ (and other sequence
methods) to list object's method table?  Would they foul something up or
would simply sit around quietly waiting for hasattr to notice them?

Skip




From pedroni at inf.ethz.ch  Wed Jan 31 23:29:37 2001
From: pedroni at inf.ethz.ch (Samuele Pedroni)
Date: Wed, 31 Jan 2001 23:29:37 +0100
Subject: [Python-Dev] weak refs and jython
References: <200101311234.NAA24584@core.inf.ethz.ch> <14968.29903.183882.41485@cj42289-a.reston1.va.home.com>
Message-ID: <001f01c08bd5$4c9c9900$7c5821c0@newmexico>

Hi.

[Fred L. Drake, Jr.]

>  > Java weak refs cannot be resurrected.
>
>   This is certainly annoying.
>   How about this: the callback receives the weak reference object or
> proxy which it was registered on as a parameter.  Since the reference
> has already been cleared, there's no way to get the object back, so we
> don't need to get it from Java either.
>   Would that be workable?  (I'm adjusting my patch now.)

Yes, it is workable: clearly we can implement weak refs only under java2 but
this is not (really) an issue.
We can register the refs in a java reference queue, and poll it lazily or
trough a low-priority thread
in order to invoke the callbacks.

-- Some remarks
I have used java weak/soft refs to implement some of the internal tables of
jython in order to avoid memory leaks, at least
under java2.

I imagine that the idea behind callbacks plus resurrection was to enable the
construction of sofisticated caches.

My intuition is that these features are not present under java because they
will interfere too much with gc
and have a performance penalty.
On the other hand java offers reference queues and soft references, the latter
cover the common case of caches
that should be cleared when there is few memory left. (Never tried them
seriously, so I don't know if the
actual impl is fair, or will just wait too much starting to discard things =>
behavior like primitives gc).

The main difference I see between callbacks and queues approach is that with
queues is this left to the user
when to do the actual cleanup of his tables/caches, and handling queues
internally has a "low" overhead.
With callbacks what happens depends really on the collection times/patterns and
the overhead is related
to call overhead and how much is non trivial, what the user put in the
callbacks. Clearly general performance
will not be easily predictable.
(From a theoretical viewpoint one can simulate more or less queues with
callbacks and the other way around).

Resurrection makes few sense with queues, but I can easely see that lacking of
both resurrection and soft refs
limits what can be done with weak-like refs.

Last thing: one of the things that is really missing in java refs features is
that one cannot put conditions of the form
as long A is not collected B should not be collected either. Clearly I'm
referring to situation when one cannot modify
the class of A in order to add a field, which is quite typical in java. This
should not be a problem with python and
its open/dynamic way-of-life.

regards, Samuele Pedroni.




From mal at lemburg.com  Wed Jan 31 20:03:12 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 20:03:12 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
		<3A756FF8.B7185FA2@lemburg.com>
		<200101291500.KAA11569@cj20424-a.reston1.va.home.com>
		<3A75B190.3FD2A883@lemburg.com>
		<200101291922.OAA13321@cj20424-a.reston1.va.home.com>
		<20010129150247.B10191@thyrsus.com>
		<200101300217.VAA21978@cj20424-a.reston1.va.home.com>
		<3A77FE94.E5082136@lemburg.com>
		<14968.14631.419491.440774@beluga.mojam.com>
		<3A78497E.8BCF197E@lemburg.com> <14968.23188.573257.392841@beluga.mojam.com>
Message-ID: <3A786170.CD65B8A4@lemburg.com>

Skip Montanaro wrote:
> 
>     MAL> This thread is an offspring of the "for something in dict:" thread.
>     MAL> The problem we face when iterating over mutable objects is that the
>     MAL> underlying objects can change. By marking them read-only we can
>     MAL> safely iterate over their contents.
> 
> I suspect you'll find it difficult to mark dbm/bsddb/gdbm files read-only.
> (And what about Andy Dustman's cool sqldict stuff?)  If you can't extend
> this concept in a reasonable fashion to cover (most of) the other objects
> that smell like dictionaries, I think you'll just be adding needless
> complications for a feature than can't be used where it's really needed.

We are currently only talking about Python dictionaries here, even though
other objects could also benefit from this.
 
> I see no problem asking for the items() of an in-memory dictionary in order
> to get a predictable list to iterate over, but doing that for disk-based
> mappings would be next to impossible.  So, I'm stuck iterating over
> something can can change out from under me.  In the end, the programmer will
> still have to handle border cases specially.  Besides, even if you *could*
> lock your disk-based mapping, are you really going to do that in situations
> where its sharable (that's what databases they are there for, after all)?  I
> suspect you're going to keep the database mutable and work around any
> resulting problems.
> 
> If you want to implement "for key in dict:", why not just have the VM call
> keys() under the covers and use that list?  It would be no worse than the
> situation today where you call "for key in dict.keys():", and with the same
> caveats.  If you're dumb enough to do that for an on-disk mapping object,
> well, you get what you asked for.

That's why iterators do a much better task here. In DB design
these are usually called cursors which the allow moving inside
large result sets. But this really is a different topic...

Readonlyness could be put to some good use in optimizing data
structure for which you know that they won't change anymore.
Temporary readonlyness has the nice sideeffect of allowing low-level
lock implementations and makes writing thread safe code easier
to handle, because you can make assertions w/r to the immutability
of an object during a certain period of time explicit in your
code.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Wed Jan 31 21:36:54 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 21:36:54 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LCEPIIGDJPKCOIHOBJEPEEKJDAAA.MarkH@ActiveState.com> <3A78045F.7DB50871@lemburg.com> <20010131125500.C5151@thyrsus.com>
Message-ID: <3A787766.35453597@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal at lemburg.com>:
> > Anyway, names really don't matter much, so how about:
> >
> > .mutable([flag]) -> integer
> >
> >   If called without argument, returns 1/0 depending on whether
> >   the object is mutable or not. When called with a flag argument,
> >   sets the mutable state of the object to the value indicated
> >   by flag and returns the previous flag state.
> 
> I'll bear this in mind if things progress to the point where a PEP is
> indicated.

Great :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Wed Jan 31 17:23:37 2001
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 31 Jan 2001 11:23:37 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: Your message of "Wed, 31 Jan 2001 13:35:36 GMT."
             <3a780eda.16144995@smtp.worldonline.dk> 
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com> <20010130165204.I962@xs4all.nl> <200101302042.PAA29301@cj20424-a.reston1.va.home.com> <3a7809c0.14839067@smtp.worldonline.dk> <20010131135914.N962@xs4all.nl>  
            <3a780eda.16144995@smtp.worldonline.dk> 
Message-ID: <200101311623.LAA01774@cj20424-a.reston1.va.home.com>

[Finn]
> >> Using global on an import name is currently ignored by Jython because
> >> the name assignment is done by the runtime, not the compiler.

[Thomas]
> >So it's impossible to do, in Jython, something like:
> >
> >def fillme():
> >    global me
> >    import me
> >
> >but it is possible to do:
> >
> >def fillme():
> >    global me
> >    import me as _me
> >    me = _me
> >
> >?

[Finn again]
> Yes, only the second example will make a global variable.
> 
> > I have to say I don't like that; we're always claiming 'import' (and
> >'def' and 'class' for that matter) are 'just another way of writing
> >assignment'. All these special cases break that.
> 
> I don't like it either, I was only reported what jython currently does.
> The current design used by Jython does lend itself directly towards a
> solution, but I don't see anything that makes it impossible to solve.

Tentatively, I'd say that this should be documented as a Jython
difference and Jython should strive to fix this.  So I see no good
reason to rule it out in CPython.

That doesn't mean I like Thomas's example!  It should probably be
redesigned along the lines of

    def fillme():
	import me
	return me

    me = fillme()

to avoid needing side effects on globals.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Wed Jan 31 17:26:11 2001
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 31 Jan 2001 11:26:11 -0500
Subject: [Python-Dev] The 2nd Korea Python Users Seminar
Message-ID: <200101311626.LAA01799@cj20424-a.reston1.va.home.com>

Wow...!

Way to go, Christian!

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Wed, 31 Jan 2001 22:46:06 +0900
From:    "Changjune Kim" <junaftnoon at yahoo.com>
To:      <guido at python.org>
Subject: The 2nd Korea Python Users Seminar

Dear Mr. Guido van Rossum,

First of all, I can't thank you more for your great contribution to the
presence of Python. It is not a mere computer programming language but a whole
culture, I think.

I am proud to tell you that we are having the 2nd Korea Python Users Seminar
which is wide open to the public. There are already more than 400 people who
registered ahead, and we expect a few more at the site. The seminar will be
held in Seoul, South Korea on Feb 2.

With the effort of Korea Python Users Group, there has been quite a boom or
phenomenon for Python among developers in Korea. Several magazines are
_competitively_ carrying regular articles about Python -- I'm one of the
authors -- and there was an article even on a _normal_ newspaper, one of the
major four big newspapers in Korea, which described the sprouting of Python in
Korea and pointed its extreme easiness to learn. (moreover, it's the year of
the snake in the 12 zodiac animals)

The seminar is mainly about:

Python 2.0, intro for newbies, Python coding style, ZOPE, internationalization
of Zope for Korean, GUIs such as wxPython, PyQt, Internet programming in
Python, Python with UML, Python C/API, XML with Python, and Stackless Python.

Christian Tismer is coming for SPC presentation with me, and Hostway CEO Lucas
Roh will give a talk about how they are using Python, and one of the Python
evangelists, Brian Lee, CTO of Linuxkorea will give a brief intro to Python
and Python C/API.

I'm so excited and happy to tell you this great news. If there is any message
you want to give to Korea Python Users Group and the audience, it'd be
great -- I could translate it and post it at the site for all the audience.

Thank you again for your wonderful snake.

Best regards,

June from Korea.




_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

------- End of Forwarded Message




From moshez at zadka.site.co.il  Wed Jan 31 21:32:45 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Wed, 31 Jan 2001 22:32:45 +0200 (IST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <007301c08baa$02908220$e46940d5@hagrid>
References: <007301c08baa$02908220$e46940d5@hagrid>, <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org><20010131101449.B28C5A83E@darjeeling.zadka.site.co.il> <14968.16962.830739.920771@anthem.wooz.org>
Message-ID: <20010131203245.E813BA83E@darjeeling.zadka.site.co.il>

[Barry]
>     itemsiter(), keysiter(), valsiter()
>     itemsiterator(), keysiterator(), valuesiterator()
>     iiterator(), kiterator(), viterator()

[/F]
> shouldn't that be xitems, xkeys, xvalues?

I'm so hoping I missed a <wink> there somewhere. Please, no more
of the dreaded 'x'.

thinking-of-ripping-x-from-my-keyboard-ly y'rs, Z.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From thomas at xs4all.net  Wed Jan 31 22:00:33 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 31 Jan 2001 22:00:33 +0100
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: <3A78226B.2E177EFE@lemburg.com>; from mal@lemburg.com on Wed, Jan 31, 2001 at 03:34:19PM +0100
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com> <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk> <3A78226B.2E177EFE@lemburg.com>
Message-ID: <20010131220033.O962@xs4all.nl>

On Wed, Jan 31, 2001 at 03:34:19PM +0100, M.-A. Lemburg wrote:

> I have made similar experience with -On with n>3 compared to -O2
> using pgcc (gcc optimized for PC processors). BTW, the Linux
> kernel uses "-Wall -Wstrict-prototypes -O3 -fomit-frame-pointer"
> as CFLAGS -- perhaps Python should too on Linux ?!

Maybe, but the Linux kernel can be quite specific in what version of gcc you
need, and knows in advance on what platform you are using it :) The
stability and actual speedup of gcc's optimization options can and does vary
across platforms. In the above example, -Wall and -Wstrict-prototypes are
just warnings, and -O3 is the same as "-O2 -finline-functions". As for
-fomit-frame-pointer....

> Does anybody know about the effect of -fomit-frame-pointer ?
> Would it cause problems or produce code which is not compatible
> with code compiled without this flag ?

The effect of -fomit-frame-pointer is that the compilation of frame-pointer
handling code is avoided. It doesn't have any effect on compatibility, since
it doesn't matter that other parts/functions/libraries do have such code,
but it does make debugging impossible (on most machines, in any case.) From
GCC's info docs:

-fomit-frame-pointer'
     Don't keep the frame pointer in a register for functions that
     don't need one.  This avoids the instructions to save, set up and
     restore frame pointers; it also makes an extra register available
     in many functions.  *It also makes debugging impossible on some
     machines.*

     On some machines, such as the Vax, this flag has no effect, because
     the standard calling sequence automatically handles the frame
     pointer and nothing is saved by pretending it doesn't exist.  The
     machine-description macro RAME_POINTER_REQUIRED' controls
     whether a target machine supports this flag.  *Note Registers::.

Obviously, for the Linux kernel this is a very good thing, you don't debug
the Linux kernel like a normal program anyway (contrary to some other UNIX
kernels, I might add.) I believe -g turns off -fomit-frame-pointer itself,
but the docs for -g or -fomit-frame-pointer don't mention it. 

One other thing I noted in the gcc docs is that gcc doesn't do loop
unrolling even with -O3, though I thought it would at -O2. You need to add
-funroll-loop to enable loop unrolling, and that might squeeze out some more
performance.. This only works for loops with a fixed repetition, though, so
I'm not sure if it matters.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Wed Jan 31 20:14:58 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 31 Jan 2001 20:14:58 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <14968.16962.830739.920771@anthem.wooz.org>; from barry@digicool.com on Wed, Jan 31, 2001 at 11:50:10AM -0500
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org> <20010131101449.B28C5A83E@darjeeling.zadka.site.co.il> <14968.16962.830739.920771@anthem.wooz.org>
Message-ID: <20010131201457.I922@xs4all.nl>

[ Trimming CC: line ]

On Wed, Jan 31, 2001 at 11:50:10AM -0500, Barry A. Warsaw wrote:

> Moshe, I had exactly the same reaction and exactly the same idea.  I'm
> a strong -1 on introducing new syntax for this when new methods can
> handle it in a much more readable way (IMO).

Same here. I *might* like it if iterators were given a format string (or
tuple object, or whatever) so they knew what the iterating code expected
(so something like this:

  for x,y,z in obj

would translate into 

  iterator(obj)("(x,y,z)")

or maybe just

  iterator(obj)((None,None,None))

or maybe even just

  iterator(obj)(3) # that is, number of elements

or so) but I suspect it might be too cute (and obfuscated) for Python,
especially if it was put to use to distingish between 'for x:y in obj' and
'for x,y in obj'.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From sjoerd at oratrix.nl  Wed Jan 31 21:05:06 2001
From: sjoerd at oratrix.nl (Sjoerd Mullender)
Date: Wed, 31 Jan 2001 21:05:06 +0100
Subject: [Python-Dev] python setup.py fails with illegal import (+ fix)
Message-ID: <20010131200507.A106931E1AD@bireme.oratrix.nl>

With the current CVS version, running python setup.py as part of the
build process fails with a syntax error:
Traceback (most recent call last):
  File "../setup.py", line 12, in ?
    from distutils.core import Extension, setup
  File "/usr/people/sjoerd/src/python/Lib/distutils/core.py", line 20, in ?
    from distutils.cmd import Command
  File "/usr/people/sjoerd/src/python/Lib/distutils/cmd.py", line 15, in ?
    from distutils import util, dir_util, file_util, archive_util, dep_util
SyntaxError: 'from ... import *' may only occur in a module scope

The fix is to change the from ... import * that the compiler complains
about:
Index: file_util.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/distutils/file_util.py,v
retrieving revision 1.7
diff -u -c -r1.7 file_util.py
*** file_util.py 2000/09/30 17:29:35	1.7
--- file_util.py 2001/01/31 20:01:56
***************
*** 106,112 ****
      # changing it (ie. it's not already a hard/soft link to src OR
      # (not update) and (src newer than dst).
  
!     from stat import *
      from distutils.dep_util import newer
  
      if not os.path.isfile(src):
--- 106,112 ----
      # changing it (ie. it's not already a hard/soft link to src OR
      # (not update) and (src newer than dst).
  
!     from stat import ST_ATIME, ST_MTIME, ST_MODE, S_IMODE
      from distutils.dep_util import newer
  
      if not os.path.isfile(src):

I didn't check this in because distutils is Greg Ward's baby.

-- Sjoerd Mullender <sjoerd.mullender at oratrix.com>



From mal at lemburg.com  Wed Jan 31 23:24:43 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 23:24:43 +0100
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
References: <LNBBLJKPBEHFEDALKOLCAEHIIMAA.tim.one@home.com>
Message-ID: <3A7890AB.69B893F9@lemburg.com>

Tim Peters wrote:
> 
> [Michael Hudson]
> > ...
> > Can anyone try this on Windows?  Seeing as windows malloc
> > reputedly sucks, maybe the differences would be bigger.
> 
> No time now (pymalloc is a non-starter for 2.1).  Was tried in the past on
> Windows.  Helped significantly.  Unclear how much was simply due to
> exploiting the global interpreter lock, though.  "Windows" is also a
> multiheaded beast (e.g., NT has very different memory performance
> characteristics than 95).

We're still in alpha, no ?  

Adding pymalloc is not much of
a deal since it fits nicely with the Python malloc macros and
giving the package a nice spin by putting it into a Python alpha
release would sure create more confidence in this nice piece
of work. We can always take it out again before going into the 
beta phase.

Or do we have a 2.1 feature freeze already ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Wed Jan 31 23:15:50 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 23:15:50 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCGEHGIMAA.tim.one@home.com>
Message-ID: <3A788E96.AB823FAE@lemburg.com>

Tim Peters wrote:
> 
> [Tim]
> >> Seems an unrelated topic:  would "iterators for dictionaries" solve the
> >> supposed problem with iteration order?
> 
> [MAL]
> > No, but it would solve the problem in a more elegant and
> > generalized way.
> 
> I'm lost.  "Would [it] solve the ... problem?" "No [it wouldn't solve the
> problem], but it would solve the problem ...".  Can only assume we're
> switching topics within single sentences now <wink>.

Sorry, not my brightest day today... what I wanted to say is that
iterators would solve the problem of defining "something" in
"for something in dict" nicely. 

Since iterators can define the order in which a data structure is 
traversed, this would also do away with the second (supposed) 
problem.

> > Besides, it also allows writing code which is thread safe, since
> > the iterator can take special actions to assure that the dictionary
> > doesn't change during the iteration phase (see the other thread
> > about "making mutable objects readonly").
> 
> Sorry, but immutability has nothing to do with thread safety (the latter has
> to do with "doing a right thing" in the presence of multiple threads, to
> keep data structures internally consistent; raising an exception is never "a
> right thing" unless the user is violating the advertised semantics, and if
> mutation during iteration is such a violation, the presence or absence of
> multiple threads has nothing to do with that).  IOW, perhaps, a critical
> section is an area of non-exceptional serialization, not a landmine that
> makes other threads *blow up* if they touch it.

Who said that an exception is raised ? The method I posted
on the mutability thread allows querying the current state just
like you would query the availability of a resource.

> > ...
> > I don't remember the figures, but these micor optimizations
> 
> That's plural, but I thought you were talking specifically about the mutable
> counter object.  I don't know which, but the two statements don't jibe.

The counter object patch is a micro-optimization and as such will
only give you a gain of a few percent. What makes the difference
is the sum of these micro optimizations.

Here's the patch for Python 1.5 which includes the optimizations:

	http://www.lemburg.com/python/mxPython-1.5.patch.gz
 
> > do speedup loops by a noticable amount. Just compare the performance
> > of stock Python 1.5 against my patched version.
> 
> No time now, but after 2.1 is out, sure, wrt it (not 1.5).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Mon Jan  1 01:13:12 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 31 Dec 2000 19:13:12 -0500
Subject: [Python-Dev] Re: Most everything is busted
In-Reply-To: <14926.34447.60988.553140@anthem.concentric.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCAECIIGAA.tim.one@home.com>

[Barry A. Warsaw]
> There's a stupid, stupid bug in Mailman 2.0, which I've just fixed
> and (hopefully) unjammed things on the Mailman end[1].  We're still
> probably subject to the Postfix delays unfortunately; I think those
> are DNS related, and I've gotten a few other reports of DNS oddities,
> which I've forwarded off to the DC sysadmins.  I don't think that
> particular problem will be fixed until after the New Year.
>
> relax-and-enjoy-the-quiet-ly y'rs,

I would have, except you appear to have ruined it:  hundreds of msgs
disgorged overnight and into the afternoon.  And echoes of email to c.l.py
now routinely come back in minutes instead of days.

Overall, ya, I liked it better when it was broken -- jerk <wink>.

typical-user-ly y'rs  - tim




From tim.one at home.com  Mon Jan  1 02:31:18 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 31 Dec 2000 20:31:18 -0500
Subject: [Python-Dev] Copyrights and licensing (was ... something irrelevant)
In-Reply-To: <200012291652.RAA20251@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECLIGAA.tim.one@home.com>

[Martin von Loewis]
> I'd like to get an "official" clarification on this question. Is it
> the case that patches containing copyright notices are only accepted
> if they are accompanied with license information?

It's nigh unto impossible to get Guido to pay attention to these kinds of
issues until after it's too late -- guess who's still trying to get an FSF
approved license for Python 1.6 <wink>.

What I intend to push for is that nothing be accepted except under the
understanding that copyright is assigned to the Python Software Foundation;
but, since that doesn't exist yet, we're in limbo.

> I agree that the changes are minor, I also believe that I hold the
> copyright to the changes whether I attach a notice or not (at least
> according to our local copyright law).

Under U.S. law too.  The difference is that, without an explicit copyright
notice, it's a lot easier to get lawyers to ignore that reality <0.3 wink>.
When the PSF does come into being, the lawyers will doubtless make us hassle
everyone with an explicit copyright notice into signing reams of paperwork.
It's a drain on time and money for all concerned, IMO, with no real payback.

> What concerns me that without such a notice, gencodec.py looks as if
> CNRI holds the copyright to it. I'm not willing to assign the
> copyright of my changes to CNRI, and I'd like to avoid the impression
> of doing so.

Understood, and with sympathy.  Since the status of JPython/Jython is still
muddy, I urged Finn Bock to put his own copyright notice on his Jython work
for exactly the same reason (i.e., to prevent CNRI claiming it later).

Seems to me, though, that it may simplify life down the road if, whenever an
author felt a similar need to assert copyright explicitly, they list Guido
as the copyright holder.  He's not going to screw Python!  And it's
inevitable that all Python copyrights will eventually be owned by him and/or
the PSF anyway.

But, for God's sake, whatever you do, *please* (anyone) don't make us look
at a unique license!  We're not lawyers, but we've been paying lawyers out
of our own pockets to do this crap, and it's expensive and time-consuming.
If you can't trust Guido to do a Right Thing with your code, Python is
better off without it over the long haul.

> What is even more concerning is that CNRI also holds the copyright to
> the generated files, even though they are derived from information
> made available by the Unicode consortium!

It's no concern to me -- but then I'm not paranoid <wink>.

cnri-and-the-uc-can-fight-it-out-if-it-comes-to-that-ly y'rs  - tim




From moshez at zadka.site.co.il  Mon Jan  1 11:01:02 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon,  1 Jan 2001 12:01:02 +0200 (IST)
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20001231105812.A12168@newcnri.cnri.reston.va.us>
References: <20001231105812.A12168@newcnri.cnri.reston.va.us>, <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>
Message-ID: <20010101100102.2360CA84F@darjeeling.zadka.site.co.il>

On Sun, 31 Dec 2000, Andrew Kuchling <akuchlin at cnri.reston.va.us> wrote:

> It also leads to one section of the FAQ (#3, I think) having something
> like 60 questions jumbled together.  IMHO the FAQ should be a text
> file, perhaps in the PEP format so it can be converted to HTML, and it
> should have an editor who'll arrange it into smaller sections.  Any
> volunteers?  (Must ... resist ...  urge to volunteer myself...  help
> me, Spock...)

Well, Andrew, I know if I leave you any more time, you won't be able
to resist the urge. OK, I'll volunteer. Can't do anything right now,
but expect to see an updated version posted on my site soon. If 
people will think it's a good idea, I'll move it to Misc/.
Fred, if the some-xml-format-to-HTML you're working on is in any
sort of readiness, I'll use that to format the FAQ. Having used Perl
in the last couple of weeks, I learned to appreciate the fact that
the FAQ is a standard part of the documentation.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From loewis at informatik.hu-berlin.de  Mon Jan  1 12:43:34 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 1 Jan 2001 12:43:34 +0100 (MET)
Subject: [Python-Dev] Re: Copyrights and licensing (was ... something irrelevant)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCECLIGAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCCECLIGAA.tim.one@home.com>
Message-ID: <200101011143.MAA11550@pandora.informatik.hu-berlin.de>

> Seems to me, though, that it may simplify life down the road if, whenever an
> author felt a similar need to assert copyright explicitly, they list Guido
> as the copyright holder.  He's not going to screw Python!  

That's a good solution, which I'll implement in a revised patch.

Thanks for the advice, and Happy New Year,

Martin



From mal at lemburg.com  Mon Jan  1 18:56:20 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 01 Jan 2001 18:56:20 +0100
Subject: [Python-Dev] Re: Copyright statements ([Patch #103002] Fix for #116285: Properly
 raise UnicodeErrors)
References: <E14Bhs3-0007uf-00@usw-sf-web3.sourceforge.net> <200012290957.KAA17936@pandora.informatik.hu-berlin.de> <3A4C757D.F64E9CEF@lemburg.com>
Message-ID: <3A50C4C4.76A1C5B6@lemburg.com>

Martin von Loewis wrote:
> 
> > My only problem with it is your copyright notice. AFAIK, patches to
> > the Python core cannot contain copyright notices without proper
> > license information. OTOH, I don't think that these minor changes
> > really warrant adding a complete license paragraph.
> 
> I'd like to get an "official" clarification on this question. Is it
> the case that patches containing copyright notices are only accepted
> if they are accompanied with license information?
> 
> I agree that the changes are minor, I also believe that I hold the
> copyright to the changes whether I attach a notice or not (at least
> according to our local copyright law).

True.

> What concerns me that without such a notice, gencodec.py looks as if
> CNRI holds the copyright to it. I'm not willing to assign the
> copyright of my changes to CNRI, and I'd like to avoid the impression
> of doing so.
>
> What is even more concerning is that CNRI also holds the copyright to
> the generated files, even though they are derived from information
> made available by the Unicode consortium!

The copyright for the files and changes needed for the Unicode 
support was indeed transferred to CNRI earlier this year. This
was part of the contract I had with CNRI.

I don't know why the copyright notice wasn't subsequently removed from
the files after final checkin of the changes, though, because, as
I remember, the copyright line was only added as "search&replace"
token to the files in question in the sign over period.

The codec files were part of the Unicode support patch, even though
they were created by the gencodec.py tool I wrote to create them
from the Unicode mapping files. That's why they also carry the
copyright token.

Note that with strict reading of the CNRI license, there's no
problem with removing the notice from the files in question:

"""
...provided, however, that CNRI's
License Agreement and CNRI's notice of copyright, i.e., "Copyright (c)
1995-2000 Corporation for National Research Initiatives; All Rights
Reserved" are retained in Python 1.6 alone or in any derivative
version prepared by Licensee...
"""

The copyright line in the Unicode files is
"(c) Copyright CNRI, All Rights Reserved. NO WARRANTY.", so this
does not match the definition they gave in their license text.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Mon Jan  1 19:58:36 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 01 Jan 2001 13:58:36 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: Your message of "Fri, 29 Dec 2000 21:59:16 +0100."
             <20001229215915.L1281@xs4all.nl> 
References: <EC$An3AHZGT6EwJP@jessikat.fsnet.co.uk> <LNBBLJKPBEHFEDALKOLCKEODIFAA.tim.one@home.com>  
            <20001229215915.L1281@xs4all.nl> 
Message-ID: <200101011858.NAA09263@cj20424-a.reston1.va.home.com>

Thomas just checked this in, using Tim's words:

> *** ref7.tex	2000/07/16 19:05:38	1.20
> --- ref7.tex	2000/12/31 22:52:59	1.21
> ***************
> *** 243,249 ****
>     \ttindex{exc_value}\ttindex{exc_traceback}}
>   
> ! The optional \keyword{else} clause is executed when no exception occurs
> ! in the \keyword{try} clause.  Exceptions in the \keyword{else} clause are
> ! not handled by the preceding \keyword{except} clauses.
>   \kwindex{else}
>   
> --- 243,251 ----
>     \ttindex{exc_value}\ttindex{exc_traceback}}
>   
> ! The optional \keyword{else} clause is executed when the \keyword{try} clause
> ! terminates by any means other than an exception or executing a
> ! \keyword{return}, \keyword{continue} or \keyword{break} statement.  
> ! Exceptions in the \keyword{else} clause are not handled by the preceding
> ! \keyword{except} clauses.
>   \kwindex{else}

How is this different from "when control flow reaches the end of the
try clause", which is what I really had in mind?  Using the current
wording, this paragraph would have to be changed each time a new
control-flow keyword is added.  Based upon the historical record
that's not a grave concern ;-), but I think the new wording relies too
much on accidentals such as the fact that these are the only control
flow altering events.  It may be that control flow is not rigidly
defined -- but as it is what was really intended, maybe the fix should
be to explain the right concept rather than the current ad-hoc
solution.  This also avoids concerns of readers who are trying to read
too much into the words and might become worried that there are other
ways of altering the control flow that *would* cause the else clause
to be executed; and guides implementors of other Pyhon-like languages
(like vyper) that might have more control-flow altering statements or
events.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin at loewis.home.cs.tu-berlin.de  Mon Jan  1 20:00:38 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 1 Jan 2001 20:00:38 +0100
Subject: [Python-Dev] PSA (Was: FAQ Horribly Out Of Date)
Message-ID: <200101011900.UAA01672@loewis.home.cs.tu-berlin.de>

> It appears that CNRI can only think about one thing at a time <0.5
> wink>.  For the last 6 months, that thing has been the license.  If
> they ever resolve the GPL compatibility issue, maybe they can be
> persuaded to think about the PSA.  In the meantime, I'd suggest you
> not renew <ahem>.

I think we need to find a better answer than that, and soon. While
everybody reading this list probably knows not to renew, the PSA is
the first thing that you see when selecting "Python Community" on
python.org. The first paragraph reads

# The continued, free existence of Python is promoted by the
# contributed efforts of many people. The Python Software Activity
# (PSA) supports those efforts by helping to coordinate them. The PSA
# operates web, ftp, and email services, organizes conferences, and
# engages in other activities that benefit the Python user
# community. In order to continue, the PSA needs the membership of
# people who value Python.

If you look at the current members list
(http://www.python.org/psa/Members.html), it appears that many
long-time members indeed have not renewed. This page was last updated
Nov 14 - so it appears that CNRI is still processing applications when
they come. It may well be that many of the newer members ask
themselves by now what happened to their money; it might not be easy
to get an answer to that question. However, there is clearly somebody
to blame here: The Python Community.

So I'd like to request that somebody with write permissions to these
pages changes the text, to something along the lines of replacing the
first paragraph with

# The Python community organizes itself in different ways; people
# interested in discussing development of and with Python usually
# participate in <a href="MailingLists.html">mailing lists</a>.
#
# <p>Organizations that wish to influence further directions of the
# Python language may join the <a href="/consortium">Python
# Consortium</a>.
#
# <p>The <a href="http://www.cnri.reston.va.us/">Corporation for
# National Research Initiatives</a> hosts the Python Software
# Activity, which is described below. The PSA used to provide funding
# for the Python development; that is no longer the case.

If there is a factual error in this text, please let me
know.

Regards,
Martin



From tim.one at home.com  Mon Jan  1 20:20:53 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 1 Jan 2001 14:20:53 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <E14D9Ev-0007ac-00@usw-sf-web3.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com>

[gvanrossum, in an SF patch comment]
> Bah.  I don't like this one bit.  More complexity for a little
> bit of extra speed.
> I'm keeping this open but expect to be closing it soon unless I
> hear a really good argument why more speed is really needed in
> this area.  Down with code bloat and creeping featurism!

Without judging "the solution" here, "the problem" is that everyone's first
attempt to use line-at-a-time file input in Perl:

    while (<F>} {
        ... $_ ...;
    }

runs 2-5x faster then everyone's first attempt in Python:

    while 1:
        line = f.readline()
        if not line:
            break
        ... line ...

It would be beneficial to address that *somehow*, cuz 2-5x isn't just "a
little bit"; and by the time you walk a newbie thru

    while 1:
        lines = f.readlines(hintsize)
        if not lines:
             break
        for line in lines:
            ... line ...

they feel like maybe Perl isn't so obscure after all <wink>.

Does someone have an elegant way to address this?  I believe Jeff's shot at
elegance was the other part of the patch, using (his new) xreadlines under
the covers to speed the fileinput module.

reading-text-files-is-very-common-ly y'rs  - tim




From guido at digicool.com  Mon Jan  1 20:25:07 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 01 Jan 2001 14:25:07 -0500
Subject: [Python-Dev] PSA (Was: FAQ Horribly Out Of Date)
In-Reply-To: Your message of "Mon, 01 Jan 2001 20:00:38 +0100."
             <200101011900.UAA01672@loewis.home.cs.tu-berlin.de> 
References: <200101011900.UAA01672@loewis.home.cs.tu-berlin.de> 
Message-ID: <200101011925.OAA09669@cj20424-a.reston1.va.home.com>

> > It appears that CNRI can only think about one thing at a time <0.5
> > wink>.  For the last 6 months, that thing has been the license.  If
> > they ever resolve the GPL compatibility issue, maybe they can be
> > persuaded to think about the PSA.  In the meantime, I'd suggest you
> > not renew <ahem>.
> 
> I think we need to find a better answer than that, and soon. While
> everybody reading this list probably knows not to renew, the PSA is
> the first thing that you see when selecting "Python Community" on
> python.org. The first paragraph reads
> 
> # The continued, free existence of Python is promoted by the
> # contributed efforts of many people. The Python Software Activity
> # (PSA) supports those efforts by helping to coordinate them. The PSA
> # operates web, ftp, and email services, organizes conferences, and
> # engages in other activities that benefit the Python user
> # community. In order to continue, the PSA needs the membership of
> # people who value Python.
> 
> If you look at the current members list
> (http://www.python.org/psa/Members.html), it appears that many
> long-time members indeed have not renewed. This page was last updated
> Nov 14 - so it appears that CNRI is still processing applications when
> they come. It may well be that many of the newer members ask
> themselves by now what happened to their money; it might not be easy
> to get an answer to that question. However, there is clearly somebody
> to blame here: The Python Community.

I don't know how many memberships CNRI has received, but it can't be
many, since we sent out no reminders.  I'll see if I can get an
answer.

> So I'd like to request that somebody with write permissions to these
> pages changes the text, to something along the lines of replacing the
> first paragraph with
> 
> # The Python community organizes itself in different ways; people
> # interested in discussing development of and with Python usually
> # participate in <a href="MailingLists.html">mailing lists</a>.
> #
> # <p>Organizations that wish to influence further directions of the
> # Python language may join the <a href="/consortium">Python
> # Consortium</a>.
> #
> # <p>The <a href="http://www.cnri.reston.va.us/">Corporation for
> # National Research Initiatives</a> hosts the Python Software
> # Activity, which is described below. The PSA used to provide funding
> # for the Python development; that is no longer the case.
> 
> If there is a factual error in this text, please let me
> know.

I've done something slightly different -- see
http://www.python.org/psa/.  I've kept only your first paragraph, and
inserted a boldface note before that about the obsolescence (or
deprecation :-) of the PSA membership.

I've removed the references to the consortium, since that's also about
to collapse under its own inactivity; instead, the PSF will be formed,
independent from CNRI, to hold the IP rights (insofar they can be
assigned to the PSF) and for not much else.

I'll see if I can get some more news about the creation of the PSF
(which is supposed to be an initiative of ActiveState and Digital
Creations).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Mon Jan  1 20:35:24 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 01 Jan 2001 14:35:24 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Mon, 01 Jan 2001 14:20:53 EST."
             <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> 
Message-ID: <200101011935.OAA09728@cj20424-a.reston1.va.home.com>

> [gvanrossum, in an SF patch comment]
> > Bah.  I don't like this one bit.  More complexity for a little
> > bit of extra speed.
> > I'm keeping this open but expect to be closing it soon unless I
> > hear a really good argument why more speed is really needed in
> > this area.  Down with code bloat and creeping featurism!
> 
> Without judging "the solution" here, "the problem" is that everyone's first
> attempt to use line-at-a-time file input in Perl:
> 
>     while (<F>} {
>         ... $_ ...;
>     }
> 
> runs 2-5x faster then everyone's first attempt in Python:
> 
>     while 1:
>         line = f.readline()
>         if not line:
>             break
>         ... line ...

But is everyone's first thought to time the speed of Python vs. Perl?
Why does it hurt so much that this is a bit slow?

> It would be beneficial to address that *somehow*, cuz 2-5x isn't just "a
> little bit"; and by the time you walk a newbie thru
> 
>     while 1:
>         lines = f.readlines(hintsize)
>         if not lines:
>              break
>         for line in lines:
>             ... line ...
> 
> they feel like maybe Perl isn't so obscure after all <wink>.
> 
> Does someone have an elegant way to address this?  I believe Jeff's shot at
> elegance was the other part of the patch, using (his new) xreadlines under
> the covers to speed the fileinput module.

But of course suggesting fileinput is also not a great solution --
it's relatively obscure (since it's not taught by most tutorials,
certainly not by the standard tutorial).

> reading-text-files-is-very-common-ly y'rs  - tim

So is worrying about performance without a good reason...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Mon Jan  1 20:49:24 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 01 Jan 2001 14:49:24 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: Your message of "Mon, 01 Jan 2001 12:01:02 +0200."
             <20010101100102.2360CA84F@darjeeling.zadka.site.co.il> 
References: <20001231105812.A12168@newcnri.cnri.reston.va.us>, <20001231003330.D2188A84F@darjeeling.zadka.site.co.il>  
            <20010101100102.2360CA84F@darjeeling.zadka.site.co.il> 
Message-ID: <200101011949.OAA09804@cj20424-a.reston1.va.home.com>

[Moshe]
> Well, Andrew, I know if I leave you any more time, you won't be able
> to resist the urge. OK, I'll volunteer. Can't do anything right now,
> but expect to see an updated version posted on my site soon. If 
> people will think it's a good idea, I'll move it to Misc/.
> Fred, if the some-xml-format-to-HTML you're working on is in any
> sort of readiness, I'll use that to format the FAQ.

Moshe, if your solution is to turn the FAQ into a document with a
single editor again, I think you're not doing the community a favor.
Granted, we could add some more sections (easy enough for me if
someone tells me the new section headings and which existing questions
go where) and there is a lot of obsolete information.

But I would be very hesitant to drop the notion of maintaining the FAQ
as a group collaboration project.  There's nothing wrong with the FAQ
wizard except that the password (Spam) should be made publicly known...

I've also noticed that Bjorn Pettersen has made a whole slew of useful
updates to various sections, mostly updates about new 2.0 features or
syntax.

> Having used Perl
> in the last couple of weeks, I learned to appreciate the fact that
> the FAQ is a standard part of the documentation.

Does that mean more than that it should be linked to from
http://www.python.org/doc/ ?  It's already there in the side bar; does
it need a more prominent position?  I used to include the FAQ in Misc/
(Ping's Misc/faq2html.py script is a last remnant of that), but gave
up after realizing that the on-line FAQ is much more useful than a
single text file.

In my eyes, the best thing you (and everyone else) could do, if you
find the time, would be to use the FAQ wizard to fix or delete
out-of-date entries.  To delete an entry, change its subject to
"Deleted" and remove its body; I'll figure out a way to delete them
from the index.  Because FAQ entries can refer to each other (and are
referred to from elsewhere) by number, it's not safe to simply
renumber entries.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Mon Jan  1 21:27:37 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 1 Jan 2001 15:27:37 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <200101011858.NAA09263@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEDJIGAA.tim.one@home.com>

[Guido]
> Thomas just checked this in, using Tim's words:

[   The optional \keyword{else} clause is executed when no
    exception occurs in the \keyword{try} clause.  Exceptions in
    the \keyword{else} clause are not handled by the preceding
    \keyword{except} clauses.

vs
    The optional \keyword{else} clause is executed when the
    \keyword{try} clause terminates by any means other than an
    exception or executing a \keyword{return}, \keyword{continue}
    or \keyword{break} statement.  Exceptions in the \keyword{else}
    clause are not handled by the preceding \keyword{except} clauses.
]

> How is this different from "when control flow reaches the end of the
> try clause", which is what I really had in mind?

Only in that it doesn't appeal to a new undefined phrase, and is (I think)
unambiguous in the eyes of a non-specialist reader (like Robin's friend).
Note that "reaching the end of the try clause" is at best ambiguous, because
you *really* have in mind "falling off the end" of the try clause.  It
wouldn't be unreasonable to say that in:

    try:
         x = 1
         y = 2
         return 1

"x=1" is the beginning of the try clause and "return 1" is the end.  So if
the reader doesn't already know what you mean, saying "the end" doesn't nail
it (or, if like me, the reader does already know what you mean, it doesn't
matter one whit what it says <wink>).

> Using the current wording, this paragraph would have to be
> changed each time a new control-flow keyword is added.  Based
> upon the historical record that's not a grave concern ;-),

It was sure no concern of mine ...

> but I think the new wording relies too much on accidentals such
> as the fact that these are the only control flow altering events.
>
> It may be that control flow is not rigidly defined -- but as it is
> what was really intended, maybe the fix should be to explain the
> right concept rather than the current ad-hoc solution.
> ...

OK, except I don't know how to do that succinctly.  For example, if Java had
an "else" clause, the Java spec would say:

    If present, the "else block" is executed if and only if execution
    of the "try block" completes normally, and then there is a choice:

        If the "else block" completes normally, then the
        "try" statement completes normally.

        If the "else block" completes abruptly for reason S,
        then the "try" statement completes abruptly for reason S.

That is, they deal with control-flow issues via appeal to "complete
normally" and "complete abruptly" (which latter comes in several flavors
("reasons"), such as returns and exceptions), and there are pages and pages
and pages of stuff throughout the spec inductively defining when these
conditions obtain.  It's clear, precise and readable; but it's also wordy,
and we don't have anything similar to build on.

As a compromise, given that we're not going to take the time to be precise
(well, I'm sure not ...):

    The optional \keyword{else} clause is executed if and
    when control flows off the end of the \keyword{try}
    clause.\foonote{In Python 2.0, control "flows off the
    end" except in case of exception, or executing a
    \keyword{return}, \keyword{continue} or \keyword{break}
    statement.}
    Exceptions in the \keyword{else} clause are not handled by
    the preceding \keyword{except} clauses.

Now it's all of imprecise, almost precise, specific to Python 2.0, and
robust against any future changes <wink>.




From akuchlin at cnri.reston.va.us  Mon Jan  1 21:35:27 2001
From: akuchlin at cnri.reston.va.us (Andrew Kuchling)
Date: Mon, 1 Jan 2001 15:35:27 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <200101011949.OAA09804@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 01, 2001 at 02:49:24PM -0500
References: <20001231105812.A12168@newcnri.cnri.reston.va.us>, <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> <20010101100102.2360CA84F@darjeeling.zadka.site.co.il> <200101011949.OAA09804@cj20424-a.reston1.va.home.com>
Message-ID: <20010101153527.A14116@newcnri.cnri.reston.va.us>

On Mon, Jan 01, 2001 at 02:49:24PM -0500, Guido van Rossum wrote:
>But I would be very hesitant to drop the notion of maintaining the FAQ
>as a group collaboration project.  There's nothing wrong with the FAQ
>wizard except that the password (Spam) should be made publicly known...

Why multiply the number of mechanisms required to maintain things?  We
already use CVS for other documentation; why not use it for the FAQ as 
well?  

--amk



From tim.one at home.com  Mon Jan  1 22:00:36 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 1 Jan 2001 16:00:36 -0500
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20010101153527.A14116@newcnri.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEDLIGAA.tim.one@home.com>

[Andrew Kuchling]
> Why multiply the number of mechanisms required to maintain things?
> We already use CVS for other documentation; why not use it for the
> FAQ as well?

The search facilities of the FAQ wizard are invaluable, and so is the
ability for "just users" to update the info from within their browsers.
There are two problems with the FAQ in practice:

1. It doesn't get updated enough.  We can't fix that by making it harder to
update!

2. It's *only* available via the web interface.  We should ship a text or
HTML snapshot with releases; perhaps even do the usual Usenet periodic
FAQ-posting thing.




From tim.one at home.com  Mon Jan  1 23:34:03 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 1 Jan 2001 17:34:03 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101011935.OAA09728@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEECIGAA.tim.one@home.com>

[Guido]
> But is everyone's first thought to time the speed of Python vs. Perl?

It's few peoples' first thought.  It's impossible for bilingual programmers
(or dabblers, or evaluators) not to notice *soon*, though, because:

> Why does it hurt so much that this is a bit slow?

Factors of 2 to 5 aren't "a bit" -- they're obvious when they happen, but
the *cause* is not.  To judge from a decade of c.l.py gripes, most people
write it off to "huh -- guess Python is just slow"; the rest eventually
figure out that their text input is the bottleneck (Tom Christiansen never
got this far <0.5 wink>), but then don't know what to do about it.

At this point I'm going to insert two anonymized pvt emails from last year:

-----Original Message #1 -----

From: TTT
Sent: Monday, March 13, 2000 2:29 AM
To: GGG
Subject: RE: [Python-Help] C, C++, Java, Perl, Python, Rexx, Tcl comparison

GGG, note especially figure 4 in Lutz Prechelt's report:

>   http://wwwipd.ira.uka.de/~prechelt/Biblio/#jccpprtTR

The submitted Python programs had by far the largest variability in how long
it took to load the dictionary.  My input loop is probably typical of the
"fast" Python programs, which indeed beat most (but not all) of the fastest
Perl ones here:

class Dictionary:
    ...

    def fill_from_file(self, f, BUFFERSIZE=500000):
        """f, BUFFERSIZE=500000 -> fill dictionary from file f.

        f must be an open file, or other object with a readlines()
        method.  It must contain one word per line.  Optional arg
        BUFFERSIZE is used to chunk up input for efficiency, and is
        roughly the # of bytes read at a time.
        """

        addword = self.addword
        while 1:
            lines = f.readlines(BUFFERSIZE)
            if not lines:
                break
            for line in lines:
                addword(line[:-1])  # chop trailing newline

Comparable Perl may have been the one-liner:

    grep(&addword, chomp(<>));

which may account for why Perl's memory use was uniformly higher than
Python's.

Whatever, you really need to be a Python expert to dream up "the fast way"
to do Python input!  Hire me, and I'll fix that <wink>.

nothing-like-blackmail-before-going-to-bed-ly y'rs  - TTT


-----Original Message #2 -----

From: GGG
Sent: Monday, March 13, 2000 7:08 AM
To: TTT
Subject: Re: [Python-Help] C, C++, Java, Perl, Python, Rexx, Tcl comparison


Agreed.  readlines(BUFFERSIZE) is a crock.  In fact, ``for i in
f.readlines()'' should use lazy evaluation -- but that will have to wait for
Py3K unless we add hints so that readlines knows it is being called from a
for loop.

--GGG


-----Back to 2001 -----

I took TTT's advice and read Lutz's report <wink>.  I agree with GGG that
hiding this in .readlines() would be maximally elegant.  xreadlines supplies
most of the lazy machinery GGG favored.  I don't know how hard it would be
to supply the rest of it, but it's such a frequent bitching point that I
would prefer pointing people to an explicit .xreadlines() hack than either
(a) try to convince them that they "shouldn't" care about the speed as much
as they claim to; or, (b) try to explain the double-loop buffering method.
I'd personally rather use an explicit .xreadlines() hack than code the
double-loop buffering too, and don't see an obvious way to do better than
that right now.

>> reading-text-files-is-very-common-ly y'rs  - tim

> So is worrying about performance without a good reason...

Indeed it is.  I'm persuaded that many people making this specific complaint
have a legitimate need for more speed, though, and that many don't persist
with Python long enough to find out how to address this complaint (because
the double-loop method is too obscure for a newbie to dream up).  That makes
this hack score extraordinarily high on my benefit/harm ratio scale (in P3K
xreadlines can be deprecated in favor of readlines <0.9 wink>).

heck-it-doesn't-even-require-a-new-keyword-ly y'rs  - tim




From thomas at xs4all.net  Mon Jan  1 23:46:45 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 1 Jan 2001 23:46:45 +0100
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101011935.OAA09728@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 01, 2001 at 02:35:24PM -0500
References: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> <200101011935.OAA09728@cj20424-a.reston1.va.home.com>
Message-ID: <20010101234645.B5435@xs4all.nl>

On Mon, Jan 01, 2001 at 02:35:24PM -0500, Guido van Rossum wrote:

[ Python lacks a One True Way of doing Perl's 'while(<>)' ]

> > Does someone have an elegant way to address this?  I believe Jeff's shot at
> > elegance was the other part of the patch, using (his new) xreadlines under
> > the covers to speed the fileinput module.

> But of course suggesting fileinput is also not a great solution --
> it's relatively obscure (since it's not taught by most tutorials,
> certainly not by the standard tutorial).

Is fileinput really obscure ? I personally quite like it. It is enough like
the perl idiom to be very useful for people thinking that way, and it
doesn't require special syntax or considerations. If tutorialization is the
only problem, I'd be happy to fix that, provided Fred or Moshe can TeX my
fix up.

As for speed (which stays a secondary or tertiary consideration at best) do
we really need the xreadlines method to accomplish that ? Couldn't fileinput
get almost the same performance using readlines() with a sizehint ? I
personally don't like the xreadlines because it adds yet another function to
do the same, with a slight, subtle and to the untrained programmer unclear
distinction from the rest. (I don't really like the range/xrange difference
either -- I think Python code shouldn't care whether they're dealing with a
real list or a generator, and as much as possible should just be generators.
And in the case of simple (x)range()es, I have yet to see a case where a
'real' list had significantly better performance than a generator.)

If we *do* start adding methods to (the public API of) filemethods, I think
we should consider more than just xreadlines() (I seem to recall other
proposals, but my memory is hazy at the moment -- I haven't slept since last
millennium) add whatever is necessary, and provide a UserFile in the std.
lib that 'emulates' all fileobject functionality using a single readline()
function.

Now, if you'll excuse me, I have a date with a soft bed I haven't seen in
about 40 hours, a pair of aspirin my head is killing for and probably a
hangover that I don't want to think about, right now ;)

Gelukkig-Nieuwjaar-iedereen-ly y'rs

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From jepler at inetnebr.com  Tue Jan  2 02:49:35 2001
From: jepler at inetnebr.com (Jeff Epler)
Date: Mon, 1 Jan 2001 19:49:35 -0600
Subject: [Python-Dev] Re: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com>; from Tim Peters on Mon, Jan 01, 2001 at 02:20:53PM -0500
Message-ID: <20010101194935.19672@falcon.inetnebr.com>

I'd like to speak up about this patch I've submitted on sourceforge.

I consider the xreadlines function/object to be the core of my proposal.
The addition of a method to file objects, as well as the modifications
to fileinput, are secondary in my opinion.

The desire is to iterate over file conents in a way that satisfies the
following criteria:
	* Uses the "for" syntax, because this clearly captures the
	  underlying operation. (files can be viewed as sequences of
	  lines when appropriate)
	* Consumes small amounts of memory even when the file contents
	  are large.
	* Has the lowest overhead that can reasonably be attained.

I think that it is agreed that the ability to use the "for" syntax is
important, since it was the impetus for the xrange function/object.
After all, there's a "while" statement which will give the same effect,
without introducing xrange.

The point under debate, as I see it, is the utility of speeding up the
"benchmarks" of folks who compare the speed of Python and another
language doing a very simple loop over the lines in a file.  Since this
advantage disappears once real work is beig done on the file, maybe an
XReadLines class, written in Python, would be more suitable.  In fact,
I've written such a class since I didn't know about fileinput and in
any case I find it less useful to me because of all the weird stuff it
does. (parsing argv, opening files by name, etc)

One shortcoming of my current patch, aside from the ones already named
in another person's response to the it, are that it fails when working
on a file-like class which implements .readline but not .readlines.

In any case, I wrote xreadlines to learn how to write C extensions to
Python, and submitted it at the suggestion of a fellow Python user in a
private discussion.  I'd like to extinguish one of these eternal
comp.lang.python threads with it too, but maybe it's not to be.

Happy new year, all.

Jeff



From gstein at lyra.org  Tue Jan  2 04:34:31 2001
From: gstein at lyra.org (Greg Stein)
Date: Mon, 1 Jan 2001 19:34:31 -0800
Subject: [Python-Dev] FAQ Horribly Out Of Date
In-Reply-To: <20010101153527.A14116@newcnri.cnri.reston.va.us>; from akuchlin@cnri.reston.va.us on Mon, Jan 01, 2001 at 03:35:27PM -0500
References: <20001231105812.A12168@newcnri.cnri.reston.va.us>, <20001231003330.D2188A84F@darjeeling.zadka.site.co.il> <20010101100102.2360CA84F@darjeeling.zadka.site.co.il> <200101011949.OAA09804@cj20424-a.reston1.va.home.com> <20010101153527.A14116@newcnri.cnri.reston.va.us>
Message-ID: <20010101193431.M10567@lyra.org>

On Mon, Jan 01, 2001 at 03:35:27PM -0500, Andrew Kuchling wrote:
> On Mon, Jan 01, 2001 at 02:49:24PM -0500, Guido van Rossum wrote:
> >But I would be very hesitant to drop the notion of maintaining the FAQ
> >as a group collaboration project.  There's nothing wrong with the FAQ
> >wizard except that the password (Spam) should be made publicly known...
> 
> Why multiply the number of mechanisms required to maintain things?  We
> already use CVS for other documentation; why not use it for the FAQ as 
> well?  

That would limit the updaters to just those with CVS access. As Guido just
pointed out, Bjorn made a bunch of updates. And he didn't need CVS to do
that...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From tim.one at home.com  Tue Jan  2 04:44:05 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 1 Jan 2001 22:44:05 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <20010101194935.19672@falcon.inetnebr.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEEKIGAA.tim.one@home.com>

[Jeff Epler]
> I'd like to speak up about this patch I've submitted on sourceforge.

I'm not sure that's allowed <wink>.

> ...
> The point under debate, as I see it, is the utility of speeding
> up the "benchmarks" of folks who compare the speed of Python and
> another language doing a very simple loop over the lines in a file.

If that were true, I couldn't care less.

> Since this advantage disappears once real work is being done on
> the file, ...

I agree that's true, but submit it's rarely relevant. *Most* file-crunching
apps are dominated by I/O time, which is why this is so visible to so many;
e.g., chewing over massive log files looking for patterns appears to be the
growth industry of the 21st century <wink>.  Even in Lutz's report (see
reference from earlier mail), where the task to be solved was far from
trivial, input time exceeded processing time across all languages (with some
oddball exceptions, when the coder neglected to use a hash table to store
info).  That's thoroughly typical of real file-crunching applications, in my
experience:  Perl has a killer speed advantage in the single most
time-consuming portion of the app, and due to one implementation trick.
Take that advantage away, and Python holds its own in this domain.

Coincidentally, I got pvt email from a newbie today, reading in part;

> If Perl wasn't so gosh darn good and fast at text scrubbing, it
> wouldn't really be a consideration, it's syntax is so clunky and
> hard to learn by comparison to both Python and Ruby.

This is just depressing, because I can predict every step of this dance.

> ...
> Happy new year, all.

And to you!  Just make sure it's a fast new year <wink>.





From moshez at zadka.site.co.il  Tue Jan  2 16:24:40 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue,  2 Jan 2001 17:24:40 +0200 (IST)
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <20010101234645.B5435@xs4all.nl>
References: <20010101234645.B5435@xs4all.nl>, <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> <200101011935.OAA09728@cj20424-a.reston1.va.home.com>
Message-ID: <20010102152440.9C26DA84F@darjeeling.zadka.site.co.il>

On Mon, 1 Jan 2001, Thomas Wouters <thomas at xs4all.net> wrote:

> As for speed (which stays a secondary or tertiary consideration at best) do
> we really need the xreadlines method to accomplish that ? Couldn't fileinput
> get almost the same performance using readlines() with a sizehint ? I

<aol>me too</aol>
Adding xreadlines() to the interface would break half a dozen file-objects all
around the world (just the standard library has StringIO, cStringIO,
GzipFile and probably some others I can't remember)

Adding .readlines(sizehint) to fileinput, and adding a function
to create something similar to fileinput from a file object (as opposed
to a file name) would help everyone, and doesn't seem to hard.
Is there a gotcha I'm just not seeing?

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tim.one at home.com  Tue Jan  2 09:06:32 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 03:06:32 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <20010101234645.B5435@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com>

[Thomas Wouters]
> ...
> As for speed (which stays a secondary or tertiary consideration
> at best) do we really need the xreadlines method to accomplish
> that ?  Couldn't fileinput get almost the same performance using
> readlines() with a sizehint ?

There was a long email discussion among Jeff, Paul Prescod, Neel
Krishnaswami, and Alex Martelli about this.  I started getting copied on it
somewhere midstream, but didn't have time to follow it then (like I do now
<wink>).

About two weeks ago Neel summarized all the approaches then under
discussion:

"""
[Neel Krishnaswami]

...

Quick performance summary of the current solutions:

Slowest: for line in fileinput.input('foo'):     # Time 100
       : while 1: line = file.readline()         # Time 75
       : for line in LinesOf(open('foo')):       # Time 25
Fastest: for line in file.readlines():           # Time 10
         while 1: lines = file.readlines(hint)   # Time 10
         for line in xreadlines(file):           # Time 10

The difference in speed between the slowest and fastest is about
a factor of 10.

LinesOf is Alex's Python wrapper class that takes a file and
uses readlines() with a size-hint to present a sequence interface.
It's around half as fast as the fastest idioms, and 3-4 times
faster than while 1:. Jeff's xreadlines is essentially the same
thing in C, and is indistinguishable in performance from the
other fast idioms.

...

"""

On his box, line-at-a-time is >7x slower than the fastest Python methods,
which latter are usually close (depending on the platform) to Perl
line-at-a-time speeds.  A factor of 7 is too large for most working
programmers to ignore in the interest of somebody else's notion of
theoretical purity <wink>.  Seriously, speed is not a secondary
consideration to me when the gap is this gross, and in an area so visible
and common.

Alex's LineOf appears a good predictor for how adding
fileinput.readlines(hint) would perform, since it appears to *be* that
(except off on its own).  Then it buys a factor of 3 over line-at-a-time on
Neel's box but leaves a factor of 2.5 on the table.  The cause of the latter
appears mostly to be the overhead of getting a Python method call into the
equation for each line returned.

Note that Jeff added .xreadlines() as a file object method at Neel's urging.
The way he started this is shown on the last line:  a function.  If we threw
out the fileinput and file method aspects, and just added a new module
xreadlines with a function xreadlines, then what?  I bet it would become as
popular as the string module, and for good reason:  it's a specific approach
that works, to a specific and common problem.

> ...
> And in the case of simple (x)range()es, I have yet to see a case
> where a 'real' list had significantly better performance than
> a generator.)

It varies by platform, but I don't think I've heard of variations larger
than 20% in either direction.  20% is nothing, though; in *this* case we're
talking order of magnitude.  That's go/nogo territory.

> ...
> Gelukkig-Nieuwjaar-iedereen-ly y'rs

I understand people are passionate when reality clashes with the dream of a
wart-free language, but that's no reason to swear at me <wink>.

wishing-you-a-happy-new-year-like-a-civilized-man-ly y'rs  - tim




From paulp at ActiveState.com  Tue Jan  2 11:00:46 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 02:00:46 -0800
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: 
 xrange : range
References: <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> <200101011935.OAA09728@cj20424-a.reston1.va.home.com>
Message-ID: <3A51A6CE.3B15371D@ActiveState.com>

Guido van Rossum wrote:
> 
> ...
> 
> But is everyone's first thought to time the speed of Python vs. Perl?
> Why does it hurt so much that this is a bit slow?

I want to interject here that I asked Jeff to submit this patch because
I don't see it as "a little bit slow." When someone transliterates a
program from one scripting language to another and gets a program that
is two to five times slower that is a big deal!

> But of course suggesting fileinput is also not a great solution --
> it's relatively obscure (since it's not taught by most tutorials,
> certainly not by the standard tutorial).

Fileinput's primary problem is that IIRC, it is even slower than doing
readline yourself!

> > reading-text-files-is-very-common-ly y'rs  - tim
> 
> So is worrying about performance without a good reason...

I don't understand what constitutes good reason. We're talking about a
relatively minor change that will speed up thousands of programs, answer
a frequently asked question from comp.lang.python, obliterate an obscure
idiom and reduce the number of requests for a Python syntax change
(assignment expression) all in one bold sweep. It seemed to me as if it
was a "pure win."

 Paul Prescod



From paulp at ActiveState.com  Tue Jan  2 11:06:24 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 02:06:24 -0800
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: 
 xrange : range
References: <20010101234645.B5435@xs4all.nl>, <LNBBLJKPBEHFEDALKOLCKEDGIGAA.tim.one@home.com> <200101011935.OAA09728@cj20424-a.reston1.va.home.com> <20010102152440.9C26DA84F@darjeeling.zadka.site.co.il>
Message-ID: <3A51A820.50365F02@ActiveState.com>

Moshe Zadka wrote:
> 
> ...
> 
> Adding .readlines(sizehint) to fileinput, and adding a function
> to create something similar to fileinput from a file object (as opposed
> to a file name) would help everyone, and doesn't seem to hard.
> Is there a gotcha I'm just not seeing?

Fileinput is inherently slow because there are too many layers of Python
code. I started to consider ways of inverting the logic so that it only
called into Python when it needed to switch files but it would have been
a much larger patch than Jeff's and I thought that a conservative
approach was important.

Fileinput should someday be optimized but we can easily get a
low-hanging fruit improvement with Jeff's patch.

 Paul Prescod



From guido at digicool.com  Tue Jan  2 15:56:40 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 02 Jan 2001 09:56:40 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 03:06:32 EST."
             <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com> 
Message-ID: <200101021456.JAA12633@cj20424-a.reston1.va.home.com>

Tim's almost as good at convincing me as he is at channeling me!  The
timings he showed almost convinced me that fileinput is hopeless and
xreadlines should be added.  But then I wrote a little timer of my
own...

I am including the timer program below my signature.  The test input
was the current access_log of dinsdale.python.org, which has about 119
Mbytes and 1M lines (as counted by the test program).

I measure about a factor of 2 between readlines with a sizehint (of 1
MB) and fileinput; a change to fileinput that
uses readline with a sizehint and in-lines the common case in
__getitem__ (as suggested by Moshe), didn't make a difference.

Output (the first time is realtime seconds, the second CPU seconds):

total 119808333 chars and 1009350 lines
count_chars_lines     7.944  7.890
readlines_sizehint    5.375  5.320
using_fileinput      15.861 15.740
while_readline        8.648  8.570

This was on a 600 MHz Pentium-III Linux box (RH 6.2).

Note that count_chars_lines and readlines_sizehint use the same
algorithm -- the difference is that readlines_sizehint uses 'pass' as
the inner loop body, while count_chars_lines adds two counters.

Given that very light per-line processing (counting lines and
characters) already increases the time considerably, I'm not sure I
buy the arguments that the I/O overhead is always considerable.  The
fact that my change to fileinput.py didn't make a difference suggests
that its lack of speed it purely caused by the Python code.

Now what to do?  I still don't like xreadlines very much, but I do see
that it can save some time.  But my test doesn't confirm Neel's times
as posted by Tim:

> Slowest: for line in fileinput.input('foo'):     # Time 100
>        : while 1: line = file.readline()         # Time 75
>        : for line in LinesOf(open('foo')):       # Time 25
> Fastest: for line in file.readlines():           # Time 10
>          while 1: lines = file.readlines(hint)   # Time 10
>          for line in xreadlines(file):           # Time 10

I only see a factor of 3 between fastest and slowest, and
readline is only about 60% slower than readlines_sizehint.

--Guido van Rossum (home page: http://www.python.org/~guido/)

import time, fileinput, sys

def timer(func, *args):
    t0 = time.time()
    c0 = time.clock()
    func(*args)
    t1 = time.time()
    c1 = time.clock()
    print "%-20s %6.3f %6.3f" % (func.__name__, t1-t0, c1-c0)

def count_chars_lines(fn, bs=1024*1024):
    nl = 0
    nc = 0
    f = open(fn, "r")
    while 1:
        buf = f.readlines(bs)
        if not buf:
            break
        for line in buf:
            nl += 1
            nc += len(line)
    f.close()
    print "total", nc, "chars and", nl, "lines"

def readlines_sizehint(fn, bs=1024*1024):
    f = open(fn, "r")
    while 1:
        buf = f.readlines(bs)
        if not buf:
            break
        for line in buf:
            pass
    f.close()

def using_fileinput(fn):
    f = fileinput.FileInput(fn)
    for line in f:
        pass
    f.close()

def while_readline(fn):
    f = open(fn, "r")
    while 1:
        line = f.readline()
        if not line:
            break
        pass
    f.close()

fn = "/home/guido/access_log"
if sys.argv[1:]:
    fn = sys.argv[1]
timer(count_chars_lines, fn)
timer(readlines_sizehint, fn, 1024*1024)
timer(using_fileinput, fn)
timer(while_readline, fn)



From guido at digicool.com  Tue Jan  2 16:07:06 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 02 Jan 2001 10:07:06 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: Your message of "Mon, 01 Jan 2001 15:27:37 EST."
             <LNBBLJKPBEHFEDALKOLCGEDJIGAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCGEDJIGAA.tim.one@home.com> 
Message-ID: <200101021507.KAA12796@cj20424-a.reston1.va.home.com>

> As a compromise, given that we're not going to take the time to be precise
> (well, I'm sure not ...):
> 
>     The optional \keyword{else} clause is executed if and
>     when control flows off the end of the \keyword{try}
>     clause.\foonote{In Python 2.0, control "flows off the
>     end" except in case of exception, or executing a
>     \keyword{return}, \keyword{continue} or \keyword{break}
>     statement.}
>     Exceptions in the \keyword{else} clause are not handled by
>     the preceding \keyword{except} clauses.
> 
> Now it's all of imprecise, almost precise, specific to Python 2.0, and
> robust against any future changes <wink>.

Sounds good to me.  The reference to 2.0 could be changed to
"Currently".

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan  2 16:20:11 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 02 Jan 2001 10:20:11 -0500
Subject: [Python-Dev] Re: curses in the core?
In-Reply-To: Your message of "Thu, 28 Dec 2000 18:25:28 EST."
             <20001228182528.A10743@thyrsus.com> 
References: <200012282252.XAA18952@loewis.home.cs.tu-berlin.de>  
            <20001228182528.A10743@thyrsus.com> 
Message-ID: <200101021520.KAA13222@cj20424-a.reston1.va.home.com>

> What does being in the Python core mean?  There are two potential definitions:
> 
> 1. Documentation says it's available on all platforms.
> 
> 2. Documentation restricts it to one of the three platform groups 
>    (Unix/Windows/Mac) but implies that it will be available on any
>    OS in that group.  
> 
> I think the second one is closer to what application programmers
> thinking about which batteries are included expect.  But I could be
> persuaded otherwise by a good argument.

Actually, when *I* have used the term "core" I've typically thought of
this as referring to anything that's in the standard source
distribution, whether or not it is built on all platforms.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas at arctrix.com  Tue Jan  2 09:42:30 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 00:42:30 -0800
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101021456.JAA12633@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 02, 2001 at 09:56:40AM -0500
References: <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com> <200101021456.JAA12633@cj20424-a.reston1.va.home.com>
Message-ID: <20010102004230.A29700@glacier.fnational.com>

On Tue, Jan 02, 2001 at 09:56:40AM -0500, Guido van Rossum wrote:
> Now what to do?  I still don't like xreadlines very much, but I do see
> that it can save some time.  But my test doesn't confirm Neel's times
> as posted by Tim:
> 
> > Slowest: for line in fileinput.input('foo'):     # Time 100
> >        : while 1: line = file.readline()         # Time 75
> >        : for line in LinesOf(open('foo')):       # Time 25
> > Fastest: for line in file.readlines():           # Time 10
> >          while 1: lines = file.readlines(hint)   # Time 10
> >          for line in xreadlines(file):           # Time 10
> 
> I only see a factor of 3 between fastest and slowest, and
> readline is only about 60% slower than readlines_sizehint.

Could it be that your using the CVS version of Python which
includes Andrew's cool glibc getline enhancement?

  Neil



From guido at digicool.com  Tue Jan  2 16:40:40 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 02 Jan 2001 10:40:40 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 00:42:30 PST."
             <20010102004230.A29700@glacier.fnational.com> 
References: <LNBBLJKPBEHFEDALKOLCCEEOIGAA.tim.one@home.com> <200101021456.JAA12633@cj20424-a.reston1.va.home.com>  
            <20010102004230.A29700@glacier.fnational.com> 
Message-ID: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>

[me]
> > I only see a factor of 3 between fastest and slowest, and
> > readline is only about 60% slower than readlines_sizehint.

[Neil]
> Could it be that your using the CVS version of Python which
> includes Andrew's cool glibc getline enhancement?

Bingo!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Tue Jan  2 17:34:31 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 11:34:31 -0500
Subject: [Python-Dev] Fwd: try...else
In-Reply-To: <200101021507.KAA12796@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEFJIGAA.tim.one@home.com>

>>     The optional \keyword{else} clause is executed if and
>>     when control flows off the end of the \keyword{try}
>>     clause.\foonote{In Python 2.0, control "flows off the
>>     end" except in case of exception, or executing a
>>     \keyword{return}, \keyword{continue} or \keyword{break}
>>     statement.}
>>     Exceptions in the \keyword{else} clause are not handled by
>>     the preceding \keyword{except} clauses.

[Guido]
> Sounds good to me.  The reference to 2.0 could be changed to
> "Currently".

Cool.  See

http://sourceforge.net/bugs/?group_id=5470&func=detailbug&bug_id=127098




From tim.one at home.com  Tue Jan  2 21:48:08 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 15:48:08 -0500
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
Message-ID: <LNBBLJKPBEHFEDALKOLCEEGJIGAA.tim.one@home.com>

test_compare is broken because the expected-output file has bizarre stuff in
it like:

    cmp(2, [1]) = -108
    cmp(2, (2,)) = -116
    cmp(2, None) = -78

What's up with that?

I'll leave test_minidom to someone who thinks they know what it's doing.

Both failures are very recent.




From tim.one at home.com  Tue Jan  2 21:48:09 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 15:48:09 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>

[Guido]
> I only see a factor of 3 between fastest and slowest, and
> readline is only about 60% slower than readlines_sizehint.

[Neil]
> Could it be that your using the CVS version of Python which
> includes Andrew's cool glibc getline enhancement?

[Guido]
> Bingo!

It's a good thing I haven't yet had time to try any speed tests myself,
since I don't have a glibc-enabled platform so Guido and I may have been
tempted to disagree about numbers in public <wink>.

I checked out the source for glibc's getline.  It's pulling the same trick
Perl uses, copying directly from the stdio buffer when it can, instead of
(like Python, and like almost all vendor fgets implementations) doing
getc-in-a-loop.  The difference is that Perl can't do that without breaking
into the FILE* representation in platform-dependent ways.  It's a shame that
almost all vendors missed that fgets was defined as a primitive by the C
committee precisely so that vendors *could* pull this speed trick under the
covers.  It's also a shame that Perl did it for them <wink>.




From barry at digicool.com  Tue Jan  2 22:56:10 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 2 Jan 2001 16:56:10 -0500
Subject: [Python-Dev] testing, please ignore
Message-ID: <14930.20090.283107.799626@anthem.wooz.org>

Sorry folks, just making sure things are working again.

you-really-didn't-want-email-this-millennium-didja?-ly y'rs,
-Barry




From guido at python.org  Tue Jan  2 21:59:22 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 02 Jan 2001 15:59:22 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 14:59:24 EST."
             <LNBBLJKPBEHFEDALKOLCAEGFIGAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCAEGFIGAA.tim.one@home.com> 
Message-ID: <200101022059.PAA14845@cj20424-a.reston1.va.home.com>

> [Guido]
> > I only see a factor of 3 between fastest and slowest, and
> > readline is only about 60% slower than readlines_sizehint.
> 
> [Neil]
> > Could it be that your using the CVS version of Python which
> > includes Andrew's cool glibc getline enhancement?
> 
> [Guido]
> > Bingo!
> 
> It's a good thing I haven't yet had time to try any speed tests myself,
> since I don't have a glibc-enabled platform so Guido and I may have been
> tempted to disagree about numbers in public <wink>.
> 
> I checked out the source for glibc's getline.  It's pulling the same trick
> Perl uses, copying directly from the stdio buffer when it can, instead of
> (like Python, and like almost all vendor fgets implementations) doing
> getc-in-a-loop.  The difference is that Perl can't do that without breaking
> into the FILE* representation in platform-dependent ways.  It's a shame that
> almost all vendors missed that fgets was defined as a primitive by the C
> committee precisely so that vendors *could* pull this speed trick under the
> covers.  It's also a shame that Perl did it for them <wink>.

Quite apart from whether we should enable xreadlines(), could you look
into doing a similar thing for MSVC stdio?  For most Unix platforms, a
cop-out answer is "use glibc" -- but for Windows it may pay to do our
own hack.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Tue Jan  2 22:06:05 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Tue, 2 Jan 2001 16:06:05 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 02, 2001 at 03:48:09PM -0500
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
Message-ID: <20010102160605.A5211@kronos.cnri.reston.va.us>

On Tue, Jan 02, 2001 at 03:48:09PM -0500, Tim Peters wrote:
>into the FILE* representation in platform-dependent ways.  It's a shame that
>almost all vendors missed that fgets was defined as a primitive by the C
>committee precisely so that vendors *could* pull this speed trick under the
>covers.  It's also a shame that Perl did it for them <wink>.

So, should Python be changed to use fgets(), available on all ANSI C
platforms, rather than the glibc-specific getline()?  That would be
more complicated than the brain-dead easy course of using getline(),
which is obviously why I didn't do it; PyFile_GetLine() had annoyingly
complicated logic.

When this was discussed in comp.lang.python, someone also mentioned
getc_unlocked(), which saves the overhead of locking the stream every
time, but that didn't seem a fruitful avenue for exploration.

--amk




From tim.one at home.com  Tue Jan  2 23:00:37 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 17:00:37 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101022059.PAA14845@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEGNIGAA.tim.one@home.com>

[Guido]
> Quite apart from whether we should enable xreadlines(), could you look
> into doing a similar thing for MSVC stdio?  For most Unix platforms, a
> cop-out answer is "use glibc" -- but for Windows it may pay to do our
> own hack.

There's no question about whether it would pay on Windows, because it pays
big for Perl on Windows.  The question is about cost.  There's no way to
*do* it short of the way Perl does it, which is to write a large pile of
Windows-specific code (roughly the same size and complexity as the glibc
getline implementation -- check it out, it's not trivial, and glibc exploits
compiler inlining to make it bearable) relying on reverse-engineered
accidents of how MS happens to use all the fields from this undocumented
struct (from MS's stdio.h):

struct _iobuf {
        char *_ptr;
        int   _cnt;
        char *_base;
        int   _flag;
        int   _file;
        int   _charbuf;
        int   _bufsiz;
        char *_tmpfname;
        };
typedef struct _iobuf FILE;

in their stdio implementation.  Else it won't play correctly with MS's
stdio.  That's A Project.  Last year I tried extracting the relevant code
from Perl, but, as is usual, gave up after unraveling the third (whatever)
layer of mystery macros with no end in sight.  I bet it would take me a
week.  Is it worth that much to you and DC?  Since the real Windows experts
are hanging out at ActiveState, I bet one of them will volunteer to do it
tonight <wink>.




From tim.one at home.com  Tue Jan  2 23:17:14 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 17:17:14 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010102160605.A5211@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEGNIGAA.tim.one@home.com>

[Tim]
> It's a shame that almost all vendors missed that fgets was defined
> as a primitive by the C committee precisely so that vendors *could*
> pull this speed trick under the covers.  It's also a shame that Perl
> did it for them <wink>.

[Andrew Kuchling]
> So, should Python be changed to use fgets(), available on all ANSI C
> platforms, rather than the glibc-specific getline()?  That would be
> more complicated than the brain-dead easy course of using getline(),
> which is obviously why I didn't do it; PyFile_GetLine() had annoyingly
> complicated logic.

The thrust of my original comment above is that fgets is almost never faster
than what Python is doing now, because vendors overwhelmingly do *not*
exploit the opportunity the std gave them.  So, no, switching to fgets()
wouldn't help.

> When this was discussed in comp.lang.python, someone also mentioned
> getc_unlocked(), which saves the overhead of locking the stream every
> time, but that didn't seem a fruitful avenue for exploration.

Well, get_unlocked isn't std (not even in C99).  Mentioning it did inspire
me to discover, however, that while the MS fgets() is the typical "getc in a
loop" thing, at least it locks/unlocks the stream once each at function
entry/exit, and uses a special MS flavor of getc ("_getc_lk") inside the
loop.  However, that this helps is an illusion, because the body of their
_getc_lk macro is identical to the body of their getc macro.  Smells like a
bug, or an unfinished project.




From paulp at ActiveState.com  Tue Jan  2 23:40:39 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 14:40:39 -0800
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: 
 xrange : range
References: <LNBBLJKPBEHFEDALKOLCCEGNIGAA.tim.one@home.com>
Message-ID: <3A5258E7.D52CA2C@ActiveState.com>

Tim Peters wrote:
> 
> There's no question about whether it would pay on Windows, because it pays
> big for Perl on Windows.  The question is about cost.  There's no way to
> *do* it short of the way Perl does it, which is to write a large pile of
> Windows-specific code 

> ... Since the real Windows experts
> are hanging out at ActiveState, I bet one of them will volunteer to do it
> tonight <wink>.

Mark is busy tonight and the Perl guys are still recovering from
implementing it the first time. :)

 Paul



From guido at python.org  Tue Jan  2 23:46:00 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 02 Jan 2001 17:46:00 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 16:06:05 EST."
             <20010102160605.A5211@kronos.cnri.reston.va.us> 
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>  
            <20010102160605.A5211@kronos.cnri.reston.va.us> 
Message-ID: <200101022246.RAA16384@cj20424-a.reston1.va.home.com>

> On Tue, Jan 02, 2001 at 03:48:09PM -0500, Tim Peters wrote:
> >into the FILE* representation in platform-dependent ways.  It's a shame that
> >almost all vendors missed that fgets was defined as a primitive by the C
> >committee precisely so that vendors *could* pull this speed trick under the
> >covers.  It's also a shame that Perl did it for them <wink>.
> 
> So, should Python be changed to use fgets(), available on all ANSI C
> platforms, rather than the glibc-specific getline()?  That would be
> more complicated than the brain-dead easy course of using getline(),
> which is obviously why I didn't do it; PyFile_GetLine() had annoyingly
> complicated logic.

You mean get_line(), which indeed has a complicated API and
corresponding logic: the argument may be a max length, or 0 to
indicate arbutrary length, or negative to indicate raw_input()
semantics. :-(

Unfortunately we can't use fgets(), even if it were faster than
getline(), because it doesn't tell how many characters it read.  On
files containing null bytes, readline() is supposed to treat these
like any other character; if your input is "abc\0def\nxyz\n", the
first readline() call should return "abc\0def\n".  But with fgets(),
you're left to look in the returned buffer for a null byte, and
there's no way (in general) to distinguish this result from an input
file that only consisted of the three characters "abc".  getline()
doesn't seem to have this problem, since its size is also an output
parameter.

> When this was discussed in comp.lang.python, someone also mentioned
> getc_unlocked(), which saves the overhead of locking the stream every
> time, but that didn't seem a fruitful avenue for exploration.

I've never heard of getc_unlocked; it's not in the (old) C standard.
If it's also a glibc thing, I doubt that using it would be faster than
getline().  If it's a new C standard (C9x) thing, we'll have to wait.

Fred reminded me that for e.g. Solaris, while everybody probably
compiles with GCC, that doesn't mean they are using glibc, so
in practice getline() will only help on Linux.

I'm slowly warming up to xreadlines(), although we must be careful to
consider the consequences (do other file-like objects need to support
it too?).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Tue Jan  2 23:46:18 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 17:46:18 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines ::  xrange : range
In-Reply-To: <3A5258E7.D52CA2C@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHBIGAA.tim.one@home.com>

[Tim]
> ... Since the real Windows experts are hanging out at ActiveState,
> I bet one of them will volunteer to do it tonight <wink>.

[Paul Prescod]
> Mark is busy tonight and the Perl guys are still recovering from
> implementing it the first time. :)

I'm delighted, then, that you have nothing better to do than tease the
decent, hard-working folks on Python-Dev!  I'll be up until about 4am --
feel free to submit your patch anytime before then.

in-a-pinch-i'll-even-accept-it-tomorrow-ly y'rs  - tim




From guido at python.org  Tue Jan  2 23:53:14 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 02 Jan 2001 17:53:14 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 17:00:37 EST."
             <LNBBLJKPBEHFEDALKOLCCEGNIGAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCCEGNIGAA.tim.one@home.com> 
Message-ID: <200101022253.RAA16482@cj20424-a.reston1.va.home.com>

> [Guido]
> > Quite apart from whether we should enable xreadlines(), could you look
> > into doing a similar thing for MSVC stdio?  For most Unix platforms, a
> > cop-out answer is "use glibc" -- but for Windows it may pay to do our
> > own hack.
> 
> There's no question about whether it would pay on Windows, because it pays
> big for Perl on Windows.  The question is about cost.  There's no way to
> *do* it short of the way Perl does it, which is to write a large pile of
> Windows-specific code (roughly the same size and complexity as the glibc
> getline implementation -- check it out, it's not trivial, and glibc exploits
> compiler inlining to make it bearable) relying on reverse-engineered
> accidents of how MS happens to use all the fields from this undocumented
> struct (from MS's stdio.h):
> 
> struct _iobuf {
>         char *_ptr;
>         int   _cnt;
>         char *_base;
>         int   _flag;
>         int   _file;
>         int   _charbuf;
>         int   _bufsiz;
>         char *_tmpfname;
>         };
> typedef struct _iobuf FILE;
> 
> in their stdio implementation.  Else it won't play correctly with MS's
> stdio.  That's A Project.  Last year I tried extracting the relevant code
> from Perl, but, as is usual, gave up after unraveling the third (whatever)
> layer of mystery macros with no end in sight.  I bet it would take me a
> week.  Is it worth that much to you and DC?  Since the real Windows experts
> are hanging out at ActiveState, I bet one of them will volunteer to do it
> tonight <wink>.

Yeah.  That's too much.  Too bad.  I'm not holding my breath for
ActiveState though. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Tue Jan  2 23:52:58 2001
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 2 Jan 2001 16:52:58 -0600 (CST)
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101022246.RAA16384@cj20424-a.reston1.va.home.com>
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
	<20010102160605.A5211@kronos.cnri.reston.va.us>
	<200101022246.RAA16384@cj20424-a.reston1.va.home.com>
Message-ID: <14930.23498.53540.401218@beluga.mojam.com>

    Guido> I'm slowly warming up to xreadlines(), ...

I haven't followed this thread closely, and my brain is a bit frazzled at
the moment, but is there some fundamental reason that the file object's
readlines method can't be made lazy, perhaps only when given a sizehint?

Skip



From paulp at ActiveState.com  Tue Jan  2 23:59:47 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 14:59:47 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
		<LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
		<20010102160605.A5211@kronos.cnri.reston.va.us>
		<200101022246.RAA16384@cj20424-a.reston1.va.home.com> <14930.23498.53540.401218@beluga.mojam.com>
Message-ID: <3A525D63.17ABCC87@ActiveState.com>

Skip Montanaro wrote:
> 
>     Guido> I'm slowly warming up to xreadlines(), ...
> 
> I haven't followed this thread closely, and my brain is a bit frazzled at
> the moment, but is there some fundamental reason that the file object's
> readlines method can't be made lazy, perhaps only when given a sizehint?

I suggested this at one point but it was pointed out that there is
probably a lot of code that works with the resulting list *as a list*
i.e. as a random-access, writable sequence object. I really wasn't
thrilled with xreadlines at first either...it's the least of all
possible evils (including the status quo).

 Paul



From nas at arctrix.com  Tue Jan  2 17:09:15 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 08:09:15 -0800
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEGJIGAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 02, 2001 at 03:48:08PM -0500
References: <LNBBLJKPBEHFEDALKOLCEEGJIGAA.tim.one@home.com>
Message-ID: <20010102080915.A30892@glacier.fnational.com>

On Tue, Jan 02, 2001 at 03:48:08PM -0500, Tim Peters wrote:
> test_compare is broken because the expected-output file has bizarre stuff in
> it like:
> 
>     cmp(2, [1]) = -108
>     cmp(2, (2,)) = -116
>     cmp(2, None) = -78
> 
> What's up with that?

My fault.  I only ran regrtest.py and not "make test".  I'm not
sure why you say bizarre stuff though.  Do you object to testing
that 2 is less than None (something that is not part of the
language spec) or do you think that the results from cmp() should
be clamped between -1 and 1?

  Neil



From guido at python.org  Wed Jan  3 00:06:16 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 02 Jan 2001 18:06:16 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 02 Jan 2001 16:52:58 CST."
             <14930.23498.53540.401218@beluga.mojam.com> 
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com> <20010102160605.A5211@kronos.cnri.reston.va.us> <200101022246.RAA16384@cj20424-a.reston1.va.home.com>  
            <14930.23498.53540.401218@beluga.mojam.com> 
Message-ID: <200101022306.SAA16684@cj20424-a.reston1.va.home.com>

> I haven't followed this thread closely, and my brain is a bit frazzled at
> the moment, but is there some fundamental reason that the file object's
> readlines method can't be made lazy, perhaps only when given a sizehint?

Yes -- readlines() is documented to return a list, and some people do
things to it that require it to be a real list (e.g. sort or reverse
it or modify it in place or concatenate it with other lists).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Wed Jan  3 00:19:14 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 18:19:14 -0500
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <20010102080915.A30892@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHEIGAA.tim.one@home.com>

[Tim]
> test_compare is broken because the expected-output file has
> bizarre stuff in it like:
>
>     cmp(2, [1]) = -108
>     cmp(2, (2,)) = -116
>     cmp(2, None) = -78
>
> What's up with that?

[Neil Schemenauer]
> My fault.  I only ran regrtest.py and not "make test".

Neil, my platform doesn't even *have* a "make":  are you saying the test
passes for you when you run regrtest.py?  That's what I did.

> I'm not sure why you say bizarre stuff though.  Do you object to
> testing that 2 is less than None (something that is not part of the
> language spec)

Only in part.  Lang Ref 2.1.3 (Comparisons) says you can compare them, and
guarantees they won't compare equal, but doesn't define it beyond that.  If
Python actually says "less", fine, we can test for that, although to
minimize maintenance down the road it would be better to test for no more
than we expect Python to guarantee across releases and implementations
(suppose Jython says 2 is greater than None:  that's fine too, and it would
be better if the test suite didn't say Jython was broken).

> or do you think that the results from cmp() should be clamped
> between -1 and 1?

Not that either <wink>; cmp() isn't documented that way.

They're "bizarre" simply because they're not what Python returns!

C:\Code\python\dist\src\PCbuild>python
Python 2.0 (#8, Dec 17 2000, 01:39:08) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> cmp(2, [1])
-1
>>> cmp(2, (2,))
-1
>>> cmp(2, None)
-1
>>>

The expected-output file is supposed to match what Python actually does.  I
have no idea where things like "-108" came from.  So things like -108 look
bizarre to me.  So long as cmp(2, [1]) returns -1 in reality, an
expected-output file that claims it returns -108 will never work no matter
how you run the tests.

One of us is missing something obvious here <wink>.




From paulp at ActiveState.com  Wed Jan  3 00:26:39 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 15:26:39 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>  
	            <20010102160605.A5211@kronos.cnri.reston.va.us> <200101022246.RAA16384@cj20424-a.reston1.va.home.com>
Message-ID: <3A5263AF.CE6C8C81@ActiveState.com>

Guido van Rossum wrote:
> 
> ...
> 
> I'm slowly warming up to xreadlines(), although we must be careful to
> consider the consequences (do other file-like objects need to support
> it too?).

The implementation is such that it is pretty easy to add the method to
other file-like objects. It is also easy to use the xreadlines module to
get the same behavior for objects that do not have the method. 
Essentially, file.xreadlines is implemented like this:

def xreadlines(self):
    import xreadlines
    xreadlines.xreadlines(self)

Any object can add the method similarly.

 Paul Prescod



From nas at arctrix.com  Tue Jan  2 17:51:48 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 08:51:48 -0800
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEHEIGAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 02, 2001 at 06:19:14PM -0500
References: <20010102080915.A30892@glacier.fnational.com> <LNBBLJKPBEHFEDALKOLCOEHEIGAA.tim.one@home.com>
Message-ID: <20010102085148.A30986@glacier.fnational.com>

On Tue, Jan 02, 2001 at 06:19:14PM -0500, Tim Peters wrote:
> Neil, my platform doesn't even *have* a "make":  are you saying the test
> passes for you when you run regrtest.py?

Yes.  Isn't checking in code without running regrtest a capital
offence? :)

> Lang Ref 2.1.3 (Comparisons) says you can compare them, and
> guarantees they won't compare equal, but doesn't define it beyond that.

Okay, I'll use == rather than cmp().  When I was working on the coercion
patch I found cmp() useful.  I guess it shouldn't be in the standard
test suite, especially since Jython may implement things differently.

[Neil]
> or, do you think that the results from cmp() should be clamped
> between -1 and 1?

[Tim]
> Not that either <wink>; cmp() isn't documented that way.
> 
> They're "bizarre" simply because they're not what Python returns!

They do on my box:

    Python 2.0 (#19, Nov 21 2000, 18:13:04) 
    [GCC 2.95.2 20000220 (Debian GNU/Linux)] on linux2
    Type "copyright", "credits" or "license" for more information.
    >>> cmp(1, None)
    -78

I guess MS uses a different strcmp than GNU.  Do you mind trying the
attached C code?  I get "-78" as output.  I should have thought a little
more before checking in the patch.  -78 is quite obviously a
machine/library dependent thing.

[Tim again]
> One of us is missing something obvious here <wink>.

I don't know about that.  The implementation of coercion and comparison
is not simple.  I've been studying it for some time now and I obviously
still don't know what the hell is going on.

AFAICT, the problem is that instances without a comparison method can
compare larger or smaller than numbers depending on where in memory the
objects are stored.

  Neil


#include <stdio.h>
#include <string.h>

int main()
{
    printf("%d\n", strcmp("", "None"));
}



From tim.one at home.com  Wed Jan  3 01:30:26 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 19:30:26 -0500
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <20010102085148.A30986@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEHIIGAA.tim.one@home.com>

[Neil]
> They do on my box:
>
>     Python 2.0 (#19, Nov 21 2000, 18:13:04)
>     [GCC 2.95.2 20000220 (Debian GNU/Linux)] on linux2
>     Type "copyright", "credits" or "license" for more information.
>     >>> cmp(1, None)
>     -78

Well, who cares about your silly box <wink>?  Messier than I thought!  Yes,
Windows strcmp is always in {-1, 0, 1}.  Rather than run tests, here's the
tail end of MS's strcmp.c:

        if ( ret < 0 )
                ret = -1 ;
        else if ( ret > 0 )
                ret = 1 ;

        return( ret );

Wasted cycles and stupid formatting <wink>.

> ...
> AFAICT, the problem is that instances without a comparison method can
> compare larger or smaller than numbers depending on where in memory
> the objects are stored.

If so, that's a bug ... OK, it *is* a bug, at least in current CVS.  Did you
cause that, or was it always this way?  I was able to provoke this badness:

>>> j < c < i
1
>>> j < i
0
>>>

i.e. it violates transitivity, and that's never supposed to happen in the
absence of user-supplied __cmp__.  Here c is an instance of "class C: pass",
and i and j are ints.

>>> type(i), type(j), type(c)
(<type 'int'>, <type 'int'>, <type 'instance'>)
>>> i, j, c
(999999, 1000000, <__main__.C instance at 00791B7C>)
>>> id(i), id(j), id(c)
(7941572, 7744676, 7936892)
>>>

Guido thought he fixed this kind of stuff once (and I believed him <wink>)
by treating all numbers as if they had type name "" (i.e., yes, an empty
string) when compared to non-numbers.  Then the usual "mixed-type
comparisons in the absence of __cmp__ compare via type name string" rule
ensured that numbers would always compare "less than" instances of any other
type.  That's the intent of the tail end:

		else if (vtp->tp_as_number != NULL)
			vname = "";
		else if (wtp->tp_as_number != NULL)
			wname = "";
		/* Numerical types compare smaller than all other types */
		return strcmp(vname, wname);

of PyObject_Compare.  So, in the example above, we *should* have

    i < c == 1
    j < c == 1
    j < c < i == 0

Unfortunately, we actually have

    i < c == 0

in that example.  We're apparently not getting to the "number hack" code
because c is an instance, and I'll confess up front that my eyes always
glazed over long before I got to PyInstance_HalfBinOp <0.half wink>.
Whatever, there's at least one bug somewhere in that path!   We should have
n < i == 1 for any numeric type n and any non-numeric type i (in the absence
of user-defined __cmp__).





From skip at mojam.com  Wed Jan  3 02:27:03 2001
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 2 Jan 2001 19:27:03 -0600 (CST)
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <3A525D63.17ABCC87@ActiveState.com>
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
	<20010102160605.A5211@kronos.cnri.reston.va.us>
	<200101022246.RAA16384@cj20424-a.reston1.va.home.com>
	<14930.23498.53540.401218@beluga.mojam.com>
	<3A525D63.17ABCC87@ActiveState.com>
Message-ID: <14930.32743.525564.69044@beluga.mojam.com>

    Paul> I suggested this at one point but it was pointed out that there is
    Paul> probably a lot of code that works with the resulting list *as a
    Paul> list*

How about this idea?  What if readlines() was allowed to return a lazy
evaluator if a sizehint > 0 was given?  I only saw one example outside of
test cases in the current CVS tree where readlines(sizehint) was used
(Tools/idle/GrepDialog.py), and it used it as expected:

    while 1:
      block = f.readlines(sizehint)
      if not block:
        break
      for line in block:
        more stuff

My suspicion is that most uses of sizehint will be like this.  It hasn't
been around all that long in Python-years (since 1.5a2), so there's probably
not tons of code to break (I agree the semantics would change), and the
majority of code that uses it probably looks like the above, which is almost
safe (if it returned "" instead of an empty evaluator when nothing was left
to read it would be safe).  The advantage would be that the above could
become the more obvious

    for line in f.readlines(sizehint):
      more stuff

and the change to file reading code that is "too slow" becomes much simpler.
(Of course, xreadlines() has that advantage as well.)

I scanned my own code quickly.  I found about 10 uses with sizehint and 300
without.

I presume we are talking about 2.1 here.  In any case, it seems to me that
in Py3k readlines should be lazy.

Skip

P.S.  Why did FileInput class never grow a readlines method?



From nas at arctrix.com  Tue Jan  2 20:38:53 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 11:38:53 -0800
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEHIIGAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 02, 2001 at 07:30:26PM -0500
References: <20010102085148.A30986@glacier.fnational.com> <LNBBLJKPBEHFEDALKOLCKEHIIGAA.tim.one@home.com>
Message-ID: <20010102113853.A31341@glacier.fnational.com>

On Tue, Jan 02, 2001 at 07:30:26PM -0500, Tim Peters wrote:
> > AFAICT, the problem is that instances without a comparison method can
> > compare larger or smaller than numbers depending on where in memory
> > the objects are stored.
> 
> If so, that's a bug ... OK, it *is* a bug, at least in current CVS.  Did you
> cause that, or was it always this way?

To quote Bart Simpson: I didn't do it.  I'm pretty sure the bug
is in PyInstance_DoBinOp.  I don't think its worth fixing though.
I'm ready to check in my coercion overhaul patch, assuming no
veto's from the list.  It should fix this bug (and introduce a
whole slew of new ones :).

Guido suggested that I remove the "number types compare smaller
than other types" behavior.  What's your take on that?  The
current patch on SF always uses the type names.  It should be
easy to implement the old behavior though.

  Neil



From nas at arctrix.com  Tue Jan  2 20:48:09 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 2 Jan 2001 11:48:09 -0800
Subject: [Python-Dev] Applying the PEP 208 (coercion overhaul) patch
Message-ID: <20010102114809.B31341@glacier.fnational.com>

I'm almost ready to apply SF patch #102652.  Guido has give the
okay assuming there are no objections from the rest of
python-dev.  The patch is large and modifies some complicated
parts of the interpreter.  I expect there will be some bugs.  If
you would like me to wait, speak now.

Guido has sent me some comments on the patch today which I plan
to review and address tonight.  I will probably apply the patch
tomorrow evening.

  Neil



From tim.one at home.com  Wed Jan  3 04:05:59 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 22:05:59 -0500
Subject: [Python-Dev] Std test failures on WIndows:  test_compare, test_minidom
In-Reply-To: <20010102113853.A31341@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHNIGAA.tim.one@home.com>

[Neil Schemenauer, on a violation of transitivity j < c < i but not j < i]

> To quote Bart Simpson: I didn't do it.  I'm pretty sure the bug
> is in PyInstance_DoBinOp.  I don't think its worth fixing though.
> I'm ready to check in my coercion overhaul patch, assuming no
> veto's from the list.  It should fix this bug (and introduce a
> whole slew of new ones :).

Sounds good to me!

> Guido suggested that I remove the "number types compare smaller
> than other types" behavior.  What's your take on that?  The
> current patch on SF always uses the type names.  It should be
> easy to implement the old behavior though.

It doesn't matter that they're specifically smaller, it matters that they
can't violate transitivity.  "numbers compare smaller" was introduced
deliberately (by Guido) because, e.g., before that we had

    99 < [99] < 99L

despite that 99 == 99L, because

   "int" < "list" < "long int"

Even stranger, we had

    100 < [99] < 0L < 100

and

    100 < [] < -101L < -100


Making numbers compare smaller than other types is one way to ensure stuff
like that can't happen; I can't think of a simpler way (although making them
compare larger than other types would be equally simple, as would making
them compare as if their type name were "Neil" <wink>).




From paulp at ActiveState.com  Wed Jan  3 04:34:59 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Tue, 02 Jan 2001 19:34:59 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
References: <200101021540.KAA13446@cj20424-a.reston1.va.home.com>
		<LNBBLJKPBEHFEDALKOLCGEGJIGAA.tim.one@home.com>
		<20010102160605.A5211@kronos.cnri.reston.va.us>
		<200101022246.RAA16384@cj20424-a.reston1.va.home.com>
		<14930.23498.53540.401218@beluga.mojam.com>
		<3A525D63.17ABCC87@ActiveState.com> <14930.32743.525564.69044@beluga.mojam.com>
Message-ID: <3A529DE3.D93C3916@ActiveState.com>

Skip Montanaro wrote:
> 
>...
> 
> I presume we are talking about 2.1 here.  In any case, it seems to me that
> in Py3k readlines should be lazy.

I agree, but I'm ambivalent about your suggestion for polymorphic return
values from readlines(). Yet another option is a "lazy=1" option.

 Paul Prescod



From tim.one at home.com  Wed Jan  3 05:33:29 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 2 Jan 2001 23:33:29 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101021456.JAA12633@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHPIGAA.tim.one@home.com>

[Guido, writes a timing program]

[Jeff, if you weren't copied on all this stuff, you can play catch-up
 by reading the archives, at
    http://mail.python.org/pipermail/python-dev/
]

> ...
> I am including the timer program below my signature.  The test input
> was the current access_log of dinsdale.python.org, which has about 119
> Mbytes and 1M lines (as counted by the test program).

For a contrast, I cobbled together a large test file out of various chunks
of C source, .py source, HTML source, and email archives.  I was shooting
for the same size you used (~119Mb), but ended up with more than 3x as many
lines.

> I measure about a factor of 2 between readlines with a sizehint (of 1
> MB) and fileinput;

Factor of 7 here (Jeff, NeilS eventually figured out that Guido was using a
CVS version of Python that has AndrewK's glibc getline patch, a zippier
line-input routine than Python 2.0 has; but it only applies to platforms
using glibc).

> ...
> Output (the first time is realtime seconds, the second CPU seconds):
>
> total 119808333 chars and 1009350 lines
> count_chars_lines     7.944  7.890
> readlines_sizehint    5.375  5.320
> using_fileinput      15.861 15.740
> while_readline        8.648  8.570
>
> This was on a 600 MHz Pentium-III Linux box (RH 6.2).

total 117615824 chars and 3237568
count_chars_lines    14.780 14.772
readlines_sizehint    9.390  9.375
using_fileinput      66.130 66.157
while_readline       30.380 30.337

866 MHz P3 Win98SE, current CVS Python.  I have no handy explanation for why
clock() and time() differ on my box (Win98 has no notions of "user time" or
"CPU time" distinct from clock time).

> Note that count_chars_lines and readlines_sizehint use the same
> algorithm -- the difference is that readlines_sizehint uses 'pass' as
> the inner loop body, while count_chars_lines adds two counters.
>
> Given that very light per-line processing (counting lines and
> characters) already increases the time considerably, I'm not sure I
> buy the arguments that the I/O overhead is always considerable.

I disagree that this is "very light processing", although I agree it's hard
to think of lighter processing <wink>:  it's a few Python statements per
line, which I'd say is pretty *typical* processing.  Read a line, run a
string find or regexp search on it, test the result, sometimes fiddle the
line accordingly and sometimes not.  File-crunching apps generally aren't
rocket science!  For example, I changed count_chars_lines to tally the
number of lines containing the string "Guido" instead, and the runtime went
up by just 0.8 seconds (BTW, it found 13808 of them <wink>):  if you're
thinking in C terms, millions of failing searches for "Guido" may seem like
more work, but the number of Python stmts executed usually counts more than
what the stmts do at the C level.

> ...
> Now what to do?  I still don't like xreadlines very much, but I do
> see that it can save some time.  But my test doesn't confirm Neel's
> times as posted by Tim:
>
>> Slowest: for line in fileinput.input('foo'):     # Time 100
>>        : while 1: line = file.readline()         # Time 75
>>        : for line in LinesOf(open('foo')):       # Time 25
>> Fastest: for line in file.readlines():           # Time 10
>>          while 1: lines = file.readlines(hint)   # Time 10
>>          for line in xreadlines(file):           # Time 10
>
> I only see a factor of 3 between fastest and slowest, and
> readline is only about 60% slower than readlines_sizehint.

I don't know what Neel used for an input file, or which platform he used
either.  And this is bound to vary a lot across platforms.  As above, I saw
a factor of 7 between fastest and slowest and a factor of 3 between readline
and readlines_sizehint.

BTW, on my platform the Perl script (using a recent ActiveState Windows
Perl)

open(FILE, "ga.txt");
while (<FILE>) {
    1;
}

ran in about 6 seconds (I never figured how to get Perl to compute usable
timings itself)-- substantially faster than even readlines_sizehint! --and
changing the body to

$nc = $nl = 0;
while (<FILE>) {
    ++$nl;
    $nc += length;
}
print "$nc $nl\n";

boosted that to about 8 seconds.  So Perl has gotten zippier too over the
years.




From tim.one at home.com  Wed Jan  3 10:32:55 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 3 Jan 2001 04:32:55 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101022253.RAA16482@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEIDIGAA.tim.one@home.com>

[Guido & Tim, wonder about faking getline-like functionality for Windows]

The attached is kinda baffling.  The std tests pass with it, and it changes my test
timings from:

count_chars_lines    14.780 14.772
readlines_sizehint    9.390  9.375
using_fileinput      66.130 66.157
while_readline       30.380 30.337

to:

count_chars_lines    14.880 14.854
readlines_sizehint    9.280  9.302
using_fileinput      48.610 48.589
while_readline       13.450 13.451

Big win?  You bet.  But ...

The baffling parts:

1. That Perl still takes only 6 seconds in line-at-a-time mode.

2. I originally wrote a getline workalike, instead of building directly into a PyString
buffer.  That made my test run *slower*, and I'm talking factor of 2, not a yawn.  To
judge from my usually silent disk (I've got 256Mb RAM on this box), I'm afraid the extra
mallocs required may have triggered the horrid Win9x malloc-thrashing problem I wrote
about while I was still at Dragon.  Consider that another vote for Vlad's PyMalloc --
we've got no handle on x-platform dynamic memory behavior now.  Python's destiny is to
replace both the platform OS and libc anyway <0.9 wink>.

The scary parts:

+ As the "XXX" comments indicate, this is full of little insecurities.

+ Another one I just thought of:  if the user's last operations on the fp were two or more
consecutive ungetc calls, all bets are off.  But then MS doesn't define what happens then
either.

+ This is much less ambitious than I recall Perl's code being:  it doesn't try to guess
anything about the file, and effectively captures only what would happen if you could
unroll the guts of a getc-in-a-loop and optimize the snot out of it.  The good news is
that this means it's much easier to maintain (it touches only two of the MS FILE* fields,
and in ways that are pretty obviously correct).  The bad news is that this seems also
pretty clearly all there *is* to be gotten out of breaking into the FILE* abstraction for
the particular test case I'm using; and increasing TUNEME doesn't save any time at all:
the sucker is flying at full speed already.

+ It drops (line-at-a-time) drops to a little under 13 seconds if I comment out the thread
macros.

+ I haven't looked at Perl's implementation in a year, and they must have dreamt up
another trick since then.  That's a "scary part" indeed to anyone who has ever looked at
Perl's implementation.

retreating-into-a-fetal-position-ly y'rs  - tim


Anyone wants to play, the sandbox is fileobject.c.  Do two things:  insert this new chunk
somewhere above get_line:

#ifdef MS_WIN32
static PyObject*
win32_getline(FILE *fp)
{
	/* XXX ignores thread safety -- but so does MS's getc macro! */
	PyObject* v;
	char* pBuf;	/* next free slot in v's buffer */
	/* MS's internals are declared in terms of ints, but it's a sure bet
	 * that won't last forever -- use size_t now & live w/ the casting;
	 * ditto for Python's routines
	 */
	size_t total_buf_size = 100;
	size_t free_buf_size = total_buf_size;
#define TUNEME 1000	/* how much to boost the string buffer when exhausted */

	v = PyString_FromStringAndSize((char *)NULL, (int)total_buf_size);
	if (v == NULL)
		return NULL;
	pBuf = BUF(v);
	Py_BEGIN_ALLOW_THREADS
	for (;;) {
		char ch;
		size_t ms_cnt;	/* FILE->_cnt shadow */
		char* ms_ptr;	/* FILE->_ptr shadow */
		size_t max_to_copy, i;
		/* stdio buffer empty or in unknown state; rather
		 * than try to simulate every quirk of MS's internals,
		 * let the MS macros deal with it.
		 */
		/* XXX we also wind up here when we simply run out of string
		 * XXX buffer space, but I'm not sure I care:  making this a
		 * XXX double-nested loop doesn't seem worth it
		 */
		ch = getc(fp);
		if (ch == EOF)
			break;
		/* make sure we've got some breathing room */
		if (free_buf_size < 100) {
			size_t currentoffset = pBuf - BUF(v);
			total_buf_size += TUNEME;  /* XXX check for overflow */
			Py_BLOCK_THREADS
			if (_PyString_Resize(&v, (int)total_buf_size) < 0)
				return NULL;
			Py_UNBLOCK_THREADS
			pBuf = BUF(v) + currentoffset;
			free_buf_size = TUNEME;
		}
		/* ch wasn't EOF, so store it */
		*pBuf++ = ch;
		--free_buf_size;
		if (ch == '\n') {
			break;
		}
		ms_cnt = (size_t)fp->_cnt;
		if (!ms_cnt) {
			/* XXX this is a slow way to read one character at
			 * XXX a time if, e.g., the stream is unbuffered
			 */
			continue;
		}
		/* payback!  now we don't have to check for buffer overflows or
		 * EOF inside the loop, nor does the macro _filbuf() branch force
		 *  _ptr and _cnt in and out of memory on each iteration
		 */
		ms_ptr = fp->_ptr;
		assert(ms_cnt > 0);
		i = max_to_copy = ms_cnt < free_buf_size ? ms_cnt : free_buf_size;
		do {
			/* XXX unclear to me why MS's getc macro does "& 0xff" */
			*pBuf++ = ch = *ms_ptr++ & 0xff;
		} while (--i && ch != '\n');
		/* update the shadows & counters */
		fp->_ptr = ms_ptr;
		free_buf_size -= max_to_copy - i;
		fp->_cnt = ms_cnt - (max_to_copy - i);
		if (ch == '\n')
			break;
	}
	Py_END_ALLOW_THREADS
	_PyString_Resize(&v, pBuf - BUF(v));
	return v;
}
#endif

2. Within get_line, add this before the #endif (this is the getline #if block):

#elif defined(MS_WIN32)
	if (n == 0) {
		return win32_getline(fp);
	}




From ping at lfw.org  Wed Jan  3 12:40:47 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Wed, 3 Jan 2001 05:40:47 -0600 (CST)
Subject: [Python-Dev] inspect.py
In-Reply-To: <14840.19556.127151.457533@anthem.concentric.net>
Message-ID: <Pine.LNX.4.10.10011021617550.800-100000@skuld.kingmanhall.org>

Uh... hi.  <sheepish look>

I know i've all but dropped out of existence for a long time, what with
my simultaneous first stints as a grad student, a teaching assistant, and
a house cook (!) and all, but i didn't want to let this work go to waste.

Now that the holidays are here i can *finally* try to get some work done!

So, i've updated inspect.py in response to Barry's comments, and below is
my reply to this old thread.  I also wrote some regression tests.

I tried to submit inspect.py to SourceForge, but i got:

    ERROR

    Patch Uploaded ERROR - Submission failed PQsendQuery() -- query is
    too long.  Maximum length is 16382

Does anyone know what's going on with that?


Anyway, the latest module and regression tests are available at:

    http://www.lfw.org/python/inspect.py
    http://www.lfw.org/python/test_inspect.py

for your perusal.




On Thu, 26 Oct 2000 barry at wooz.org wrote:
> Some thoughts after an initial scan of inspect.py:
> 
> - The doc strings for the is*() functions aren't accurate.
>   E.g. ismodule() says that it asks whether "the object is a module
>   with the __file__ special attribute", but that isn't really what it
>   tests!  Guido points out that builtin modules don't currently have
>   __file__ and besides, you're really testing that the type of the
>   object is ModuleType.

Perhaps a different wording would be better, but i should at least
clarify the intention: i wrote them that way because it seemed that
the current objects export an unofficial "interface" by means of the
special attributes they provide.  The purpose of the "is*()" functions
is to determine whether an object meets one of these interfaces.

A complete interface would provide (1) a type-checker, (2) a constructor,
and (3) the methods.  As for (2), we don't normally allow construction of
these things (except for wizards using the newmodule).  As for (3), i
suppose that one could further encapsulate these interfaces by providing
spelled-out methods like "def getcode(f): return f.func_code", but it
didn't seem worth the trouble.  So that left just (1), and i had the
other parts in mind while trying to describe (1).

The type-checkers aren't of much use unless they accurately reflect
the availability of the special attributes.  Do you see what i'm trying
to do?  Maybe you can suggest a better way of doing it... anyway, i've
tried to compromise in the docstrings as submitted.

> - Don't make the predicate in getmembers() default to "lambda x: 1"
>   Instead make the default None, and skip the predicate test if it is
>   None.

Okay, fine.

> - getdoc()'s docstring should describe the margin munging it does.

Okay, done.

> - findsource() seems off-by one, e.g.
> 
>    >>> x = inspect.findsource(inspect.findsource)
>    >>> x[1]
>    138
> 
>    but the function really stars on line 139.

138 was the intended result here.  Indeed the function starts
on line 139 if you start counting from 1.  The reason it returns
138 is that it's the index you would use for the array of lines
(thus x[0][x[1]] or file.readlines()[138] is the first line of
the function).

Which way makes more sense?  Should it be changed?

> - I notice that currentframe() still uses the try/except trick to get
>   the frame object.  It's much more efficient to provide a C
>   trampoline for getting that information.

Sure, if there's a faster way, that's fine.  It just wasn't
something i expected to be used really often, and i wanted to
write the module in pure Python so it could be easily maintained.

I added a line to clobber the pure-Python currentframe() with
sys._getframe() if it exists.

> - If this were included in the library, we might want to 2.0-ify it.

It currently doesn't rely on any 2.0 features, and it would be
kind of nice to have it still work with 1.5 (especially if it is
part of a drop-in documentation tool, as it is now, since it goes
with htmldoc).


-- ?!ng

"Computers are useless.  They can only give you answers."
    -- Pablo Picasso





From guido at python.org  Wed Jan  3 13:06:33 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 07:06:33 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <200101031206.HAA19182@cj20424-a.reston1.va.home.com>

Apparently getc_unlocked() is in the Single Unix spec.  Not sure how
widespread that is -- do Linux developers pay attention to this
standard at all?  According to the webpage it's (c) 1997.

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Wed, 03 Jan 2001 10:58:44 +0200
From:    Erno Kuusela <erno at iki.fi>
To:      guido at python.org
Subject: getc_unlocked note

hello,

i was reading the python-dev archives and saw that someone had noticed
my getline/getc_unlocked post from the newsgroup. a correction to the
python-dev thread: getc_unlocked and friends are infact standard (not c99
though since c99 doesn't specify threads); they are part of the single
unix specification.

link:
http://www.opennc.org/onlinepubs/007908799/xsh/getc_unlocked.html

   -- erno

------- End of Forwarded Message




From guido at python.org  Wed Jan  3 13:37:11 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 07:37:11 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 03 Jan 2001 04:32:55 EST."
             <LNBBLJKPBEHFEDALKOLCIEIDIGAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCIEIDIGAA.tim.one@home.com> 
Message-ID: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>

> 1. That Perl still takes only 6 seconds in line-at-a-time mode.

Are you sure Perl still uses stdio at all?

If so, does it open the file in binary or in text mode?  Based on the
APIs in MS's libc, I presume that the crlf->lf translation is not done
by stdio proper but by the Unix I/O emulation just underneath it
(open() has an O_BINARY option flag, so read() probably does the
translation).  That comes down to copying most bytes an extra time.

(To test this hypothesis, you could try to open the test file with
mode "rb" and see if it makes a difference.)

> 2. I originally wrote a getline workalike, instead of building
> directly into a PyString buffer.  That made my test run *slower*,
> and I'm talking factor of 2, not a yawn.  To judge from my usually
> silent disk (I've got 256Mb RAM on this box), I'm afraid the extra
> mallocs required may have triggered the horrid Win9x
> malloc-thrashing problem I wrote about while I was still at Dragon.
> Consider that another vote for Vlad's PyMalloc -- we've got no
> handle on x-platform dynamic memory behavior now.  Python's destiny
> is to replace both the platform OS and libc anyway <0.9 wink>.
>
> The scary parts:
>
> + As the "XXX" comments indicate, this is full of little
> insecurities.

My biggest worry: thread-safety.  There must be a way to lock the file
(you indicated that fgets() uses it).

> + Another one I just thought of: if the user's last operations on
> the fp were two or more consecutive ungetc calls, all bets are off.
> But then MS doesn't define what happens then either.

Python doesn't have an interface to ungetc(), and I believe the stdio
standard says you can only call ungetc() once consecutively.  Assuming
other C code linked with Python obeys this rule (a pretty safe
assumption), we should be fine.  And if the assumption is violated, I
presume it's really that C code's fault -- plus, it code that only
uses getc() would be screwed just as badly.

> + This is much less ambitious than I recall Perl's code being: it
> doesn't try to guess anything about the file, and effectively
> captures only what would happen if you could unroll the guts of a
> getc-in-a-loop and optimize the snot out of it.  The good news is
> that this means it's much easier to maintain (it touches only two of
> the MS FILE* fields, and in ways that are pretty obviously correct).
> The bad news is that this seems also pretty clearly all there *is*
> to be gotten out of breaking into the FILE* abstraction for the
> particular test case I'm using; and increasing TUNEME doesn't save
> any time at all: the sucker is flying at full speed already.

You probably don't have many lines longer than 1000 characters.

> + It drops (line-at-a-time) drops to a little under 13 seconds if I
> comment out the thread macros.

If you mean the Py_BLOCK_THREADS around the resize, that can be safely
dropped.  (If/when we introduce Vladimir's malloc, we'll have to
decide whether it is threadsafe by itself or whether it requires the
global interpreter lock.  I vote to make it threadsafe by itself.)

> + I haven't looked at Perl's implementation in a year, and they must
> have dreamt up another trick since then.  That's a "scary part"
> indeed to anyone who has ever looked at Perl's implementation.
>
> retreating-into-a-fetal-position-ly y'rs - tim
> 
> 
> Anyone wants to play, the sandbox is fileobject.c.  Do two things:
> insert this new chunk somewhere above get_line:
> 
> #ifdef MS_WIN32
> static PyObject*
> win32_getline(FILE *fp)
> {
> 	/* XXX ignores thread safety -- but so does MS's getc macro! */
> 	PyObject* v;
> 	char* pBuf;	/* next free slot in v's buffer */
> 	/* MS's internals are declared in terms of ints, but it's a sure bet
> 	 * that won't last forever -- use size_t now & live w/ the casting;
> 	 * ditto for Python's routines
> 	 */
> 	size_t total_buf_size = 100;
> 	size_t free_buf_size = total_buf_size;
> #define TUNEME 1000	/* how much to boost the string buffer when exhausted */
> 
> 	v = PyString_FromStringAndSize((char *)NULL, (int)total_buf_size);
> 	if (v == NULL)
> 		return NULL;
> 	pBuf = BUF(v);
> 	Py_BEGIN_ALLOW_THREADS
> 	for (;;) {
> 		char ch;
> 		size_t ms_cnt;	/* FILE->_cnt shadow */
> 		char* ms_ptr;	/* FILE->_ptr shadow */
> 		size_t max_to_copy, i;
> 		/* stdio buffer empty or in unknown state; rather
> 		 * than try to simulate every quirk of MS's internals,
> 		 * let the MS macros deal with it.
> 		 */
> 		/* XXX we also wind up here when we simply run out of string
> 		 * XXX buffer space, but I'm not sure I care:  making this a
> 		 * XXX double-nested loop doesn't seem worth it
> 		 */
> 		ch = getc(fp);
> 		if (ch == EOF)
> 			break;
> 		/* make sure we've got some breathing room */
> 		if (free_buf_size < 100) {
> 			size_t currentoffset = pBuf - BUF(v);
> 			total_buf_size += TUNEME;  /* XXX check for overflow */
> 			Py_BLOCK_THREADS
> 			if (_PyString_Resize(&v, (int)total_buf_size) < 0)
> 				return NULL;
> 			Py_UNBLOCK_THREADS
> 			pBuf = BUF(v) + currentoffset;
> 			free_buf_size = TUNEME;
> 		}
> 		/* ch wasn't EOF, so store it */
> 		*pBuf++ = ch;
> 		--free_buf_size;
> 		if (ch == '\n') {
> 			break;
> 		}
> 		ms_cnt = (size_t)fp->_cnt;
> 		if (!ms_cnt) {
> 			/* XXX this is a slow way to read one character at
> 			 * XXX a time if, e.g., the stream is unbuffered
> 			 */
> 			continue;
> 		}
> 		/* payback!  now we don't have to check for buffer overflows or
> 		 * EOF inside the loop, nor does the macro _filbuf() branch force
> 		 *  _ptr and _cnt in and out of memory on each iteration
> 		 */
> 		ms_ptr = fp->_ptr;
> 		assert(ms_cnt > 0);
> 		i = max_to_copy = ms_cnt < free_buf_size ? ms_cnt : free_buf_size;

Doesn't it make more sense to delay the resize until this point?  I
don't know how much the character copying accounts for, but I could
imagine a strategy based on memchr() and memcpy() that first searches
for a \n, and if found, allocates to the right size before copying.
Typically, the buffer contains many lines, so this could be optimized
into requiring a single exactly-sized malloc() call in the common case
(where the buffer doesn't wrap).  But possibly scanning the buffer for
\n and then copying the bytes separately, even with memcmp() and
memcpy(), slows things down too much for this to be faster.

> 		do {
> 			/* XXX unclear to me why MS's getc macro does "& 0xff" */
> 			*pBuf++ = ch = *ms_ptr++ & 0xff;

I know why.  getchar() returns an int in the range [-1, 255].  If
chars are signed the &0xff is needed else you would get a return in
the range [-128, 127] and -1 would be ambiguous (EOF==-1).  Not sure
if they *are* unsigned on any MS platform -- if they aren't, whoever
coded this wasn't thinking -- on the other hand the compiler probagbly
optimizes it out.  But here since you're copying to another character,
it's pointless.

> 		} while (--i && ch != '\n');
> 		/* update the shadows & counters */
> 		fp->_ptr = ms_ptr;
> 		free_buf_size -= max_to_copy - i;
> 		fp->_cnt = ms_cnt - (max_to_copy - i);
> 		if (ch == '\n')
> 			break;
> 	}
> 	Py_END_ALLOW_THREADS
> 	_PyString_Resize(&v, pBuf - BUF(v));
> 	return v;
> }
> #endif
> 
> 2. Within get_line, add this before the #endif (this is the getline #if block):
> 
> #elif defined(MS_WIN32)
> 	if (n == 0) {
> 		return win32_getline(fp);
> 	}

Note that get_line() with negative n could be implemented as
get_line(0) with some post processing.  This should be done completely
separately, in PyFile_GetLine.  The negative n case is only used by
raw_input() -- it means strip the \n and raise EOFError for EOF, and I
expect that this is rarely if ever used in a speed-conscious
situation.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Wed Jan  3 15:56:31 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 09:56:31 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 03 Jan 2001 07:06:33 EST."
             <200101031206.HAA19182@cj20424-a.reston1.va.home.com> 
References: <200101031206.HAA19182@cj20424-a.reston1.va.home.com> 
Message-ID: <200101031456.JAA19990@cj20424-a.reston1.va.home.com>

> Apparently getc_unlocked() is in the Single Unix spec.  Not sure how
> widespread that is -- do Linux developers pay attention to this
> standard at all?  According to the webpage it's (c) 1997.

Erno Kuusela gave me some more info about this; glibc supports it.

I did a quick test which suggests that it is a lot faster than regular
getc() -- on a small test file it's actually faster than GNU
getline(), even with the proper flockfile() / funlockfile() calls.
(The test file was 6Mb -- 10 copies of /etc/termcap, which has short
lines -- avg 43 chars.)

This together with Tim's Win32x specific hacks might be the best we
can do for get_line().  However, raw xreadlines is still almost twice
as fast, so it's still under consideration.

Maybe MS supports a similar unlocked getc macro, and a separate
primitive to lock/unlock a file?  That would allow more unified code.

(Quick research shows that it exists, but only in internal form.  We
could probably call _lock_file() and _unlock_file(), and define our
own getc_lk(), protected by the proper set of macros.  This could all
be presented by config.h as flockfile(), funlockfile(), and
getc_unlocked() macros.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Wed Jan  3 16:27:09 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 3 Jan 2001 10:27:09 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101031206.HAA19182@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 03, 2001 at 07:06:33AM -0500
References: <200101031206.HAA19182@cj20424-a.reston1.va.home.com>
Message-ID: <20010103102709.A19451@kronos.cnri.reston.va.us>

On Wed, Jan 03, 2001 at 07:06:33AM -0500, Guido van Rossum wrote:
>Apparently getc_unlocked() is in the Single Unix spec.  Not sure how
>widespread that is -- do Linux developers pay attention to this
>standard at all?  According to the webpage it's (c) 1997.

It seems to be in glibc 2.1, but I don't know how much it would help,
and the added complexity of having to lock the file separately worries
me, perhaps due to a superstitious fear of angering the Thread Gods.

--amk



From akuchlin at mems-exchange.org  Wed Jan  3 16:44:57 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 3 Jan 2001 10:44:57 -0500
Subject: [Python-Dev] Help wanted with setup.py script
In-Reply-To: <017201c0759a$c2b180c0$e000a8c0@thomasnotebook>; from thomas.heller@ion-tof.com on Wed, Jan 03, 2001 at 04:35:10PM +0100
References: <200012290046.TAA01346@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com> <017201c0759a$c2b180c0$e000a8c0@thomasnotebook>
Message-ID: <20010103104457.A19493@kronos.cnri.reston.va.us>

[Cc'ing to python-dev].  

On Wed, Jan 03, 2001 at 04:35:10PM +0100, Thomas Heller wrote:
>You didn't expect this script run under windows?
>(It does not run)

It shouldn't matter, I think, since the makesetup stuff doesn't run on
Windows either; presumably the compiled-in modules are specified by an
MSVC project file, or something similar.  Can anyone confirm that I
don't care if setup.py works on Windows?  (Well, I *know* for a fact I
don't care; but should I? :) )

--amk




From guido at python.org  Wed Jan  3 16:49:43 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 10:49:43 -0500
Subject: [Python-Dev] Help wanted with setup.py script
In-Reply-To: Your message of "Wed, 03 Jan 2001 10:44:57 EST."
             <20010103104457.A19493@kronos.cnri.reston.va.us> 
References: <200012290046.TAA01346@207-172-57-128.s128.tnt2.ann.va.dialup.rcn.com> <017201c0759a$c2b180c0$e000a8c0@thomasnotebook>  
            <20010103104457.A19493@kronos.cnri.reston.va.us> 
Message-ID: <200101031549.KAA20188@cj20424-a.reston1.va.home.com>

> It shouldn't matter, I think, since the makesetup stuff doesn't run on
> Windows either; presumably the compiled-in modules are specified by an
> MSVC project file, or something similar.  Can anyone confirm that I
> don't care if setup.py works on Windows?  (Well, I *know* for a fact I
> don't care; but should I? :) )

Personally, I don't think it's worth to make setup.py work for
Windows.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Wed Jan  3 21:04:07 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 3 Jan 2001 15:04:07 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #103082] speed up readline() using getc_unlocked()
In-Reply-To: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net>; from noreply@sourceforge.net on Wed, Jan 03, 2001 at 08:47:30AM -0800
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net>
Message-ID: <20010103150407.D20301@kronos.cnri.reston.va.us>

On Wed, Jan 03, 2001 at 08:47:30AM -0800, GvR wrote:
>Summary: speed up readline() using getc_unlocked()

So what does the performance of this version look like?

--amk



From guido at python.org  Wed Jan  3 21:25:53 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 15:25:53 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #103082] speed up readline() using getc_unlocked()
In-Reply-To: Your message of "Wed, 03 Jan 2001 15:04:07 EST."
             <20010103150407.D20301@kronos.cnri.reston.va.us> 
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net>  
            <20010103150407.D20301@kronos.cnri.reston.va.us> 
Message-ID: <200101032025.PAA27457@cj20424-a.reston1.va.home.com>

> >Summary: speed up readline() using getc_unlocked()
> 
> So what does the performance of this version look like?

Very slightly faster than the GNU getline() version.  Without GNU
getline, the old code was about 3.5 times slower.

Here are the current times on a 6 Mb file (fileinput.py has my
sourceforge speedup patch too):

$ ./python ~/rltest.py ~/termcapx10 
total 6252720 chars and 146250 lines; average line length 42.8
count_chars_lines     0.943  0.930
readlines_sizehint    0.544  0.540
using_fileinput       2.089  2.090
while_readline        0.956  0.960

For comparison, here's what Python 1.5.2 does with the same test
(which should be pretty close to what the released Python 2.0 does; I
don't have a copy of that handy).

$ python1.5 ~/rltest.py ~/termcapx10 
total 6252720 chars and 146250 lines; average line length 42.8
count_chars_lines     0.836  0.820
readlines_sizehint    0.523  0.520
using_fileinput       5.739  5.740
while_readline        3.670  3.670

I don't know why count_chars_lines got proportionally more slower than
readlines_sizehint.  (The += operator didn't make a difference either
way.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Wed Jan  3 21:45:38 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 15:45:38 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #103082] speed up readline() using getc_unlocked()
In-Reply-To: Your message of "Wed, 03 Jan 2001 15:25:53 EST."
             <200101032025.PAA27457@cj20424-a.reston1.va.home.com> 
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net> <20010103150407.D20301@kronos.cnri.reston.va.us>  
            <200101032025.PAA27457@cj20424-a.reston1.va.home.com> 
Message-ID: <200101032045.PAA27595@cj20424-a.reston1.va.home.com>

I should add that the patches are on SourceForge:

fileinput.py:
http://sourceforge.net/patch/?func=detailpatch&patch_id=103081&group_id=5470

fileobject.c:
http://sourceforge.net/patch/?func=detailpatch&patch_id=103082&group_id=5470

I'm ready to check these in, but I'm waiting 24 hours in case there's
something I've missed.  (I haven't actually tested these on any other
platform besides Linux.)

Jeff Epler's xreadlines patch is here:
http://sourceforge.net/patch/?func=detailpatch&patch_id=102915&group_id=5470

Note that Jeff's patch includes a patch to fileinput.py that does the
same thing as mine but using his xreadlines module instead of directly
using readlines(sizehint) as does mine.  I like my approach better,
mostly because it reduces depenencies.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Wed Jan  3 22:25:30 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 3 Jan 2001 16:25:30 -0500
Subject: [Python-Dev] speed up readline() using getc_unlocked()
In-Reply-To: <200101032045.PAA27595@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 03, 2001 at 03:45:38PM -0500
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net> <20010103150407.D20301@kronos.cnri.reston.va.us> <200101032025.PAA27457@cj20424-a.reston1.va.home.com> <200101032045.PAA27595@cj20424-a.reston1.va.home.com>
Message-ID: <20010103162530.A20433@kronos.cnri.reston.va.us>

On Wed, Jan 03, 2001 at 03:45:38PM -0500, Guido van Rossum wrote:
>I'm ready to check these in, but I'm waiting 24 hours in case there's
>something I've missed.  (I haven't actually tested these on any other
>platform besides Linux.)

On Solaris 2.6, the configure script doesn't detect that
getc_unlocked() & friends are supported; details available from the
patch.  After editing config.h manually to enable them, the results are:

Before getc_unlocked patch:
total 1559913 chars and 32513 lines
count_chars_lines     0.892  0.730
readlines_sizehint    0.329  0.300
using_fileinput       4.612  4.470
while_readline        2.739  2.670

After patch:
total 1559913 chars and 32513 lines
count_chars_lines     0.698  0.680
readlines_sizehint    0.273  0.270
using_fileinput       2.707  2.700
while_readline        0.778  0.780
amarok src>           

With a patched version of fileinput.py:
using_fileinput       1.675  1.680

--amk



From guido at python.org  Wed Jan  3 22:36:07 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 03 Jan 2001 16:36:07 -0500
Subject: [Python-Dev] speed up readline() using getc_unlocked()
In-Reply-To: Your message of "Wed, 03 Jan 2001 16:25:30 EST."
             <20010103162530.A20433@kronos.cnri.reston.va.us> 
References: <E14Dr4U-0006lx-00@usw-sf-web1.sourceforge.net> <20010103150407.D20301@kronos.cnri.reston.va.us> <200101032025.PAA27457@cj20424-a.reston1.va.home.com> <200101032045.PAA27595@cj20424-a.reston1.va.home.com>  
            <20010103162530.A20433@kronos.cnri.reston.va.us> 
Message-ID: <200101032136.QAA07752@cj20424-a.reston1.va.home.com>

> On Solaris 2.6, the configure script doesn't detect that
> getc_unlocked() & friends are supported; details available from the
> patch.

(Fixed now, see the new patch.)

> After editing config.h manually to enable them, the results are:
> 
> Before getc_unlocked patch:
> total 1559913 chars and 32513 lines
> count_chars_lines     0.892  0.730
> readlines_sizehint    0.329  0.300
> using_fileinput       4.612  4.470
> while_readline        2.739  2.670
> 
> After patch:
> total 1559913 chars and 32513 lines
> count_chars_lines     0.698  0.680
> readlines_sizehint    0.273  0.270
> using_fileinput       2.707  2.700
> while_readline        0.778  0.780
> amarok src>           
> 
> With a patched version of fileinput.py:
> using_fileinput       1.675  1.680

Thanks!  The bottom line seems to be that your basic readline loop is
still 3x as slow as the fastest way -- so there's still a lot to say
for xreadlines...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Wed Jan  3 22:42:48 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 03 Jan 2001 22:42:48 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib codecs.py,1.13,1.14
References: <E14DvT9-00079N-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <3A539CD8.367361B8@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> Update of /cvsroot/python/python/dist/src/Lib
> In directory usw-pr-cvs1:/tmp/cvs-serv26608/Lib
> 
> Modified Files:
>         codecs.py
> Log Message:
> ...
> 
> This patch closes the bugs #116285 and #119960.

I was too fast... the subject line of #119960 was misleading.
It is still open.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Thu Jan  4 00:13:15 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 3 Jan 2001 18:13:15 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com>

[Guido]
> Are you sure Perl still uses stdio at all?

Pretty sure, but there are so many layers of macros the code is
undecipherable, and I can't step thru macros in the debugger either (that's
assuming I wanted to devote N hours to building Perl from source too --
which I don't).  Perl also makes heavy use of macroizing std library names,
so e.g. when I see "fopen" (which I do!), that doesn't mean I'm getting the
fopen I'm thinking of.  But the MSVC config files define all sorts of macros
to get at the MS stdio _cnt and _ptr (and most other) FILE* fields, and the
version of fopen in the Win32 stuff appears to defer to the platform fopen
(after doing Perlish stuff, like if someone passed "/dev/null" as the file
name, Perl changes it to "NUL").

This is what it's like:  the first line of Perl's win32_fopen is this:

    dTHXo;

That's conditionally defined in perl.h, either as

#define dTHXo			dTHXoa(PERL_GET_THX)

or, if pTHXo is not defined, as

#  define dTHXo		dTHX

dTHX in turn is #defined in 4 different places across 3 different files in 2
different directories.  I'll skip those.  OTOH, dTHXoa is easy!  It's only
defined once:

#define dTHXoa(a)		pTHXo = a

Ah, *that* clears it up <wink>.  Etc.  20 years ago I may have thought this
was fun.  I thought debugging large systems of m4 macros was fun then, and
I'm not sure this is either better or worse than that -- well, it's worse,
because I understood m4's implementation.


> If so, does it open the file in binary or in text mode?

Sorry, but I really don't know and it's a pit to pursue.  If it's not native
text mode, they do a good job of faking it (e.g., Ctrl-Z acts like an EOF
when reading a text file from Perl on Windows -- not something even Larry
would be likely to do on his own <wink>).

> Based on the APIs in MS's libc, I presume that the crlf->lf
> translation is not done by stdio proper but by the Unix I/O
> emulation just underneath it (open() has an O_BINARY option
> flag, so read() probably does the translation).

Yes; and late in the last release cycle, import.c's open_exclusive had a
Windows bug related to this (fdopen() used "wb", but the earlier open()
didn't use O_BINARY, and fdopen *acted* like it had used "w").  Also, the MS
setmode() function works on file handles, not streams.

> That comes down to copying most bytes an extra time.

Understood.  But the CRLF are stored physically on disk, so unless the disk
controller is converting them, *someone's* software (whether MS's or Perl's)
is doing it.  By the time Perl is doing its fast line-input stuff, and doing
what sure looks like a straight copy out of an IO buffer, it's clear from
the code that CRLF has already been translated to LF.

> (To test this hypothesis, you could try to open the test file
> with mode "rb" and see if it makes a difference.)

In Python, that saved about 10% (but got the wrong answers <wink>).  In
Perl, about 15-20%.  But I don't think that tells us who's doing the
translation.  Assuming that the translation takes about the same total time
for each, it makes sense that the percentage would be higher for Perl (since
its total runtime is lower:  same-sized slice of a smaller pie).

> My biggest worry: thread-safety.  There must be a way to lock
> the file (you indicated that fgets() uses it).

Yes, via the unadvertised _lock_str and _unlock_str macros defined in MS
mtdll.h, which is not on the include path:

/*
 * This is an internal C runtime header file. It is used when building
 * the C runtimes only. It is not to be used as a public header file.
 */

The routines and macros it calls are also unadvertised.  After an hour of
thrashing I wasn't able to successfully link any code trying to call these
routines.  Doesn't mean it's impossible, does means they're internal to MS
libc and aren't meant to be called by anything else.  That's why it's called
"cheating" <wink>.  Perl appears to ignore the whole issue (but Perl's
thread story is muddy at best).

[... ungetc ...]

Not worried here either.

> ...
> You probably don't have many lines longer than 1000 characters.

None, in fact.

>> + It drops (line-at-a-time) drops to a little under 13 seconds if I
>> comment out the thread macros.

> If you mean the Py_BLOCK_THREADS around the resize, that can be safely
> dropped.

I meant *all* thread-related macros -- was just trying to get a feel for how
much that fiddling cost (it's an expense Perl doesn't seem to have -- yet).
Was measurable but not substantial.  WRT the resize, there's now a "fast
path" that avoids it.

> (If/when we introduce Vladimir's malloc, we'll have to decide whether
> it is threadsafe by itself or whether it requires the global
> interpreter lock.  I vote to make it threadsafe by itself.)

As feared, this thread is going to consume my life <0.5 wink>.

> ...
> Doesn't it make more sense to delay the resize until this point?  I
> don't know how much the character copying accounts for, but I could
> imagine a strategy based on memchr() and memcpy() that first searches
> for a \n, and if found, allocates to the right size before copying.
> Typically, the buffer contains many lines, so this could be optimized
> into requiring a single exactly-sized malloc() call in the common case
> (where the buffer doesn't wrap).  But possibly scanning the buffer for
> \n and then copying the bytes separately, even with memcmp() and
> memcpy(), slows things down too much for this to be faster.

Turns out that Perl does very much what I was doing; the Perl code is
actually more burdensome, because its routine is trying to deal not only
with \n-termination, but also arbitrary-string termination (Perl's Awk-like
input record separator), and "paragraph mode", and fixed-size reads, and
some other stuff I can't figure out from the macro names.  In all cases with
a terminator, though, it's doing the same business of both copying and
testing in a very tight inner loop.  It doesn't appear to make any serious
attempts to avoid resizing the buffer.  But, Perl has its own malloc
routines, and I'm guessing they're highly tuned for this stuff.

Since we're stuck with the MS malloc-- and Win9x's in particular seems
lame --adding this near the start of my stuff did yield a nice speedup:

	if (fp->_cnt > 0 &&
	    (pBuf = (char *)memchr(fp->_ptr, '\n', fp->_cnt)) != NULL) {
	    	/* it's all in the buffer so don't bother releasing the
	    	 * global lock
	    	 */
		total_buf_size = pBuf - fp->_ptr + 1;
		v = PyString_FromStringAndSize(fp->_ptr,
			                       (int)total_buf_size);
		if (v != NULL) {
			pBuf = BUF(v) + total_buf_size;
			fp->_cnt -= total_buf_size;
			fp->_ptr += total_buf_size;
		}
		goto done;
	}

So that builds the result string directly from the stdio buffer when it can.
Times dropped from (before this particular small hack)

count_chars_lines    14.880 14.854
readlines_sizehint    9.280  9.302
using_fileinput      48.610 48.589
while_readline       13.450 13.451

to

count_chars_lines    14.780 14.784
readlines_sizehint    9.550  9.514
using_fileinput      43.560 43.584
while_readline       10.600 10.578

Since I have no long lines in this test data, and the stdio buffer typically
contains thousands of chars, most calls should be satisfied by the fast
path.  Compared to the previous code, the fast path (1) avoids global lock
fiddling (but that didn't account for much in a distinct test); (2) crawls
over the buffer twice instead of once; and, (3) avoids one (shrinking!)
realloc.  So crawling over the buffer an extra time costs nothing compared
to the cost of a resize; and that's likely just more evidence that
malloc/realloc suck on this platform.

CAUTION:  no file locking is going on now (because I haven't found a way to
do it).  My previous claim that the MS getc macro did no locking was wrong,
as I discovered by stepping thru the generated machine code.  stdio.h
#defines getc without locking, but in _MT mode it later gets #undef'ed and
turned into a function call.

>> /* XXX unclear to me why MS's getc macro does "& 0xff" */
>>			*pBuf++ = ch = *ms_ptr++ & 0xff;

> I know why.  getchar() returns an int in the range [-1, 255].  If
> chars are signed the &0xff is needed else you would get a return in
> the range [-128, 127] and -1 would be ambiguous (EOF==-1).

Bingo -- MS chars are signed.

> ...
> But here since you're copying to another character, it's pointless.

Yup!  Gone.

> ....
> Note that get_line() with negative n could be implemented as
> get_line(0) with some post processing.

Andrew's glibc getline code appears to have wanted to do that, but looks to
me like it's unreachable (unless I'm hallucinating, the "n < 0" test after
return from glibc getline can't succeed, because the enclosing block is
guarded by an "n==0" test).

> This should be done completely separately, in PyFile_GetLine.

I assume you have an editor <wink>.

> The negative n case is only used by raw_input() -- it means strip
> the \n and raise EOFError for EOF, and I expect that this is rarely
> if ever used in a speed-conscious situation.

I've never seen raw_input used except when stdin and stdout were connected
to a tty.  When I tried raw_input from a DOS box under the debugger, it
never called get_line.  Something trickier is going on there; I suspect it's
actually calling fgets (eventually) instead in that case.

more-mysteries-than-i-really-need-ly y'rs  - tim




From jeremy at alum.mit.edu  Thu Jan  4 01:06:58 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Wed, 3 Jan 2001 19:06:58 -0500 (EST)
Subject: [Python-Dev] Mailman problems?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com>
References: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com>
Message-ID: <14931.48802.273143.209933@localhost.localdomain>

Tim & Barry,

It looks like the is some problem with Mailman that is garbling
messages to python-dev.  It may only affect lines that begin with a
tab; not sure.

Your most recent message came through with the following line

>    dTHXo;

(This was not the only example.)

I think this was supposed to be a line of C code, but whatever
meaningful contents it had were rendered as gobbledygook.

Jeremy


    
    



From loewis at informatik.hu-berlin.de  Thu Jan  4 01:13:16 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Thu, 4 Jan 2001 01:13:16 +0100 (MET)
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <200101040013.BAA13436@pandora.informatik.hu-berlin.de>

> Apparently getc_unlocked() is in the Single Unix spec.  Not sure how
> widespread that is -- do Linux developers pay attention to this
> standard at all?

Ulrich Drepper, who is in charge of glibc, is always interested in
following Single Unix to the letter; getc_unlocked is supported
atleast since glibc 2.0.

http://www.sun.com/smcc/solaris-migration/docs/courses/threadsHTML/adv.html

claims that getc_unlocked is already in POSIX.1c; Solaris apparently
supports it atleast since Solaris 2.4.

Irix has it since 6.5, Tru64 atleast since 4.0d (probably much
longer); HPUX since 11.0, AIX since atleast 4.3.

Of the BSDs, only OpenBSD appears to support it; it knows that it is
in ANSI 1003.1 since 1996-07-12.

SCO OpenServer doesn't support it.

Regards,
Martin



From fredrik at effbot.org  Thu Jan  4 01:20:41 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Thu, 4 Jan 2001 01:20:41 +0100
Subject: [Python-Dev] Mailman problems?
References: <200101031237.HAA19244@cj20424-a.reston1.va.home.com><LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com> <14931.48802.273143.209933@localhost.localdomain>
Message-ID: <011901c075e4$2ce96360$e46940d5@hagrid>

> It looks like the is some problem with Mailman that is garbling
> messages to python-dev.  It may only affect lines that begin with a
> tab; not sure.
>
> Your most recent message came through with the following line
> 
> >    dTHXo;
> 
> (This was not the only example.)
> 
> I think this was supposed to be a line of C code, but whatever
> meaningful contents it had were rendered as gobbledygook.

also looks like Mailman removed all smileys from
Jeremys post ;-)

</F>




From thomas at xs4all.net  Thu Jan  4 01:27:54 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 01:27:54 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101040013.BAA13436@pandora.informatik.hu-berlin.de>; from loewis@informatik.hu-berlin.de on Thu, Jan 04, 2001 at 01:13:16AM +0100
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de>
Message-ID: <20010104012753.D2467@xs4all.nl>

On Thu, Jan 04, 2001 at 01:13:16AM +0100, Martin von Loewis wrote:

> Of the BSDs, only OpenBSD appears to support it; it knows that it is
> in ANSI 1003.1 since 1996-07-12.

BSDI supports getc_unlocked() at least since BSDI 3.1. I don't have any
older boxes to check, but the manpage for getc and all its friends carries
the timestamp 'June 4, 1993', which implies it could have been available a
lot longer. (Note that BSD was once known to *define* the standard ;-)

I concur that FreeBSD does not currently support getc_unlocked, but since
BSDI and FreeBSD are merging, I suspect it will, soonish.

In other words: use it! :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From barry at wooz.org  Thu Jan  4 03:59:01 2001
From: barry at wooz.org (Barry A. Warsaw)
Date: Wed, 3 Jan 2001 21:59:01 -0500
Subject: [Python-Dev] Re: Mailman problems?
References: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCAEJLIGAA.tim.one@home.com>
	<14931.48802.273143.209933@localhost.localdomain>
Message-ID: <14931.59125.391596.730296@anthem.wooz.org>

>>>>> "JH" == Jeremy Hylton <jeremy at alum.mit.edu> writes:

    JH> It looks like the is some problem with Mailman that is
    JH> garbling messages to python-dev.  It may only affect lines
    JH> that begin with a tab; not sure.

    JH> Your most recent message came through with the following line

    >> dTHXo;

    JH> (This was not the only example.)

    JH> I think this was supposed to be a line of C code, but whatever
    JH> meaningful contents it had were rendered as gobbledygook.

Oh shoot, my bad.  I dropped in an experimental Perl filter module in
the delivery pipeline.  It's been so long since I hacked Perl, I think
I meant to write $%_-> when I really wrote %$_->

-Barry




From tim.one at home.com  Thu Jan  4 05:26:51 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 3 Jan 2001 23:26:51 -0500
Subject: [Python-Dev] RE: Mailman problems?
In-Reply-To: <14931.48802.273143.209933@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKIIGAA.tim.one@home.com>

[Jeremy]
> It looks like the is some problem with Mailman that is garbling
> messages to python-dev.  It may only affect lines that begin with a
> tab; not sure.
>
> Your most recent message came through with the following line
>
>>    dTHXo;
>
> (This was not the only example.)
>
> I think this was supposed to be a line of C code, but whatever
> meaningful contents it had were rendered as gobbledygook.

I have no idea where that "o" came from!  It was supposed to be "o".  Barry,
fix it!

BTW, the second line of Perl implementation functions is usually a lot less
mysterious than the first.  If anyone wants the joy of reverse-engineering
Perl's supernaturally fast input, it's function Perl_sv_gets in file sv.c.
sv.c?  Yes!  The destination of a one-line input is a Scalar Value, hence,
sc.  I expect there's similar method behind all of this stuff, but I never
stumbled into the key.  To get you started, here's the first line of
Perl_sv_gets:

    dTHR;

The line you're looking for is 119 lines down from that:

	    if ((*bp++ = *ptr++) == rslast)  /* really   |  dust */

the-comment-makes-more-sense-in-context<wink>-ly y'rs  - tim




From thomas at xs4all.net  Thu Jan  4 07:51:17 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 07:51:17 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101040037.TAA08699@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 03, 2001 at 07:37:22PM -0500
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com>
Message-ID: <20010104075116.J402@xs4all.nl>

On Wed, Jan 03, 2001 at 07:37:22PM -0500, Guido van Rossum wrote:
> > In other words: use it! :)
> 
> Mind doing a few platform tests on the (new version of the) patch?

Well, only a bit :) It's annoying that BSDI doesn't come with autoconf, but
I managed to use all my early-morning wit (it's 6:30AM <wink>) to work
around it. I've tested it on BSDI 4.1 and FreeBSD 4.2-RELEASE.

> I already know that it works on Red Hat Linux 6.2 (my box) and Solaris
> 2.6 (Andrew's box).  I would be delighted to know that it works on at
> least one other platform that has getc_unlocked() and one platform
> that doesn't have it!

Sorry, I have to disappoint you. FreeBSD does have getc_unlocked, they
just didn't document it. Hurrah for autoconf ;P Anyway, it worked like a
charm on BSDI:

(Python 2.0)
total 1794310 chars and 37660 lines
count_chars_lines     0.310  0.300
readlines_sizehint    0.150  0.150
using_fileinput       2.013  2.017
while_readline        1.006  1.000

(CVS Python + getc_unlocked)
daemon2:~/python/python/dist/src > ./python test.py termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.354  0.350
readlines_sizehint    0.182  0.183
using_fileinput       1.594  1.583
while_readline        0.363  0.367

But something weird is going on on FreeBSD:

(Standard CVS Python)
> ./python ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.265  0.266
readlines_sizehint    0.148  0.148
using_fileinput       0.943  0.938
while_readline        0.214  0.219

(CVS+getc_unlocked)
> ./python-getc-unlocked  ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.266  0.266
readlines_sizehint    0.151  0.141
using_fileinput       1.066  1.078
while_readline        0.283  0.281

This was sufficiently unexpected that I looked a bit further. The FreeBSD
Python was compiled without editing Modules/Setup, so it was statically
linked, no readline etc, but *with* threads (which are on by default, and
functional on both FreeBSD and BSDI 4.1.) Here's the timings after I enabled
just '*shared*':

(CVS + *shared*)
> ./python ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.276  0.273
readlines_sizehint    0.150  0.156
using_fileinput       0.902  0.898
while_readline        0.206  0.203

(This was not a fluke, I repeated it several times, getting hardly any
variation.) Enabling readline and cursesmodule had no additional effect.
Adding *shared* to the getc_unlocked tree saw roughly the same improvement,
but was still slower than without getc_unlocked.

(CVS + *shared* + getc_unlocked)
> ./python ~thomas/test.py ~thomas/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.272  0.273
readlines_sizehint    0.149  0.148
using_fileinput       1.031  1.031
while_readline        0.267  0.266

Increasing the size of the testfile didn't change anything, other than the
absolute numbers. I browsed stdio.h, where both getc() and getc_unlocked()
are defined as macros. getc_unlocked is defined as:

#define __sgetc(p) (--(p)->_r < 0 ? __srget(p) : (int)(*(p)->_p++))
#define getc_unlocked(fp)       __sgetc(fp)

and getc either as

#define getc(fp)        getc_unlocked(fp)
(without threads) or

static __inline int                     \
__getc_locked(FILE *_fp)                \
{                                       \
        extern int __isthreaded;        \
        int _ret;                       \
        if (__isthreaded)               \
                _FLOCKFILE(_fp);        \
        _ret = getc_unlocked(_fp);      \
        if (__isthreaded)               \
                funlockfile(_fp);       \
        return (_ret);                  \
}
#define getc(fp)        __getc_locked(fp)

_FLOCKFILE(x) is defined as flockfile(x), so that isn't the difference. The
speed difference has to be in the quick-and-easy test for whether the
locking is even necessary. Starting a thread on 'time.sleep(900)' in test.py
shows these numbers:

(standard CVS python)
> ./python-shared-std ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.433  0.445
readlines_sizehint    0.204  0.188
using_fileinput       1.595  1.594
while_readline        0.456  0.453

(getc_unlocked)
> ./python-getc-unlocked-shared ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.441  0.453
readlines_sizehint    0.206  0.195
using_fileinput       1.677  1.688
while_readline        0.509  0.508

So... using getc_unlocked manually for performance reasons isn't a cardinal
sin on FreeBSD only if you are really using threads :-)

Lets-outsmart-the-OS-scheduler-next!-ly y'rs
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Thu Jan  4 08:57:26 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 08:57:26 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test/output test_coercion,1.2,1.3
In-Reply-To: <E14DzKN-0005eK-00@usw-pr-cvs1.sourceforge.net>; from nascheme@users.sourceforge.net on Wed, Jan 03, 2001 at 05:36:27PM -0800
References: <E14DzKN-0005eK-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010104085726.E2467@xs4all.nl>

On Wed, Jan 03, 2001 at 05:36:27PM -0800, Neil Schemenauer wrote:
> Update of /cvsroot/python/python/dist/src/Lib/test/output
> In directory usw-pr-cvs1:/tmp/cvs-serv21710/Lib/test/output
> 
> Modified Files:
> 	test_coercion 
> Log Message:
> Sequence repeat works now for in-place multiply with an integer type
> as the left operand.  I don't know if this is a feature or a bug.

> ! 2 *= [1] => [1, 1]

It's a feature.

x = 2 * [1]

works, so

x = 2
x *= [1]

does, too. Obviously, '2 *= [1]' shouldn't, but I'm assuming you don't
actually execute that (it should give a SyntaxError)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From fredrik at effbot.org  Thu Jan  4 10:32:55 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Thu, 4 Jan 2001 10:32:55 +0100
Subject: [Python-Dev] RE: Mailman problems?
References: <LNBBLJKPBEHFEDALKOLCGEKIIGAA.tim.one@home.com>
Message-ID: <00a701c07631$531983b0$e46940d5@hagrid>

tim wrote:
> I have no idea where that "o" came from!  It was supposed to be "o".
> Barry, fix it!

no need.  from the perlguts man page:

    "You can ignore [pad]THX[xo] when browsing the Perl
    headers/sources."

in-my-dictionary-perl's-an-american-physicist-ly yrs /F




From mal at lemburg.com  Thu Jan  4 11:02:35 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 04 Jan 2001 11:02:35 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include
 classobject.h,2.33,2.34
References: <E14DzEi-0005T2-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <3A544A3B.32B86792@lemburg.com>

Neil Schemenauer wrote:
> 
> Update of /cvsroot/python/python/dist/src/Include
> In directory usw-pr-cvs1:/tmp/cvs-serv21006/Include
> 
> Modified Files:
>         classobject.h
> Log Message:
> Remove PyInstance_*BinOp functions.
> 
> Index: classobject.h
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Include/classobject.h,v
> retrieving revision 2.33
> retrieving revision 2.34
> diff -C2 -r2.33 -r2.34
> *** classobject.h       2000/09/01 23:29:26     2.33
> --- classobject.h       2001/01/04 01:30:34     2.34
> ***************
> *** 60,71 ****
>   extern DL_IMPORT(int) PyClass_IsSubclass(PyObject *, PyObject *);
> 
> - extern DL_IMPORT(PyObject *) PyInstance_DoBinOp(PyObject *, PyObject *,
> -                                                 char *, char *,
> -                                                 PyObject * (*)(PyObject *,
> -                                                                PyObject *));
> -
> - extern DL_IMPORT(int)
> - PyInstance_HalfBinOp(PyObject *, PyObject *, char *, PyObject **,
> -                       PyObject * (*)(PyObject *, PyObject *), int);

Wouldn't it be safer to provide emulation APIs for these ? There
might be code out there using these APIs.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at python.org  Thu Jan  4 15:06:53 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 04 Jan 2001 09:06:53 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include classobject.h,2.33,2.34
In-Reply-To: Your message of "Thu, 04 Jan 2001 11:02:35 +0100."
             <3A544A3B.32B86792@lemburg.com> 
References: <E14DzEi-0005T2-00@usw-pr-cvs1.sourceforge.net>  
            <3A544A3B.32B86792@lemburg.com> 
Message-ID: <200101041406.JAA11926@cj20424-a.reston1.va.home.com>

> > - extern DL_IMPORT(PyObject *) PyInstance_DoBinOp(PyObject *, PyObject *,
> > -                                                 char *, char *,
> > -                                                 PyObject * (*)(PyObject *,
> > -                                                                PyObject *));
> > -
> > - extern DL_IMPORT(int)
> > - PyInstance_HalfBinOp(PyObject *, PyObject *, char *, PyObject **,
> > -                       PyObject * (*)(PyObject *, PyObject *), int);
> 
> Wouldn't it be safer to provide emulation APIs for these ? There
> might be code out there using these APIs.

No.  These were never intended to be part of the API (and it was a
mistake that they used DL_IMPORT()).  They had to be extern because
they were defined in one file and used in another.  I'm glad they're
gone.  They are so obscure that I'd be *very* surprised if anybody was
using them, and even more if they even *wanted* emulation under the
new scheme -- I'd expect them to eagerly convert their code to using
new-style numbers right away.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Thu Jan  4 15:16:39 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 04 Jan 2001 09:16:39 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 04 Jan 2001 07:51:17 +0100."
             <20010104075116.J402@xs4all.nl> 
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com>  
            <20010104075116.J402@xs4all.nl> 
Message-ID: <200101041416.JAA11983@cj20424-a.reston1.va.home.com>

[Thomas finds that on FreeBSD, getc() is faster than getc_unlocked().]

Thomas, I really don't understand it.  The getc() source code you
showed calls getc_unlocked().  So how can it be faster?  The answer
must be somewhere else...  Cache line conflicts, the rewriting of the
loop that I did, a compiler bug, the inlining, who knows.  Can you
compare the generated assembly code?  On other platforms,
getc_unlocked() typically speeds the readline() test case up by a
significant factor (as in your BSDI numbers, where it's almost 3x
faster).

Could it be that you're mistaken and that somehow getc_unlocked() is
*not* chosen on FreeBSD?  Then I could believe it, the rewritten loop
is so different that the optimizer might have done something different
to it.  (Check config.h.  When all else fails, I put an #error in the
#ifdef branch that I expect not to be taken.)

Could it be that somehow getc_unlocked() is later defined to be the
same as getc(), so choosing it just adds the overhead of calling
f[un]lockfile() for each line?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Thu Jan  4 15:59:05 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 15:59:05 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101041416.JAA11983@cj20424-a.reston1.va.home.com>; from guido@python.org on Thu, Jan 04, 2001 at 09:16:39AM -0500
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com>
Message-ID: <20010104155904.L402@xs4all.nl>

On Thu, Jan 04, 2001 at 09:16:39AM -0500, Guido van Rossum wrote:
> [Thomas finds that on FreeBSD, getc() is faster than getc_unlocked().]

> Thomas, I really don't understand it.  The getc() source code you
> showed calls getc_unlocked().  So how can it be faster?  The answer
> must be somewhere else...  Cache line conflicts, the rewriting of the
> loop that I did, a compiler bug, the inlining, who knows.  Can you
> compare the generated assembly code?  On other platforms,
> getc_unlocked() typically speeds the readline() test case up by a
> significant factor (as in your BSDI numbers, where it's almost 3x
> faster).

Nono, reread my message, and your code. getc() isn't faster than
getc_unlocked(). getc() is faster than flockfile(f) + getc_unlocked(f) (+
the rearranging of the function, use of PyTHREAD_ALLOW inside the outer loop,
etc.) Significantly so when there is only one thread running (which is still
the common case, for most systems, and FreeBSD's libc has easy inside
knowledge about) and marginally so when there is at least one other thread.
The small advantage in the multi-threaded case can be explained by the
rest of the changes. 

You see, I was comparing a patched tree versus a non-patched tree, not a
getc_unlocked() enabled one versus a disabled one, so I was measuring the
speed difference of the *patch*, not of the use of getc_unlocked() vs
getc(). Here is the speed difference of just the use of getc() vs
getc_unlocked() (same tree, hand-edited config.h) in a non-threaded
environment:

> ./python-getc-disabled ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.271  0.273
readlines_sizehint    0.149  0.148
using_fileinput       0.898  0.898
while_readline        0.214  0.211

> ./python-getc-enabled ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.271  0.273
readlines_sizehint    0.148  0.148
using_fileinput       0.898  0.898
while_readline        0.214  0.211


As you see, no significant difference. Here is the difference in a threaded
environment (a second thread that does just 'time.sleep(900)'):

> ./python-getc-disabled ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.429  0.422
readlines_sizehint    0.200  0.211
using_fileinput       1.604  1.594
while_readline        0.465  0.461

> ./python-getc-enabled ~/test.py ~/termcapx10
total 1794310 chars and 37660 lines
count_chars_lines     0.429  0.430
readlines_sizehint    0.201  0.203
using_fileinput       1.600  1.602
while_readline        0.463  0.461

... where I have to note that the getc-disabled version's 'using_fileinput'
time fluctuates a lot more, mostly upwards, in the threaded environment. (I
see it jump to 1.609, 1.617 cputime, every few runs.) Still not a terribly
significant difference, but a hint that we, too, can use inside knowledge ;)

> Could it be that you're mistaken and that somehow getc_unlocked() is
> *not* chosen on FreeBSD?  Then I could believe it, the rewritten loop
> is so different that the optimizer might have done something different
> to it.  (Check config.h.  When all else fails, I put an #error in the
> #ifdef branch that I expect not to be taken.)

Yah, #error is great for debugging, I use it a lot ;) But I'm sure of this.
FreeBSD's getc() is just craftily optimized. Note that if we can get
get_line using getc_unlocked() to run as fast as get_line using getc() on
FreeBSD, it should also benifit other platforms, because the only speed to
be had is in our own code :) Not that I'm saying it can be improved, just
that it apparently got slower, because of this patch. I can't be much help
doing any performance tuning, though, I've about used up my lunchhour and
I'm working late tonight ;P

Good-thing-my-boss-can't-tell-the-difference-between-Apache-and-Python-src-ly
	y'rs, 
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Thu Jan  4 16:27:28 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 04 Jan 2001 10:27:28 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 04 Jan 2001 15:59:05 +0100."
             <20010104155904.L402@xs4all.nl> 
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com>  
            <20010104155904.L402@xs4all.nl> 
Message-ID: <200101041527.KAA12181@cj20424-a.reston1.va.home.com>

[Me & Thomas in violent agreement that there's something weird about
the speed of getc_unlocked() vs. getc() on FreeBSD.]

I just realized what's the probable cause.  Read your timing post
again:

# BSDI:
# 
# (Python 2.0)
# while_readline        1.006  1.000
# 
# (CVS Python + getc_unlocked)
# while_readline        0.363  0.367

# FreeBSD:
# 
# (Standard CVS Python)
# while_readline        0.214  0.219
# 
# (CVS+getc_unlocked)
# while_readline        0.283  0.281

Standard CVS Python, as opposed to Python 2.0 as released, uses GNU
getline()!  So on FreeBSD, for this test case, GNU getline() is faster
than getc_unlocked().

So the question is, should I leave the GNU getline() code in?  I'm
inclined against it -- it's not that much faster, and on other
platform getc_unlocked() is faster.  Given that getc_unlocked() is a
standard (of some sort) and GNU getline() is, well, just that, I'd say
let's stick with getc_unlocked().

(Unfortunately, from a phone conversation I had last night with Tim,
there's not much hope of doing something there -- and that platform
sorely needs it!  The hacks that Tim reported earlier are definitely
not thread-safe.  While it's easy to come up with getc_unlocked() for
Windows, the locking operations used internally there by the /MT code
are not exported from MSVCRT.DLL, and that's crucial.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Thu Jan  4 16:31:39 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 16:31:39 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101041527.KAA12181@cj20424-a.reston1.va.home.com>; from guido@python.org on Thu, Jan 04, 2001 at 10:27:28AM -0500
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl> <200101041527.KAA12181@cj20424-a.reston1.va.home.com>
Message-ID: <20010104163139.M402@xs4all.nl>

On Thu, Jan 04, 2001 at 10:27:28AM -0500, Guido van Rossum wrote:
> [Me & Thomas in violent agreement that there's something weird about
> the speed of getc_unlocked() vs. getc() on FreeBSD.]

> I just realized what's the probable cause.  Read your timing post
> again:

> Standard CVS Python, as opposed to Python 2.0 as released, uses GNU
> getline()!

Sorry, no go. You need two things to use getline(): getline() itself, and a
GNU libc. FreeBSD has neither. (And autoconf agrees with me.) If you *really
really* want me to, I can compile 2.0-standard on FreeBSD and show you. But
I'd rather not :)

Now go back and read my other mail about why FreeBSD is faster :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From akuchlin at mems-exchange.org  Thu Jan  4 16:43:15 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 4 Jan 2001 10:43:15 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010104155904.L402@xs4all.nl>; from thomas@xs4all.net on Thu, Jan 04, 2001 at 03:59:05PM +0100
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl>
Message-ID: <20010104104315.C23803@kronos.cnri.reston.va.us>

On Thu, Jan 04, 2001 at 03:59:05PM +0100, Thomas Wouters wrote:
>getc_unlocked(). getc() is faster than flockfile(f) + getc_unlocked(f) (+
>the rearranging of the function, use of PyTHREAD_ALLOW inside the outer loop,
>etc.) Significantly so when there is only one thread running (which is still

So it looks like the ALLOW_THREADS should be moved out of the for
loop.  This produced no measureable performance difference on Solaris;
I'll leave it to GvR to try it on Linux.  I wonder if FreeBSD has some
unusually slow thread operation?

--amk



From thomas at xs4all.net  Thu Jan  4 16:59:25 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 4 Jan 2001 16:59:25 +0100
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010104104315.C23803@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Jan 04, 2001 at 10:43:15AM -0500
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl> <20010104104315.C23803@kronos.cnri.reston.va.us>
Message-ID: <20010104165925.G2467@xs4all.nl>

On Thu, Jan 04, 2001 at 10:43:15AM -0500, Andrew Kuchling wrote:
> On Thu, Jan 04, 2001 at 03:59:05PM +0100, Thomas Wouters wrote:

> >getc_unlocked(). getc() is faster than flockfile(f) + getc_unlocked(f) (+
> >the rearranging of the function, use of PyTHREAD_ALLOW inside the outer loop,
> >etc.) Significantly so when there is only one thread running (which is still

> So it looks like the ALLOW_THREADS should be moved out of the for
> loop.  This produced no measureable performance difference on Solaris;
> I'll leave it to GvR to try it on Linux.  I wonder if FreeBSD has some
> unusually slow thread operation?

Note that I was just guessing there. I did a quick scan of the function, and
noticed that the ALLOW_THREADS statements had moved into the outer loop. I
didn't even contemplate whether that made a difference, so don't trust that
judgement.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From akuchlin at mems-exchange.org  Thu Jan  4 17:10:29 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 4 Jan 2001 11:10:29 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010104165925.G2467@xs4all.nl>; from thomas@xs4all.net on Thu, Jan 04, 2001 at 04:59:25PM +0100
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl> <20010104104315.C23803@kronos.cnri.reston.va.us> <20010104165925.G2467@xs4all.nl>
Message-ID: <20010104111029.A28510@kronos.cnri.reston.va.us>

On Thu, Jan 04, 2001 at 04:59:25PM +0100, Thomas Wouters wrote:
>Note that I was just guessing there. I did a quick scan of the function, and
>noticed that the ALLOW_THREADS statements had moved into the outer loop. I
>didn't even contemplate whether that made a difference, so don't trust that
>judgement.

According to your benchmark, the performance of the threaded version
was the same whether or not getc_unlocked() was unused, so it's not
that flockfile() is really slow.  I can't believe the compiler
optimized the old, ungainly loop better than the newer, tighter loop.
That leaves the ALLOW_THREADS as the most reasonable culprit.

--amk




From akuchlin at mems-exchange.org  Thu Jan  4 18:10:11 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 04 Jan 2001 12:10:11 -0500
Subject: [Python-Dev] SGI's Digital Media SDK
Message-ID: <E14EDtz-0007fS-00@kronos.cnri.reston.va.us>

SGI just made a source release of their digital media SDK for IRIX and
Linux at http://oss.sgi.com/projects/dmsdk/ .  According to the FAQ,
this is derived from previous SGI libraries, "including the Video
Library (VL), the Audio Library (AL), Digital Media Image Convertor
(DMIC), Digital Media Audio Convertor (DMAC), and the Compression
Library (CL)."  Interested parties may want to look into this, because
Python still has the al, cd, cl, and sv modules; maybe they'd work
with the new software with a reasonable amount of fixing, and at least
now there's a reasonable chance that non-IRIX platforms will be
supported.

--amk




From guido at python.org  Thu Jan  4 20:07:13 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 04 Jan 2001 14:07:13 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 04 Jan 2001 10:43:15 EST."
             <20010104104315.C23803@kronos.cnri.reston.va.us> 
References: <200101040013.BAA13436@pandora.informatik.hu-berlin.de> <20010104012753.D2467@xs4all.nl> <200101040037.TAA08699@cj20424-a.reston1.va.home.com> <20010104075116.J402@xs4all.nl> <200101041416.JAA11983@cj20424-a.reston1.va.home.com> <20010104155904.L402@xs4all.nl>  
            <20010104104315.C23803@kronos.cnri.reston.va.us> 
Message-ID: <200101041907.OAA12573@cj20424-a.reston1.va.home.com>

> So it looks like the ALLOW_THREADS should be moved out of the for
> loop.  This produced no measureable performance difference on Solaris;
> I'll leave it to GvR to try it on Linux.  I wonder if FreeBSD has some
> unusually slow thread operation?

I kind of doubt that it's Py_ALLOW_THREADS -- it's in the outer loop,
which typically only gets executed once.  It only goes around a second
time when the line is longer than the initial buffer.  We could tweak
the initial buffer size (currently 100, with increments of 1000).

--Guido van Rossum (home page: http://www.python.org/~guido/)




From mal at lemburg.com  Thu Jan  4 20:32:15 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 04 Jan 2001 20:32:15 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include 
 classobject.h,2.33,2.34
References: <E14DzEi-0005T2-00@usw-pr-cvs1.sourceforge.net>  
	            <3A544A3B.32B86792@lemburg.com> <200101041406.JAA11926@cj20424-a.reston1.va.home.com>
Message-ID: <3A54CFBF.CDD2138B@lemburg.com>

Guido van Rossum wrote:
> 
> > > - extern DL_IMPORT(PyObject *) PyInstance_DoBinOp(PyObject *, PyObject *,
> > > -                                                 char *, char *,
> > > -                                                 PyObject * (*)(PyObject *,
> > > -                                                                PyObject *));
> > > -
> > > - extern DL_IMPORT(int)
> > > - PyInstance_HalfBinOp(PyObject *, PyObject *, char *, PyObject **,
> > > -                       PyObject * (*)(PyObject *, PyObject *), int);
> >
> > Wouldn't it be safer to provide emulation APIs for these ? There
> > might be code out there using these APIs.
> 
> No.  These were never intended to be part of the API (and it was a
> mistake that they used DL_IMPORT()).  They had to be extern because
> they were defined in one file and used in another.  I'm glad they're
> gone.  They are so obscure that I'd be *very* surprised if anybody was
> using them, and even more if they even *wanted* emulation under the
> new scheme -- I'd expect them to eagerly convert their code to using
> new-style numbers right away.

I'll see whether I can get mxDateTime working with the new
scheme later this year -- it would be really great to do away
with the coercion hack I was using until now :-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Fri Jan  5 07:04:56 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 5 Jan 2001 01:04:56 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101041527.KAA12181@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENHIGAA.tim.one@home.com>

[Guido van Rossum]
> ...
> (Unfortunately, from a phone conversation I had last night with
> Tim, there's not much hope of doing something there -- and that
> platform [Win32] sorely needs it!  The hacks that Tim reported
> earlier are definitely not thread-safe.  While it's easy to come
> up with getc_unlocked() for Windows, the locking operations used
> internally there by the /MT code are not exported from MSVCRT.DLL,
> and that's crucial.)

The short course is that I still haven't found a workable way to lock
streams on Windows:  they do have a complete set of stream-locking functions
and macros, but there's no way short of deep magic I can find to get at them
("deep magic" == resort to assembler and patch in function addresses).

The only file-locking functions advertised in the C and platform SDK
libraries are trivial variants of Python's msvcrt.locking, but that has to
do with locking specific file byte-position ranges across processes, not
ensuring the integrity of runtime stream structures across threads.

Perl appears to ignore the issue of thread safety here (on Windows and
everywhere else).

Revealing experiment!

1. I threw away my changes and rebuilt from current CVS.

2. I made one change, expanding the getc() call in get_line to what MSVC
*would* expand it to if we weren't building in thread mode:

    if ((c = (--fp->_cnt >= 0 ?
              0xff & *fp->_ptr++ :
              _filbuf(fp))) == EOF) {

That alone reduced the runtime of my "while 1: readline" test case from over
30 seconds to 12.8.  What I did before went beyond that, by also (in effect)
unrolling the loop and optimizing it.  That bought an additional ~2 seconds.

So compared to Perl's 6 seconds, it looks like we're paying (on Win98SE)
approximately:

   17 seconds for compiling with _MT (threadsafe libc)
    6 seconds to do the work <wink>
    5 seconds for "other stuff", best guess mostly a poor
          platform malloc/realloc
    2 seconds for not optimizing the loop
   --
   30 total

Unfortunately, the smoking gun is the only one whose firing pin we can't
file down on this platform.

so-the-good-news-is-that-it's-impossible-for-perl-not-to-be-at-
    least-twice-as-fast<wink>-ly y'rs  - tim




From guido at python.org  Fri Jan  5 16:29:05 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 05 Jan 2001 10:29:05 -0500
Subject: [Python-Dev] Python 2.1 release schedule (PEP 226)
Message-ID: <200101051529.KAA19100@cj20424-a.reston1.va.home.com>

We had our first PythonLabs meeting of the year yesterday, and we went
over the 2.1 release schedule.  The release schedule is posted in PEP
226: http://python.sourceforge.net/peps/pep-0226.html

We found that the schedule previously posted there was a bit too
aggressive, given our goals for this release, so we have adjusted the
dates somewhat.  We have also decided on a date for the first alpha
release (previously unmentioned in the PEP).  So, here are the
relevant dates:

    19-Jan-2001: First 2.1 alpha release
    23-Feb-2001: First 2.1 beta release
    01-Apr-2001: 2.1 final release

We're already in PEP freeze mode -- no more PEPs will be considered
for inclusion in 2.1.  Below is a list of the PEPs that we are
currently considering, with some comments.  But first some general
remarks:

- The alpha release cycle is for testing of tentative features.  Alpha
  releases contain working code that we want to see widely tested;
  however, it's possible that a feature present in an alpha release is
  changed or even retracted in a later release.

- Beta releases represent a feature freeze -- after the first beta
  release, we will resign ourselves to fixing bugs.  Once beta 1 is
  released, no new features will be introduced, and no features will
  be withdrawn.

The alpha cycle is especially important for features (such as nested
scopes) that (may) introduce backwards incompatibilities.  There may
be more than one alpha release depending on feedback on the alpha 1
release.  (But having too many alpha releases is not good -- people
won't bother downloading.)

Thus, we can only introduce a new feature in beta 1 if we're very sure
that it is mature enough to stay without interface changes.  The final
decision on all PEPs under consideration has to be made before the
beta 1 release.

The beta cycle is important to ensure stability of the final release.

Specific PEPs under consideration:

 I    42  pep-0042.txt  Small Feature Requests                 Hylton

	  Actually, most of these won't be fulfilled in 2.1.

 SD  205  pep-0205.txt  Weak References                        Drake

	  Fred is still working on this.  I hope Tim can assist.  But
	  we may have to postpone this.

 S   207  pep-0207.txt  Rich Comparisons                   Lemburg, van Rossum

	  I'm pretty sure that this is a piece of cake now that the
	  coercion patches are checked in.

 S   208  pep-0208.txt  Reworking the Coercion Model           Schemenauer

	  All checked in.  Great work, Neil!

 S   217  pep-0217.txt  Display Hook for Interactive Use       Zadka

	  Moshe, this was accepted ages ago.  Would you mind
	  submitting a patch to SourceForge?  If you don't champion
	  this (and nobody else does), we may have to postpone it
	  still.

 S   222  pep-0222.txt  Web Library Enhancements               Kuchling

	  This is really up to Andrew.  It seems he plans to create
	  new modules, so he won't be introducing incompatibilities in
	  existing APIs.

 S   227  pep-0227.txt  Statically Nested Scopes               Hylton

	  Jeremy is still working on a proper implementation, which he
	  hopes to have ready in time for the first alpha release
	  date.

 S   229  pep-0229.txt  Using Distutils to Build Python        Kuchling

	  I just moved this from pie-in-the-sky to active.  Andrew has
	  a working prototype, it just doesn't work 100% yet, so I'm
	  very hopeful.

 S   230  pep-0230.txt  Warning Framework                      van Rossum

	  All done.

 S   232  pep-0232.txt  Function Attributes                    Warsaw

	  Still waiting for Barry to implement this, but it's pretty
	  straightforward.

 S   233  pep-0233.txt  Python Online Help                     Prescod

	  Paul, what's up with this?  Tim & I recommended to do
	  something simple and working, and then you disappeared from
	  the face of the earth.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at acm.org  Fri Jan  5 16:28:16 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 5 Jan 2001 10:28:16 -0500 (EST)
Subject: [Python-Dev] new "theme" on SourceForge!
Message-ID: <14933.59408.512734.105160@cj42289-a.reston1.va.home.com>

  While "theme-ability" is becoming very popular for desktop software
(think about the latest Gnome and KDE systems for Unix, and some of
the multimedia applications for Windows, and the newest MacOS
desktops), it can be a huge drain on Web sites; too many graphics is a
pain, and too many tables just makes it worse.
  SourceForge had definately fallen prey to the overly-fancy themes,
and all of us developers paid the price with slow rendering.
  But they've fixed that!
  The SF crew has announced a new "theme" called "Ultra Light" which
is optimized for slow connections.  What that really means is less
embedded graphics and fewer nested tables, so rendering is *much*
faster.
  To try the new theme, go to the "Change My Theme" link near the top
of the left-hand navigation area.  Use the form to select "Ultra
Light"; you can preview the theme first if you want.
  Guido also thinks its cool that the bug & patch report pages are
printable with this theme.  (Sheesh... managers! ;)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tim.one at home.com  Fri Jan  5 18:46:16 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 5 Jan 2001 12:46:16 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Lib fileinput.py,1.5,1.6
In-Reply-To: <E14EY6j-0000wX-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEPBIGAA.tim.one@home.com>

[Guido]
> Modified Files:
> 	fileinput.py
> Log Message:
> Speed it up by using readlines(sizehint).  It's still slower than
> other ways of reading input. :-(

On my box, it's now head-to-head with (maybe even a little quicker than) the
while 1: line-at-a-time way:

total 117615824 chars and 3237568 lines
readlines_sizehint    9.450  9.459
using_fileinput      29.880 29.884
while_readline       30.480 30.506

(stock CVS Python under Win98SE)

So that's a huge improvement!

the-two-people-using-fileinput-should-be-delighted<wink>-ly y'rs  - tim




From skip at mojam.com  Fri Jan  5 20:05:14 2001
From: skip at mojam.com (Skip Montanaro)
Date: Fri, 5 Jan 2001 13:05:14 -0600 (CST)
Subject: [Python-Dev] fileinput.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEPBIGAA.tim.one@home.com>
References: <E14EY6j-0000wX-00@usw-pr-cvs1.sourceforge.net>
	<LNBBLJKPBEHFEDALKOLCOEPBIGAA.tim.one@home.com>
Message-ID: <14934.6890.160122.384692@beluga.mojam.com>

    Tim> the-two-people-using-fileinput-should-be-delighted<wink>-ly

What do you think contributes to fileinput's relative disfavor?  This whole
thread on Python's file reading performance was started by the eternal whine
"why is Python so much slower than Perl?" which really means why is

   line = f.readline()
   while line:
      process(line)

so much slower than whatever that thing is in Perl that everybody uses as
the be-all-end-all performance benchmark (something with <> in it).

Given that fileinput is supposed to make the I/O loop in Python more
familiar to those people wandering over from Perl (at least in part), you'd
think that people would naturally gravitate to it.  Would it benefit from
some exposure in the Python tutorial?  Is it fast enough now to warrant the
extra exposure?

just-whining-out-loud-ly y'rs

Skip



From tim.one at home.com  Fri Jan  5 20:11:00 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 5 Jan 2001 14:11:00 -0500
Subject: [Python-Dev] new "theme" on SourceForge!
In-Reply-To: <14933.59408.512734.105160@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEPJIGAA.tim.one@home.com>

[Fred L. Drake, Jr.]

Who would have guessed that the "L." stands for Light?

> ...
> The SF crew has announced a new "theme" called "Ultra Light" which
> is optimized for slow connections.

Indeed, I think I can cancel my cable modem now and go back to a 28.8 phone
modem.

liking-it!-ly y'rs  - tim




From jeremy at alum.mit.edu  Fri Jan  5 20:14:49 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 5 Jan 2001 14:14:49 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
Message-ID: <14934.7465.360749.199433@localhost.localdomain>

There was a brief discussion of unit testing last millennium, which did
not reach any conclusions.  I'd like to restart the discussion and set
some specific goals.  The action item is a unit testing bake-off, held
next week, to choose a tool.

The primary goal is to choose a unit testing framework for the
regression test suite.  Tests written with this framework would
eventually replace the current regrtest.py framework, based on
comparing test output to expected output.

For the 2.1 release, the goal would be to choose a test framework to
include in the standard distribution and use it to write some or all
of the new tests.  We would need to integrate it in some way with
regrtest.py, so that a single command can be used to run all the
tests.

In the long run, we can migrate existing tests to use the new system.
The new system can help us address some other goals:

    - running an entire test suite to completion instead of stopping
      on the first failure

    - clearer reporting of what went wrong

    - better support for conditional tests, e.g. write a test for
      httplib that only runs if the network is up.  This is tied into
      better error reporting, since the current test suite could only
      report that httplib succeeded or failed.

Does anyone disagree with the goal?

Three tools have been proposed: PyUnit, Quixote unittest, and doctest.

doctest has been championed by Peter Funk, who wants a few new
features, but Tim, its author, isn't pushing it as a tool for writing
stand alone tests.  I think the best way to use doctest is for module
writers to consider it when writing a new module.  If doctest is used
from the start for a module, we could integrate it with the regression
test.  It seems quite useful for what it is intended for, but is not a
general solution.

That leaves PyUnit and Quixote's unittest.  The two tools are fairly
similar, but differ on a number of non-trivial details.  Quixote also
integrates code coverage, which is quite handy.  If we don't adopt its
unittest, we should add code coverage to PyUnit.

Is anyone else interested in the choice between the two?  If so, I
suggest you try writing some tests with each tool and reporting back
with your feedback.  I propose leaving one week for such a bake-off and
making a decision next Friday.

Jeremy



From fredrik at effbot.org  Fri Jan  5 20:55:18 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 5 Jan 2001 20:55:18 +0100
Subject: [Python-Dev] unit testing bake-off
References: <14934.7465.360749.199433@localhost.localdomain>
Message-ID: <004c01c07751$6eed84d0$e46940d5@hagrid>

Jeremy Hylton wrote:
> Is anyone else interested in the choice between the two?

yes.  I suggest adding doctest.py plus one unit test implementation.

> If so, I suggest you try writing some tests with each tool and
> reporting back with your feedback.

we've recently migrated from a 30-minute reimplementation of Kent
Beck's original framework to one of the frameworks you mention.  with
that background, the choice was easy.  let me know when it's time to
vote...

</F>




From guido at python.org  Fri Jan  5 20:55:33 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 05 Jan 2001 14:55:33 -0500
Subject: [Python-Dev] fileinput.py
In-Reply-To: Your message of "Fri, 05 Jan 2001 13:05:14 CST."
             <14934.6890.160122.384692@beluga.mojam.com> 
References: <E14EY6j-0000wX-00@usw-pr-cvs1.sourceforge.net> <LNBBLJKPBEHFEDALKOLCOEPBIGAA.tim.one@home.com>  
            <14934.6890.160122.384692@beluga.mojam.com> 
Message-ID: <200101051955.OAA20190@cj20424-a.reston1.va.home.com>

> What do you think contributes to fileinput's relative disfavor?

In my view, fileinput is one of those unfortunate features that exist
solely to shut up a particular kind of criticism.  Without fileinput,
Perl zealots would have an easy argument for a "trivial reject" of
even considering Python.  Now, when somebody claims the superiority of
Perl's "loop involving a <> thingie", you can point to fileinput to
prevent them from scoring a point.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Jan  5 21:01:13 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 05 Jan 2001 15:01:13 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: Your message of "Fri, 05 Jan 2001 20:55:18 +0100."
             <004c01c07751$6eed84d0$e46940d5@hagrid> 
References: <14934.7465.360749.199433@localhost.localdomain>  
            <004c01c07751$6eed84d0$e46940d5@hagrid> 
Message-ID: <200101052001.PAA20238@cj20424-a.reston1.va.home.com>

> yes.  I suggest adding doctest.py plus one unit test implementation.

I second this vote for doctest (in addition to a unittest thing).  I
propose that Tim checks in his latest version of doctest.  It should
go under Lib, not under Lib/test, I think.  (Certainly that's how Tim
has been proposing its use.)

It requires LaTeX docs, but since it's got a great docstring, that
should be easy.

> > If so, I suggest you try writing some tests with each tool and
> > reporting back with your feedback.
> 
> we've recently migrated from a 30-minute reimplementation of Kent
> Beck's original framework to one of the frameworks you mention.  with
> that background, the choice was easy.  let me know when it's time to
> vote...

Which framework are you now using?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Jan  5 21:14:41 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 05 Jan 2001 15:14:41 -0500
Subject: [Python-Dev] Add __exports__ to modules
Message-ID: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>

Please have a look at this SF patch:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470

This implements control over which names defined in a module are
externally visible: if there's a variable __exports__ in the module,
it is a list of identifiers, and any access from outside the module to
names not in the list is disallowed.  This affects access using the
getattr and setattr protocols (which raise AttributeError for
disallowed names), as well as "from M import v" (which raises
ImportError).

I like it.  This has been asked for many times.  Does anybody see a
reason why this should *not* be added?

Tim remarked that introducing this will prompt demands for a similar
feature on classes and instances, where it will be hard to implement
without causing a bit of a slowdown.  It causes a slight slowdown (an
extra dictionary lookup for each use of "M.v") even when it is not
used, but for accessing module variables that's acceptable.  I'm not
so sure about instance variable references.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Fri Jan  5 21:19:55 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 5 Jan 2001 15:19:55 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <200101052001.PAA20238@cj20424-a.reston1.va.home.com>
References: <14934.7465.360749.199433@localhost.localdomain>
	<004c01c07751$6eed84d0$e46940d5@hagrid>
	<200101052001.PAA20238@cj20424-a.reston1.va.home.com>
Message-ID: <14934.11371.879059.610988@localhost.localdomain>

If anyone is interested in experimenting with a test suite, here is a
summary of the code coverage for the current regression test suite as
run on my Linux box.  Pick a module with low code coverage and your
experiment can also improve the regression test suite.

Jeremy

 67.42%    798  Modules/arraymodule.c
 74.39%    773  Modules/audioop.c
 81.84%    380  Modules/binascii.c
 62.36%    449  Modules/bsddbmodule.c
 78.29%    152  Modules/cmathmodule.c
 67.89%    246  Modules/_codecsmodule.c
 47.41%   2647  Modules/cPickle.c
 87.50%      8  Modules/cryptmodule.c
 64.34%    272  Modules/cStringIO.c
  0.00%   1351  Modules/_cursesmodule.c
  0.00%    202  Modules/_curses_panel.c
 99.28%    139  Modules/errnomodule.c
 30.71%    127  Modules/fcntlmodule.c
 81.90%    315  Modules/gcmodule.c
  0.00%      4  Modules/getbuildinfo.c
 47.29%    277  Modules/getpath.c
 72.22%     54  Modules/grpmodule.c
 79.95%    419  Modules/imageop.c
  0.00%     11  Modules/../Include/cStringIO.h
 13.25%    234  Modules/linuxaudiodev.c
 14.80%    223  Modules/_localemodule.c
 30.66%    137  Modules/main.c
 73.20%     97  Modules/mathmodule.c
 98.39%    124  Modules/md5c.c
 69.70%     66  Modules/md5module.c
 48.62%    362  Modules/mmapmodule.c
 66.22%     74  Modules/newmodule.c
 84.91%     53  Modules/operator.c
 50.57%   1236  Modules/parsermodule.c
  0.00%    350  Modules/pcremodule.c
 28.88%   1077  Modules/posixmodule.c
 82.05%     39  Modules/pwdmodule.c
 77.96%    431  Modules/pyexpat.c
  0.00%   1876  Modules/pypcre.c
 50.00%      2  Modules/python.c
  0.00%    189  Modules/readline.c
 78.35%    425  Modules/regexmodule.c
 72.93%    931  Modules/regexpr.c
  0.00%     81  Modules/resource.c
 76.98%    443  Modules/rgbimgmodule.c
 82.70%    289  Modules/rotormodule.c
 82.47%    291  Modules/selectmodule.c
 85.10%    208  Modules/shamodule.c
 81.52%    276  Modules/signalmodule.c
 51.18%    678  Modules/socketmodule.c
 78.64%   1105  Modules/_sre.c
 69.67%    689  Modules/stropmodule.c
 80.49%    656  Modules/structmodule.c
  4.88%    123  Modules/termios.c
 60.71%    140  Modules/threadmodule.c
 68.78%    205  Modules/timemodule.c
 76.92%     65  Modules/ucnhash.c
 87.50%     16  Modules/unicodedatabase.c
 65.83%    120  Modules/unicodedata.c
 68.81%    420  Modules/zlibmodule.c
 64.68%   1005  Objects/abstract.c
 18.77%    261  Objects/bufferobject.c
 68.77%   1204  Objects/classobject.c
 27.59%     58  Objects/cobject.c
 59.41%    271  Objects/complexobject.c
 78.32%    678  Objects/dictobject.c
 52.14%    723  Objects/fileobject.c
 80.43%    368  Objects/floatobject.c
 84.86%    185  Objects/frameobject.c
 60.40%    149  Objects/funcobject.c
 78.68%    455  Objects/intobject.c
 77.66%    779  Objects/listobject.c
 81.17%   1142  Objects/longobject.c
 50.68%    148  Objects/methodobject.c
 58.82%    136  Objects/moduleobject.c
 76.50%    549  Objects/object.c
 15.24%    105  Objects/rangeobject.c
 41.03%     78  Objects/sliceobject.c
 76.63%   1797  Objects/stringobject.c
 77.00%    287  Objects/tupleobject.c
 22.22%     18  Objects/typeobject.c
 84.26%    108  Objects/unicodectype.c
 66.61%   2743  Objects/unicodeobject.c
 90.79%     76  Parser/acceler.c
  0.00%     28  Parser/bitset.c
  0.00%     67  Parser/firstsets.c
 18.18%     22  Parser/grammar1.c
  0.00%    139  Parser/grammar.c
  0.00%     30  Parser/intrcheck.c
  0.00%     38  Parser/listnode.c
  0.00%      2  Parser/metagrammar.c
  0.00%     63  Parser/myreadline.c
 90.70%     43  Parser/node.c
 82.26%    124  Parser/parser.c
 79.38%     97  Parser/parsetok.c
  0.00%    366  Parser/pgen.c
  0.00%     85  Parser/pgenmain.c
  0.00%     60  Parser/printgrammar.c
 76.70%    588  Parser/tokenizer.c
 62.31%   1231  Python/bltinmodule.c
 76.55%   2021  Python/ceval.c
 64.78%    230  Python/codecs.c
 73.85%   2367  Python/compile.c
 76.67%     30  Python/dynload_shlib.c
 75.75%    301  Python/errors.c
 65.59%    401  Python/exceptions.c
  0.00%     31  Python/frozenmain.c
 56.83%    776  Python/getargs.c
100.00%      2  Python/getcompiler.c
100.00%      2  Python/getcopyright.c
 80.00%      5  Python/getmtime.c
 15.62%     32  Python/getopt.c
100.00%      2  Python/getplatform.c
100.00%      4  Python/getversion.c
 61.78%   1167  Python/import.c
 66.67%     42  Python/importdl.c
 51.35%    483  Python/marshal.c
 60.58%    274  Python/modsupport.c
 88.73%     71  Python/mystrtoul.c
  0.00%      2  Python/pyfpe.c
 91.15%    113  Python/pystate.c
 37.80%    635  Python/pythonrun.c
  0.00%      5  Python/sigcheck.c
 12.67%    150  Python/structmember.c
 53.87%    323  Python/sysmodule.c
100.00%      5  Python/thread.c
 53.47%    144  Python/thread_pthread.h
 21.74%    138  Python/traceback.c
 58.65%  48417  TOTAL



From tim.one at home.com  Fri Jan  5 21:46:10 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 5 Jan 2001 15:46:10 -0500
Subject: [Python-Dev] RE: fileinput.py
In-Reply-To: <14934.6890.160122.384692@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEAAIHAA.tim.one@home.com>

[Skip Montanaro]
> What do you think contributes to fileinput's relative disfavor?

Only half jokingly, because I never use it <wink>, and I don't think Fredrik
or Alex Martelli do either.  That means it rarely gets mentioned by the
c.l.py reply bots.  Plus it's not *used* anywhere in the Python
distribution, so nobody stumbles into it that way either.  Plus the docs
require more than one line to explain what it does, and get bogged down
describing the Awk-like (Perl took this from Awk) convolutions before the
simplest (one explictly named file) case.  It *is* regularly mentioned in
the eternal "while 1:" debate, but that's it.

> This whole thread on Python's file reading performance was started
> by the eternal whine "why is Python so much slower than Perl?"

No, it started with Guido's objections to Jeff's xreadlines patch.  I
dragged Perl into it -- because, like it or not, that was the right thing to
do <wink>.

> which really means why is
>
>    line = f.readline()
>    while line:
>       process(line)
>
> so much slower than whatever that thing is in Perl that everybody
> uses as the be-all-end-all performance benchmark (something with
> <> in it).

"<FILE>" is simply Perl's way of spelling Python's FILE.readline() (and
FILE.readlines(), when <FILE> appears in an array context; and FILE.read()
when Perl's Awkish "record separator" is disabled; and ...).  "<>" without
an explict filehandle does all the inherited-from-Awk magic with argv, else
that stuff doesn't come into play.   "<>" (wihtout a filehandle) seems
rarely used in Perl practice, though, *except* in support of

your_shell_prompt> some_perl_script < some_file

That is, "<>" is usually used simply as an abbrevision for <STDIN>, and I
bet *most* Perl programmers don't even know "<>" is more general than that.

> Given that fileinput is supposed to make the I/O loop in Python more
> familiar to those people wandering over from Perl (at least in part),
> you'd think that people would naturally gravitate to it.

I guess you didn't actually read the timing results <wink>.  Really, it's
been an outrageously slow way to do input.  That's better now, and I'm much
more likely now than I used to be to use

    for line in fileinput.input('file'):

instead of

    f = open('file')
    while 1:
        line = f.readline()
        if not line:
            break

The relative attraction of the former is obvious if it's reasonably quick.
I don't really have any use for the Awk complications (note that I'm running
on Windows, though, and the shells here don't expand wildcards -- the Awk
gimmicks are much more useful on Unix systems).

> Would it benefit from some exposure in the Python tutorial?

Heh -- that's a tough one.  The *simplest* case is the only one deserving of
promotion.  But in that case, Jeff's xreadlines is about as convenient and
much quicker.  I bet we'll all be afraid to change the tutorial to mention
either <0.9 wink>.

> Is it fast enough now to warrant the extra exposure?

Don't know.  It's the same speed as "while 1: on *my* box now, but still 3x
slower than the double-loop method.

> just-whining-out-loud-ly y'rs

so-do-*you*-want-to-use-it-now?-ly y'rs  - tim




From thomas at xs4all.net  Fri Jan  5 22:19:42 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 5 Jan 2001 22:19:42 +0100
Subject: [Python-Dev] RE: fileinput.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEAAIHAA.tim.one@home.com>; from tim.one@home.com on Fri, Jan 05, 2001 at 03:46:10PM -0500
References: <14934.6890.160122.384692@beluga.mojam.com> <LNBBLJKPBEHFEDALKOLCIEAAIHAA.tim.one@home.com>
Message-ID: <20010105221942.J2467@xs4all.nl>

On Fri, Jan 05, 2001 at 03:46:10PM -0500, Tim Peters wrote:

> "<>" (wihtout a filehandle) seems
> rarely used in Perl practice, though, *except* in support of
> 
> your_shell_prompt> some_perl_script < some_file
> 
> That is, "<>" is usually used simply as an abbrevision for <STDIN>, and I
> bet *most* Perl programmers don't even know "<>" is more general than that.

Well, I can't say anything about *most* Perl programmers, but all Perl
programmers I know (including me) know damned well what <> does, and use it
frequently. And in all the ways: no arguments meaning <STDIN>, a list of
files meaning open those files one at a time, using - to include stdin in
that list, accessing the filename and linenumber, etc. None of them can be
called newbies, though.

But then, I like using Python's fileinput, too, so maybe I'm just weird :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From ping at lfw.org  Fri Jan  5 23:01:53 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Fri, 5 Jan 2001 16:01:53 -0600 (CST)
Subject: [Python-Dev] RE: fileinput.py
In-Reply-To: <20010105221942.J2467@xs4all.nl>
Message-ID: <Pine.LNX.4.10.10101051553200.452-100000@skuld.kingmanhall.org>

On Fri, 5 Jan 2001, Thomas Wouters wrote:

> On Fri, Jan 05, 2001 at 03:46:10PM -0500, Tim Peters wrote:
> > That is, "<>" is usually used simply as an abbrevision for <STDIN>, and I
> > bet *most* Perl programmers don't even know "<>" is more general than that.
> 
> Well, I can't say anything about *most* Perl programmers, but all Perl
> programmers I know (including me) know damned well what <> does, and use it
> frequently. And in all the ways: no arguments meaning <STDIN>, a list of
> files meaning open those files one at a time, using - to include stdin in
> that list, accessing the filename and linenumber, etc.

I was just about to chime in and say the same thing.  I don't even
program in Perl any more, and i still remember all the ways that <> works.

For text-processing scripts, it's unbeatable.  It does pretty much
exactly everything you want, and the idiom

    while (<>) {
        ...
    }

is simple, quickly learned, frequently used, and instantly recognizable.

    import sys
    if len(sys.argv) > 1:
        file = open(sys.argv[1])
    else:
        file = sys.stdin
    while 1:
        line = file.readline()
        if not line:
            break
        ...

is much more complex, harder to explain, harder to learn, and runs slower.

I have two separate suggestions:

    1.  Include 'sys' in builtins.  It's silly to have to 'import sys'
        just to be able to see sys.argv and sys.stdin.

    2.  Put fileinput.input() in sys.

With both, the while (<>) idiom becomes:

    for line in sys.input():
        ...


-- ?!ng

"This code is better than any code that doesn't work has any right to be."
    -- Roger Gregory, on Xanadu




From skip at mojam.com  Fri Jan  5 23:19:36 2001
From: skip at mojam.com (Skip Montanaro)
Date: Fri, 5 Jan 2001 16:19:36 -0600 (CST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <14934.11371.879059.610988@localhost.localdomain>
References: <14934.7465.360749.199433@localhost.localdomain>
	<004c01c07751$6eed84d0$e46940d5@hagrid>
	<200101052001.PAA20238@cj20424-a.reston1.va.home.com>
	<14934.11371.879059.610988@localhost.localdomain>
Message-ID: <14934.18552.749081.871226@beluga.mojam.com>

    Jeremy> If anyone is interested in experimenting with a test suite, here
    Jeremy> is a summary of the code coverage for the current regression
    Jeremy> test suite as run on my Linux box.

Speaking of which, I am still running my nightly code coverage thing (still
with warts) whose results are available at

    http://musi-cal.mojam.com/~skip/python/Python/dist/src/

Does anyone care?  Should I turn it off?

Skip



From thomas at xs4all.net  Sat Jan  6 00:18:58 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sat, 6 Jan 2001 00:18:58 +0100
Subject: [Python-Dev] RE: fileinput.py
In-Reply-To: <Pine.LNX.4.10.10101051553200.452-100000@skuld.kingmanhall.org>; from ping@lfw.org on Fri, Jan 05, 2001 at 04:01:53PM -0600
References: <20010105221942.J2467@xs4all.nl> <Pine.LNX.4.10.10101051553200.452-100000@skuld.kingmanhall.org>
Message-ID: <20010106001858.B402@xs4all.nl>

On Fri, Jan 05, 2001 at 04:01:53PM -0600, Ka-Ping Yee wrote:

>     while (<>) {
>         ...
>     }

> is simple, quickly learned, frequently used, and instantly recognizable.

>     import sys
>     if len(sys.argv) > 1:
>         file = open(sys.argv[1])
>     else:
>         file = sys.stdin
>     while 1:
>         line = file.readline()
>         if not line:
>             break
>         ...

... Except that it can take more than one filename, and will do the one
after another, and that it takes "-" as a filename for stdin. Doing it in a
script is not dead simple, unless you open up all files at once (which can
be harmful, and Perl, for one, doesn't do) or you do most of the work
fileinput does. That is why I use fileinput (and while-diamond) -- I might
not need it now, but when I do need it, it already works :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From moshez at zadka.site.co.il  Sat Jan  6 12:00:33 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Sat,  6 Jan 2001 13:00:33 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
Message-ID: <20010106110033.52127A84F@darjeeling.zadka.site.co.il>

On Fri, 05 Jan 2001 15:14:41 -0500, Guido van Rossum <guido at python.org> wrote:

> Please have a look at this SF patch:
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> 
> This implements control over which names defined in a module are
> externally visible: if there's a variable __exports__ in the module,
> it is a list of identifiers, and any access from outside the module to
> names not in the list is disallowed.  This affects access using the
> getattr and setattr protocols (which raise AttributeError for
> disallowed names), as well as "from M import v" (which raises
> ImportError).

Ummmmm.....why do we want this? What's wrong with the current
suggestion of using "_"? __exports__ feels somehow wrong to
me. None of the rest of Python has any access control, and
I really like that. A big -1 from me, for what it's worth.

> I like it.

I'm surprised. Why do you like that?

>  This has been asked for many times.  

So has adding curly-braces as control structure, with all due respect.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From billtut at microsoft.com  Sat Jan  6 04:43:06 2001
From: billtut at microsoft.com (Bill Tutt)
Date: Fri, 5 Jan 2001 19:43:06 -0800 
Subject: [Python-Dev] Add __exports__ to modules
Message-ID: <58C671173DB6174A93E9ED88DCB0883DB8637E@red-msg-07.redmond.corp.microsoft.com>

I think I'm with Moshe on this one, whats wrong with just using underscores
(__) to play the hiding game.

Here's my silly language suggestion for this week:

with self:
  .bar = foo
  bar.blah = .fubar
  .bar = .bar + 1
  # etc....

Bill



From skip at mojam.com  Sat Jan  6 05:15:12 2001
From: skip at mojam.com (Skip Montanaro)
Date: Fri, 5 Jan 2001 22:15:12 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010106110033.52127A84F@darjeeling.zadka.site.co.il>
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
	<20010106110033.52127A84F@darjeeling.zadka.site.co.il>
Message-ID: <14934.39888.908416.983794@beluga.mojam.com>

    > On Fri, 05 Jan 2001 15:14:41 -0500, Guido van Rossum <guido at python.org> wrote:
    > Please have a look at this SF patch:
    > 
    > http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
    > 
    > This implements control over which names defined in a module are
    > externally visible: if there's a variable __exports__ in the module,
    > it is a list of identifiers, and any access from outside the module to
    > names not in the list is disallowed.  This affects access using the
    > getattr and setattr protocols (which raise AttributeError for
    > disallowed names), as well as "from M import v" (which raises
    > ImportError).

I have to agree with Moshe.  If __exports__ is implemented for modules we'll
have multiple, different access control mechanisms for different things,
some of which thoughtful programmers would be able to get around, some of
which they wouldn't.  Here are the ways I'm aware of to control attribute
visibility (there may be others - I don't usually delve too deeply into this
stuff):

  * preface module globals with "_": This just prevents those globals from
    being added to the current namespace when a programmer executes "from
    module import *".  Programmers can workaround this by attribute access
    through the module object or by explicitly importing it: "from module
    import _foo" works, yes?

  * preface class or instance attributes with "__":  This just mangles the
    name by prefacing the visible name with _<classname>.  The programmer
    can still access it by knowing the simple name mangling rule.

In both cases the programmer can still get at the attribute value when
necessary.

If you were to add some sort of access control to module globals, I would
have thought it would have been along the same lines as the existing
mechanisms in place to "hide" class/instance attributes.  Would it be
possible (or desirable) to add the name mangling restriction to module
globals as an alternative to this more restrictive implementation?  What
about the chances that class/instance attribute hiding will get more
restrictive in the future?  Finally, are the motivations for wanting to
restrict access to module globals and class/instance attributes that much
different from one another that they call for fundamentally different
mechanisms?

Skip



From barry at digicool.com  Sat Jan  6 06:15:20 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 6 Jan 2001 00:15:20 -0500
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
	<20010106110033.52127A84F@darjeeling.zadka.site.co.il>
Message-ID: <14934.43496.322436.612746@anthem.wooz.org>

I'm -0 on this, largely for the reasons already brought up: if modules
grow __exports__ then there will be pressure to add it to classes, and
modules already have a limited version of access control through
leading underscore names.

I might be more positive on the addition if __exports__ were added to
classes, because at least there'd be a consistently stronger fence
added to name access rules that prevented even consenting adults from
fiddling with the naughty bits.

-Barry




From nas at arctrix.com  Sat Jan  6 00:20:58 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 5 Jan 2001 15:20:58 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <14934.43496.322436.612746@anthem.wooz.org>; from barry@digicool.com on Sat, Jan 06, 2001 at 12:15:20AM -0500
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com> <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <14934.43496.322436.612746@anthem.wooz.org>
Message-ID: <20010105152058.A6016@glacier.fnational.com>

On Sat, Jan 06, 2001 at 12:15:20AM -0500, Barry A. Warsaw wrote:
> I might be more positive on the addition if __exports__ were added to
> classes, because at least there'd be a consistently stronger fence
> added to name access rules that prevented even consenting adults from
> fiddling with the naughty bits.

I think you, Skip and Moshe are missing a big advantage of having
the __exports__ mechanism.  It should allow some attribute access
inside of modules to become faster (like LOAD_FAST for locals).
I think that optimization could be implemented without too much
difficultly.  I've never channeled Guido before so I could be off
the mark.  If the only advantage is encapsulation then I'm -0.

  Neil



From barry at digicool.com  Sat Jan  6 08:09:31 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 6 Jan 2001 02:09:31 -0500
Subject: [Python-Dev] PEP 232 update and patch
Message-ID: <14934.50347.851118.581484@anthem.wooz.org>


I've updated PEP 232, function attributes, and uploaded a patch to SF.
I couldn't coax cvs diff into including the new files
Lib/test/test_funcattrs.py and Lib/test/output/test_funcattrs so I'll
attach them below.

PEP 232:
   http://python.sourceforge.net/peps/pep-0232.html

SF patch #103123:
   http://sourceforge.net/patch/?func=detailpatch&patch_id=103123&group_id=5470

Enjoy,
-Barry

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_funcattrs.py
URL: <http://mail.python.org/pipermail/python-dev/attachments/20010106/bf1d0513/attachment-0001.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_funcattrs
URL: <http://mail.python.org/pipermail/python-dev/attachments/20010106/bf1d0513/attachment-0001.asc>

From martin at loewis.home.cs.tu-berlin.de  Sat Jan  6 11:06:49 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 6 Jan 2001 11:06:49 +0100
Subject: [Python-Dev] PEP 208 comment
Message-ID: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de>

I just studied PEP 208 for the first time. Overall, it seems all
natural and nice, but there is one one aspect I'd like to see changed:
the naming of the type flag.

Currently, it is called Py_TPFLAGS_NEWSTYLENUMBER. IMHO, nothing in a
program should be called "new". The flag will still be there five
years from now, but it won't be new anymore. Also, while the flag
indicates that style of the numbers is new, it does not say what it
does. So I propose to rename it; if nobody finds a better name, I
propose to call it Py_TPFLAGS_UNCOERCED.

Regards,
Martin



From thomas at xs4all.net  Sat Jan  6 13:52:19 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sat, 6 Jan 2001 13:52:19 +0100
Subject: [Python-Dev] PEP 208 comment
In-Reply-To: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Sat, Jan 06, 2001 at 11:06:49AM +0100
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de>
Message-ID: <20010106135219.L2467@xs4all.nl>

On Sat, Jan 06, 2001 at 11:06:49AM +0100, Martin v. Loewis wrote:

> Currently, it is called Py_TPFLAGS_NEWSTYLENUMBER. IMHO, nothing in a
> program should be called "new". The flag will still be there five
> years from now, but it won't be new anymore. Also, while the flag
> indicates that style of the numbers is new, it does not say what it
> does. So I propose to rename it; if nobody finds a better name, I
> propose to call it Py_TPFLAGS_UNCOERCED.

Wrong name. The TPFLAGs only indicate whether a struct is large enough to
contain a particular member, not whether that member is going to contain or
do anything. 'Py_TPFLAGS_HASCOERCE' or some such would seem more appropriate
to me.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From martin at loewis.home.cs.tu-berlin.de  Sat Jan  6 14:36:39 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 6 Jan 2001 14:36:39 +0100
Subject: [Python-Dev] PEP 208 comment
In-Reply-To: <20010106135219.L2467@xs4all.nl> (message from Thomas Wouters on
	Sat, 6 Jan 2001 13:52:19 +0100)
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de> <20010106135219.L2467@xs4all.nl>
Message-ID: <200101061336.f06DadP02895@mira.informatik.hu-berlin.de>

> Wrong name. The TPFLAGs only indicate whether a struct is large enough to
> contain a particular member, not whether that member is going to contain or
> do anything. 

That may have been the original intention; *this* specific flag is not
of that kind. Please look at abstract.c:binary_op1, which has

	if (v->ob_type->tp_as_number != NULL && NEW_STYLE_NUMBER(v)) {
		slot = NB_BINOP(v->ob_type->tp_as_number, op_slot);
		if (*slot) {
			x = (*slot)(v, w);
			if (x != Py_NotImplemented) {
				return x;
			}
			Py_DECREF(x); /* can't do it */
		}
		if (v->ob_type == w->ob_type) {
			goto binop_error;
		}
	}

Here, no additional member was added: there always was tp_as_number,
and that also supported all possible op_slot values. What is new here
is that the slot may be called even if v and w have different types;
that was not allowed before the PEP 208 changes. Yet it tests for
NEW_STYLE_NUMBER(v), which is

PyType_HasFeature((o)->ob_type, Py_TPFLAGS_NEWSTYLENUMBER)

So the presence of this flag is indeed an promise that a specific
member will do something that it normally wouldn't do.

> 'Py_TPFLAGS_HASCOERCE' or some such would seem more appropriate to
> me.

Well, all numbers still have coercion - it just may not be used if the
flag is present. It's not a matter of having or not having something
(well, only the "new style" numbers may have nb_cmp, but calling it
Py_TPFLAGS_HAS_NB_CMP would be besides the point, IMO).

Anyway, I don't want to defend my version too much - I just want to
request that the current name is changed to *something* more
descriptive.

Regards,
Martin



From skip at mojam.com  Sat Jan  6 15:40:30 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 6 Jan 2001 08:40:30 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010105152058.A6016@glacier.fnational.com>
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
	<20010106110033.52127A84F@darjeeling.zadka.site.co.il>
	<14934.43496.322436.612746@anthem.wooz.org>
	<20010105152058.A6016@glacier.fnational.com>
Message-ID: <14935.11870.360839.235102@beluga.mojam.com>

    Neil> I think you, Skip and Moshe are missing a big advantage of having
    Neil> the __exports__ mechanism.  It should allow some attribute access
    Neil> inside of modules to become faster (like LOAD_FAST for locals).  I
    Neil> think that optimization could be implemented without too much
    Neil> difficultly.

True enough, that hadn't occurred to me.  Knowing that now, I still don't
think consistency of the interface should suffer as a result of
under-the-covers performance gains.

Skip




From skip at mojam.com  Sat Jan  6 15:42:25 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 6 Jan 2001 08:42:25 -0600 (CST)
Subject: [Python-Dev] Re: [Patches] [Patch #103123] PEP 232 implementation (function attributes)
In-Reply-To: <E14En6H-0000ol-00@usw-sf-web1.sourceforge.net>
References: <E14En6H-0000ol-00@usw-sf-web1.sourceforge.net>
Message-ID: <14935.11985.972526.108391@beluga.mojam.com>

Oooo...  I tried went to check out Barry's function attribute patch at

    http://sourceforge.net/patch/?func=detailpatch&patch_id=103123&group_id=5470

and got

    Fatal error: Call to a member function on a non-object in
    /usr/local/htdocs/alexandria/www/patch/index.php on line 55

in response.  Any idea whazzup?

Skip



From akuchlin at cnri.reston.va.us  Sat Jan  6 15:47:59 2001
From: akuchlin at cnri.reston.va.us (Andrew Kuchling)
Date: Sat, 6 Jan 2001 09:47:59 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <14934.18552.749081.871226@beluga.mojam.com>; from skip@mojam.com on Fri, Jan 05, 2001 at 04:19:36PM -0600
References: <14934.7465.360749.199433@localhost.localdomain> <004c01c07751$6eed84d0$e46940d5@hagrid> <200101052001.PAA20238@cj20424-a.reston1.va.home.com> <14934.11371.879059.610988@localhost.localdomain> <14934.18552.749081.871226@beluga.mojam.com>
Message-ID: <20010106094759.A13723@newcnri.cnri.reston.va.us>

On Fri, Jan 05, 2001 at 04:19:36PM -0600, Skip Montanaro wrote:
>Speaking of which, I am still running my nightly code coverage thing (still
>with warts) whose results are available at
>    http://musi-cal.mojam.com/~skip/python/Python/dist/src/

Add a link to it from the Python development pages on SourceForge; I
suspect much of the problem is that people don't remember the URL for
it, and don't want to dig through the archives to find it.

--amk




From mal at lemburg.com  Sat Jan  6 16:15:27 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 06 Jan 2001 16:15:27 +0100
Subject: [Python-Dev] PEP 208 comment
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de>
Message-ID: <3A57368F.FC01F78@lemburg.com>

"Martin v. Loewis" wrote:
> 
> I just studied PEP 208 for the first time. Overall, it seems all
> natural and nice, but there is one one aspect I'd like to see changed:
> the naming of the type flag.
> 
> Currently, it is called Py_TPFLAGS_NEWSTYLENUMBER. IMHO, nothing in a
> program should be called "new". The flag will still be there five
> years from now, but it won't be new anymore. Also, while the flag
> indicates that style of the numbers is new, it does not say what it
> does. So I propose to rename it; if nobody finds a better name, I
> propose to call it Py_TPFLAGS_UNCOERCED.

Given that the design could well be applied to other slots as
well, I think you've got a point there. The idea behind the
flag was to signal that slots will no longer make object type
assumptions which they could previously. Right now, only numeric
types support this feature. In the future I could imaging
strings and other types involving coercion would also want
to use the feature.

Given this design idea, how about calling the flag
Py_TPFLAGS_CHECKTYPES ?!

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From skip at mojam.com  Sat Jan  6 16:35:20 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 6 Jan 2001 09:35:20 -0600 (CST)
Subject: [Python-Dev] function attributes as "true" class attributes & reclamation error
Message-ID: <14935.15160.130742.390323@beluga.mojam.com>

You know, I thought of something (which was probably already obvious to the
rest of you) while perusing Barry's patch.  Attaching function attributes to
unbound methods could really function like C++ static data members.  You'd
have to write accessor functions to make setting the attributes look clean,
but that wouldn't be all bad.  Precisely because you couldn't modify them
through the bound method, there's be no chance you could make the mistake of
modifying them that way and having them transmogrify into instance
attributes.

Here's a quick example:

    class C:
      def __init__(self):
	self.just_resting()
      __init__.howmany = 0

      def __del__(self):
	self.hes_dead()

      def hes_dead(self):
	C.__init__.howmany -= 1

      def just_resting(self):
	C.__init__.howmany += 1

      def howmany(self):
	return C.__init__.howmany

    def howmany():
	return C.__init__.howmany

    c = C()
    print c.howmany()
    d = C()
    print d.howmany()
    del c
    print d.howmany()

After applying Barry's patch, if I execute this script from the command line
it displays

    1
    2
    1

as one would expect, but then catches an attribute error during cleanup:

    Exception exceptions.AttributeError: "'None' object has no attribute
    '__init__'" in <method C.__del__ of C instance at 0x80ffc14> ignored

If I add "del d" to the end of the script the exception disappears.  I
suspect there is a cleanup order problem of some sort.  It seems like C is
getting reclaimed before d (not possible), or that d's __class__ attribute
is set to None before its __del__ method is called.  Is this a known problem
or something introduced by Barry's patch?

Skip




From barry at digicool.com  Sat Jan  6 17:09:47 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 6 Jan 2001 11:09:47 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #103123] PEP 232 implementation (function attributes)
References: <E14En6H-0000ol-00@usw-sf-web1.sourceforge.net>
	<14935.11985.972526.108391@beluga.mojam.com>
Message-ID: <14935.17227.634808.132783@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip at mojam.com> writes:

    SM> and got

    |     Fatal error: Call to a member function on a non-object in
    |     /usr/local/htdocs/alexandria/www/patch/index.php on line 55

    SM> in response.  Any idea whazzup?

I got a similar error on SF when I tried to find my patch on the
patches page.  I still think the patch manager just gives you no way
to see all the patches when there's more than what fits on one page.
The error dropped a cookie in my lap that logged me out too.

After I logged in again, it all seemed to work.

-Barry




From martin at loewis.home.cs.tu-berlin.de  Sat Jan  6 16:20:51 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 6 Jan 2001 16:20:51 +0100
Subject: [Python-Dev] PEP 208 comment
In-Reply-To: <3A57368F.FC01F78@lemburg.com> (mal@lemburg.com)
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de> <3A57368F.FC01F78@lemburg.com>
Message-ID: <200101061520.f06FKpu03218@mira.informatik.hu-berlin.de>

> Given this design idea, how about calling the flag
> Py_TPFLAGS_CHECKTYPES ?!

Sounds good to me.

Martin



From thomas at xs4all.net  Sat Jan  6 17:47:24 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sat, 6 Jan 2001 17:47:24 +0100
Subject: [Python-Dev] PEP 208 comment
In-Reply-To: <200101061336.f06DadP02895@mira.informatik.hu-berlin.de>; from martin@loewis.home.cs.tu-berlin.de on Sat, Jan 06, 2001 at 02:36:39PM +0100
References: <200101061006.f06A6nn01722@mira.informatik.hu-berlin.de> <20010106135219.L2467@xs4all.nl> <200101061336.f06DadP02895@mira.informatik.hu-berlin.de>
Message-ID: <20010106174724.M2467@xs4all.nl>

On Sat, Jan 06, 2001 at 02:36:39PM +0100, Martin v. Loewis wrote:

> That may have been the original intention; *this* specific flag is not
> of that kind. Please look at abstract.c:binary_op1, which has

You're right, I stand corrected, I retract my proposal :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Sat Jan  6 23:05:23 2001
From: guido at python.org (Guido van Rossum)
Date: Sat, 06 Jan 2001 17:05:23 -0500
Subject: [Python-Dev] function attributes as "true" class attributes & reclamation error
In-Reply-To: Your message of "Sat, 06 Jan 2001 09:35:20 CST."
             <14935.15160.130742.390323@beluga.mojam.com> 
References: <14935.15160.130742.390323@beluga.mojam.com> 
Message-ID: <200101062205.RAA23603@cj20424-a.reston1.va.home.com>

> You know, I thought of something (which was probably already obvious to the
> rest of you) while perusing Barry's patch.  Attaching function attributes to
> unbound methods could really function like C++ static data members.  You'd
> have to write accessor functions to make setting the attributes look clean,
> but that wouldn't be all bad.  Precisely because you couldn't modify them
> through the bound method, there's be no chance you could make the mistake of
> modifying them that way and having them transmogrify into instance
> attributes.
> 
> Here's a quick example:
> 
>     class C:
>       def __init__(self):
> 	self.just_resting()
>       __init__.howmany = 0
> 
>       def __del__(self):
> 	self.hes_dead()
> 
>       def hes_dead(self):
> 	C.__init__.howmany -= 1
> 
>       def just_resting(self):
> 	C.__init__.howmany += 1
> 
>       def howmany(self):
> 	return C.__init__.howmany
> 
>     def howmany():
> 	return C.__init__.howmany
> 
>     c = C()
>     print c.howmany()
>     d = C()
>     print d.howmany()
>     del c
>     print d.howmany()

Skip, I don't find this better than the existing solution, which uses
C._howmany instead of C.__init__.howmany.

True, you can access it as self._howmany and if you assign to
self._howmany you'd transform it into an instance attribute -- but
that falls in the "then don't do that" category.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Sat Jan  6 23:14:44 2001
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 6 Jan 2001 17:14:44 -0500
Subject: [Python-Dev] Rehabilitating fgets
Message-ID: <LNBBLJKPBEHFEDALKOLCCEBOIHAA.tim_one@email.msn.com>

[Guido]
> ...
> Unfortunately we can't use fgets(), even if it were faster than
> getline(), because it doesn't tell how many characters it read.

Let's think about that a little harder, because it appears to be our only
hope on Windows (the MS fgets isn't optimized like the Perl inner loop, but
it does lock/unlock the stream only at routine entry/exit, and uses a hidden
non-locking (== much faster) variant of getc in the guts -- we've seen that
the "locking" part of MS getc accounts for 17 of 30 seconds in my test
case).

> On files containing null bytes, readline() is supposed to treat
> these like any other character;

fgets does too (at least it does on Windows, and I believe that's std
behavior).  The problem is that it also makes up a null byte on its own.

> If your input is "abc\0def\nxyz\n", the first readline() call
> should return "abc\0def\n".

Yes.

> But with fgets(), you're left to look in the returned buffer for
> a null byte,

Also yes.  But suppose I search "from the right", and ensure the buffer is
free of null bytes before the fgets.  For your input file above, fgets
overwrites the initial 9 bytes of the buffer (assuming the buffer is at
least 9 bytes long ...) with

    "abc\0def\n\0"

and there's no problem if I search from the right.

> and there's no way (in general) to distinguish this result from
> an input file that only consisted of the three characters "abc".

As above, I'm not convinced of that.  The input file "abc" would overwrite
the first four bytes of the buffer with

    "abc\0"

and leave the tail end alone (well, the MS fgets leaves the tail alone,
although I'm not sure ANSI C guarantees that).

Of course I've *read* any number of Unix(tm) FAQs that also claim it's
impossible, but I never believed them either <wink>.

This extra buffer fiddling is surely an expense I don't want to pay, but the
timing evidence on Windows so far says that I can probably search and/or
copy the whole buffer 100 times and still be faster than enduring the
threadsafe getc.

Am I missing something obvious?





From guido at python.org  Sat Jan  6 23:33:00 2001
From: guido at python.org (Guido van Rossum)
Date: Sat, 06 Jan 2001 17:33:00 -0500
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: Your message of "Sat, 06 Jan 2001 17:14:44 EST."
             <LNBBLJKPBEHFEDALKOLCCEBOIHAA.tim_one@email.msn.com> 
References: <LNBBLJKPBEHFEDALKOLCCEBOIHAA.tim_one@email.msn.com> 
Message-ID: <200101062233.RAA23942@cj20424-a.reston1.va.home.com>

[Tim suggests to use fgets(), preparing the buffer with non-null
bytes, and searching for a null byte from the right.]

If this is really sufficiently fast, I'd say, go for it.  Looks
bullet-proof as long as the source code to MSVCRT doesn't change. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim_one at email.msn.com  Sat Jan  6 23:34:42 2001
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 6 Jan 2001 17:34:42 -0500
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEBOIHAA.tim_one@email.msn.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEBPIHAA.tim_one@email.msn.com>

[Tim, pondering]
> ... But suppose I search "from the right", and ensure the buffer is
> free of null bytes before the fgets.

Even better, suppose I ensure the buffer is free of both null bytes and
newlines before the fgets; then if I search from the *left* for a newline
and find one, it must be that fgets found a line and it ends right there,
and this should usually obtain.  There's no need to search from the right
unless I don't find a newline ...





From skip at mojam.com  Sun Jan  7 02:15:08 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 6 Jan 2001 19:15:08 -0600 (CST)
Subject: [Python-Dev] function attributes as "true" class attributes & reclamation error
In-Reply-To: <200101062205.RAA23603@cj20424-a.reston1.va.home.com>
References: <14935.15160.130742.390323@beluga.mojam.com>
	<200101062205.RAA23603@cj20424-a.reston1.va.home.com>
Message-ID: <14935.49948.574427.668588@beluga.mojam.com>

    Skip> Attaching function attributes to unbound methods could really
    Skip> function like C++ static data members....

    Guido> Skip, I don't find this better than the existing solution, which
    Guido> uses C._howmany instead of C.__init__.howmany.

It was more a "hey, I never thought of it quite that way" than a "hey, I
think this would be a great new idiom".  In fact, I believe the more
important part of my note was the bit about the attribute error on exit.

I'm sure function attributes will attract their fair share of abuse. ;-)

Skip





From tim_one at email.msn.com  Sun Jan  7 04:16:31 2001
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 6 Jan 2001 22:16:31 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin test_charmapcodec test_pow
Message-ID: <LNBBLJKPBEHFEDALKOLCMECFIHAA.tim_one@email.msn.com>

I'm pretty sure the test_pow and test_charmapcodec failures aren't my doing.

test_builtin fails because raw_input() isn't stripping a trailing newline.
I've got my own code in this area that *may* be to blame, but I don't see
how it could be.  I note that fileobject.c's new function get_line_raw has
the comment

/* Internal routine to get a line for raw_input():
   strip trailing '\n', raise EOFError if EOF reached immediately
*/

but the code doesn't look for a trailing newline (let alone strip one).





From tim_one at email.msn.com  Sun Jan  7 04:33:02 2001
From: tim_one at email.msn.com (Tim Peters)
Date: Sat, 6 Jan 2001 22:33:02 -0500
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: <200101062233.RAA23942@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com>

> [Tim suggests to use fgets(), preparing the buffer with non-null
> bytes, and searching for a null byte from the right.]

[Guido]
> If this is really sufficiently fast, I'd say, go for it.  Looks
> bullet-proof as long as the source code to MSVCRT doesn't change. :-)

Surprise?  Despite all the memsets, memchrs (looking for a newline), and
one-at-a-time backward searches (looking for a null byte), it's a huge win
on Windows:

total 117615824 chars and 3237568 lines
readlines_sizehint    9.550  9.578
using_fileinput      28.790 28.781
while_readline       13.120 13.134

The last one was 30.5 seconds before the fgets hackery.

I'll check it in tomorrow after sleeping on it (there's a large pile of
messy endcases (not only does fgets() invent a null byte, it can't tell you
whether it stopped reading due to EOF, so maybe the last line in the file
ends with 10000 null bytes + no newline + exactly lines up with a buffer
boundary -- etc); test_builtin is failing in a closely related area but
nobody would have checked in code that failed a std test <wink>; and it's
been a frustrating day all around).

i-want-my-cable-modem-back-now-ly y'rs  - tim





From esr at thyrsus.com  Sun Jan  7 05:01:25 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sat, 6 Jan 2001 23:01:25 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com>; from tim_one@email.msn.com on Sat, Jan 06, 2001 at 10:33:02PM -0500
References: <200101062233.RAA23942@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com>
Message-ID: <20010106230125.A29058@thyrsus.com>

Tim Peters <tim_one at email.msn.com>:
> > [Tim suggests to use fgets(), preparing the buffer with non-null
> > bytes, and searching for a null byte from the right.]

No, I haven't forgotten about the curses autoconfig stuff.  But...

This mess reminds me.  For some work I'm doing right now, it would be
very useful if there were a way to query the end-of-file status of a
file descriptor without actually doing a read.

I don't see this ability anywhere in the 2.0 API.  Questions:

1. Am I missing something obvious?

2. If the answer to 1 is that I am not, in fact, being a dumbass, what
   is the right way to support this?  The obvious alternatives are an 
   eof member (analogous to the existing `closed' member, or an eof()
   method.  I favor the latter.

3. If we agree on a design, I'm willing to implement this at least for
   Unix.  Should be a small project.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The direct use of physical force is so poor a solution to the problem of
limited resources that it is commonly employed only by small children and
great nations.
	-- David Friedman



From skip at mojam.com  Sun Jan  7 05:05:22 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 6 Jan 2001 22:05:22 -0600 (CST)
Subject: [Python-Dev] readline module seems crippled - am I missing something?
Message-ID: <14935.60162.726131.593211@beluga.mojam.com>

For a more-or-less throwaway script I'm working on I need a little input
function similar to Emacs's read-from-minibuffer, which accepts both a
prompt and an initial string for the input buffer.  Seems like I ought to be
able to whip something up using readline, but it's not happening.  GNU
readline's docs aren't the greatest, but I thought this simple script would
work:

    import readline
    readline.insert_text("default")
    x = raw_input("?")
    print x

I expected to see an editable "default" displayed after the prompt and have
x default to "default" if I just hit the return key.  I see nothing
displayed after the question mark, and x is the empty string if I just hit
return.  

This does print "default":

    readline.insert_text("default")
    x = readline.get_line_buffer()
    print x

so I know that insert_text and get_line_buffer seem to be working as
intended.  Looking at call_readline in Modules/readline.c I see nothing that
would disrupt the line buffer before the call to readline().

Am I missing something totally obvious about how GNU readline works or the
conditions under which readline is used (only at the interactive prompt?) or
is some required bit of GNU readline not exposed through Python's readline
module?

Skip



From tim.one at home.com  Sun Jan  7 11:09:02 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 7 Jan 2001 05:09:02 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <20010106230125.A29058@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>

[Eric S. Raymond]
> ...
> For some work I'm doing right now, it would be very useful if
> there were a way to query the end-of-file status of a file
> descriptor without actually doing a read.
>
> I don't see this ability anywhere in the 2.0 API.

When someone says "API", I think "C API".  In that case you can use
feof(stream) directly, or whatever the heck your platform supports for
handles (_eof(handle) on Windows, which I know is an OS you're secretly
longing to master <wink>).

I don't believe there's a way to find out from Python short of trying to
read, though.  Well, I suppose you could try to compare f.tell() to the
size, if you knew that f.tell() and "the size" made sense for f ...

> 1. Am I missing something obvious?

I don't know!  I never asked Guido about this, and given that he's not on
vacation now I'm not allowed to channel him.  I would hazard a guess,
though, that he thinks "you do or don't get something back when you read" is
clearer than "you may or may not get something back when you read,
regardless of which answer I give you in response to .eof() -- depending".
The latter is particularly muddy in a threaded environment, even for plain
old disk files.

> 2. If the answer to 1 is that I am not, in fact, being a dumbass,
>    what is the right way to support this?  The obvious alternatives
>    are an eof member (analogous to the existing `closed' member, or
>    an eof() method.  I favor the latter.
>
> 3. If we agree on a design, I'm willing to implement this at least
>    for Unix.  Should be a small project.

I agree an .eof() method would be better than a data member.  Note that
whenever Python internals hit stream EOF today, they call clearerr(), so
simply adding an feof() wrapper wouldn't suffice.  Guido seemed to try to
make sure that feof() would never be useful <0.8 wink>.

one-of-life's-little-mysteries-ly y'rs  - tim




From gstein at lyra.org  Sun Jan  7 11:46:54 2001
From: gstein at lyra.org (Greg Stein)
Date: Sun, 7 Jan 2001 02:46:54 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects fileobject.c,2.96,2.97
In-Reply-To: <E14EY5D-0000pm-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Fri, Jan 05, 2001 at 06:43:07AM -0800
References: <E14EY5D-0000pm-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010107024654.W17220@lyra.org>

On Fri, Jan 05, 2001 at 06:43:07AM -0800, Guido van Rossum wrote:
> Update of /cvsroot/python/python/dist/src/Objects
> In directory usw-pr-cvs1:/tmp/cvs-serv3183
> 
> Modified Files:
> 	fileobject.c 
> Log Message:
> Restructured get_line() for clarity and speed.
> 
> - The raw_input() functionality is moved to a separate function.
> 
> - Drop GNU getline() in favor of getc_unlocked(), which exists on more
>   platforms (and is even a tad faster on my system).

The "configure" tests for getline() can be punted if we won't use it any
more...

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From gstein at lyra.org  Sun Jan  7 13:27:57 2001
From: gstein at lyra.org (Greg Stein)
Date: Sun, 7 Jan 2001 04:27:57 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Jan 05, 2001 at 03:14:41PM -0500
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
Message-ID: <20010107042757.X17220@lyra.org>

It feels wrong. Whatever happened to the "we're all adults here" mantra.

Besides people asking for it, what is a good reason *for* it to be added?

Cheers,
-g

On Fri, Jan 05, 2001 at 03:14:41PM -0500, Guido van Rossum wrote:
> Please have a look at this SF patch:
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> 
> This implements control over which names defined in a module are
> externally visible: if there's a variable __exports__ in the module,
> it is a list of identifiers, and any access from outside the module to
> names not in the list is disallowed.  This affects access using the
> getattr and setattr protocols (which raise AttributeError for
> disallowed names), as well as "from M import v" (which raises
> ImportError).
> 
> I like it.  This has been asked for many times.  Does anybody see a
> reason why this should *not* be added?
> 
> Tim remarked that introducing this will prompt demands for a similar
> feature on classes and instances, where it will be hard to implement
> without causing a bit of a slowdown.  It causes a slight slowdown (an
> extra dictionary lookup for each use of "M.v") even when it is not
> used, but for accessing module variables that's acceptable.  I'm not
> so sure about instance variable references.
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Greg Stein, http://www.lyra.org/



From guido at python.org  Sun Jan  7 17:52:11 2001
From: guido at python.org (Guido van Rossum)
Date: Sun, 07 Jan 2001 11:52:11 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: Your message of "Sat, 06 Jan 2001 23:01:25 EST."
             <20010106230125.A29058@thyrsus.com> 
References: <200101062233.RAA23942@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com>  
            <20010106230125.A29058@thyrsus.com> 
Message-ID: <200101071652.LAA31411@cj20424-a.reston1.va.home.com>

> This mess reminds me.  For some work I'm doing right now, it would be
> very useful if there were a way to query the end-of-file status of a
> file descriptor without actually doing a read.

I hope you really mean file object (== wrapper around stdio FILE
object).  A file descriptor (small little integer in Unix) doesn't
have a way to find this out.

Even for file objects, it is typically only known that there's an EOF
condition after a lowest-level read operation returned 0 bytes.  So in
effect you must still do a read in order to determine EOF status.

I just ran a small test program, and fread() appears to set the eof
status when it returns a short count.  Normally, Python's read() uses
fread() so this might be useful.  However after a readline(), you
can't know the eof status (unless the last line of the file doesn't
end in a newline).

> I don't see this ability anywhere in the 2.0 API.  Questions:
> 
> 1. Am I missing something obvious?
> 
> 2. If the answer to 1 is that I am not, in fact, being a dumbass, what
>    is the right way to support this?  The obvious alternatives are an 
>    eof member (analogous to the existing `closed' member, or an eof()
>    method.  I favor the latter.
> 
> 3. If we agree on a design, I'm willing to implement this at least for
>    Unix.  Should be a small project.

Before adding an eof() method, can you explain what your program is
trying to do?  Is it reading from a pipe or socket?  Then select() or
poll() might be useful.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Sun Jan  7 19:30:32 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 7 Jan 2001 13:30:32 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 07, 2001 at 05:09:02AM -0500
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>
Message-ID: <20010107133032.F4586@thyrsus.com>

Tim Peters <tim.one at home.com>:
> I agree an .eof() method would be better than a data member.  Note that
> whenever Python internals hit stream EOF today, they call clearerr(), so
> simply adding an feof() wrapper wouldn't suffice.  Guido seemed to try to
> make sure that feof() would never be useful <0.8 wink>.

That's inconvenient, but only means the internal Python state flag
that feof() would inspect would have to be checked after each read.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"...The Bill of Rights is a literal and absolute document. The First
Amendment doesn't say you have a right to speak out unless the
government has a 'compelling interest' in censoring the Internet. The
Second Amendment doesn't say you have the right to keep and bear arms
until some madman plants a bomb. The Fourth Amendment doesn't say you
have the right to be secure from search and seizure unless some FBI
agent thinks you fit the profile of a terrorist. The government has no
right to interfere with any of these freedoms under any circumstances."
	-- Harry Browne, 1996 USA presidential candidate, Libertarian Party



From esr at thyrsus.com  Sun Jan  7 19:45:41 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 7 Jan 2001 13:45:41 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <200101071652.LAA31411@cj20424-a.reston1.va.home.com>; from guido@python.org on Sun, Jan 07, 2001 at 11:52:11AM -0500
References: <200101062233.RAA23942@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAECHIHAA.tim_one@email.msn.com> <20010106230125.A29058@thyrsus.com> <200101071652.LAA31411@cj20424-a.reston1.va.home.com>
Message-ID: <20010107134541.G4586@thyrsus.com>

Guido van Rossum <guido at python.org>:
> > This mess reminds me.  For some work I'm doing right now, it would be
> > very useful if there were a way to query the end-of-file status of a
> > file descriptor without actually doing a read.
> 
> I hope you really mean file object (== wrapper around stdio FILE
> object).  A file descriptor (small little integer in Unix) doesn't
> have a way to find this out.

You're right, my bad.
 
> Even for file objects, it is typically only known that there's an EOF
> condition after a lowest-level read operation returned 0 bytes.  So in
> effect you must still do a read in order to determine EOF status.
> 
> I just ran a small test program, and fread() appears to set the eof
> status when it returns a short count.  Normally, Python's read() uses
> fread() so this might be useful.  However after a readline(), you
> can't know the eof status (unless the last line of the file doesn't
> end in a newline).

I considered trying a zero-length read() in Python, but this strikes me 
as inelegant even if it would work.

> Before adding an eof() method, can you explain what your program is
> trying to do?  Is it reading from a pipe or socket?  Then select() or
> poll() might be useful.

Sadly, it's exactly the wrong case.  Hmmm...omitting irrelevant details,
it's a situation where a markup file can contain sections in two different
languages.  The design requires the first interpreter to exit on seeing
either EOF or a marker that says "switching to second language".  For
reasons too compllicated to explain, it would be best if the parser for
the first language didn't simply call the second parser.

The logic I wanted to write amounts to:

while 1:
    line = fp.readline()
    if not line or line == "history":
        break
    interpret_in-language_1(line)

if not fp.feof()
    while 1:
        line = fp.readline()
        if not line:
            break
    	interpret_in-language_2(line)

I just tested the zero-length-read method.  That worked.  I guess I'll
use it.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Today, we need a nation of Minutemen, citizens who are not only prepared to
take arms, but citizens who regard the preservation of freedom as the basic
purpose of their daily life and who are willing to consciously work and
sacrifice for that freedom."
	-- John F. Kennedy



From martin at loewis.home.cs.tu-berlin.de  Sun Jan  7 19:45:15 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 7 Jan 2001 19:45:15 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
Message-ID: <200101071845.f07IjFi01249@mira.informatik.hu-berlin.de>

Authors of extension packages often find the need to auto-import some
of their modules. This is often needed for registration, e.g. a codec
author (like Tamito KAJIYAMA, who wrote the JapaneseCodecs package)
may need to register a search function with codecs.register. This is
currently only possible by writing into sitecustomize.py, which must
be done by the system administrator manually.

To enhance the service of site.py, I've written the patch

http://sourceforge.net/patch/?func=detailpatch&patch_id=103134&group_id=5470

which treats lines in PTH files which start with "import" as
statements and executes them, instead of appending these lines to
sys.path.

The patch is relatively small, but since it is an extension: Do I need
to write a PEP for it?

Regards,
Martin



From tismer at tismer.com  Sun Jan  7 19:05:21 2001
From: tismer at tismer.com (Christian Tismer)
Date: Sun, 07 Jan 2001 20:05:21 +0200
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
		<20010106110033.52127A84F@darjeeling.zadka.site.co.il>
		<14934.43496.322436.612746@anthem.wooz.org>
		<20010105152058.A6016@glacier.fnational.com> <14935.11870.360839.235102@beluga.mojam.com>
Message-ID: <3A58AFE1.3AB619BD@tismer.com>


Skip Montanaro wrote:
> 
>     Neil> I think you, Skip and Moshe are missing a big advantage of having
>     Neil> the __exports__ mechanism.  It should allow some attribute access
>     Neil> inside of modules to become faster (like LOAD_FAST for locals).  I
>     Neil> think that optimization could be implemented without too much
>     Neil> difficultly.
> 
> True enough, that hadn't occurred to me.  Knowing that now, I still don't
> think consistency of the interface should suffer as a result of
> under-the-covers performance gains.

Ok, vice versa:
Given that we can support access control via __exports__
for modules, classes and instances as well, *and* if we
can think up a scheme that allows a LOAD_FAST like speedup
for all of these cases at the same time,
then I would say +1, otherwise -0, half-hearted solution.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From guido at python.org  Sun Jan  7 22:13:01 2001
From: guido at python.org (Guido van Rossum)
Date: Sun, 07 Jan 2001 16:13:01 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: Your message of "Sun, 07 Jan 2001 13:30:32 EST."
             <20010107133032.F4586@thyrsus.com> 
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>  
            <20010107133032.F4586@thyrsus.com> 
Message-ID: <200101072113.QAA32467@cj20424-a.reston1.va.home.com>

> Tim Peters <tim.one at home.com>:
> > I agree an .eof() method would be better than a data member.  Note that
> > whenever Python internals hit stream EOF today, they call clearerr(), so
> > simply adding an feof() wrapper wouldn't suffice.  Guido seemed to try to
> > make sure that feof() would never be useful <0.8 wink>.
> 
[ESR]
> That's inconvenient, but only means the internal Python state flag
> that feof() would inspect would have to be checked after each read.

This was done because some platforms set feof() when there's still a
possibity to read more (e.g. after an interactive user typed ^D),
while others don't.  It's inconvenient to get an endless stream of
EOFs from stdin when a user typed ^D to one particular prompt, so I
decided to clear the EOF status.

[ESR in a later message]
> I considered trying a zero-length read() in Python, but this strikes me 
> as inelegant even if it would work.

I doubt that a zero-length read conveys any information.  It should
return "" whether or not there is more to read!  Plus, look at the
implementation of readline() (file_readline() in
Objects/fileobject.c): it shortcuts the n == 0 case and returns an
empty string without touching the file.

[me]
> > Before adding an eof() method, can you explain what your program is
> > trying to do?  Is it reading from a pipe or socket?  Then select() or
> > poll() might be useful.

[ESR again]
> Sadly, it's exactly the wrong case.  Hmmm...omitting irrelevant details,
> it's a situation where a markup file can contain sections in two different
> languages.  The design requires the first interpreter to exit on seeing
> either EOF or a marker that says "switching to second language".  For
> reasons too compllicated to explain, it would be best if the parser for
> the first language didn't simply call the second parser.
> 
> The logic I wanted to write amounts to:
> 
> while 1:
>     line = fp.readline()
>     if not line or line == "history":
>         break
>     interpret_in-language_1(line)
> 
> if not fp.feof()
>     while 1:
>         line = fp.readline()
>         if not line:
>             break
>     	interpret_in-language_2(line)
> 
> I just tested the zero-length-read method.  That worked.  I guess I'll
> use it.

Bizarre (given what I know about zero-length read).  But in the above
code, you can replace "if not fp.feof()" with "if line".  In other
words, you just have to carry the state over within your program.

So, I see no reason why the logic in your program couldn't take care
of this, which in general is a preferred way to solve a problem than
to change the language.

Also note that in Python it's no sin to attempt to read a line even
when the file is already at EOF -- you will simply get an empty line
again.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at effbot.org  Sun Jan  7 22:29:46 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Sun, 7 Jan 2001 22:29:46 +0100
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com>              <20010107133032.F4586@thyrsus.com>  <200101072113.QAA32467@cj20424-a.reston1.va.home.com>
Message-ID: <035901c078f0$f6180f70$e46940d5@hagrid>

Guido van Rossum wrote:
> Bizarre (given what I know about zero-length read).  But in the above
> code, you can replace "if not fp.feof()" with "if line".  In other
> words, you just have to carry the state over within your program.

and if that's too hard, just hide the state in
a class:

class FileWrapper:

    def __init__(self, file):
        self.__file = file
        self.__line = None

    def __more(self):
        # try reading another line
        if not self.__line:
            self.__line = self.__file.readline()

    def eof(self):
        self.__more()
        return not self.__line

    def readline(self):
        self.__more()
        line = self.__line
        self.__line = None
        return line

file = open("myfile.txt")

file = FileWrapper(file)

while not file.eof():
    print repr(file.readline())

</F>




From guido at python.org  Sun Jan  7 22:32:26 2001
From: guido at python.org (Guido van Rossum)
Date: Sun, 07 Jan 2001 16:32:26 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin test_charmapcodec test_pow
In-Reply-To: Your message of "Sat, 06 Jan 2001 22:16:31 EST."
             <LNBBLJKPBEHFEDALKOLCMECFIHAA.tim_one@email.msn.com> 
References: <LNBBLJKPBEHFEDALKOLCMECFIHAA.tim_one@email.msn.com> 
Message-ID: <200101072132.QAA32627@cj20424-a.reston1.va.home.com>

> I'm pretty sure the test_pow and test_charmapcodec failures aren't my doing.
> 
> test_builtin fails because raw_input() isn't stripping a trailing newline.
> I've got my own code in this area that *may* be to blame, but I don't see
> how it could be.  I note that fileobject.c's new function get_line_raw has
> the comment
> 
> /* Internal routine to get a line for raw_input():
>    strip trailing '\n', raise EOFError if EOF reached immediately
> */
> 
> but the code doesn't look for a trailing newline (let alone strip one).

My bad.  Try the latest CVS now.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Sun Jan  7 23:15:27 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 7 Jan 2001 17:15:27 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <200101072113.QAA32467@cj20424-a.reston1.va.home.com>; from guido@python.org on Sun, Jan 07, 2001 at 04:13:01PM -0500
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com>
Message-ID: <20010107171527.A5093@thyrsus.com>

Guido van Rossum <guido at python.org>:
> [ESR in a later message]
> > I considered trying a zero-length read() in Python, but this strikes me 
> > as inelegant even if it would work.
> 
> I doubt that a zero-length read conveys any information.  It should
> return "" whether or not there is more to read!

Duh.  Of course it would.  

You know, I've always been half-consciously dissatisfied with Python's
use of "" as an EOF marker, and now I know why.  It's precisely
because there's no way to distinguish these cases.  I think a zero-length
read ought to return "" and a read on EOF ought to return None.

> Bizarre (given what I know about zero-length read).  But in the above
> code, you can replace "if not fp.feof()" with "if line".  In other
> words, you just have to carry the state over within your program.
> 
> So, I see no reason why the logic in your program couldn't take care
> of this, which in general is a preferred way to solve a problem than
> to change the language.

OK, two objections, one practical and one (more important) esthetic:

Practical: I guess I oversimplified the code for expository purposes.
What's actually going on is that I have two parser classes both based
on shlex -- they do character-at-a-time input and don't actually
*have* accessible line buffers.

Esthetic: Yes, I can have the first parser set a flag, or return some
EOF token.  But this seems deeply wrong to me, because EOFness is not
a property of the parser but of the underlying stream object.  It
seems to me that my program ought to be able to ask the stream object
whether it's at EOF rather than carrying its own flag for that state.

In Python as it is, there's no clean way to do this.  I'd have to do a
nonzero-length read to test it (I failed to check the right alternate
case before when I tried zero-length).  That's really broken.  What if the
neither the underlying stream nor the parser supports pushback?

Do you see now why I think this is a more general issue?

Now, another and more general way to handle this would be to make an
equivalent of the old FIONCLEX ioctl part of Python's standard set of 
file object methods -- a way to ask "how many bytes are ready to be
read in this stream?  

Trivial to make it work for plain files, of course.  Harder to make it   
work usefully for pipes/fifos/sockets/terminals.  Having it pass up the
results of the fstat.size field (corrected for the current seek address
if you're reading a plain file) would be a good start.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Live free or die; death is not the worst of evils.
	-- General George Stark.



From tismer at tismer.com  Sun Jan  7 23:37:55 2001
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 08 Jan 2001 00:37:55 +0200
Subject: [Python-Dev] ANN: Stackless Python 2.0
Message-ID: <3A58EFC3.5A722FF0@tismer.com>

Dear community,

I'm happy to announce that

		Stackless Python 2.0

is finally ready and available for download.

Stackless Python for Python 1.5.2+ also got some minor
enhancements. Both versions are available as Win32
installer files here:

http://www.stackless.com/spc20-win32.exe
http://www.stackless.com/spc15-win32.exe

Speed: Stackless Python for Python 2.0 is again a bit faster
than the original. This time even better: About 9-10 percent.
I have to say that optimization was much harder this time.
My speed patches are now done by a Python script, which will
make maintenance and diff reading much easier in the future.

There is now also a bit of example code available, like
the uthread9.py Microthreads module from Will Ware, Just van Rossum,
and Mike Fletcher.

Source code and an update to the website will become available in
the next days.

enjoy - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From mal at lemburg.com  Mon Jan  8 01:26:00 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 01:26:00 +0100
Subject: [Python-Dev] Std tests failing, Windows: test_builtin 
 test_charmapcodec test_pow
References: <LNBBLJKPBEHFEDALKOLCMECFIHAA.tim_one@email.msn.com>
Message-ID: <3A590918.E90031AA@lemburg.com>

Tim Peters wrote:
> 
> I'm pretty sure the test_pow and test_charmapcodec failures aren't my doing.

test_charmapcodec is my fault... I should run the tests in a
clean room environment before checkin: my PYTHONPATH picked up
some other file which it was not supposed to do.

I'll fix it next week.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Mon Jan  8 05:13:26 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 7 Jan 2001 23:13:26 -0500
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEBPIHAA.tim_one@email.msn.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEEDIHAA.tim.one@home.com>

The "Win32" readline() hack is now checked in, but there's really nothing
Win32-specific about it anymore.  It makes one mild assumption about what
the C std doesn't clearly address but may have intended:  that in case of a
non-NULL return, fgets doesn't overwrite any of the buffer positions beyond
the terminating null byte (the std is clear that it doesn't overwrite
anything at all in case of a NULL-because-EOF return, but I can't say
whether they're pointing that out as a consequence, or pointing that out as
an exception).

I'm curious about how it performs (relative to the getc_unlocked hack) on
other platforms.  If you'd like to try that, just recompile fileobject.c
with

    USE_MS_GETLINE_HACK

#define'd.  It should *work* on any platform with fgets() meeting the
assumption.  The new test_bufio.py std test gives it a pretty good
correctness workout, if you're worried about that.




From esr at snark.thyrsus.com  Mon Jan  8 05:16:53 2001
From: esr at snark.thyrsus.com (Eric S. Raymond)
Date: Sun, 7 Jan 2001 23:16:53 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
Message-ID: <200101080416.f084GrM10912@snark.thyrsus.com>

Setting things up so curses is autoconfigured into the default build
if your system has it in the expected places turned out to be dead
easy.  Some clever person (the BDFL himself?) wrote the build process
so that there is *already* a Setup.config.in that gets configure
expansions done on it, with the generated Setup.config used when
makesetup does its magic.

As a bonus, I've also added autoconfiguration for readline.  A small
detail, but one which I suspect many people building their own Pythons
frequently trip over.

The technique generalizes easily.  The archetype for a facility for
autoconfiguring libfoo with a Python extension foo.c if it's present
has just two steps:

Add this to Modules/Setup.config.in:

@USE_FOO_MODULE at foo foo.c -lfoo

Add this to configure.in:

# This is used to generate Setup.config
AC_SUBST(USE_FOO_MODULE)
AC_CHECK_LIB(foo, random_foo_function, 
	[USE_FOO_MODULE=""],
	[USE_FOO_MODULE="#"])

(Apologies for the lack of description with the patch.  I tripped over
a SourceForge interface bug.)
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The possession of arms by the people is the ultimate warrant
that government governs only with the consent of the governed.
        -- Jeff Snyder



From tim.one at home.com  Mon Jan  8 06:34:20 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 00:34:20 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <3A590918.E90031AA@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com>

An update:  test_builtin works again (thanks, Guido!), and test_charmapcodec
will "next week" (thanks, MAL!).

Still unknown (to me):  is the test_pow failure unique to Windows?  One
response from a Unix(tm) geek would settle that.




From nas at arctrix.com  Sun Jan  7 23:59:49 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 7 Jan 2001 14:59:49 -0800
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 12:34:20AM -0500
References: <3A590918.E90031AA@lemburg.com> <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com>
Message-ID: <20010107145949.A14166@glacier.fnational.com>

On Mon, Jan 08, 2001 at 12:34:20AM -0500, Tim Peters wrote:
> Still unknown (to me):  is the test_pow failure unique to Windows?  One
> response from a Unix(tm) geek would settle that.

It works fine for me on Linux.  I thought I tested on Windows
before checking in the coerce patch.  I'll try again.

  Neil



From nas at arctrix.com  Mon Jan  8 00:29:14 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 7 Jan 2001 15:29:14 -0800
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <20010107145949.A14166@glacier.fnational.com>; from nas@arctrix.com on Sun, Jan 07, 2001 at 02:59:49PM -0800
References: <3A590918.E90031AA@lemburg.com> <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com> <20010107145949.A14166@glacier.fnational.com>
Message-ID: <20010107152914.A14228@glacier.fnational.com>

On Sun, Jan 07, 2001 at 02:59:49PM -0800, Neil Schemenauer wrote:
> It works fine for me on Linux.  I thought I tested on Windows
> before checking in the coerce patch.  I'll try again.

Wierd. rt.bat does not run the test_pow script.  If I run
"regrtet test_pow" then the test fails.  It could be a problem
with line endings (I copied the source for a Unix CVS checkout).

Anyhow, I found the bug.  I don't know how test_pow was passing
under Linux.  Time to reboot again.

  Neil



From tim.one at home.com  Mon Jan  8 07:39:20 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 01:39:20 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <20010107152914.A14228@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEEKIHAA.tim.one@home.com>

[NeilS]
> Wierd. rt.bat does not run the test_pow script.

Works for me, else I never would have noticed <wink>.  Also works for me in
single-test mode:

C:\Code\python\dist\src\PCbuild>rt test_pow

C:\Code\python\dist\src\PCbuild>python ../lib/test/regrtest.py test_pow
test_pow
The actual stdout doesn't match the expected stdout.
This much did match (between asterisk lines):
**********************************************************************
test_pow
Testing integer mode...
    Testing 2-argument pow() function...
    Testing 3-argument pow() function...
Testing long integer mode...
    Testing 2-argument pow() function...
    Testing 3-argument pow() function...
Testing floating point mode...
    Testing 3-argument pow() function...
The number in both columns should match.
3 3
-5 -5
-1 -1
5 5
-3 -3
-7 -7

3L 3L
-5L -5L
-1L -1L
5L 5L
-3L -3L
-7L -7L

3.0 3.0
-5.0 -5.0
-1.0 -1.0
-7.0 -7.0

**********************************************************************
Then ...
We expected (repr): ''
But instead we got: 'Float mismatch:'
test test_pow failed -- Writing: 'Float mismatch:', expected: ''
1 test failed: test_pow

C:\Code\python\dist\src\PCbuild>

That may point to the problem, too:  the canned output file is truncated?

> If I run "regrtet test_pow" then the test fails.  It could be a
> problem with line endings (I copied the source for a Unix CVS
> checkout).

Don't understand; e.g., "copied" what, from where to where?  I'm not sure I
gave you write access to my box, and hacking into Windows machines is uncool
because it's not challenging <wink>.

> Anyhow, I found the bug.  I don't know how test_pow was passing
> under Linux.  Time to reboot again.

Cool!  BTW, Windows solves the "don't reboot enough" problem for you via
automation, sometimes on an hourly basis.

Thanks for sharing the brain cells, Neil!




From thomas at xs4all.net  Mon Jan  8 07:44:11 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 07:44:11 +0100
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <200101080416.f084GrM10912@snark.thyrsus.com>; from esr@snark.thyrsus.com on Sun, Jan 07, 2001 at 11:16:53PM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com>
Message-ID: <20010108074411.N2467@xs4all.nl>

On Sun, Jan 07, 2001 at 11:16:53PM -0500, Eric S. Raymond wrote:
> Setting things up so curses is autoconfigured into the default build
> if your system has it in the expected places turned out to be dead
> easy.  Some clever person (the BDFL himself?) wrote the build process
> so that there is *already* a Setup.config.in that gets configure
> expansions done on it, with the generated Setup.config used when
> makesetup does its magic.

Skip, actually, IIRC. It was added in the last stages of 2.0 development, to
auto-detect bsddb. However, I still think it should be a separate
'configure', in the Modules directory. Especially now that Andrew is
practically checking in the distutils setup ;) The main configure can make
an educated guess whether Python and distutils are available, and call
configure with some passed-through options if not. It does depend on what
the distutils setup does, though, and I'll shamefully admit that I haven't
looked at that ;P

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From nas at arctrix.com  Mon Jan  8 00:51:16 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 7 Jan 2001 15:51:16 -0800
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEEKIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 01:39:20AM -0500
References: <20010107152914.A14228@glacier.fnational.com> <LNBBLJKPBEHFEDALKOLCGEEKIHAA.tim.one@home.com>
Message-ID: <20010107155116.A14312@glacier.fnational.com>

On Mon, Jan 08, 2001 at 01:39:20AM -0500, Tim Peters wrote:
> [NeilS]
> > If I run "regrtet test_pow" then the test fails.  It could be a
> > problem with line endings (I copied the source for a Unix CVS
> > checkout).
> 
> Don't understand; e.g., "copied" what, from where to where?

I should have been clearer.  I mean the problem with rt.bat not
running test_pow.  I copied the CVS source from my Linux ext2
filesystem to a VFAT filesystem.  I was too lazy to fix the line
endings.

  Neil



From nas at arctrix.com  Mon Jan  8 00:52:38 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 7 Jan 2001 15:52:38 -0800
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <20010107152914.A14228@glacier.fnational.com>; from nas@arctrix.com on Sun, Jan 07, 2001 at 03:29:14PM -0800
References: <3A590918.E90031AA@lemburg.com> <LNBBLJKPBEHFEDALKOLCOEEGIHAA.tim.one@home.com> <20010107145949.A14166@glacier.fnational.com> <20010107152914.A14228@glacier.fnational.com>
Message-ID: <20010107155238.A14291@glacier.fnational.com>

On Sun, Jan 07, 2001 at 03:29:14PM -0800, Neil Schemenauer wrote:
> I don't know how test_pow was passing under Linux.

Under Linux with the buggy float_pow:

    >>> pow(10.0, 0, 10)
    nan
    >>> pow(10.0, 0, 10) == 1
    1
    >>> pow(10.0, 0, 10) == 0
    1

Under Windows NAN obviously behaves differently.

  floating-point-is-fun-ly y'rs Neil



From esr at thyrsus.com  Mon Jan  8 07:49:45 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 8 Jan 2001 01:49:45 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <20010108074411.N2467@xs4all.nl>; from thomas@xs4all.net on Mon, Jan 08, 2001 at 07:44:11AM +0100
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl>
Message-ID: <20010108014945.A19516@thyrsus.com>

Thomas Wouters <thomas at xs4all.net>:
> On Sun, Jan 07, 2001 at 11:16:53PM -0500, Eric S. Raymond wrote:
> > Setting things up so curses is autoconfigured into the default build
> > if your system has it in the expected places turned out to be dead
> > easy.  Some clever person (the BDFL himself?) wrote the build process
> > so that there is *already* a Setup.config.in that gets configure
> > expansions done on it, with the generated Setup.config used when
> > makesetup does its magic.
> 
> Skip, actually, IIRC. It was added in the last stages of 2.0 development, to
> auto-detect bsddb. However, I still think it should be a separate
> 'configure', in the Modules directory.

You may be right.  Still, this patch solves the immediate problem in a
reasonably clean way, and I urge that it should go in.  We can do a
more complete reorganization of the build process later.  (I'll help with
that; I'm pretty expert with autoconf and friends.)
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"As to the species of exercise, I advise the gun. While this gives
[only] moderate exercise to the body, it gives boldness, enterprise,
and independence to the mind.  Games played with the ball and others
of that nature, are too violent for the body and stamp no character on
the mind. Let your gun, therefore, be the constant companion to your
walks."
        -- Thomas Jefferson, writing to his teenaged nephew.



From tim.one at home.com  Mon Jan  8 08:05:46 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 02:05:46 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010106110033.52127A84F@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>

Well, I like __exports__ (but not some details of the patch, for which see
my SF comments).  Guido is aware of the optimization possibilities, but
that's not what's driving it.  I don't know why he likes it; I like it
because the only normal use for a module is to do module.attr, or "from
module import attr", and dir(module) very often exposes stuff today that the
module author had no intention of exporting.  For example, if I do

    import os
    dir(os)

under CVS Python today, on my box I see that os exports "i".  It's bound to
_exit.  That's baffling, and is purely an accident of how module os.py
initialization works when you're running on Windows.

Couple that with that I've hardly ever seen (or bothered to write) a module
docstring spelling out everything a module *intends* to export, and an
__exports__ line near the top (when present) would also automagically give a
solid answer to that question.

modules aren't classes or instances, and in normal practice modules
accumulate all sorts of accidental attrs (due to careless (== normal)
imports, and module init code).  It doesn't make any *sense* that os exports
"sys" either, or that random exports "cos", or that cgi exports "string", or
... this inelegance is ubiquitous.

In a world with an __exports__ that gets used, though, I do wonder whether
people will or won't export their test() functions.  I really like that they
do now.

or-maybe-it's-just-that-i-like-modules-that-*have*-a-
    test-function<wink>-ly y'rs  - tim





From gstein at lyra.org  Mon Jan  8 08:25:32 2001
From: gstein at lyra.org (Greg Stein)
Date: Sun, 7 Jan 2001 23:25:32 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 02:05:46AM -0500
References: <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
Message-ID: <20010107232532.V17220@lyra.org>

On Mon, Jan 08, 2001 at 02:05:46AM -0500, Tim Peters wrote:
>...
> modules aren't classes or instances, and in normal practice modules
> accumulate all sorts of accidental attrs (due to careless (== normal)
> imports, and module init code).  It doesn't make any *sense* that os exports
> "sys" either, or that random exports "cos", or that cgi exports "string", or
> ... this inelegance is ubiquitous.

Simple question: so what?

"Oh, no! My module exposes mod.sys! Oh, woe is me!"  *snort*

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From tim.one at home.com  Mon Jan  8 08:29:39 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 02:29:39 -0500
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <20010107155238.A14291@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEENIHAA.tim.one@home.com>

[Neil Schemenauer]
> Under Linux with the buggy float_pow:
>
>     >>> pow(10.0, 0, 10)
>     nan
>     >>> pow(10.0, 0, 10) == 1
>     1
>     >>> pow(10.0, 0, 10) == 0
>     1
>
> Under Windows NAN obviously behaves differently.

Comparisons with NaN are a platform-dependent accident, partly because some
C compilers generate nonsense code, partly because Python isn't coded to
cater to NaN's peculiarities either.  The behavior under Windows is
(accidentally) better in these cases today (NaN should never compare equal
to anything -- not even to itself -- and, curiously, MSVC's codegen mistakes
cancel out Python's mistakes in this case!).

Thank you for fixing the bug.  Only test_charmapcodec is failing for me now,
and MAL knows the cause and cure.

nothing-can-stop-the-alpha-now-ly y'rs  - tim




From thomas at xs4all.net  Mon Jan  8 08:42:30 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 08:42:30 +0100
Subject: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEENIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 02:29:39AM -0500
References: <20010107155238.A14291@glacier.fnational.com> <LNBBLJKPBEHFEDALKOLCIEENIHAA.tim.one@home.com>
Message-ID: <20010108084230.O2467@xs4all.nl>

On Mon, Jan 08, 2001 at 02:29:39AM -0500, Tim Peters wrote:

> (NaN should never compare equal to anything -- not even to itself

You know that's impossible, in Python, right ? (Due to the shortcut taken by
'==', based on object identity.) Is that going to be 'fixed', too ? :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From ping at lfw.org  Mon Jan  8 08:51:11 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Sun, 7 Jan 2001 23:51:11 -0800 (PST)
Subject: [Python-Dev] inspect.py
In-Reply-To: <Pine.LNX.4.10.10011021617550.800-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101072348530.1032-100000@skuld.kingmanhall.org>

Hi again.

Sorry to bother you if you're busy -- i haven't seen any responses
about inspect.py for a few days and wanted to know what your
reactions were.  The module and test suite are still at:

    http://www.lfw.org/python/inspect.py
    http://www.lfw.org/python/test_inspect.py

The only change since my announcement last Wednesday is that
getframe() has been renamed to getframeinfo().

Thanks,


-- ?!ng

"Old code doesn't die -- it just smells that way."
    -- Bill Frantz




From tim.one at home.com  Mon Jan  8 09:17:57 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 03:17:57 -0500
Subject: NaN nonsense (was RE: [Python-Dev] Std tests failing, Windows: test_builtin  test_charmapcodec test_pow)
In-Reply-To: <20010108084230.O2467@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEPIHAA.tim.one@home.com>

>> (NaN should never compare equal to anything -- not even to itself

[Thomas Wouters]
> You know that's impossible, in Python, right ? (Due to the
> shortcut taken by '==', based on object identity.)

Surely you jest:  I probably knew that while you were still nursing <wink>.

OTOH, Python on WinTel comes remarkably close (by accident):

C:\Code\python\dist\src\PCbuild>python
Python 2.0 (#8, Jan  5 2001, 00:33:19) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> inf = 1e300**2
>>> inf
1.#INF
>>> nan = inf - inf
>>> nan
-1.#IND
>>> nan2 = nan * 1.0
>>> nan2
-1.#IND
>>> nan == nan2
0
>>>

> Is that going to be 'fixed', too ? :)

Not if I can help it.  I'd be in favor of adding an fcmp function that needs
to be called explicitly when you want the full complexity of 754
comparisons.  Count them all up, and there are 32 distinct 754 binary float
comparison operators!  The 754 std says 26 (from memory, may be 2 more or
less) of those have to be supplied, but-- since 754 is not a language
std --says nothing about how they're to be spelled.

OTOH, C99 resolutely tries to map that into C, and 754 True Believers will
use that as a club.

On the third hand, as Tom MacDonald posted here earlier (he was X3J11
chair), he's not sure anyone will ever implement C99 in whole.  The
complexities of full 754 support are a large part of why he worries about
that.

too-much-too-late-ly y'rs  - tim




From tim.one at home.com  Mon Jan  8 09:17:59 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 03:17:59 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010107232532.V17220@lyra.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>

[Greg Stein]
> Simple question: so what?
>
> "Oh, no! My module exposes mod.sys! Oh, woe is me!"  *snort*

Couldn't care less about the module author.  It's the module user who has to
sort this stuff out.  "Don't use 'import *'" is good advice but not followed
either, and after I do

from MyPackage import sys  # intentionally exports its own sys
from GregSnort import *    # accidentally exports some other sys

madness ensues.  Like I said, it's inelegant, and at best.

Simple question for you:  what would __exports__ hurt?  "Oh, no!  Tim's
module explicitly lists what it intended to export!  Oh, woe is me!".  Gimme
a break.




From gstein at lyra.org  Mon Jan  8 09:26:03 2001
From: gstein at lyra.org (Greg Stein)
Date: Mon, 8 Jan 2001 00:26:03 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 03:17:59AM -0500
References: <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>
Message-ID: <20010108002603.X17220@lyra.org>

On Mon, Jan 08, 2001 at 03:17:59AM -0500, Tim Peters wrote:
> [Greg Stein]
> > Simple question: so what?
> >
> > "Oh, no! My module exposes mod.sys! Oh, woe is me!"  *snort*
> 
> Couldn't care less about the module author.  It's the module user who has to
> sort this stuff out.  "Don't use 'import *'" is good advice but not followed
> either, and after I do
> 
> from MyPackage import sys  # intentionally exports its own sys
> from GregSnort import *    # accidentally exports some other sys
> 
> madness ensues.  Like I said, it's inelegant, and at best.
> 
> Simple question for you:  what would __exports__ hurt?  "Oh, no!  Tim's
> module explicitly lists what it intended to export!  Oh, woe is me!".  Gimme
> a break.

hehe... adding __exports__ to your module is fine. Adding more crud to
Python, in opposition to the "we're all adults" motto, doesn't seem Right.

Somebody wants to use "from foo import *" on a module not designed for it?
Too bad for them. If you're suggesting __exports__ is to patch over problems
caused by "from foo import *", then I think you're barking up the wrong tree
:-)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From moshez at zadka.site.co.il  Mon Jan  8 17:50:57 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon,  8 Jan 2001 18:50:57 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010107232532.V17220@lyra.org>
References: <20010107232532.V17220@lyra.org>, <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
Message-ID: <20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>

[Tim Peters]
> modules aren't classes or instances, and in normal practice modules
> accumulate all sorts of accidental attrs (due to careless (== normal)
> imports, and module init code).  It doesn't make any *sense* that os exports
> "sys" either, or that random exports "cos", or that cgi exports "string", or
> ... this inelegance is ubiquitous.

[Greg Stein]
> Simple question: so what?
> 
> "Oh, no! My module exposes mod.sys! Oh, woe is me!"  *snort*

Let me "me to" here:
Put another way, what Greg said is just a rephrase of "don't use from
foo import * unless foo's docos say it's OK". Add to that the simple
access control of a leading underscore, and I don't see any place
which needs it.

Something better to do would be to use 
import foo as _foo

In some standard library modules, and minimize using from foo import bar
in them. Since everyone know that leading underscore means "implementation
detail - ignore at your convenience, use at yor peril", this would keep
the "we're all adults" philosophy of Python, with all the advantages
*I* see in __exports__.

One more point against __exports__, which I hoped I would not have to
make (but when I'm up against the timbot *and* Guido, I need to pull
out the heavy artillery): it would *totally* stop any hope in the
future of module level __getattr__ (or at least complicate the semantics).
I think Alex M. is thinking of a PEP, but he's taking his time, since
no PEPs can be considered until 2.1 is out.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tim.one at home.com  Mon Jan  8 09:49:58 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 03:49:58 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010108002603.X17220@lyra.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEFBIHAA.tim.one@home.com>

[Greg Stein]
> hehe... adding __exports__ to your module is fine. Adding more
> crud to Python, in opposition to the "we're all adults" motto,
> doesn't seem Right.

My idea of what's Right is copied from my boss <wink>.

> Somebody wants to use "from foo import *" on a module not designed
> for it?  Too bad for them.

How is someone supposed to know whether a module "was designed" for import*?
Even Tkinter (which just about everyone does "import *" on) also exports
sys, and everything from the "types" module, by accident too.

> If you're suggesting __exports__ is to patch over problems
> caused by "from foo import *", then I think you're barking up the
> wrong tree
> :-)

Indeed.  But I'm suggesting that the problems that *can* arise from
"import*" illustrate the fundamental silliness of exporting things by
accident.  It's come up much more often for me when I'm looking over
someone's shoulder, teaching them how to use dir() in an interactive shell
to answer their own damn questions <0.5 wink>.  It's usually the case that
dir(M) shows them something that isn't documented, and over time I am *not*
pleased that "oh, I guess the 'string' in there is just crap" is how they
learn to view it.

I can live without __exports__; but I'd prefer not to, because I would
always use it if it were there.

if-i'd-both-use-it-and-heartily-recommend-it-it's-hard-to-
    oppose-it-ly y'rs  - tim




From m.favas at per.dem.csiro.au  Mon Jan  8 12:48:40 2001
From: m.favas at per.dem.csiro.au (Mark Favas)
Date: Mon, 08 Jan 2001 19:48:40 +0800
Subject: [Python-Dev] _cursesmodule.c clobbered since Christmas
Message-ID: <3A59A918.E0D02E0D@per.dem.csiro.au>

I last successfully downloaded from CVS, compiled, linked and tested on
Dec. 22 last year. For the last week or so, the current CVS
_cursesmodule.c gives a bunch of compiler warning messages of the form:

cc: Warning: ./_cursesmodule.c, line 619: In this statement,
"derwin(...)" of ty
pe "int", is being converted to "pointer to struct _win_st".
(cvtdiftypes)
  win = derwin(self->win,nlines,ncols,begin_y,begin_x);
--^
cc: Warning: ./_cursesmodule.c, line 1259: In this statement,
"subpad(...)" of t
ype "int", is being converted to "pointer to struct _win_st".
(cvtdiftypes)
    win = subpad(self->win, nlines, ncols, begin_y, begin_x);
----^
cc: Warning: ./_cursesmodule.c, line 1488: In this statement,
"termname(...)" of
 type "int", is being converted to "pointer to const char".
(cvtdiftypes)
NoArgReturnStringFunction(termname)
^
(more elided)

and

cc: Warning: ./_cursesmodule.c, line 305: The scalar variable "arg1" is
fetched 
but not initialized.  And there may be other such fetches of this
variable that 
have not been reported in this compilation. (uninit1)
Window_NoArg2TupleReturnFunction(getparyx, int, "(ii)")
^
cc: Warning: ./_cursesmodule.c, line 305: The scalar variable "arg2" is
fetched 
but not initialized.  And there may be other such fetches of this
variable that 
have not been reported in this compilation. (uninit1)
Window_NoArg2TupleReturnFunction(getparyx, int, "(ii)")
^
(more elided)

and at link time, fails with:

ld:
Unresolved:
getbegyx
getmaxyx
getparyx


I've held off bothering anyone about this, but it begins to look as
though no-one else has noticed... My platform? Tru64 Unix, V4.0F (aka
OSF1). The recent pow() bug hit this platform, too. Happy to do any
testing...



-- 
Mark Favas  -   m.favas at per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA



From guido at python.org  Mon Jan  8 15:27:50 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 09:27:50 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: Your message of "Mon, 08 Jan 2001 01:49:45 EST."
             <20010108014945.A19516@thyrsus.com> 
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl>  
            <20010108014945.A19516@thyrsus.com> 
Message-ID: <200101081427.JAA03146@cj20424-a.reston1.va.home.com>

> You may be right.  Still, this patch solves the immediate problem in a
> reasonably clean way, and I urge that it should go in.  We can do a
> more complete reorganization of the build process later.  (I'll help with
> that; I'm pretty expert with autoconf and friends.)

I expect Andrew's code to go in before 2.1 is released.  So I don't
see a reason why we should hurry and check in a stop-gap measure.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Jan  8 15:33:09 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 09:33:09 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Mon, 08 Jan 2001 00:26:03 PST."
             <20010108002603.X17220@lyra.org> 
References: <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>  
            <20010108002603.X17220@lyra.org> 
Message-ID: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>

> hehe... adding __exports__ to your module is fine. Adding more crud to
> Python, in opposition to the "we're all adults" motto, doesn't seem Right.
> 
> Somebody wants to use "from foo import *" on a module not designed for it?
> Too bad for them. If you're suggesting __exports__ is to patch over problems
> caused by "from foo import *", then I think you're barking up the wrong tree
> :-)

You haven't been answering many newbie questions lately, have you? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Jan  8 16:06:28 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 10:06:28 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: Your message of "Sun, 07 Jan 2001 17:15:27 EST."
             <20010107171527.A5093@thyrsus.com> 
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com>  
            <20010107171527.A5093@thyrsus.com> 
Message-ID: <200101081506.KAA03404@cj20424-a.reston1.va.home.com>

> > So, I see no reason why the logic in your program couldn't take care
> > of this, which in general is a preferred way to solve a problem than
> > to change the language.
> 
> OK, two objections, one practical and one (more important) esthetic:
> 
> Practical: I guess I oversimplified the code for expository purposes.
> What's actually going on is that I have two parser classes both based
> on shlex -- they do character-at-a-time input and don't actually
> *have* accessible line buffers.

And what's wrong with always starting the second parser?  If the
stream was at EOF it will simply process zero lines.  Or does your
parser have a problem with empty input?

> Esthetic: Yes, I can have the first parser set a flag, or return some
> EOF token.  But this seems deeply wrong to me, because EOFness is not
> a property of the parser but of the underlying stream object.  It
> seems to me that my program ought to be able to ask the stream object
> whether it's at EOF rather than carrying its own flag for that state.

Eric, before we go furhter, can you give an exact definition of
EOFness to me?

> In Python as it is, there's no clean way to do this.  I'd have to do a
> nonzero-length read to test it (I failed to check the right alternate
> case before when I tried zero-length).  That's really broken.  What if the
> neither the underlying stream nor the parser supports pushback?
> 
> Do you see now why I think this is a more general issue?

No.  What's wrong with just setting the parser loose on the input and
letting it deal with EOF?  In your example, apparently a line
containing the word "history" signals that the rest of the file must
be parsed by the second parser.  What if "history" is the last line of
the file?  The eof() test can't tell you *that*!

> Now, another and more general way to handle this would be to make an
> equivalent of the old FIONCLEX ioctl part of Python's standard set of 
> file object methods -- a way to ask "how many bytes are ready to be
> read in this stream?  

There's no portable way to do that.

> Trivial to make it work for plain files, of course.  Harder to make it   
> work usefully for pipes/fifos/sockets/terminals.  Having it pass up the
> results of the fstat.size field (corrected for the current seek address
> if you're reading a plain file) would be a good start.

This seems totally the wrong level to solve your problem.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From moshez at zadka.site.co.il  Tue Jan  9 00:13:21 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 01:13:21 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>
References: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>  
            <20010108002603.X17220@lyra.org>
Message-ID: <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il>

On Mon, 08 Jan 2001 09:33:09 -0500, Guido van Rossum <guido at python.org> wrote:
> > hehe... adding __exports__ to your module is fine. Adding more crud to
> > Python, in opposition to the "we're all adults" motto, doesn't seem Right.
> > 
> > Somebody wants to use "from foo import *" on a module not designed for it?
> > Too bad for them. If you're suggesting __exports__ is to patch over problems
> > caused by "from foo import *", then I think you're barking up the wrong tree
> > :-)
> 
> You haven't been answering many newbie questions lately, have you? :-)

Well, I have. 
And frankly, I think having "from foo import *" issue a warning at 2.1
a *much* better solution.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From guido at python.org  Mon Jan  8 16:15:20 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 10:15:20 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Tue, 09 Jan 2001 01:13:21 +0200."
             <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il> 
References: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com> <20010108002603.X17220@lyra.org>  
            <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il> 
Message-ID: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>

[Greg]
> > > hehe... adding __exports__ to your module is fine. Adding more crud to
> > > Python, in opposition to the "we're all adults" motto, doesn't seem Right.
> > > 
> > > Somebody wants to use "from foo import *" on a module not designed for it?
> > > Too bad for them. If you're suggesting __exports__ is to patch over problems
> > > caused by "from foo import *", then I think you're barking up the wrong tree
> > > :-)

[Guido]
> > You haven't been answering many newbie questions lately, have you? :-)

[Moshe]
> Well, I have. 
> And frankly, I think having "from foo import *" issue a warning at 2.1
> a *much* better solution.

(1) For what problem?

(2) Under exactly what circumstances do you want from foo import *
    issue a warning?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Mon Jan  8 16:26:21 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 16:26:21 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101071845.f07IjFi01249@mira.informatik.hu-berlin.de>
Message-ID: <3A59DC1D.29DE500B@lemburg.com>

"Martin v. Loewis" wrote:
> 
> Authors of extension packages often find the need to auto-import some
> of their modules. This is often needed for registration, e.g. a codec
> author (like Tamito KAJIYAMA, who wrote the JapaneseCodecs package)
> may need to register a search function with codecs.register. This is
> currently only possible by writing into sitecustomize.py, which must
> be done by the system administrator manually.
> 
> To enhance the service of site.py, I've written the patch
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=103134&group_id=5470
> 
> which treats lines in PTH files which start with "import" as
> statements and executes them, instead of appending these lines to
> sys.path.
> 
> The patch is relatively small, but since it is an extension: Do I need
> to write a PEP for it?

Just curious: wouldn't this introduce a /tmp-style problem to
Python ?

The scenario is quite simple: a Python script runs under root.
The script could pick up a lingering .pth file (e.g. from /tmp
or one of its subdirs -- distutils does this !) and then executes
arbitrary code as *root*.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From jim at interet.com  Mon Jan  8 16:43:05 2001
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 08 Jan 2001 10:43:05 -0500
Subject: [Python-Dev] Create a synthetic stdout for Windows?
Message-ID: <3A59E009.96922CA5@interet.com>

There a number of problems which frequently recur on c.l.p
that can serve as a source of Python improvement ideas.
On December 30, 2000 gerson.kurz at t-online.de (Gerson Kurz) writes:

   If I embedd Python in a Win32 console application (using
   Demo\embed.c), everything works fine. If I take the very same piece
of
   code and put it in a Win32 Windows application (not MFC, just a plain
   WinMain()) I see no output (and more importantly so, no errors),
   because the application does not have a stdout/stderr set up.

This is well known.  Windows developers must replace sys.stdout and
sys.stderr with alternative mechanisms.  Unfortunately this solution
does not completely work because errors can occur before sys.stdout
is replaced.  I propose patching pythonw.exe (WinMain.c) and adding
a new module to fix this so it Just Works.  The patch is completely
Windows specific.  I am not sure if this constitutes a PEP, but would
like everyone's feedback anyway.

Design Requirements

1) "pythonw.exe myfile.py" will give the usual error message if
   myfile.py does not exist.

2) "pythonw.exe myfile.py" will give the usual traceback for a
   syntax error in myfile.py.

3) python.exe will provide a useful C-language stdout/stderr so
   the user does not have to replace sys.stdout/err herself.

4) None of the above will interfere will the user's replacement
   of sys.stdout/err for her own purposes.

Description of Patch

A new module winstdoutmodule.c (138 lines) is included in Windows
builds. It contains a C entry point PyWin_StdoutReplace() which
creates a valid C stdout/err, and code to display output
in a popup dialog box.  There is a Python entry point
winstdout.print() to display output, but it is only used
for special purposes, and the typical user will never import
winstdout.

The file WinMain.c calls PyWin_StdoutReplace() before it
calls Py_Main(), and PyWin_StdoutPrint() afterwards.  This
is meant to display startup error messages.  Normally,
any available output is displayed when the system is idle.

Technical Details

Some experimentation (as opposed to documentation) shows that
Win32 programs have a valid FILE * stdout, but fileno(stdout)
gives INVALID_HANDLE_VALUE; the FILE * has an invalid OS file
object.  It is tempting to hack the FILE structure directly.
But it is more prudent to use the only documented way to
replace stdout, namely the standard call "freopen()" (also
available on Unix).  The design uses this call to open a
temporary file to append stdout and stderr output.  To
display output, the file is checked when the system is
idle, and MessageBox() is called with the file contents if any.

Status

After a few false starts, I now have working code.

Is this a good idea?  If so, is the implementation optimal
(comments from MarkH especially welcome)?

JimA



From mal at lemburg.com  Mon Jan  8 16:52:32 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 16:52:32 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>  
	            <20010108002603.X17220@lyra.org> <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il>
Message-ID: <3A59E240.7F77790E@lemburg.com>

Moshe Zadka wrote:
> 
> On Mon, 08 Jan 2001 09:33:09 -0500, Guido van Rossum <guido at python.org> wrote:
> > > hehe... adding __exports__ to your module is fine. Adding more crud to
> > > Python, in opposition to the "we're all adults" motto, doesn't seem Right.
> > >
> > > Somebody wants to use "from foo import *" on a module not designed for it?
> > > Too bad for them. If you're suggesting __exports__ is to patch over problems
> > > caused by "from foo import *", then I think you're barking up the wrong tree
> > > :-)
> >
> > You haven't been answering many newbie questions lately, have you? :-)
> 
> Well, I have.
> And frankly, I think having "from foo import *" issue a warning at 2.1
> a *much* better solution.

Why raise a warning ? "from xyz import *" is still very useful in
intercative sessions and also has some merrits when it comes to
importing all subpackages of a package (well, at least those listed
in __all__).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From barry at digicool.com  Mon Jan  8 16:54:10 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 8 Jan 2001 10:54:10 -0500
Subject: [Python-Dev] Add __exports__ to modules
References: <20010107232532.V17220@lyra.org>
	<20010106110033.52127A84F@darjeeling.zadka.site.co.il>
	<LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
	<20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>
Message-ID: <14937.58018.792925.31985@anthem.wooz.org>

>>>>> "MZ" == Moshe Zadka <moshez at zadka.site.co.il> writes:

    MZ> it would *totally* stop any hope in the future of module level
    MZ> __getattr__ (or at least complicate the semantics).  I think
    MZ> Alex M. is thinking of a PEP, but he's taking his time, since
    MZ> no PEPs can be considered until 2.1 is out.

Given the current discussion, I'm now -1 on __exports__ unless a PEP
is written.  I think enough issues and interactions have been brought
up that a PEP is warranted first.

-Barry




From moshez at zadka.site.co.il  Tue Jan  9 01:03:00 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 02:03:00 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>
References: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>, <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com> <20010108002603.X17220@lyra.org>  
            <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il>
Message-ID: <20010109000300.DF2A5A82D@darjeeling.zadka.site.co.il>

On Mon, 08 Jan 2001 10:15:20 -0500, Guido van Rossum <guido at python.org> wrote:

> (1) For what problem?

Users seeing things they didn't expect in their modules.

> (2) Under exactly what circumstances do you want from foo import *
>     issue a warning?

All.
If you want to be less extreme, don't warn if the module defines
a __from_star_ok__

But in any case, I'm done with this thread. We'll probably won't
manage to convince each other.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From guido at python.org  Mon Jan  8 17:04:58 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 11:04:58 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Mon, 08 Jan 2001 10:54:10 EST."
             <14937.58018.792925.31985@anthem.wooz.org> 
References: <20010107232532.V17220@lyra.org> <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>  
            <14937.58018.792925.31985@anthem.wooz.org> 
Message-ID: <200101081604.LAA04464@cj20424-a.reston1.va.home.com>

> Given the current discussion, I'm now -1 on __exports__ unless a PEP
> is written.  I think enough issues and interactions have been brought
> up that a PEP is warranted first.

I have to agree.  I am no longer championing this patch.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Mon Jan  8 17:27:17 2001
From: skip at mojam.com (Skip Montanaro)
Date: Mon, 8 Jan 2001 10:27:17 -0600 (CST)
Subject: [Python-Dev] inspect.py
In-Reply-To: <Pine.LNX.4.10.10101072348530.1032-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10011021617550.800-100000@skuld.kingmanhall.org>
	<Pine.LNX.4.10.10101072348530.1032-100000@skuld.kingmanhall.org>
Message-ID: <14937.60005.951163.80255@beluga.mojam.com>

    Ping> Sorry to bother you if you're busy -- i haven't seen any responses
    Ping> about inspect.py for a few days and wanted to know what your
    Ping> reactions were.

Fiddling code bits is not the sort of stuff I do very often, but every time
I do I wind up having to reacquaint myself with all sorts of object details
that slip out of my brain shortly after the latest need is gone.  Having a
module that hides the details seems like a good idea to me.

+1.  I vote it go into 2.1 assuming a bit for the library reference can be
written in time.

Skip



From akuchlin at mems-exchange.org  Mon Jan  8 17:31:09 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 11:31:09 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <200101081427.JAA03146@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 08, 2001 at 09:27:50AM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com>
Message-ID: <20010108113109.C7563@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 09:27:50AM -0500, Guido van Rossum wrote:
>I expect Andrew's code to go in before 2.1 is released.  So I don't
>see a reason why we should hurry and check in a stop-gap measure.

But it might not; the final version might be unacceptable or run into
some intractable problem.  Assuming the patch is correct (I haven't
looked at it), why not check it in?  The work has already been done to
write it, after all.

--amk




From akuchlin at mems-exchange.org  Mon Jan  8 17:41:10 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 11:41:10 -0500
Subject: [Python-Dev] _cursesmodule.c clobbered since Christmas
In-Reply-To: <3A59A918.E0D02E0D@per.dem.csiro.au>; from m.favas@per.dem.csiro.au on Mon, Jan 08, 2001 at 07:48:40PM +0800
References: <3A59A918.E0D02E0D@per.dem.csiro.au>
Message-ID: <20010108114110.D7563@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 07:48:40PM +0800, Mark Favas wrote:
>I last successfully downloaded from CVS, compiled, linked and tested on
>Dec. 22 last year. For the last week or so, the current CVS
>_cursesmodule.c gives a bunch of compiler warning messages of the form:

Hmm... on Dec. 22 there was a sizable change to export a C API from
the module; since then there's only been one minor change.  Perhaps
the last version you compiled successfully was from before I checked
in those changes.  In any case, I'll look into it as soon as my Compaq
test drive account is usable and I have access to a Tru64 4.0
machine again.  Thanks for the report!

Once the PEP 229 changes go in, many more modules will be tried on
many more platforms.  It might be worth considering setting up a
Tinderbox for Python, or at least doing a systematic test on several
platforms before releases.

--amk




From paulp at ActiveState.com  Mon Jan  8 17:46:47 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Mon, 08 Jan 2001 08:46:47 -0800
Subject: [Python-Dev] Add __exports__ to modules
References: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
Message-ID: <3A59EEF7.BB4118BD@ActiveState.com>

Tim Peters wrote:
> 
> ... It doesn't make any *sense* that os exports
> "sys" either, or that random exports "cos", or that cgi exports "string", or
> ... this inelegance is ubiquitous.

I agree strongly. I think that Python people are careless about what
their module dictionaries look like. My two main annoyances are modules
that export other modules randomly and modules that export huge wacks of
constants.

> Indeed.  But I'm suggesting that the problems that *can* arise from
> "import*" illustrate the fundamental silliness of exporting things by
> accident.  It's come up much more often for me when I'm looking over
> someone's shoulder, teaching them how to use dir() in an interactive shell
> to answer their own damn questions <0.5 wink>.  It's usually the case that
> dir(M) shows them something that isn't documented, and over time I am *not*
> pleased that "oh, I guess the 'string' in there is just crap" is how they
> learn to view it.

Screw dir()! Let's talk about important stuff: Komodo. And Idle. And
WingIDE. And PythonWorks and PythonWin. :)

How are class browsers and "intellisense prompters" supposed to know
that it "makes sense" to prompt the user with os.path but not
CGIHTTPServer.os.path. 

Overall, I think Tim is right. We are all adults here and part of being
adults is keeping your privates private and your nose clean.

 Paul Prescod



From paulp at ActiveState.com  Mon Jan  8 17:47:39 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Mon, 08 Jan 2001 08:47:39 -0800
Subject: [Python-Dev] Add __exports__ to modules
References: <20010107232532.V17220@lyra.org>, <20010106110033.52127A84F@darjeeling.zadka.site.co.il> <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>
Message-ID: <3A59EF2B.792801E5@ActiveState.com>

Moshe Zadka wrote:
> 
> ...
> Let me "me to" here:
> Put another way, what Greg said is just a rephrase of "don't use from
> foo import * unless foo's docos say it's OK". 

That's not the issue. It's not about keeping people out of your module.
In fact I would propose that mod.__dict__ should be as loose as ever.

It's a user interface issue. If we encourage people to learn about
modules in interactive environments like the prompt using dir(), class
browsers and IDEs then we need to create modules that are friendly for
those users. I think that the current situation is pretty bad that way.
what does CGIHTTPServer export BaseHTTPServer? And why is
CGIHTTPServer.CGIHTTPServer a class but CGIHTTPServer.BaseHTTPServer is
a module?

We go to great lengths to make the syntax newbie friendly. I think that
we should make similar efforts in a cleanly reflective class library.

> Add to that the simple
> access control of a leading underscore, and I don't see any place
> which needs it.
> 
> Something better to do would be to use
> import foo as _foo

It's pretty clear that nobody does this now and nobody is going to start
doing it in the near future. It's too invasive and it makes the code too
ugly. Why obfuscate thousands of lines of code when a simple feature can
mitigate that?

>...
> One more point against __exports__, which I hoped I would not have to
> make (but when I'm up against the timbot *and* Guido, I need to pull
> out the heavy artillery): it would *totally* stop any hope in the
> future of module level __getattr__ (or at least complicate the semantics).
> I think Alex M. is thinking of a PEP, but he's taking his time, since
> no PEPs can be considered until 2.1 is out.

__exports__ would merely be considered an implementation detail of the
"default __getattr__". Custom __getattr__'s could decide whether to
respect it or not. It doesn't complicate anything much.

 Paul Prescod



From nas at arctrix.com  Mon Jan  8 10:54:55 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 8 Jan 2001 01:54:55 -0800
Subject: [Python-Dev] Create a synthetic stdout for Windows?
In-Reply-To: <3A59E009.96922CA5@interet.com>; from jim@interet.com on Mon, Jan 08, 2001 at 10:43:05AM -0500
References: <3A59E009.96922CA5@interet.com>
Message-ID: <20010108015455.A15138@glacier.fnational.com>

On Mon, Jan 08, 2001 at 10:43:05AM -0500, James C. Ahlstrom wrote:
> Is this a good idea?  If so, is the implementation optimal
> (comments from MarkH especially welcome)?

The general idea sounds good to me.  Having tracebacks go nowhere
when running pythonw is un-Python-like.  I don't know enough
about MFC, etc. to comment on the specifics of your patch.

  Neil



From akuchlin at mems-exchange.org  Mon Jan  8 17:49:13 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 11:49:13 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59EEF7.BB4118BD@ActiveState.com>; from paulp@ActiveState.com on Mon, Jan 08, 2001 at 08:46:47AM -0800
References: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <3A59EEF7.BB4118BD@ActiveState.com>
Message-ID: <20010108114913.E7563@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 08:46:47AM -0800, Paul Prescod wrote:
>How are class browsers and "intellisense prompters" supposed to know
>that it "makes sense" to prompt the user with os.path but not
>CGIHTTPServer.os.path. 

Could we then simply adopt __exports__ as a convention for such
browsers, but with no changes to core Python to support it?  Browsers
would then follow the algorithm "Use __exports__ if present, dir() if
not."  

--amk



From paulp at ActiveState.com  Mon Jan  8 17:51:26 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Mon, 08 Jan 2001 08:51:26 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <3A59F00E.53A0A32A@ActiveState.com>

Tim Peters wrote:
> 
> ....
> 
> Perl appears to ignore the issue of thread safety here (on Windows and
> everywhere else).

If you can create a sample program that demonstrates the unsafety I'll
anonymously submit it as a bug on our internal system and ensure that
the next version of Perl is as slow as Python. :)

Seriously: If someone comes at me with
Perl-IO-is-way-faster-than-Python-IO, I'd like to know what concretely
they've given up in order to achieve that performance. And even just
for my own interest I'd like to understand the cost/benefit of
stream thread safety. For instance would it make sense to just write
a thread-safe wrapper for streams used from multiple threads?

 Paul Prescod



From paulp at ActiveState.com  Mon Jan  8 18:01:49 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Mon, 08 Jan 2001 09:01:49 -0800
Subject: [Python-Dev] Add __exports__ to modules
References: <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <3A59EEF7.BB4118BD@ActiveState.com> <20010108114913.E7563@kronos.cnri.reston.va.us>
Message-ID: <3A59F27D.C27B8CD0@ActiveState.com>

Andrew Kuchling wrote:
> 
> ...
> 
> Could we then simply adopt __exports__ as a convention for such
> browsers, but with no changes to core Python to support it?  Browsers
> would then follow the algorithm "Use __exports__ if present, dir() if
> not."

dir() is one of the "interactive tools" I'd like to work better in the
presence of __exports__. On the other hand, dir() works pretty poorly
for object instances today so maybe we need something new anyhow. 
Perhaps attrs()? 

If there were an "attrs()" and it basically returned __exports__ if it
existed and dir() if it didn't, then I would buy it. Graphical apps
would just build on attrs().

 Paul



From MarkH at ActiveState.com  Mon Jan  8 18:04:31 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Mon, 8 Jan 2001 09:04:31 -0800
Subject: [Python-Dev] Create a synthetic stdout for Windows?
In-Reply-To: <3A59E009.96922CA5@interet.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPGEDKCOAA.MarkH@ActiveState.com>

> Is this a good idea?  If so, is the implementation optimal

Im really on the fence here.  Note however that your solution does not solve
the original problem.  Eg, your example is:

> On December 30, 2000 gerson.kurz at t-online.de (Gerson Kurz) writes:
>
>    If I embedd Python in a Win32 console application (using
>    Demo\embed.c), everything works fine. If I take the very same piece

But your solution involves:

> The file WinMain.c calls PyWin_StdoutReplace() before it
> calls Py_Main(), and PyWin_StdoutPrint() afterwards.  This

Note that the original problem was _embedding_ Python - thus, you need to
patch _their_ WinMain to make it work for them - something you can't do.

Even if PyWin_StdoutReplace() was a public symbol so they _could_ call it, I
am not convinced they would - it is almost certain they will still need to
redirect output to somewhere useful, so why bother redirecting it
temporarily just to redirect it for real immediately after?

Finally, I am slightly concerned about the possibility of "hanging" certain
programs. For example, I believe that DCOM will often invoke a COM server in
a different "desktop" than the user (this is also true for Services, but
Python services don't use pythonw.exe).  Thus, a Python program may end up
hanging with a dialog box, but in the context where no user is able to see
it.  However, this could be addressed by adding a command-line option to
prevent this new behaviour kicking in.

I would prefer to see a decent API for extracting error and traceback
information from Python.  On the other hand, I _do_ see the problem for
"newbies" trying to use pythonw.exe.

So - I guess I am saying that I don't see this as optimal, and it doesnt
solve the original problem you pointed at - but in the interests of making
pythonw.exe seem "less broken" for newbies, I could live with this as long
as I could prevent it when necessary.

Another option would be to use the Win32 Console APIs, and simply attempt to
create a console for the error message.  Eg, maybe PyErr_Print() could be
changed to check for the existance of a console, and if not found, create
it.  However, the problem with this approach is that the error message will
often be printed just as the process is terminating - meaning you will see a
new console with the error message for about 0.025 of a second before it
vanishes due to process termination.  Any sort of "press any key to
terminate" option then leaves us in the same position - if no user can see
the message, the process appears hung.

Mark.




From andreas at andreas-jung.com  Mon Jan  8 18:06:16 2001
From: andreas at andreas-jung.com (Andreas Jung)
Date: Mon, 8 Jan 2001 18:06:16 +0100
Subject: [Python-Dev] Re: ANN: Stackless Python 2.0
In-Reply-To: <3A58EFC3.5A722FF0@tismer.com>; from tismer@tismer.com on Mon, Jan 08, 2001 at 12:37:55AM +0200
References: <3A58EFC3.5A722FF0@tismer.com>
Message-ID: <20010108180616.A18993@yetix.sz-sb.de>

On Mon, Jan 08, 2001 at 12:37:55AM +0200, Christian Tismer wrote:
> Dear community,
> 
> I'm happy to announce that
> 
> 		Stackless Python 2.0
> 
> is finally ready and available for download.
> 
> Stackless Python for Python 1.5.2+ also got some minor
> enhancements. Both versions are available as Win32
> installer files here:

Are there patches available against the standard Python 2.0 
source code tree ?

Andreas 



From tismer at tismer.com  Mon Jan  8 17:15:55 2001
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 08 Jan 2001 18:15:55 +0200
Subject: [Python-Dev] Re: ANN: Stackless Python 2.0
References: <3A58EFC3.5A722FF0@tismer.com> <20010108180616.A18993@yetix.sz-sb.de>
Message-ID: <3A59E7BB.6908B7E2@tismer.com>


Andreas Jung wrote:
> 
> On Mon, Jan 08, 2001 at 12:37:55AM +0200, Christian Tismer wrote:
> > Dear community,
> >
> > I'm happy to announce that
> >
> >               Stackless Python 2.0
> >
> > is finally ready and available for download.
> >
> > Stackless Python for Python 1.5.2+ also got some minor
> > enhancements. Both versions are available as Win32
> > installer files here:
> 
> Are there patches available against the standard Python 2.0
> source code tree ?

I had no time yet to put the source trees on the web.
Should happen in one or two days.
The I will probably not provide patches, hoping that
some other Unix people will catch up and provide that
part. This worked the same for the 1.5.2 version.

The 2.0 port consists of 10 or so files, which can be used
as direct replacements for the same files in the 2.0 distro.
I think on Unix this is the right way to go.
For me it is simpler to have my own litle tree, since I'm
working with Windows, and I just have to modify my VC++
project file.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From moshez at zadka.site.co.il  Tue Jan  9 02:30:09 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 03:30:09 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59F27D.C27B8CD0@ActiveState.com>
References: <3A59F27D.C27B8CD0@ActiveState.com>, <LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com> <3A59EEF7.BB4118BD@ActiveState.com> <20010108114913.E7563@kronos.cnri.reston.va.us>
Message-ID: <20010109013009.37D6DA82D@darjeeling.zadka.site.co.il>

On Mon, 08 Jan 2001 09:01:49 -0800, Paul Prescod <paulp at ActiveState.com> wrote:

> dir() is one of the "interactive tools" I'd like to work better in the
> presence of __exports__. On the other hand, dir() works pretty poorly
> for object instances today so maybe we need something new anyhow. 
> Perhaps attrs()? 
> 
> If there were an "attrs()" and it basically returned __exports__ if it
> existed and dir() if it didn't, then I would buy it. Graphical apps
> would just build on attrs().

Even better, __exports__ could be what was imported in 
from foo import *.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From andreas at andreas-jung.com  Mon Jan  8 18:25:36 2001
From: andreas at andreas-jung.com (Andreas Jung)
Date: Mon, 8 Jan 2001 18:25:36 +0100
Subject: [Python-Dev] Re: ANN: Stackless Python 2.0
In-Reply-To: <3A59E7BB.6908B7E2@tismer.com>; from tismer@tismer.com on Mon, Jan 08, 2001 at 06:15:55PM +0200
References: <3A58EFC3.5A722FF0@tismer.com> <20010108180616.A18993@yetix.sz-sb.de> <3A59E7BB.6908B7E2@tismer.com>
Message-ID: <20010108182536.A20361@yetix.sz-sb.de>

On Mon, Jan 08, 2001 at 06:15:55PM +0200, Christian Tismer wrote:
> 
> The 2.0 port consists of 10 or so files, which can be used
> as direct replacements for the same files in the 2.0 distro.
> I think on Unix this is the right way to go.
> For me it is simpler to have my own litle tree, since I'm
> working with Windows, and I just have to modify my VC++
> project file.

I would prefer a tar.gz archive that contains just the modified files.
With this approach it is easy possible to extract the archive inside
the Python source tree.

Andreas



From loewis at informatik.hu-berlin.de  Mon Jan  8 18:51:28 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 8 Jan 2001 18:51:28 +0100 (MET)
Subject: [Python-Dev] Extending startup code: PEP needed?
Message-ID: <200101081751.SAA08918@pandora.informatik.hu-berlin.de>

> Just curious: wouldn't this introduce a /tmp-style problem to
> Python ?

I tried, but I could not produce such a problem.

> The scenario is quite simple: a Python script runs under root.
> The script could pick up a lingering .pth file (e.g. from /tmp
> or one of its subdirs -- distutils does this !) and then executes
> arbitrary code as *root*.

No, Python looks only in a few places for pth file: 
{<prefix>,<exec_prefix>}{,/lib/python<version>/site-packages,/lib/site-python}

so it won't pick up pth files in /tmp.

Regards,
Martin



From esr at thyrsus.com  Mon Jan  8 19:01:37 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 8 Jan 2001 13:01:37 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <200101081506.KAA03404@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 08, 2001 at 10:06:28AM -0500
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com> <20010107171527.A5093@thyrsus.com> <200101081506.KAA03404@cj20424-a.reston1.va.home.com>
Message-ID: <20010108130137.E22834@thyrsus.com>

Guido van Rossum <guido at python.org>:
> Eric, before we go furhter, can you give an exact definition of
> EOFness to me?

A file is at EOF when attempts to read more data from it will fail
returning no data.

> What's wrong with just setting the parser loose on the input and
> letting it deal with EOF?

Nothing wrong in theory, but it's a problem in practice.  I don't want
to import the second parser unless it's actually needed, because it's much
larger than the first one.

>                                In your example, apparently a line
> containing the word "history" signals that the rest of the file must
> be parsed by the second parser.  What if "history" is the last line of
> the file?  The eof() test can't tell you *that*!

Right.  That case never happens.  I mean it *really* never happens :-).

What we're talking about is a game system.  The first parser recognizes
a spec language for describing games of a particular class (variants of
Diplomacy, if that's meaningful to you).  The system keeps logfiles which
consist of a a section in the game description language, optionally 
followed by the token "history" and an order log.

The parser for the order log language is a *lot* larger than the one
for the description language.  This is why I said I don't want the
first parser to just call the second.  I want to test for EOF to
know whether I have to import the second parser at all!

Here's the beginning of my problem: the first parser can't export a line
buffer, because it doesn't *have* a line buffer.  It's a subclass of
shlex and does single-character reads.

There are two ways I can cope with this.  One is to do a (nonzero)
length read after the first parser exits; the other is to have the
first parser set a state flag controlling whether the second parser
loads.

This is where it bites that I can't test for EOF with a read(0). The
second shlex parser only has token-level pushback!  If do a
nonzero-length read and I get data, I'm screwed.  On the other hand
(as I said before) setting a lexer state flag seems wrong, because
EOFness is a property of the underlying stream rather than the parser.
I'd be duplicating state that exists in the stdio stream structure
anyway; it ought to be accessible.

> > Now, another and more general way to handle this would be to make an
> > equivalent of the old FIONCLEX ioctl part of Python's standard set of 
> > file object methods -- a way to ask "how many bytes are ready to be
> > read in this stream?  
> 
> There's no portable way to do that.

Actually, fstat(2) is portable enough to support a very useful
approximation of FIONCLEX.  I know, because I tried it.

Last night I coded up a "waiting" method for file objects that calls
fstat(2) on the associated file descriptor.  For a plain file, it
then subtracts the result of ftell() from the fstat size field and
returns that -- for other files, it simply returns the size field.

I then tested this on plain files, FIFOs, and sockets under Linux. It
turns out fstat(2) gives useful information in all three cases (a
count of characters waiting in the buffer in the latter two).  I expected
this; it should be true under all current Unixes.

fstat(2) does not give useful size-field results for Linux block
devices.  I didn't test the character (terminal) devices.  (I
documented my results in Python's Doc/lib/stat.tex, in a patch I have
already submitted to SourceForge.)

I would be quite surprised if the plain-file case didn't work on Mac
and Windows.  I would be a little surprised if the socket case failed,
because all three probably inherited fstat(2) from the ancestral BSD
TCP/IP stack.

Just having the plain-file case work would, IMHO, be justification
enough for this method.  If it turns out to be portable across Mac and
Windows sockets as well, *huge* win.  Could this be tested by someone
with access to Windows and Mac systems?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

An armed society is a polite society.  Manners are good when one 
may have to back up his acts with his life.
        -- Robert A. Heinlein, "Beyond This Horizon", 1942




From mal at lemburg.com  Mon Jan  8 19:10:50 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 19:10:50 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de>
Message-ID: <3A5A02AA.675A35D1@lemburg.com>

Martin von Loewis wrote:
> 
> > Just curious: wouldn't this introduce a /tmp-style problem to
> > Python ?
> 
> I tried, but I could not produce such a problem.
> 
> > The scenario is quite simple: a Python script runs under root.
> > The script could pick up a lingering .pth file (e.g. from /tmp
> > or one of its subdirs -- distutils does this !) and then executes
> > arbitrary code as *root*.
> 
> No, Python looks only in a few places for pth file:
> {<prefix>,<exec_prefix>}{,/lib/python<version>/site-packages,/lib/site-python}
> 
> so it won't pick up pth files in /tmp.

Hmm, but what if the Python script picks up a site.py which is
different from the standard one distributed with Python ?

The code adding (and with the patch: executing) the .pth files
is defined in site.py and it is rather easy to override this
file by adding a modified site.py file to the current working dir...
a potential security hole in its own right, I guess :(

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at python.org  Mon Jan  8 19:30:34 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 13:30:34 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: Your message of "Mon, 08 Jan 2001 13:01:37 EST."
             <20010108130137.E22834@thyrsus.com> 
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com> <20010107171527.A5093@thyrsus.com> <200101081506.KAA03404@cj20424-a.reston1.va.home.com>  
            <20010108130137.E22834@thyrsus.com> 
Message-ID: <200101081830.NAA05301@cj20424-a.reston1.va.home.com>

Eric, take a hint.  You're not going to get your eof() method no
matter what arguments you bring up.  But I'll explain it to you again
anyway... :-)

> Guido van Rossum <guido at python.org>:
> > Eric, before we go furhter, can you give an exact definition of
> > EOFness to me?

[Eric]
> A file is at EOF when attempts to read more data from it will fail
> returning no data.

I was afraid you would say this.  That's not a condition that's easy
to calculate without doing I/O, *and* that's not the condition that
you are interested in for your problem.  According to your definition,
f.eof() should be true in this example:

    f = open("/etc/passwd")
    f.seek(0, 2)                 # Seek to end of file
    print f.eof()                # What will this print???
    print `f.readline()`         # Will print ''

But getting the right result here requires a lot of knowledge about
how the file is implemented!  While you've explained how this can be
implemented on Unix, it can't be implemented with just the tools that
stdio gives us.  Going beyond stdio in order to implement a feature is
a grave decision.  After all, Python is portable to many
less-than-mainstream operating systems (VxWorks, OS/9, VMS...).  Now,
if this was just a speed hack (like xreadlines) I could accept having
some platform-dependent code, if at least there was a portable way to
do it that was just a bit slower.  But here you can't convince me that
this can be done in a portable way, and I don't want to force porters
to figure out how to do this for their platform before their port can
work.  I also don't want to make f.eof() a non-portable feature: *if*
it is provided, it's too important for that.

Note that stdio's feof() doesn't have this definition!  It is set when
the last *read* (or getc(), etc.) stumbled upon an EOF condition.
That's also of limited value; it's mostly defined so you can
distinguish between errors and EOF when you get a short read.  The
stdio feof() flag would be false in the above example.

> > What's wrong with just setting the parser loose on the input and
> > letting it deal with EOF?
> 
> Nothing wrong in theory, but it's a problem in practice.  I don't want
> to import the second parser unless it's actually needed, because it's much
> larger than the first one.

So be practical and let the first parser set a global flag that tells
you whether it's necessary to load the second one.

> >                                In your example, apparently a line
> > containing the word "history" signals that the rest of the file must
> > be parsed by the second parser.  What if "history" is the last line of
> > the file?  The eof() test can't tell you *that*!
> 
> Right.  That case never happens.  I mean it *really* never happens :-).
> 
> What we're talking about is a game system.  The first parser recognizes
> a spec language for describing games of a particular class (variants of
> Diplomacy, if that's meaningful to you).  The system keeps logfiles which
> consist of a a section in the game description language, optionally 
> followed by the token "history" and an order log.
> 
> The parser for the order log language is a *lot* larger than the one
> for the description language.  This is why I said I don't want the
> first parser to just call the second.  I want to test for EOF to
> know whether I have to import the second parser at all!
> 
> Here's the beginning of my problem: the first parser can't export a line
> buffer, because it doesn't *have* a line buffer.  It's a subclass of
> shlex and does single-character reads.
> 
> There are two ways I can cope with this.  One is to do a (nonzero)
> length read after the first parser exits; the other is to have the
> first parser set a state flag controlling whether the second parser
> loads.

Do the latter.  Nothing wrong with it that I can see.

> This is where it bites that I can't test for EOF with a read(0).

And can you tell me a system where you *can* test for EOF with a
read(0)?  I've never heard of such a thing.  The Unix read() system
call has the same properties as Python's f.read().  I'm pretty sure
that fread() with a zero count also doesn't give you the information
you're after.

> The
> second shlex parser only has token-level pushback!  If do a
> nonzero-length read and I get data, I'm screwed.  On the other hand
> (as I said before) setting a lexer state flag seems wrong, because
> EOFness is a property of the underlying stream rather than the parser.
> I'd be duplicating state that exists in the stdio stream structure
> anyway; it ought to be accessible.

Bullshit.  The EOFness that you're after (according to your own
definition) is not the same as the EOFness of the stdio stream.  The
EOFness in the stdio stream could help you, but Python resets it -- so
that making it available wouldn't be as easy as you claim.  Anyway,
you seem to have a sufficiently vague idea of what "EOFness" means
that I don't think providing access to whatever low-level EOFness
condition might exist would do you much good.

> > > Now, another and more general way to handle this would be to make an
> > > equivalent of the old FIONCLEX ioctl part of Python's standard set of 
> > > file object methods -- a way to ask "how many bytes are ready to be
> > > read in this stream?  
> > 
> > There's no portable way to do that.
> 
> Actually, fstat(2) is portable enough to support a very useful
> approximation of FIONCLEX.  I know, because I tried it.
> 
> Last night I coded up a "waiting" method for file objects that calls
> fstat(2) on the associated file descriptor.  For a plain file, it
> then subtracts the result of ftell() from the fstat size field and
> returns that -- for other files, it simply returns the size field.
> 
> I then tested this on plain files, FIFOs, and sockets under Linux. It
> turns out fstat(2) gives useful information in all three cases (a
> count of characters waiting in the buffer in the latter two).  I expected
> this; it should be true under all current Unixes.
> 
> fstat(2) does not give useful size-field results for Linux block
> devices.  I didn't test the character (terminal) devices.  (I
> documented my results in Python's Doc/lib/stat.tex, in a patch I have
> already submitted to SourceForge.)
> 
> I would be quite surprised if the plain-file case didn't work on Mac
> and Windows.  I would be a little surprised if the socket case failed,
> because all three probably inherited fstat(2) from the ancestral BSD
> TCP/IP stack.
> 
> Just having the plain-file case work would, IMHO, be justification
> enough for this method.  If it turns out to be portable across Mac and
> Windows sockets as well, *huge* win.  Could this be tested by someone
> with access to Windows and Mac systems?

I don't see the huge win.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Jan  8 19:33:26 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 13:33:26 -0500
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: Your message of "Mon, 08 Jan 2001 19:10:50 +0100."
             <3A5A02AA.675A35D1@lemburg.com> 
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de>  
            <3A5A02AA.675A35D1@lemburg.com> 
Message-ID: <200101081833.NAA05325@cj20424-a.reston1.va.home.com>

Discussions based on Python running as root and picking up untrusted
code from $PYTHONPATH are pointless.  Of course this is a security
hole.  If root runs *any* Python script in a way that could pick up
even a single untrusted module, there's a security hole.  site.py or
*.pth files are just a special case of this, so I don't see why this
is used as an example.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Mon Jan  8 19:48:40 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 13:48:40 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59EF2B.792801E5@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEGFIHAA.tim.one@home.com>

[Moshe]
> Something better to do would be to use
> import foo as _foo

[Paul]
> It's pretty clear that nobody does this now and nobody is going
> to start doing it in the near future. It's too invasive and it
> makes the code too ugly.

Actually, this function is one of my std utilities:

def _pvt_import(globs, modname, *items):
    """globs, modname, *items -> import into globs with leading "_".

    If *items is empty, set globs["_" + modname] to module modname.
    If *items is not empty, import each item similarly but don't
    import the module into globs.
    Leave names that already begin with an underscore as-is.

    # import math as _math
    >>> _pvt_import(globals(), "math")
    >>> round(_math.pi, 0)
    3.0

    # import math.sin as _sin and math.floor as _floor
    >>> _pvt_import(globals(), "math", "sin", "floor")
    >>> _floor(3.14)
    3.0
    """

    mod = __import__(modname, globals())
    if items:
        for name in items:
            xname = name
            if xname[0] != "_":
                xname = "_" + xname
            globs[xname] = getattr(mod, name)
    else:
        xname = modname
        if xname[0] != "_":
            xname = "_" + xname
        globs[xname] = mod

Note that it begins with an underscore because it's *meant* to be exported
<0.5 wink>.  That is, the module importing this does

    from utils import _pvt_import

because they don't already have _pvt_import to automate adding the
underscore, and without the underscore almost everyone would accidentally
export "pvt_import" in turn.  IOW,

    import M
    from N import M

not only import M, by default they usually export it too, but the latter is
rarely *intended*.  So, over the years, I've gone thru several phases of
naming objects I *intend* to export with a leading underscore.  That's the
only way to prevent later imports from exporting by accident.  I don't
believe I've distributed any code using _pvt_import, though, because it
fights against the language and expectations.  Metaprogramming against the
grain should be a private sin <0.9 wink>.

_metaprogramming-ly y'rs  - tim




From mal at lemburg.com  Mon Jan  8 19:40:37 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 19:40:37 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de>  
	            <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com>
Message-ID: <3A5A09A5.D0DC33A1@lemburg.com>

Guido van Rossum wrote:
> 
> Discussions based on Python running as root and picking up untrusted
> code from $PYTHONPATH are pointless.  Of course this is a security
> hole.  If root runs *any* Python script in a way that could pick up
> even a single untrusted module, there's a security hole.  site.py or
> *.pth files are just a special case of this, so I don't see why this
> is used as an example.

Agreed; see my reply to Martin.

Still, wouldn't it be wise to add some logic to Python to prevent
importing untrusted modules, e.g. by making sys.path read-only and
disabling the import hook usage using a command line ? 

This would at least prevent the most obvious attacks. I wonder how
RedHat works around these problems.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From jim at interet.com  Mon Jan  8 20:16:45 2001
From: jim at interet.com (James C. Ahlstrom)
Date: Mon, 08 Jan 2001 14:16:45 -0500
Subject: [Python-Dev] Create a synthetic stdout for Windows?
References: <LCEPIIGDJPKCOIHOBJEPGEDKCOAA.MarkH@ActiveState.com>
Message-ID: <3A5A121D.FDD8C2C1@interet.com>

Mark Hammond wrote:

> Note that the original problem was _embedding_ Python - thus, you need to
> patch _their_ WinMain to make it work for them - something you can't do.

Correct, if they don't use pythonw.exe, but use a different
main program, the new stdout will not be installed.  But then
they must have their own main.c, and they can add the C call.
 
> Even if PyWin_StdoutReplace() was a public symbol so they _could_ call it, I

Yes, the symbol PyWin_StdoutReplace() is public, and they
can call it.

> am not convinced they would - it is almost certain they will still need to
> redirect output to somewhere useful, so why bother redirecting it
> temporarily just to redirect it for real immediately after?

Redirecting it temporarily is valuable, because if the sys.stdout
replacement occurs in (for example) myprog.py, then "pythonw.exe
myprog.py"
will fail to produce any error messages for a syntax error in myprog.py.

Also, I was hoping further sys.stdout redirection would be unnecessary.
 
> Finally, I am slightly concerned about the possibility of "hanging" certain
> programs. For example, I believe that DCOM will often invoke a COM server in
> a different "desktop" than the user (this is also true for Services, but
> Python services don't use pythonw.exe).  Thus, a Python program may end up
> hanging with a dialog box, but in the context where no user is able to see
> it.  However, this could be addressed by adding a command-line option to
> prevent this new behaviour kicking in.

Limiting the code to pythonw.exe instead of trying to install
it in python20.dll was supposed to prevent damage to the use
of Python in servers.  Since pythonw.exe is a Windows (GUI) program,
I am assuming there is a screen.  The dialog box is started with
MessageBox() and a window handle of GetForegroundWindow().  So
there doesn't need to be an application window.  I have tested it
with GUI programs, and it also works when run from a console.

Having said that, you may be right that there is some way to
hang on a dialog box which can not be seen.  It depends on what
MessageBox() and GetForegroundWindow() actually do.  If it seems
that this patch has merit, I would be grateful if you would review
the code to look for issues of this type.
 
> I would prefer to see a decent API for extracting error and traceback
> information from Python.  On the other hand, I _do_ see the problem for
> "newbies" trying to use pythonw.exe.

There could be an API added to the winstdout module such as
  msg = winstdout.GetMessageText()
which would return saved text, control its display etc.
But then the problem remains of actually displaying the messages
especially in the context of tracebacks and errors.  And it is
probably easier to redirect sys.stdout so it does what you want
rather than use the API.

I do not view winstdout as a "newbie" feature, but rather a
generally useful C-language addition to Python.

> So - I guess I am saying that I don't see this as optimal, and it doesnt
> solve the original problem you pointed at - but in the interests of making
> pythonw.exe seem "less broken" for newbies, I could live with this as long
> as I could prevent it when necessary.

I guess I am saying, perhaps incorrectly, that the mechanism provided
will make further redirection of sys.stdout unnecessary 99% of the
time.  Experimentation shows that Python composes tracebacks and
error messages a line or partial line at a time.  That is, you can
not display each call to printf(), but must wait until the system is
idle to be sure that multiple calls to printf() are complete.  So this
forces you to use the idle processing loop, not rocket science but
at least inconvenient.  And the only source of stdout/err is tracebacks,
error messages and the "print" statement.  What would you do with
these in a Windows program except display an "OK" dialog box?

If someone out there knows of a different example of sys.stdout
redirection in use in the real world, it would be helpful if
they would describe it.  Maybe it could be incorporated.

> Another option would be to use the Win32 Console APIs, and simply attempt to
> create a console for the error message.  Eg, maybe PyErr_Print() could be
> changed to check for the existance of a console, and if not found, create
> it.  However, the problem with this approach is that the error message will
> often be printed just as the process is terminating - meaning you will see a
> new console with the error message for about 0.025 of a second before it
> vanishes due to process termination.  Any sort of "press any key to
> terminate" option then leaves us in the same position - if no user can see
> the message, the process appears hung.

Yes, this a problem with the console API approach.  Another is that
popping up a black console for output instead of the usual "OK"
dialog box is unnatural, and will force the user to replace sys.stdout.
I was hoping this C stdout will make this unnecessary.

JimA



From esr at thyrsus.com  Mon Jan  8 20:17:50 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 8 Jan 2001 14:17:50 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <200101081830.NAA05301@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 08, 2001 at 01:30:34PM -0500
References: <20010106230125.A29058@thyrsus.com> <LNBBLJKPBEHFEDALKOLCKECNIHAA.tim.one@home.com> <20010107133032.F4586@thyrsus.com> <200101072113.QAA32467@cj20424-a.reston1.va.home.com> <20010107171527.A5093@thyrsus.com> <200101081506.KAA03404@cj20424-a.reston1.va.home.com> <20010108130137.E22834@thyrsus.com> <200101081830.NAA05301@cj20424-a.reston1.va.home.com>
Message-ID: <20010108141750.C23214@thyrsus.com>

Guido van Rossum <guido at python.org>:
> [Eric]
> > A file is at EOF when attempts to read more data from it will fail
> > returning no data.
> 
> I was afraid you would say this.  That's not a condition that's easy
> to calculate without doing I/O, *and* that's not the condition that
> you are interested in for your problem.  According to your definition,
> f.eof() should be true in this example:
> 
>     f = open("/etc/passwd")
>     f.seek(0, 2)                 # Seek to end of file
>     print f.eof()                # What will this print???
>     print `f.readline()`         # Will print ''

I agree that after f.seek(0, 2) f is in an end-of-file condition.  But
I think it's precisely the definition that would be useful for my
problem.  Contrary to what you say, I think my definition of EOF is
quite sharp -- a sequential read would return no data.

Better to think of what I need as an "is there data waiting?" query.
I should have framed it that way, rather than about EOFness, from the
beginning.

> But getting the right result here requires a lot of knowledge about
> how the file is implemented!  While you've explained how this can be
> implemented on Unix, it can't be implemented with just the tools that
> stdio gives us.

Granted.  However, it looks possible that "is there data waiting"
*can* be portably implemented with the help of fstat(2), which by
precedent is also part of Python's toolkit.

> I also don't want to make f.eof() a non-portable feature: *if*
> it is provided, it's too important for that.

Agreed.

> Note that stdio's feof() doesn't have this definition!  It is set when
> the last *read* (or getc(), etc.) stumbled upon an EOF condition.
> That's also of limited value; it's mostly defined so you can
> distinguish between errors and EOF when you get a short read.  The
> stdio feof() flag would be false in the above example.

OK.  You're right about that.  I should have thought more clearly about
the difference between the state of stdio and the state of the underlying
file or device.  Access to stdio state won't do by itself.

> > This is where it bites that I can't test for EOF with a read(0).
> 
> And can you tell me a system where you *can* test for EOF with a
> read(0)?  I've never heard of such a thing.  The Unix read() system
> call has the same properties as Python's f.read().  I'm pretty sure
> that fread() with a zero count also doesn't give you the information
> you're after.

I'd have to test -- but what Unix read(2) does in this case isn't
really my point.  My real point is that I can't probe for whether
there's data waiting to be read in what seems like the obvious way.  I
expect Python to compensate for the deficiencies of the underlying C,
not reflect them.

> > Just having the plain-file case work would, IMHO, be justification
> > enough for this method.  If it turns out to be portable across Mac and
> > Windows sockets as well, *huge* win.  Could this be tested by someone
> > with access to Windows and Mac systems?
> 
> I don't see the huge win.

Try "polling after a non-blocking open".  A lower-overhead and more 
natural way to do it than with a poller object.  (This is on my mind 
because I used a poller object to query FIFOs just last week.)

The game system I'm working on, BTW, has another point of interest for
this list.  It is a rather large and complex suite of C programs that
makes heavy use of dynamic-memory allocation; I am translating to
Python partly in order to avoid chronic misallocation problems (leaks
and wild pointers) and partly because the thing needed to be rewritten
anyway to eliminate global state so I can embed it an multithreaded
server.

Side-by-side comparison of the original C and its translation should
be quite an interesting educational experience once it's done.  That
just might be my next yesar's paper.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

It is the assumption of this book that a work of art is a gift, not a
commodity.  Or, to state the modern case with more precision, that works of
art exist simultaneously in two "economies," a market economy and a gift
economy.  Only one of these is essential, however: a work of art can survive
without the market, but where there is no gift there is no art.
	-- Lewis Hyde, The Gift: Imagination and the Erotic Life of Property



From guido at python.org  Mon Jan  8 20:36:02 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 14:36:02 -0500
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: Your message of "Mon, 08 Jan 2001 19:40:37 +0100."
             <3A5A09A5.D0DC33A1@lemburg.com> 
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com>  
            <3A5A09A5.D0DC33A1@lemburg.com> 
Message-ID: <200101081936.OAA05440@cj20424-a.reston1.va.home.com>

> Still, wouldn't it be wise to add some logic to Python to prevent
> importing untrusted modules, e.g. by making sys.path read-only and
> disabling the import hook usage using a command line ? 
> 
> This would at least prevent the most obvious attacks. I wonder how
> RedHat works around these problems.

I don't understand what kind of attacks you are thinking of.  What
would making sys.path read-only prevent?  You seem to be thinking that
some malicious piece of code could try to subvert you by setting
sys.path.  But what you forget is that if this piece of code cannot be
trusted wiuth sys.path, it should not be trusted to run at all!

--Guido van Rossum (home page: http://www.python.org/~guido/)




From loewis at informatik.hu-berlin.de  Mon Jan  8 20:45:44 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 8 Jan 2001 20:45:44 +0100 (MET)
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: <3A5A02AA.675A35D1@lemburg.com> (mal@lemburg.com)
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com>
Message-ID: <200101081945.UAA12178@pandora.informatik.hu-berlin.de>

> The code adding (and with the patch: executing) the .pth files
> is defined in site.py and it is rather easy to override this
> file by adding a modified site.py file to the current working dir...
> a potential security hole in its own right, I guess :(

Indeed - independent of my patch changing the other site.py :-)

Regards,
Martin



From skip at mojam.com  Mon Jan  8 20:49:22 2001
From: skip at mojam.com (Skip Montanaro)
Date: Mon, 8 Jan 2001 13:49:22 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59EF2B.792801E5@ActiveState.com>
References: <20010107232532.V17220@lyra.org>
	<20010106110033.52127A84F@darjeeling.zadka.site.co.il>
	<LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
	<20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>
	<3A59EF2B.792801E5@ActiveState.com>
Message-ID: <14938.6594.44596.509259@beluga.mojam.com>

    Paul> It's not about keeping people out of your module.  In fact I would
    Paul> propose that mod.__dict__ should be as loose as ever.

Okay, how about this as a compromise first step?  Allow programmers to put
__exports__ lists in their modules but don't do anything with them *except*
modify dir() to respect that if it exists?  That would pretty up dir()
output for newbies, almost certainly not break anything, improve the
internal documentation of the modules that use __exports__, and still allow
us to move in a more restrictive direction at a later time if we so choose.

Skip



From moshez at zadka.site.co.il  Tue Jan  9 05:04:23 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 06:04:23 +0200 (IST)
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: <3A5A02AA.675A35D1@lemburg.com>
References: <3A5A02AA.675A35D1@lemburg.com>, <200101081751.SAA08918@pandora.informatik.hu-berlin.de>
Message-ID: <20010109040423.68AA4A82D@darjeeling.zadka.site.co.il>

On Mon, 08 Jan 2001 19:10:50 +0100, "M.-A. Lemburg" <mal at lemburg.com> wrote:

> Hmm, but what if the Python script picks up a site.py which is
> different from the standard one distributed with Python ?

Then the site.py can do whatever it wants.
No need to go through PTHs
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tim.one at home.com  Mon Jan  8 20:59:48 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 14:59:48 -0500
Subject: feof status (was: Re: [Python-Dev] Rehabilitating fgets)
In-Reply-To: <20010108130137.E22834@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEGHIHAA.tim.one@home.com>

Quickie:

[Guido]
> Eric, before we go furhter, can you give an exact definition of
> EOFness to me?

[Eric]
> A file is at EOF when attempts to read more data from it will fail
> returning no data.

To be very clear about this, that's not what C's feof() means:  in general,
the end-of-file indicator in std C stream input is set only *after* you've
attempted a read that "didn't work".  For example,

#include <stdio.h>

void
main()
{
	FILE* fp = fopen("guts", "wb");
	fputs("abc", fp);
	fclose(fp);
	fp = fopen("guts", "rb");
	for (;;) {
		int c;
		c = getc(fp);
		printf("getc returned %c (%d)\n", c, c);
		printf("At EOF after getc? %d\n", feof(fp));
		if (c == EOF)
			break;
	}
}

Unless your C is broken, feof() will return 0 after getc() returns 'a', and
again after 'b', and again after 'c'.  It's not until getc() returns EOF
that feof() first returns a non-zero result.

Then add these two lines after the "for":

	fseek(fp, 0L, SEEK_END);
	printf("after seeking to the end, feof() says %d\n", feof(fp));

Unless your fseek() is non-std, that clears the end-of-file indicator, and
regardless of to where you seek.  So the std behavior throughout libc is
much like Python's behavior:  there's nothing that can tell you whether
you're at the end of the file, in general, short of trying to read and
failing to get something back.

In your case you seem to *know* that you have a "plain old file", meaning
that its size is well-defined and that ftell() makes sense for it.  You also
seem to know that you don't have to worry about anyone else, e.g., appending
to it (or in any other way changing its size, or changing your stream's file
position), while you're mucking with it.  So why not just do f.tell() and
compare that to the size yourself?  This sounds easy for you to do, but in
this particular case you enjoy the benefits of a world of assumptions that
aren't true in general.

> ...
> This is where it bites that I can't test for EOF with a read(0).

You can't in std C using an fread of 0 bytes either -- that has no effect on
the end-of-file indicator.  Add

		if (c == 'c') {
			char buf[100];
			size_t i = fread(buf, 1, 0, fp);
			printf("after fread of 0 bytes, feof() says %d\n",
			       feof(fp));
		}

before the "(c == EOF)" test above to try that on your platform.

> ...
> I would be quite surprised if the plain-file case didn't work on Mac
> and Windows.

Don't know about Mac.  On Windows everything is grossly complicated because
of line-end translations in text mode.  Like the C std says, the only
*portable* thing you can do with an ftell() result for a text file is feed
it back unaltered to fseek().  It so happens that on Windows, using MS's
libc, if f.readline() returns "abc\n" for the first line of a native text
file, f.tell() returns 5, reflecting the actual byte offset in the file
(including the \r that .readline() doesn't show you).  So you *can* get away
with comparing f.tell() to the file's size on Windows too (using the MS C
compiler; don't know about others).

the-operational-defn-of-eof-is-the-only-portable-defn-
    there-is-ly y'rs  - tim




From moshez at zadka.site.co.il  Tue Jan  9 05:08:29 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue,  9 Jan 2001 06:08:29 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <14938.6594.44596.509259@beluga.mojam.com>
References: <14938.6594.44596.509259@beluga.mojam.com>, <20010107232532.V17220@lyra.org>
	<20010106110033.52127A84F@darjeeling.zadka.site.co.il>
	<LNBBLJKPBEHFEDALKOLCIEEMIHAA.tim.one@home.com>
	<20010108165057.8FED8A82D@darjeeling.zadka.site.co.il>
	<3A59EF2B.792801E5@ActiveState.com>
Message-ID: <20010109040829.BDB66A82D@darjeeling.zadka.site.co.il>

[Paul Prescod] 
> It's not about keeping people out of your module.  In fact I would
> propose that mod.__dict__ should be as loose as ever.

[Skip Montanaro]
> Okay, how about this as a compromise first step?  Allow programmers to put
> __exports__ lists in their modules but don't do anything with them *except*
> modify dir() to respect that if it exists?  That would pretty up dir()
> output for newbies, almost certainly not break anything, improve the
> internal documentation of the modules that use __exports__, and still allow
> us to move in a more restrictive direction at a later time if we so choose.

I'm +1 on that personally. 
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From mal at lemburg.com  Mon Jan  8 21:38:00 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 08 Jan 2001 21:38:00 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com>  
	            <3A5A09A5.D0DC33A1@lemburg.com> <200101081936.OAA05440@cj20424-a.reston1.va.home.com>
Message-ID: <3A5A2528.C289BE1D@lemburg.com>

Guido van Rossum wrote:
> 
> > Still, wouldn't it be wise to add some logic to Python to prevent
> > importing untrusted modules, e.g. by making sys.path read-only and
> > disabling the import hook usage using a command line ?
> >
> > This would at least prevent the most obvious attacks. I wonder how
> > RedHat works around these problems.
> 
> I don't understand what kind of attacks you are thinking of.  What
> would making sys.path read-only prevent?  You seem to be thinking that
> some malicious piece of code could try to subvert you by setting
> sys.path.  But what you forget is that if this piece of code cannot be
> trusted wiuth sys.path, it should not be trusted to run at all!

I was thinking an attack where knowledge of common temporary
execution locations is used to trick Python into executing
untrusted code -- the untrusted code would only have to be
copied to the known temporary execution directory and then
gets executed by Python next time the program using the temporary
location is invoked.

But you're right: this is possible with and without sys.path being
writeable or not.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Mon Jan  8 21:45:57 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 21:45:57 +0100
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <200101081427.JAA03146@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 08, 2001 at 09:27:50AM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com>
Message-ID: <20010108214557.H402@xs4all.nl>

On Mon, Jan 08, 2001 at 09:27:50AM -0500, Guido van Rossum wrote:
> > You may be right.  Still, this patch solves the immediate problem in a
> > reasonably clean way, and I urge that it should go in.  We can do a
> > more complete reorganization of the build process later.  (I'll help with
> > that; I'm pretty expert with autoconf and friends.)

> I expect Andrew's code to go in before 2.1 is released.  So I don't
> see a reason why we should hurry and check in a stop-gap measure.

Oh, we're gonna distribute binaries of Python 2.0/1.5.2-with-distutils for
every known platform that can run configure ? :) I still think there are
more than enough platforms without Python to warrant using autoconf for
configuring modules. The module list and their demands are stable enough to
make maintenance a fair breeze, IMHO.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From akuchlin at mems-exchange.org  Mon Jan  8 22:57:58 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 16:57:58 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <20010108214557.H402@xs4all.nl>; from thomas@xs4all.net on Mon, Jan 08, 2001 at 09:45:57PM +0100
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com> <20010108214557.H402@xs4all.nl>
Message-ID: <20010108165758.B9260@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 09:45:57PM +0100, Thomas Wouters wrote:
>every known platform that can run configure ? :) I still think there are
>more than enough platforms without Python to warrant using autoconf for
>configuring modules. The module list and their demands are stable enough to
>make maintenance a fair breeze, IMHO.

Umm... the proposed PEP 229 patch would compile a Python binary with
sre, posix, and strop statically linked; this minimal Python is then
used to run the setup.py script.  You shouldn't require a preinstalled
Python, though the current version of the patch doesn't meet this
requirement yet.

--amk




From tim.one at home.com  Mon Jan  8 21:59:40 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 15:59:40 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <3A59F00E.53A0A32A@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>

[Tim]
> Perl appears to ignore the issue of thread safety here (on Windows and
> everywhere else).

[Paul Prescod]
> If you can create a sample program that demonstrates the unsafety
> I'll anonymously submit it as a bug on our internal system

I don't want to spend time on that, as I *assume* it's already well-known
within the Perl thread community.  Besides, the last version of Perl I got
from ActiveState <wink> complains:

     No threads in this perl at temp.pl line 14

if I try to use Perl threads.  That's:

> \perl\bin\perl -v

This is perl, v5.6.0 built for MSWin32-x86-multi-thread
(with 1 registered patch, see perl -V for more detail)

Copyright 1987-2000, Larry Wall

Binary build 620 provided by ActiveState Tool Corp.
http://www.ActiveState.com
Built 18:31:05 Oct 31 2000

...

If I can repair that by downloading a more recent release, let me know.

> and ensure that the next version of Perl is as slow as Python. :)

I don't want to slow them down!  To the contrary, now I've got a solid
reason for why I keep using Perl for simple high-volume text-crunching jobs
<wink>.

> Seriously: If someone comes at me with Perl-IO-is-way-faster-than-
> Python-IO, I'd like to know what concretely they've given up in order
> to achieve that performance.

My line-at-a-time test case used (rounding to nearest whole integers) 30
seconds in Python and 6 in Perl.  The result of testing many changes to
Python's implementation was that the excess 24 seconds broke down like so:

    17   spent inside internal MS threadsafe getc() lock/unlock
             routines
     5   uncertain, but evidence suggests much of it due to MS
             malloc/realloc (Perl does its own memory mgmt)
     2   for not copying directly out of the platform FILE*
             implementation struct in a highly optimized loop (like
             Perl does)

My last checkin to fileobject.c reclaimed 17 seconds on Win98SE while
remaining threadsafe, via a combination of locking per line instead of per
character, and invoking realloc much less often (only for lines exceeding
200 chars).  (BTW, I'm still curious to know how that compares to the
getc_unlocked hack on a platform other than Windows!)

> And even just for my own interest I'd like to understand the cost/
> benefit of stream thread safety.

If you're not *using* threads, or not using them to muck with the same
stream at the same time, the ratio is infinite.  And that's usually the
case.

> For instance would it make sense to just write a thread-safe
> wrapper for streams used from multiple threads?

Alas, on Windows you can't pick and choose:  you get the threadsafe libc, or
you don't.  So long as anyone may want to use threads for any reason
whatsoever, we must link with threadsafe libraries.  But, as above, on
Windows we're not paying much for that anymore in this case (unless maybe
the threadsafe MS malloc family is also outrageously slower than its
careless counterpart ...).  It does prevent me from persuing the "optimized
inner loop" business, because MS doesn't expose its locking primitives (so I
can't do in C everything I would need to do to optimize the inner loop while
remaining threadsafe).

there-are-damn-few-pieces-of-libc-we-wouldn't-be-better-off-
    writing-ourselves-but-then-we'd-have-a-much-harder-time-
    playing-with-others'-code-ly y'rs  - tim




From akuchlin at mems-exchange.org  Mon Jan  8 22:15:34 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 8 Jan 2001 16:15:34 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 03:59:40PM -0500
References: <3A59F00E.53A0A32A@ActiveState.com> <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>
Message-ID: <20010108161534.A2392@kronos.cnri.reston.va.us>

On Mon, Jan 08, 2001 at 03:59:40PM -0500, Tim Peters wrote:
>200 chars).  (BTW, I'm still curious to know how that compares to the
>getc_unlocked hack on a platform other than Windows!)

On Solaris and Linux, the results seemed to be lost in the noise.
Repeated runs of filetest.py were sometimes faster than without
USE_MS_GETLINE_HACK, so the variation is probably large enough to
swamp any difference between the two.  (Assuming I enabled the getline
hack correctly of course; someone please replicate...)

--amk

Linux: w/o USE_MS_GETLINE_HACK
kronos Python-2.0>./python ~/filetest.py
total 1559913 chars and 32513 lines
count_chars_lines     0.186  0.190
readlines_sizehint    0.108  0.110
using_fileinput       0.447  0.450
while_readline        0.184  0.180

Linux w/ USE_MS_GETLINE_HACK:
kronos Python-2.0>./python ~/filetest.py
total 1559913 chars and 32513 lines
count_chars_lines     0.178  0.180
readlines_sizehint    0.108  0.110
using_fileinput       0.434  0.430
while_readline        0.183  0.190                                              
Solaris w/o USE_MS_GETLINE_HACK:
amarok src>./python ~/filetest.py
total 1559913 chars and 32513 lines
count_chars_lines     0.640  0.630
readlines_sizehint    0.278  0.280
using_fileinput       1.874  1.820
while_readline        0.839  0.840

Solaris w/ USE_MS_GETLINE_HACK:
amarok src>./python ~/filetest.py
total 1559913 chars and 32513 lines
count_chars_lines     0.569  0.570
readlines_sizehint    0.275  0.280
using_fileinput       1.902  1.900
while_readline        0.769  0.770



From gstein at lyra.org  Mon Jan  8 22:29:40 2001
From: gstein at lyra.org (Greg Stein)
Date: Mon, 8 Jan 2001 13:29:40 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010108161534.A2392@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Mon, Jan 08, 2001 at 04:15:34PM -0500
References: <3A59F00E.53A0A32A@ActiveState.com> <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com> <20010108161534.A2392@kronos.cnri.reston.va.us>
Message-ID: <20010108132940.G4141@lyra.org>

On Mon, Jan 08, 2001 at 04:15:34PM -0500, Andrew Kuchling wrote:
> On Mon, Jan 08, 2001 at 03:59:40PM -0500, Tim Peters wrote:
> >200 chars).  (BTW, I'm still curious to know how that compares to the
> >getc_unlocked hack on a platform other than Windows!)
> 
> On Solaris and Linux, the results seemed to be lost in the noise.

Your times are so small... I'd suggest do a few iterations within
filetest.py so your margin of error isn't so noticable.

Cheers,
-g

>...
> Linux: w/o USE_MS_GETLINE_HACK
> kronos Python-2.0>./python ~/filetest.py
> total 1559913 chars and 32513 lines
> count_chars_lines     0.186  0.190
> readlines_sizehint    0.108  0.110
> using_fileinput       0.447  0.450
> while_readline        0.184  0.180
> 
> Linux w/ USE_MS_GETLINE_HACK:
> kronos Python-2.0>./python ~/filetest.py
> total 1559913 chars and 32513 lines
> count_chars_lines     0.178  0.180
> readlines_sizehint    0.108  0.110
> using_fileinput       0.434  0.430
> while_readline        0.183  0.190                                              
> Solaris w/o USE_MS_GETLINE_HACK:
> amarok src>./python ~/filetest.py
> total 1559913 chars and 32513 lines
> count_chars_lines     0.640  0.630
> readlines_sizehint    0.278  0.280
> using_fileinput       1.874  1.820
> while_readline        0.839  0.840
> 
> Solaris w/ USE_MS_GETLINE_HACK:
> amarok src>./python ~/filetest.py
> total 1559913 chars and 32513 lines
> count_chars_lines     0.569  0.570
> readlines_sizehint    0.275  0.280
> using_fileinput       1.902  1.900
> while_readline        0.769  0.770

-- 
Greg Stein, http://www.lyra.org/



From thomas at xs4all.net  Mon Jan  8 22:59:17 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 22:59:17 +0100
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <20010108165758.B9260@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Mon, Jan 08, 2001 at 04:57:58PM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com> <20010108214557.H402@xs4all.nl> <20010108165758.B9260@kronos.cnri.reston.va.us>
Message-ID: <20010108225916.P2467@xs4all.nl>

On Mon, Jan 08, 2001 at 04:57:58PM -0500, Andrew Kuchling wrote:

> Umm... the proposed PEP 229 patch would compile a Python binary with
> sre, posix, and strop statically linked; this minimal Python is then
> used to run the setup.py script.  You shouldn't require a preinstalled
> Python, though the current version of the patch doesn't meet this
> requirement yet.

Apologies. I should've bothered to read the PEP first, but I haven't found
the time yet :P I retract all my comments on the subject until I do.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Mon Jan  8 23:08:50 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 8 Jan 2001 23:08:50 +0100
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010109000300.DF2A5A82D@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Tue, Jan 09, 2001 at 02:03:00AM +0200
References: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>, <200101081433.JAA03185@cj20424-a.reston1.va.home.com>, <20010107232532.V17220@lyra.org> <LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com> <20010108002603.X17220@lyra.org> <20010108231321.AB08FA82D@darjeeling.zadka.site.co.il> <200101081515.KAA03474@cj20424-a.reston1.va.home.com> <20010109000300.DF2A5A82D@darjeeling.zadka.site.co.il>
Message-ID: <20010108230850.Q2467@xs4all.nl>

On Tue, Jan 09, 2001 at 02:03:00AM +0200, Moshe Zadka wrote:

> > (2) Under exactly what circumstances do you want from foo import *
> >     issue a warning?

> All.
> If you want to be less extreme, don't warn if the module defines
> a __from_star_ok__

We already have a perfectly acceptable way of turning off warnings in
particular circumstances. I'm +1 on warning against using 'from spam import
*' by the way, though it would be even better (+2!) if there was a 'import *
considered harmful' page/chapter in the documentation somewhere, so we could
point to it.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Mon Jan  8 23:23:02 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 08 Jan 2001 17:23:02 -0500
Subject: [Python-Dev] Extending startup code: PEP needed?
In-Reply-To: Your message of "Mon, 08 Jan 2001 21:38:00 +0100."
             <3A5A2528.C289BE1D@lemburg.com> 
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com> <3A5A09A5.D0DC33A1@lemburg.com> <200101081936.OAA05440@cj20424-a.reston1.va.home.com>  
            <3A5A2528.C289BE1D@lemburg.com> 
Message-ID: <200101082223.RAA05858@cj20424-a.reston1.va.home.com>

> I was thinking an attack where knowledge of common temporary
> execution locations is used to trick Python into executing
> untrusted code -- the untrusted code would only have to be
> copied to the known temporary execution directory and then
> gets executed by Python next time the program using the temporary
> location is invoked.

When does Python execute code from a predictable common temporary
location?  When is that likely to be used from a Python script running
as root?

Note that if you use tempfile.TemporaryFile(), you can create a
temporary file that's not subvertible.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at acm.org  Mon Jan  8 23:35:17 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 8 Jan 2001 17:35:17 -0500 (EST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010108230850.Q2467@xs4all.nl>
References: <200101081515.KAA03474@cj20424-a.reston1.va.home.com>
	<200101081433.JAA03185@cj20424-a.reston1.va.home.com>
	<20010107232532.V17220@lyra.org>
	<LNBBLJKPBEHFEDALKOLCAEFAIHAA.tim.one@home.com>
	<20010108002603.X17220@lyra.org>
	<20010108231321.AB08FA82D@darjeeling.zadka.site.co.il>
	<20010109000300.DF2A5A82D@darjeeling.zadka.site.co.il>
	<20010108230850.Q2467@xs4all.nl>
Message-ID: <14938.16549.944123.917467@cj42289-a.reston1.va.home.com>

Thomas Wouters writes:
 > *' by the way, though it would be even better (+2!) if there was a 'import *
 > considered harmful' page/chapter in the documentation somewhere, so we could
 > point to it.

  Care to write it?


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From MarkH at ActiveState.com  Tue Jan  9 00:00:01 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Mon, 8 Jan 2001 15:00:01 -0800
Subject: [Python-Dev] Create a synthetic stdout for Windows?
In-Reply-To: <3A5A05DA.86B3EB86@interet.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPEEEGCOAA.MarkH@ActiveState.com>

> Limiting the code to pythonw.exe instead of trying to install
> it in python20.dll was supposed to prevent damage to the use
> of Python in servers.  Since pythonw.exe is a Windows (GUI) program,
> I am assuming there is a screen.

Sometimes _no_ screen at all is wanted - ie, no main GUI window, and no
console window.  pythonw is used in this case.  COM uses pythonw.exe in just
this way, and when executed by DCOM, it will be executed in a context where
the user can not see any such dialog.

However, I would be happy to ensure the correct command-line is used to
prevent this behaviour in this case.

Indeed, in _every_ case I use pythonw.exe I would disable this - but I
accept that other users have simpler requirements.

> Having said that, you may be right that there is some way to
> hang on a dialog box which can not be seen.  It depends on what
> MessageBox() and GetForegroundWindow() actually do.  If it seems
> that this patch has merit, I would be grateful if you would review
> the code to look for issues of this type.

There will be no issues in the code - it is just that Win2k will execute in
a different "workspace" (I think that is the term).  This is identical to
the problem of a service attempting to display a messagebox - the code is
perfect and works perfectly - just in a context where noone can see it, or
dismiss it.

> > I would prefer to see a decent API for extracting error and traceback
> > information from Python.  On the other hand, I _do_ see the problem for
> > "newbies" trying to use pythonw.exe.
>
> There could be an API added to the winstdout module such as
>   msg = winstdout.GetMessageText()
> which would return saved text, control its display etc.

I was thinking more of a "Py_GetTraceback()", which would return a complete
exception string.

Thus, embedders could write code similar to:

  whatever = Py_BuildValue(...);
  ret = PyObject_Call(foo, whatever);
  ...
  if (!ok) {
    char *text = Py_GetTraceback();
    MsgBox(text);
  }

Thus, with only a small amount of work, they have _complete_ control over
the output.  However, I agree this doesnt really solve pythonw.exe's
problems.

> I do not view winstdout as a "newbie" feature, but rather a
> generally useful C-language addition to Python.

Hrm.  I dont believe a commercial app, for example, would find this
suitable - they would roll their own solution.

Hence I see this purely for newbie users.  Advanced users have complete
control now - a simple try/except block around their main code, and you are
pretty good.  A builtin module for displaying a messagebox is as robust as
an experienced user needs to emulate this, IMO.

> I guess I am saying, perhaps incorrectly, that the mechanism provided
> will make further redirection of sys.stdout unnecessary 99% of the
> time.

Yes, I disagree here.  IMO it is no good for a commercial, real app.  As I
said, I see this as a feature so the newbie will not believe pythonw.exe is
broken.  Advanced users can already do similar things themselves.

> Experimentation shows that Python composes tracebacks and
> error messages a line or partial line at a time.  That is, you can
> not display each call to printf(), but must wait until the system is
> idle to be sure that multiple calls to printf() are complete.  So this
> forces you to use the idle processing loop, not rocket science but
> at least inconvenient.

What "idle processing loop"?

> And the only source of stdout/err is tracebacks,
> error messages and the "print" statement.  What would you do with
> these in a Windows program except display an "OK" dialog box?

Log the error to a file, and display a "friendly" dialog - possibly offering
to automatically submit a support request/bug report.

The casual user is going to be _very_ scared by a Python traceback.  This is
a sin of a similar magnitude to those crappy applications with unhandled VB
exceptions.

IMO, nothing looks more unprofessional than an app that displays an internal
VB error message.  Python is no different IMO.  For real applications, there
is a good chance that the majority of your users have never heard of Python.

Thus, I don't believe your solution suitable for the real, professional,
commercial user.  However, I agree that your solution does not prevent this
user doing the "right thing"...

But all this does keep me believing this is a "newbie" helper.

>
> If someone out there knows of a different example of sys.stdout
> redirection in use in the real world, it would be helpful if
> they would describe it.  Maybe it could be incorporated.

Sure.  Komodo to a file with a friendly dialog (sometimes ;-).

Pythonwin actually attempts a few things first - eg, not every exception
Pythonwin casues at startup should be logged.

Python services write unhandled errors to the event log.

I don't believe I have worked on 2 projects with the same requirement
here!!!

Mark.




From nas at arctrix.com  Mon Jan  8 17:22:10 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 8 Jan 2001 08:22:10 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 08, 2001 at 03:59:40PM -0500
References: <3A59F00E.53A0A32A@ActiveState.com> <LNBBLJKPBEHFEDALKOLCEEGKIHAA.tim.one@home.com>
Message-ID: <20010108082210.A16149@glacier.fnational.com>

On Mon, Jan 08, 2001 at 03:59:40PM -0500, Tim Peters wrote:
> My line-at-a-time test case used (rounding to nearest whole integers) 30
> seconds in Python and 6 in Perl.  The result of testing many changes to
> Python's implementation was that the excess 24 seconds broke down like so:
> 
>     17   spent inside internal MS threadsafe getc() lock/unlock
>              routines
>      5   uncertain, but evidence suggests much of it due to MS
>              malloc/realloc (Perl does its own memory mgmt)
>      2   for not copying directly out of the platform FILE*
>              implementation struct in a highly optimized loop (like
>              Perl does)

Have you tried pymalloc?  

  Neil



From billtut at microsoft.com  Tue Jan  9 01:38:14 2001
From: billtut at microsoft.com (Bill Tutt)
Date: Mon, 8 Jan 2001 16:38:14 -0800 
Subject: [Python-Dev] Create a synthetic stdout for Windows?
Message-ID: <58C671173DB6174A93E9ED88DCB0883D0A6202@red-msg-07.redmond.corp.microsoft.com>

> From: 	Mark Hammond [mailto:MarkH at ActiveState.com] 

> There will be no issues in the code - it is just that Win2k will execute
in
> a different "workspace" (I think that is the term).  This is identical to
> the problem of a service attempting to display a messagebox - the code is
> perfect and works perfectly - just in a context where noone can see it, or
> dismiss it.


The term Mark is looking for here is Windowstation, and it's an NT thing,
not just a Win2k thing. Windowstations have been around for ages.

Bill



From ping at lfw.org  Tue Jan  9 02:51:15 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 8 Jan 2001 17:51:15 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <14938.6594.44596.509259@beluga.mojam.com>
Message-ID: <Pine.LNX.4.10.10101081749580.5156-100000@skuld.kingmanhall.org>

On Mon, 8 Jan 2001, Skip Montanaro wrote:
> Okay, how about this as a compromise first step?  Allow programmers to put
> __exports__ lists in their modules but don't do anything with them *except*
> modify dir() to respect that if it exists?

I'd say: Just have dir() and import * pay attention to __exports__.
Don't mess with getattr or __dict__.


-- ?!ng

Happiness comes more from loving than being loved; and often when our
affection seems wounded it is is only our vanity bleeding. To love, and
to be hurt often, and to love again--this is the brave and happy life.
    -- J. E. Buchrose 




From ping at lfw.org  Tue Jan  9 03:00:08 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 8 Jan 2001 18:00:08 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A59F27D.C27B8CD0@ActiveState.com>
Message-ID: <Pine.LNX.4.10.10101081751530.5156-100000@skuld.kingmanhall.org>

On Mon, 8 Jan 2001, Paul Prescod wrote:
> dir() is one of the "interactive tools" I'd like to work better in the
> presence of __exports__. On the other hand, dir() works pretty poorly
> for object instances today so maybe we need something new anyhow. 

I suggest a built-in function "methods()" that works like this:

    def methods(obj):
        if type(obj) is InstanceType: return methods(obj.__class__)
        results = []
        if hasattr(obj, '__bases__'):
            for base in obj.__bases__:
                results.extend(methods(base))
        results.extend(
            filter(lambda k, o=obj: type(getattr(o, k)) in
                   [MethodType, BuiltinMethodType], dir(obj)))
        return unique(results)

    def unique(seq):
        dict = {}
        for item in seq: dict[item] = 1
        results = dict.keys()
        results.sort()
        return results


    >>> import sys
    >>> 
    >>> methods(sys.stdin)
    ['close', 'fileno', 'flush', 'isatty', 'read', 'readinto', 'readline', 'readlines', 'seek', 'tell', 'truncate', 'write', 'writelines']
    >>>        
    >>> import SocketServer
    >>> 
    >>> methods(SocketServer.ForkingTCPServer)
    ['__init__', 'collect_children', 'fileno', 'finish_request', 'get_request', 'handle_error', 'handle_request', 'process_request', 'serve_forever', 'server_activate', 'server_bind', 'verify_request']
    >>> 



-- ?!ng

Happiness comes more from loving than being loved; and often when our
affection seems wounded it is is only our vanity bleeding. To love, and
to be hurt often, and to love again--this is the brave and happy life.
    -- J. E. Buchrose 




From gstein at lyra.org  Tue Jan  9 03:20:56 2001
From: gstein at lyra.org (Greg Stein)
Date: Mon, 8 Jan 2001 18:20:56 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects fileobject.c,2.102,2.103
In-Reply-To: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Mon, Jan 08, 2001 at 06:00:13PM -0800
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010108182056.C4640@lyra.org>

On Mon, Jan 08, 2001 at 06:00:13PM -0800, Guido van Rossum wrote:
>...
> Modified Files:
> 	fileobject.c 
> Log Message:
> Tsk, tsk, tsk.  Treat FreeBSD the same as the other BSDs when defining
> a fallback for TELL64.  Fixes SF Bug #128119.
>...
> *** fileobject.c	2001/01/08 04:02:07	2.102
> --- fileobject.c	2001/01/09 02:00:11	2.103
> ***************
> *** 59,63 ****
>   #if defined(MS_WIN64)
>   #define TELL64 _telli64
> ! #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
>   /* NOTE: this is only used on older
>      NetBSD prior to f*o() funcions */
> --- 59,63 ----
>   #if defined(MS_WIN64)
>   #define TELL64 _telli64
> ! #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(__FreeBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
>   /* NOTE: this is only used on older
>      NetBSD prior to f*o() funcions */

All of those #ifdefs could be tossed and it would be more robust (long term)
if an autoconf macro were used to specify when TELL64 should be defined.

[ I've looked thru fileobject.c and am a bit confused: the conditions for
  defining TELL64 do not match the conditions for *using* it. that would
  seem to imply a semantic error somewhere and/or a potential gotcha when
  they get skewed (like I assume what happened to FreeBSD). simplifying with
  an autoconf macro may help to rationalize it. ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From tim.one at home.com  Tue Jan  9 05:29:02 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 23:29:02 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010108161534.A2392@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHPIHAA.tim.one@home.com>

[Andrew Kuchling]

I'll chop everything except while_readline (which is most affected by this
stuff):

> Linux: w/o USE_MS_GETLINE_HACK
> while_readline        0.184  0.180
>
> Linux w/ USE_MS_GETLINE_HACK:
> while_readline        0.183  0.190
>
> Solaris w/o USE_MS_GETLINE_HACK:
> while_readline        0.839  0.840
>
> Solaris w/ USE_MS_GETLINE_HACK:
> while_readline        0.769  0.770

So it's probably a wash.  In that case, do we want to maintain two hacks for
this?  I can't use the FLOCKFILE/etc approach on Windows, while "the
Windows" approach probably works everywhere (although its speed relies on
the platform factoring out at least the locking/unlocking in fgets).

Both methods lack a refinement I would like to see, but can't achieve in
"the Windows way":  ensure that consistency is on no worse than a per-line
basis.  Right now, both methods lock/unlock the file only for the extent of
the current buffer size, so that two threads *can* get back different
interleaved pieces of a single long line.  Like so:

import thread

def read(f):
    x = f.readline()
    print "thread saw " + `len(x)` + " chars"
    m.release()

f = open("ga", "w") # a file with one long line
f.write("x" * 100000 + "\n")
f.close()

m = thread.allocate_lock()
for i in range(10):
    print i
    f = open("ga", "r")
    m.acquire()
    thread.start_new_thread(read, (f,))
    x = f.readline()
    print "main saw " + `len(x)` + " chars"
    m.acquire(); m.release()
    f.close()

Here's a typical run on Windows (current CVS Python):

0
main saw 95439 chars
thread saw 4562 chars
1
main saw 97941 chars
thread saw 2060 chars
2
thread saw 43801 chars
main saw 56200 chars
3
thread saw 8011 chars
main saw 91990 chars
4
main saw 46546 chars
thread saw 53455 chars
5
thread saw 53125 chars
main saw 46876 chars
6
main saw 98638 chars
thread saw 1363 chars
7
main saw 72121 chars
thread saw 27880 chars
8
thread saw 70031 chars
main saw 29970 chars
9
thread saw 27555 chars
main saw 72446 chars

So, yes, it's threadsafe now:  between them, the threads always see a grand
total of 100001 characters.  But what friggin' good is that <wink>?  If,
e.g., Guido wants multiple threads to chew over his giant logfile, there's
no guarantee that .readline() ever returns an actual line from the file.

Not that Python 2.0 was any better in this respect ...




From tim.one at home.com  Tue Jan  9 05:48:25 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 23:48:25 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <20010108082210.A16149@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEIAIHAA.tim.one@home.com>

[Tim]
>      5   uncertain, but evidence suggests much of it due to MS
>              malloc/realloc (Perl does its own memory mgmt)

[NeilS]
> Have you tried pymalloc?

Not recently, and don't expect to find time for it this week.  IIRC,
Vladimir did get significant speedups-- lo those many years ago! --when he
tried it on Windows, though.  Maybe (or maybe not) that was due to
exploiting the global lock (i.e., exploiting that pymalloc didn't need to do
its own serialization, when called from the Python core).




From tim.one at home.com  Tue Jan  9 05:52:25 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 8 Jan 2001 23:52:25 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHPIHAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEIAIHAA.tim.one@home.com>

[Tim]
> ...
> Here's a typical run on Windows (current CVS Python):
>
> 0
> main saw 95439 chars
> thread saw 4562 chars
> 1
> main saw 97941 chars
> thread saw 2060 chars
> 2
> thread saw 43801 chars
> main saw 56200 chars
> 3
> thread saw 8011 chars
> main saw 91990 chars
> 4
> main saw 46546 chars
> thread saw 53455 chars
> 5
> thread saw 53125 chars
> main saw 46876 chars
> 6
> main saw 98638 chars
> thread saw 1363 chars
> 7
> main saw 72121 chars
> thread saw 27880 chars
> 8
> thread saw 70031 chars
> main saw 29970 chars
> 9
> thread saw 27555 chars
> main saw 72446 chars

Oops!  I lied.  That was the released 2.0.  Current CVS is either better or
worse, depending on whether you think "working" by accident more often is a
good thing or leads to false confidence <wink>:

0
main saw 100001 chars
thread saw 0 chars
1
main saw 100001 chars
thread saw 0 chars
2
main saw 100001 chars
thread saw 0 chars
3
main saw 100001 chars
thread saw 0 chars
4
main saw 100001 chars
thread saw 0 chars
5
thread saw 25802 chars
main saw 74199 chars
6
thread saw 802 chars
main saw 99199 chars
7
main saw 100001 chars
thread saw 0 chars
8
main saw 100001 chars
thread saw 0 chars
9
main saw 100001 chars
thread saw 0 chars




From mal at lemburg.com  Tue Jan  9 08:23:42 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 09 Jan 2001 08:23:42 +0100
Subject: [Python-Dev] Extending startup code: PEP needed?
References: <200101081751.SAA08918@pandora.informatik.hu-berlin.de> <3A5A02AA.675A35D1@lemburg.com> <200101081833.NAA05325@cj20424-a.reston1.va.home.com> <3A5A09A5.D0DC33A1@lemburg.com> <200101081936.OAA05440@cj20424-a.reston1.va.home.com>  
	            <3A5A2528.C289BE1D@lemburg.com> <200101082223.RAA05858@cj20424-a.reston1.va.home.com>
Message-ID: <3A5ABC7E.E953962B@lemburg.com>

Guido van Rossum wrote:
> 
> > I was thinking an attack where knowledge of common temporary
> > execution locations is used to trick Python into executing
> > untrusted code -- the untrusted code would only have to be
> > copied to the known temporary execution directory and then
> > gets executed by Python next time the program using the temporary
> > location is invoked.
> 
> When does Python execute code from a predictable common temporary
> location?  When is that likely to be used from a Python script running
> as root?
> 
> Note that if you use tempfile.TemporaryFile(), you can create a
> temporary file that's not subvertible.

It's not Python itself that's running temporary files. Tools
like distutils, RPM, etc. tend to run Python code in temporary
locations during build stages. That's what I was thinking about.
OTOH, root should know where these tools run their code, so
I guess it's moot to discuss who's fault this really is, e.g.
distutils style distributions should never be unzipped to /tmp
for subsequent installation, but nobody will prevent root
from doing so.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Tue Jan  9 08:35:09 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 9 Jan 2001 02:35:09 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101031237.HAA19244@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEIGIHAA.tim.one@home.com>

[Guido]
> Are you sure Perl still uses stdio at all?

I've got solid answers now, but I'll paraphrase them anonymously to save the
bother of untangling multi-person email etiquette snarls:

+ Yes, Perl uses platform stdio.  Usually.  Yes on Windows anyway.

+ But Perl "cheats" on Windows (well, everywhere it can ...), as I've
explained in great detail half a dozen times over the years.  No reason to
retract any of that.

+ The cheating is not thread-safe.

+ The last stab at threads accessible from Perl was an experiment that got
dropped.  There are no user-muckable threads in std Perl builds.

+ But there is a notion of threads available at the C level.

+ This latter notion of threads is used to implement Perl's fork() on
Windows, so can be exploited to test Windows Perl thread safety without
writing a Perl extension module in C.

+ This Perl program (very much like the 2-threaded one I just posted for
Python) uses that trick:

-------------------------------------------------------------------
sub counter {
    my $nc = 0;
    while (<FILE>) {
        $nc += length;
    }
    print "num bytes seen = $nc\n";
}

open(FILE, "ga");
binmode FILE;

fork();
&counter();
-------------------------------------------------------------------

Under the covers, that really shares the FILE filehandle on Windows via
threads.  Running it multiple times yields multiple wild results; the number
of bytes seen by parent and child rarely sum to the number of bytes actually
in the input file ("ga").  The most common output for me is that one thread
sees the entire file, while the other sees "a lot" of it (since the Perl
inner loop registerizes its FILE* struct member shadows for as long as
possible, that's actually what I expected).

So the code is exactly as thread-unsafe as it looked.

bosses-demand-answers-but-they-forget-their-questions<wink>-ly
    y'rs  - tim




From guido at python.org  Tue Jan  9 14:41:24 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 09 Jan 2001 08:41:24 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Mon, 08 Jan 2001 23:29:02 EST."
             <LNBBLJKPBEHFEDALKOLCIEHPIHAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCIEHPIHAA.tim.one@home.com> 
Message-ID: <200101091341.IAA09132@cj20424-a.reston1.va.home.com>

> So it's probably a wash.  In that case, do we want to maintain two hacks for
> this?  I can't use the FLOCKFILE/etc approach on Windows, while "the
> Windows" approach probably works everywhere (although its speed relies on
> the platform factoring out at least the locking/unlocking in fgets).

I'm much more confident about the getc_unlocked() approach than about
fgets() -- with the latter we need much more faith in the C library
implementers.  (E.g. that fgets() never writes beyond the null bytes
it promises, and that it locks/unlocks only once.)  Also, you're
relying on blindingly fast memchr() and memset() implementations.

> Both methods lack a refinement I would like to see, but can't achieve in
> "the Windows way":  ensure that consistency is on no worse than a per-line
> basis.  [Example omitted]

The only portable way to ensure this that I can see, is to have a
separate mutex in the Python file object.  Since this is hardly a
common thing to do, I think it's better to let the application manage
that lock if they need it.

(Then why are we bothering with flockfile(), you may ask?  Because
otherwise, accidental multithreaded reading from the same file could
cause core dumps.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Tue Jan  9 16:48:13 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Tue, 9 Jan 2001 10:48:13 -0500
Subject: [Python-Dev] Python 2.1 release schedule (PEP 226)
In-Reply-To: <200101051529.KAA19100@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Jan 05, 2001 at 10:29:05AM -0500
References: <200101051529.KAA19100@cj20424-a.reston1.va.home.com>
Message-ID: <20010109104813.D6203@kronos.cnri.reston.va.us>

On Fri, Jan 05, 2001 at 10:29:05AM -0500, Guido van Rossum wrote:
> S   222  pep-0222.txt  Web Library Enhancements               Kuchling
>
>	  This is really up to Andrew.  It seems he plans to create
>	  new modules, so he won't be introducing incompatibilities in
>	  existing APIs.

I don't think PEP 222 will be worked on for 2.1; there have only been
a few reactions, and none at all on the python-web-modules mailing
list, so I don't think anyone really cares very much at this point.
Maybe for 2.2, or maybe I'll just write new classes for Quixote.

That leaves PEP 229 as the only PEP I need to work on for 2.1.

--amk



From tim.one at home.com  Tue Jan  9 22:12:42 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 9 Jan 2001 16:12:42 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101091341.IAA09132@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEKAIHAA.tim.one@home.com>

[Guido]
> I'm much more confident about the getc_unlocked() approach than about
> fgets() -- with the latter we need much more faith in the C library
> implementers.  (E.g. that fgets() never writes beyond the null bytes
> it promises, and that it locks/unlocks only once.)  Also, you're
> relying on blindingly fast memchr() and memset() implementations.

Yet Andrew's timings say it's a wash on Linux and Solaris (perhaps even a
bit quicker on Solaris, despite that it's paying an extra layer of function
call per line, to keep it out of get_line proper).  That tells me the
assumptions are indeed mild.  The business about not writing beyond the null
byte is a concern only I would have raised:  the possibility is an
aggressively paranoid reading of the std (I do *lots* of things with libc
I'm paranoid about <0.9 wink>).  If even *Microsoft* didn't blow these
things, it's hard to imagine any other vendor exploding ...

Still, I'd rather get rid of ms_getline_hack if I could, because the code is
so much more complicated.

>> Both methods lack a refinement I would like to see, but can't
>> achieve in "the Windows way":  ensure that consistency is on no
>> worse than a per-line basis.  [Example omitted]

> The only portable way to ensure this that I can see, is to have a
> separate mutex in the Python file object.  Since this is hardly a
> common thing to do, I think it's better to let the application manage
> that lock if they need it.

Well, it would be easy to fiddle the HAVE_GETC_UNLOCKED method to keep the
file locked until the line was complete, and I wouldn't be opposed to making
life saner on platforms that allow it.  But there's another problem here:
part of the reason we release Python threads around the fgets is in case
some other thread is trying to write the data we're trying to read, yes?
But since FLOCKFILE is in effect, other threads *trying* to write to the
stream we're reading will get blocked anyway.  Seems to give us potential
for deadlocks.

> (Then why are we bothering with flockfile(), you may ask?

I wouldn't ask that, no <wink>.

> Because otherwise, accidental multithreaded reading from the same
> file could cause core dumps.)

Ugh ... turns out that on my box I can provoke core dumps anyway, with this
program.  Blows up under released 2.0 and CVS Pythons (so it's not due to
anything new):

import thread

def read(f):
    import time
    time.sleep(.01)
    n = 0
    while n < 1000000:
        x = f.readline()
        n += len(x)
        print "r",
    print "read " + `n`
    m.release()

m = thread.allocate_lock()
f = open("ga", "w+")
print "opened"
m.acquire()
thread.start_new_thread(read, (f,))
n = 0
x = "x" * 113 + "\n"
while n < 1000000:
    f.write(x)
    print "w",
    n += len(x)
m.acquire()
print "done"

Typical run:

C:\Python20>\code\python\dist\src\pcbuild\python temp.py
opened
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w
w w w w w w w w w w w w w w w w w w w w w w w w w w w w w w r w r
w r w r w r w r w r w r w r w r w r w r w r w r w r w r w r w r w
r r w r w r w r w r w r

and then it dies in msvcrt.dll with a bad pointer.  Also dies under the
debugger (yay!) ... always dies like so:

+ We (Python) call the MS fwrite, from fileobject.c file_write.
+ MS fwrite succeeds with its _lock_str(stream) call.
+ MS fwrite then calls MS _fwrite_lk.
+ MS _fwrite_lk calls memcpy, which blows up for a non-obvious reason.

Looks like the stream's _cnt member has gone mildly negative, which
_fwrite_lk casts to unsigned and so treats like a giant positive count, and
so memcpy eventually runs off the end of the process address space.

Only thing I can conclude from this is that MS's internal stream-locking
implementation is buggy.  At least on W98SE.  Other flavors of Windows?
Other platforms?

Note that I don't claim the program above is *sensible*, just that it
shouldn't blow up.  Alas, short of indeed adding a separate mutex in Python
file objects-- or writing our own stdio --I don't believe I can fix this.

the-best-thing-to-do-with-threads-is-don't-ly y'rs  - tim




From fdrake at acm.org  Tue Jan  9 23:58:49 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 9 Jan 2001 17:58:49 -0500 (EST)
Subject: [Python-Dev] Updated development documentation
Message-ID: <14939.38825.218757.535010@cj42289-a.reston1.va.home.com>

  I've just updated the development version of the documentation, but
am not sure the automated notice got sent.
  This version contains a wide variety of smaller updates, plus added
documentation on the fpectl and xreadlines modules.


        http://python.sourceforge.net/devel-docs/


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From MarkH at ActiveState.com  Wed Jan 10 01:00:03 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Tue, 9 Jan 2001 16:00:03 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEKAIHAA.tim.one@home.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPOEFJCOAA.MarkH@ActiveState.com>

> Only thing I can conclude from this is that MS's internal stream-locking
> implementation is buggy.  At least on W98SE.  Other flavors of Windows?
> Other platforms?

Same behaviour on Win2k for me.

Mark.



From tim.one at home.com  Wed Jan 10 01:55:11 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 9 Jan 2001 19:55:11 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEHPIGAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEKMIHAA.tim.one@home.com>

Final report (I've spent way more time on this than I can afford already, so
it's "final" by defn <0.3 wink>).  We started here (on my Win98SE box, using
Guido's test program):

total 117615824 chars and 3237568 lines
count_chars_lines    14.780 14.772
readlines_sizehint    9.390  9.375
using_fileinput      66.130 66.157
while_readline       30.380 30.337

Here's where we are today:

total 117615824 chars and 3237568 lines
count_chars_lines    14.670 14.667
readlines_sizehint    9.500  9.506
using_fileinput      28.670 28.708
while_readline       13.680 13.676
for_xreadlines        7.630  7.635

Same box, same input file, same test program except for this addition:

def for_xreadlines(fn):
    f = open(fn, MODE)
    for line in xreadlines.xreadlines(f):
        pass
    f.close()

This last is within 25% of Perl "while (<>)" speed, but-- unlike Perl --is
thread-safe.  Good show!  The other speedups are nothing to snort at either.

The strangest thing left to my eye is why xreadlines enjoys a significant
advantage over the double-loop buffering method (readlines_sizehint) on my
box; reducing the very large (1Mb) buffer in Guido's test program made no
material difference to that.

nothing's-ever-finished-but-everything-ends-ly y'rs  - tim




From tim.one at home.com  Wed Jan 10 06:46:24 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 10 Jan 2001 00:46:24 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPOEFJCOAA.MarkH@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKELFIHAA.tim.one@home.com>

[Tim]
> Only thing I can conclude from this is that MS's internal stream-
> locking implementation is buggy.  At least on W98SE.  Other flavors
> of Windows?  Other platforms?

[Mark Hammond]
> Same behaviour on Win2k for me.

Thanks, Mark!  I opened a bug on SF to record more clues:

http://sourceforge.net/bugs/?func=detailbug&bug_id=128210&group_id=5470

I didn't assign it to anyone because-- best I can tell --there's nothing
realistic we can do about it.  Probably won't happen in practice anyway
<wink>.

there's-a-reason-thread-problems-pop-up-on-windows-first-but-
    ms-isn't-it-ly y'rs  - tim




From billtut at microsoft.com  Wed Jan 10 10:10:51 2001
From: billtut at microsoft.com (Bill Tutt)
Date: Wed, 10 Jan 2001 01:10:51 -0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <58C671173DB6174A93E9ED88DCB0883DB863A8@red-msg-07.redmond.corp.microsoft.com>

With a nice simple C test case from Tim, I've submitted this one to internal
support.
I'll let everybody know what happens when I know more.

Bill

 -----Original Message-----
From: 	Tim Peters [mailto:tim.one at home.com] 
Sent:	Tuesday, January 09, 2001 9:46 PM
To:	python-dev at python.org
Subject:	RE: [Python-Dev] xreadlines : readlines :: xrange : range

[Tim]
> Only thing I can conclude from this is that MS's internal stream-
> locking implementation is buggy.  At least on W98SE.  Other flavors
> of Windows?  Other platforms?

[Mark Hammond]
> Same behaviour on Win2k for me.

Thanks, Mark!  I opened a bug on SF to record more clues:

http://sourceforge.net/bugs/?func=detailbug&bug_id=128210&group_id=5470

I didn't assign it to anyone because-- best I can tell --there's nothing
realistic we can do about it.  Probably won't happen in practice anyway
<wink>.

there's-a-reason-thread-problems-pop-up-on-windows-first-but-
    ms-isn't-it-ly y'rs  - tim


_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
http://www.python.org/mailman/listinfo/python-dev



From m.favas at per.dem.csiro.au  Wed Jan 10 12:57:56 2001
From: m.favas at per.dem.csiro.au (Mark Favas)
Date: Wed, 10 Jan 2001 19:57:56 +0800
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
Message-ID: <3A5C4E44.23B593E9@per.dem.csiro.au>

Just Another Data Point - my box (DEC Alpha, Tru64 Unix) shows the same
behaviour as Tim's WinBox wrt the new xreadline and the double-loop
readlines (so it's not just something funny with MS (not that there's
not anything funny with MS...)):

total 131426612 chars and 514216 lines
count_chars_lines     5.450  5.066
readlines_sizehint    4.112  4.083
using_fileinput      10.928 10.916
while_readline       11.766 11.733
for_xreadlines        3.569  3.533

-- 
Mark Favas  -   m.favas at per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA



From tismer at tismer.com  Wed Jan 10 12:06:42 2001
From: tismer at tismer.com (Christian Tismer)
Date: Wed, 10 Jan 2001 13:06:42 +0200
Subject: [Python-Dev] Add __exports__ to modules
References: <Pine.LNX.4.10.10101081749580.5156-100000@skuld.kingmanhall.org>
Message-ID: <3A5C4242.E445C3A1@tismer.com>


Ka-Ping Yee wrote:
> 
> On Mon, 8 Jan 2001, Skip Montanaro wrote:
> > Okay, how about this as a compromise first step?  Allow programmers to put
> > __exports__ lists in their modules but don't do anything with them *except*
> > modify dir() to respect that if it exists?
> 
> I'd say: Just have dir() and import * pay attention to __exports__.
> Don't mess with getattr or __dict__.

quadruple-nodd - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From mal at lemburg.com  Wed Jan 10 14:21:28 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 10 Jan 2001 14:21:28 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
Message-ID: <3A5C61D8.2E5D098C@lemburg.com>

Guido van Rossum wrote:
> 
> Please have a look at this SF patch:
> 
> http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> 
> This implements control over which names defined in a module are
> externally visible: if there's a variable __exports__ in the module,
> it is a list of identifiers, and any access from outside the module to
> names not in the list is disallowed.  This affects access using the
> getattr and setattr protocols (which raise AttributeError for
> disallowed names), as well as "from M import v" (which raises
> ImportError).

Can't we use the existing attribute __all__ (this is currently
only used for packages) for this kind of thing. As other have already
remarked: I would rather like to see this attribute being used
as basis for 'from M import *' rather than enforce the access
restrictions like the patch suggests.

Access control mechanisms should be treated in different ways
such as wrapping objects using access-control proxies (see mx.Proxy
for an example of such an implementation) and on-demand only.
I wouldn't wan't to pay the performance hit for each and every
lookup in all my Python applications just because someone out
there feels that "from M import *" has a meaning in life
apart from being useful in interactive sessions to ease typing ;-)
 
> I like it.  This has been asked for many times.  Does anybody see a
> reason why this should *not* be added?
> 
> Tim remarked that introducing this will prompt demands for a similar
> feature on classes and instances, where it will be hard to implement
> without causing a bit of a slowdown.  It causes a slight slowdown (an
> extra dictionary lookup for each use of "M.v") even when it is not
> used, but for accessing module variables that's acceptable.  I'm not
> so sure about instance variable references.

Again, I'd rather see these implemented using different
techniques which are under programmer control and made
explicit and visible in the program flow. Proxies are ideal
for these things, since they allow great flexibility while
still providing reasonable security at Python level.

I have been using the proxy approach for years now and 
so far with great success. What's even better is that
weak references and garbage finalization aids come along with
it for free.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at python.org  Wed Jan 10 16:12:56 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 10:12:56 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 09 Jan 2001 19:55:11 EST."
             <LNBBLJKPBEHFEDALKOLCKEKMIHAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCKEKMIHAA.tim.one@home.com> 
Message-ID: <200101101512.KAA26193@cj20424-a.reston1.va.home.com>

> The strangest thing left to my eye is why xreadlines enjoys a significant
> advantage over the double-loop buffering method (readlines_sizehint) on my
> box; reducing the very large (1Mb) buffer in Guido's test program made no
> material difference to that.

I was baffled at this too (same difference on my box), until I
discovered that the buffer size is specified *twice*: once as a
default in the arg list of readlines_sizehint(), then *again* in the
call to timer() near the bottom of the file.

Take the latter one out and the times are comparable, in fact
readlines_sizehint() is a few percent quicker.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jim at interet.com  Wed Jan 10 16:19:01 2001
From: jim at interet.com (James C. Ahlstrom)
Date: Wed, 10 Jan 2001 10:19:01 -0500
Subject: [Python-Dev] Create a synthetic stdout for Windows?
References: <LCEPIIGDJPKCOIHOBJEPEEEGCOAA.MarkH@ActiveState.com>
Message-ID: <3A5C7D65.780065C6@interet.com>

Mark Hammond wrote:

> Sometimes _no_ screen at all is wanted - ie, no main GUI window, and no
> console window.  pythonw is used in this case.  COM uses pythonw.exe in just
> this way, and when executed by DCOM, it will be executed in a context where
> the user can not see any such dialog.
> 
> However, I would be happy to ensure the correct command-line is used to
> prevent this behaviour in this case.
> 
> Indeed, in _every_ case I use pythonw.exe I would disable this - but I
> accept that other users have simpler requirements.

It would be easier to have a pythonw2.exe where this feature is
built in, rather than a command line option.  But see below.
 
> > I do not view winstdout as a "newbie" feature, but rather a
> > generally useful C-language addition to Python.
> 
> Hrm.  I dont believe a commercial app, for example, would find this
> suitable - they would roll their own solution.
...
> > I guess I am saying, perhaps incorrectly, that the mechanism provided
> > will make further redirection of sys.stdout unnecessary 99% of the
> > time.
> 
> Yes, I disagree here.  IMO it is no good for a commercial, real app.  As I
...
> > If someone out there knows of a different example of sys.stdout
> > redirection in use in the real world, it would be helpful if
> > they would describe it.  Maybe it could be incorporated.
> 
> Sure.  Komodo to a file with a friendly dialog (sometimes ;-).
...
> I don't believe I have worked on 2 projects with the same requirement
> here!!!

Well, that is the problem.  Is this feature "generally useful"?
I am writing Windows programs in which Python is the "main"
and provides the GUI, so I find this useful.  And I do show
my users tracebacks.  But perhaps this is unique to me.  I
don't see users of wxPython nor tkinter replying "great idea"
so maybe they don't use pythonw.

Absent more support, I don't think this idea has enough
merit to justify a patch.

JimA



From guido at python.org  Wed Jan 10 17:39:34 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 11:39:34 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 10 Jan 2001 01:10:51 PST."
             <58C671173DB6174A93E9ED88DCB0883DB863A8@red-msg-07.redmond.corp.microsoft.com> 
References: <58C671173DB6174A93E9ED88DCB0883DB863A8@red-msg-07.redmond.corp.microsoft.com> 
Message-ID: <200101101639.LAA26776@cj20424-a.reston1.va.home.com>

> With a nice simple C test case from Tim, I've submitted this one to internal
> support.
> I'll let everybody know what happens when I know more.

I bet you it's rejected on the basis of "the docs tell you not to mix
reading and writing on the same stream without intervening seek or
flush."  If I were on the support line I would do that.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Wed Jan 10 17:38:16 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 11:38:16 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Tue, 09 Jan 2001 16:12:42 EST."
             <LNBBLJKPBEHFEDALKOLCGEKAIHAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCGEKAIHAA.tim.one@home.com> 
Message-ID: <200101101638.LAA26759@cj20424-a.reston1.va.home.com>

> [Guido]
> > I'm much more confident about the getc_unlocked() approach than about
> > fgets() -- with the latter we need much more faith in the C library
> > implementers.  (E.g. that fgets() never writes beyond the null bytes
> > it promises, and that it locks/unlocks only once.)  Also, you're
> > relying on blindingly fast memchr() and memset() implementations.

[Tim]
> Yet Andrew's timings say it's a wash on Linux and Solaris (perhaps even a
> bit quicker on Solaris, despite that it's paying an extra layer of function
> call per line, to keep it out of get_line proper).  That tells me the
> assumptions are indeed mild.  The business about not writing beyond the null
> byte is a concern only I would have raised:  the possibility is an
> aggressively paranoid reading of the std (I do *lots* of things with libc
> I'm paranoid about <0.9 wink>).  If even *Microsoft* didn't blow these
> things, it's hard to imagine any other vendor exploding ...
> 
> Still, I'd rather get rid of ms_getline_hack if I could, because the code is
> so much more complicated.

Which is another argument to prefer the getc_unlocked() code when it
works -- it's obviously correct. :-)

> >> Both methods lack a refinement I would like to see, but can't
> >> achieve in "the Windows way":  ensure that consistency is on no
> >> worse than a per-line basis.  [Example omitted]
> 
> > The only portable way to ensure this that I can see, is to have a
> > separate mutex in the Python file object.  Since this is hardly a
> > common thing to do, I think it's better to let the application manage
> > that lock if they need it.
> 
> Well, it would be easy to fiddle the HAVE_GETC_UNLOCKED method to keep the
> file locked until the line was complete, and I wouldn't be opposed to making
> life saner on platforms that allow it.

Hm...  That would be possible, except for one unfortunate detail:
_PyString_Resize() may call PyErr_BadInternalCall() which touches
thread state.

> But there's another problem here:
> part of the reason we release Python threads around the fgets is in case
> some other thread is trying to write the data we're trying to read, yes?

NO, NO NO!  Mixing reads and writes on the same stream wasn't what we
are locking against at all.  (As you've found out, it doesn't even
work.)  We're only trying to protect against concurrent *reads*.

> But since FLOCKFILE is in effect, other threads *trying* to write to the
> stream we're reading will get blocked anyway.  Seems to give us potential
> for deadlocks.

Only if tyeh are holding other locks at the same time.  I haven't done
a thorough survey of fileobject.c, but I've skimmed it, I believe it's
religious about releasing the Global Interpreter Lock around I/O
calls.  But, of course, 3rd party C code might not be.

> > (Then why are we bothering with flockfile(), you may ask?
> 
> I wouldn't ask that, no <wink>.
> 
> > Because otherwise, accidental multithreaded reading from the same
> > file could cause core dumps.)
> 
> Ugh ... turns out that on my box I can provoke core dumps anyway, with this
> program.  Blows up under released 2.0 and CVS Pythons (so it's not due to
> anything new):

Yeah.  But this is insane use -- see my comments on SF.  It's only
worth fixing because it could be used to intentionally crash Python --
but there are easier ways...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Wed Jan 10 17:41:47 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 10 Jan 2001 10:41:47 -0600 (CST)
Subject: [Python-Dev] Shouldn't the Mac be listed as an environment?
Message-ID: <14940.37067.893679.750918@beluga.mojam.com>

I just noticed that the "Environment" options for Python on the SF site are
listed as

     Console (Text Based), Win32 (MS Windows), X11 Applications

Shouldn't something Macintosh-related be in that list as well?

Skip



From guido at python.org  Wed Jan 10 17:53:16 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 11:53:16 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Wed, 10 Jan 2001 14:21:28 +0100."
             <3A5C61D8.2E5D098C@lemburg.com> 
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>  
            <3A5C61D8.2E5D098C@lemburg.com> 
Message-ID: <200101101653.LAA28986@cj20424-a.reston1.va.home.com>

> Guido van Rossum wrote:
> > 
> > Please have a look at this SF patch:
> > 
> > http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> > 
> > This implements control over which names defined in a module are
> > externally visible: if there's a variable __exports__ in the module,
> > it is a list of identifiers, and any access from outside the module to
> > names not in the list is disallowed.  This affects access using the
> > getattr and setattr protocols (which raise AttributeError for
> > disallowed names), as well as "from M import v" (which raises
> > ImportError).

[Marc-Andre]
> Can't we use the existing attribute __all__ (this is currently
> only used for packages) for this kind of thing. As other have already
> remarked: I would rather like to see this attribute being used
> as basis for 'from M import *' rather than enforce the access
> restrictions like the patch suggests.

Yes -- I came up with the same thought.

So here's a plan: somebody please submit a patch that does only one
thing: from...import * looks for __all__ and if it exists, imports
exactly those names.  No changes to dir(), or anything.

> Access control mechanisms should be treated in different ways
> such as wrapping objects using access-control proxies (see mx.Proxy
> for an example of such an implementation) and on-demand only.
> I wouldn't wan't to pay the performance hit for each and every
> lookup in all my Python applications just because someone out
> there feels that "from M import *" has a meaning in life
> apart from being useful in interactive sessions to ease typing ;-)

In the process of looking into Zope internals I've noticed that
proxies are indeed very useful!

I note that the IMPORT opcodes in ceval.c require that the imported
module (as found in sys.modules[name] or returned by __import__()) is
a real module object.  I think this is unnecessary -- at least
IMPORT_FROM should work even if the module is a proxy or some other
thing (I've been known to smuggle class instances into sys.modules :-)
and IMPORT_STAR should work with a non-module at least if it has an
__all__ attribute.

> > I like it.  This has been asked for many times.  Does anybody see a
> > reason why this should *not* be added?
> > 
> > Tim remarked that introducing this will prompt demands for a similar
> > feature on classes and instances, where it will be hard to implement
> > without causing a bit of a slowdown.  It causes a slight slowdown (an
> > extra dictionary lookup for each use of "M.v") even when it is not
> > used, but for accessing module variables that's acceptable.  I'm not
> > so sure about instance variable references.
> 
> Again, I'd rather see these implemented using different
> techniques which are under programmer control and made
> explicit and visible in the program flow. Proxies are ideal
> for these things, since they allow great flexibility while
> still providing reasonable security at Python level.
> 
> I have been using the proxy approach for years now and 
> so far with great success. What's even better is that
> weak references and garbage finalization aids come along with
> it for free.

Agreed.  Which reminds me -- would you mind reviewing Fred's new
version of PEP 205 (weak refs)?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Wed Jan 10 18:12:20 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 10 Jan 2001 18:12:20 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>  
	            <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <3A5C97F4.945D0C1@lemburg.com>

Guido van Rossum wrote:
> 
> > Guido van Rossum wrote:
> > >
> > > Please have a look at this SF patch:
> > >
> > > http://sourceforge.net/patch/?func=detailpatch&patch_id=102808&group_id=5470
> > >
> > > This implements control over which names defined in a module are
> > > externally visible: if there's a variable __exports__ in the module,
> > > it is a list of identifiers, and any access from outside the module to
> > > names not in the list is disallowed.  This affects access using the
> > > getattr and setattr protocols (which raise AttributeError for
> > > disallowed names), as well as "from M import v" (which raises
> > > ImportError).
> 
> [Marc-Andre]
> > Can't we use the existing attribute __all__ (this is currently
> > only used for packages) for this kind of thing. As other have already
> > remarked: I would rather like to see this attribute being used
> > as basis for 'from M import *' rather than enforce the access
> > restrictions like the patch suggests.
> 
> Yes -- I came up with the same thought.

Sorry, I didn't read the whole thread on the topic. Rereading the
above paragraph I guess I should have had some more coffee at the
time of writing ;-)
 
> So here's a plan: somebody please submit a patch that does only one
> thing: from...import * looks for __all__ and if it exists, imports
> exactly those names.  No changes to dir(), or anything.

+1 -- this won't be me though (at least not this week).
 
> > Access control mechanisms should be treated in different ways
> > such as wrapping objects using access-control proxies (see mx.Proxy
> > for an example of such an implementation) and on-demand only.
> > I wouldn't wan't to pay the performance hit for each and every
> > lookup in all my Python applications just because someone out
> > there feels that "from M import *" has a meaning in life
> > apart from being useful in interactive sessions to ease typing ;-)
> 
> In the process of looking into Zope internals I've noticed that
> proxies are indeed very useful!
> 
> I note that the IMPORT opcodes in ceval.c require that the imported
> module (as found in sys.modules[name] or returned by __import__()) is
> a real module object.  I think this is unnecessary -- at least
> IMPORT_FROM should work even if the module is a proxy or some other
> thing (I've been known to smuggle class instances into sys.modules :-)
> and IMPORT_STAR should work with a non-module at least if it has an
> __all__ attribute.

Cool.  This could make Python instances usable as "modules"
-- with full getattr() hook support !

For IMPORT_STAR I'd suggest first looking for __all__ and
then reverting to __dict__.items() in case this fails. 

BTW, is __dict__ needed by the import mechanism or would
the getattr/setattr slots suffice ? And if yes, must it
be a real Python dictionary ?
 
> > > I like it.  This has been asked for many times.  Does anybody see a
> > > reason why this should *not* be added?
> > >
> > > Tim remarked that introducing this will prompt demands for a similar
> > > feature on classes and instances, where it will be hard to implement
> > > without causing a bit of a slowdown.  It causes a slight slowdown (an
> > > extra dictionary lookup for each use of "M.v") even when it is not
> > > used, but for accessing module variables that's acceptable.  I'm not
> > > so sure about instance variable references.
> >
> > Again, I'd rather see these implemented using different
> > techniques which are under programmer control and made
> > explicit and visible in the program flow. Proxies are ideal
> > for these things, since they allow great flexibility while
> > still providing reasonable security at Python level.
> >
> > I have been using the proxy approach for years now and
> > so far with great success. What's even better is that
> > weak references and garbage finalization aids come along with
> > it for free.
> 
> Agreed.  Which reminds me -- would you mind reviewing Fred's new
> version of PEP 205 (weak refs)?

I'll have a look at it next week. Is that OK ?
 
> --Guido van Rossum (home page: http://www.python.org/~guido/)
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fdrake at acm.org  Wed Jan 10 18:37:58 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 10 Jan 2001 12:37:58 -0500 (EST)
Subject: [Python-Dev] Shouldn't the Mac be listed as an environment?
In-Reply-To: <14940.37067.893679.750918@beluga.mojam.com>
References: <14940.37067.893679.750918@beluga.mojam.com>
Message-ID: <14940.40438.1654.487682@cj42289-a.reston1.va.home.com>

Skip Montanaro writes:
 > I just noticed that the "Environment" options for Python on the SF site are
 > listed as
 > 
 >      Console (Text Based), Win32 (MS Windows), X11 Applications
 > 
 > Shouldn't something Macintosh-related be in that list as well?

  Are the maintainers of the MacOS port using the SF bug tracker or
something else?  If they're using it, then by all means we should add
it.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From thomas at xs4all.net  Wed Jan 10 19:06:06 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 10 Jan 2001 19:06:06 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules xreadlinesmodule.c,NONE,1.1 Setup.dist,1.3,1.4
In-Reply-To: <E14G6bV-0004nX-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Tue, Jan 09, 2001 at 01:46:53PM -0800
References: <E14G6bV-0004nX-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010110190606.T2467@xs4all.nl>

On Tue, Jan 09, 2001 at 01:46:53PM -0800, Guido van Rossum wrote:

> static void
> xreadlines_dealloc(PyXReadlinesObject *op) {
> 	Py_XDECREF(op->file);
> 	Py_XDECREF(op->lines);
> 	PyObject_DEL(op);
> }

I'm confuzzled. Is this breach of the style guidelines intentional,
accidental, or just not cared enough about ? The style isn't even consistent
in that single module!

> void
> initxreadlines(void)
> {
> 	PyObject *m;
> 
> 	m = Py_InitModule("xreadlines", xreadlines_methods);
> }


-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From skip at mojam.com  Wed Jan 10 19:11:52 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 10 Jan 2001 12:11:52 -0600 (CST)
Subject: [Python-Dev] Shouldn't the Mac be listed as an environment?
In-Reply-To: <14940.40438.1654.487682@cj42289-a.reston1.va.home.com>
References: <14940.37067.893679.750918@beluga.mojam.com>
	<14940.40438.1654.487682@cj42289-a.reston1.va.home.com>
Message-ID: <14940.42472.174920.866172@beluga.mojam.com>

    Fred> Are the maintainers of the MacOS port using the SF bug tracker or
    Fred> something else?  If they're using it, then by all means we should
    Fred> add it.

Even if they aren't, I think it would be valuable to list.  There aren't all
that many tools (open source or otherwise) that run on Unix, Windows and Mac
and can be used as either a console app or a GUI.

I assume the reason Fred asks is that the Environment: list is generated
on-the-fly and somehow ties into use of the SF bug tracker.

Skip



From thomas at xs4all.net  Wed Jan 10 19:45:44 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 10 Jan 2001 19:45:44 +0100
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101101653.LAA28986@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 10, 2001 at 11:53:16AM -0500
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com> <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <20010110194544.V2467@xs4all.nl>

On Wed, Jan 10, 2001 at 11:53:16AM -0500, Guido van Rossum wrote:

> I note that the IMPORT opcodes in ceval.c require that the imported
> module (as found in sys.modules[name] or returned by __import__()) is
> a real module object.  I think this is unnecessary -- at least
> IMPORT_FROM should work even if the module is a proxy or some other
> thing (I've been known to smuggle class instances into sys.modules :-)
> and IMPORT_STAR should work with a non-module at least if it has an
> __all__ attribute.

Hmm.... Have you been sneaking looks at python-list again, Guido ? :-) I'm
certain the expanding of IMPORT would make a lot of people very happy. Alex
Martelli only just discovered the fact you can populate sys.modules
yourself, with non-module objects, and was wondering about its legality and
compatibility.

I, for one, am very +1 on the idea, also on MAL's idea to do our best in the
IMPORT_STAR case (try dict.items(), etc.)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tim.one at home.com  Wed Jan 10 19:49:40 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 10 Jan 2001 13:49:40 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101101512.KAA26193@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGENEIHAA.tim.one@home.com>

[Tim]
> The strangest thing left to my eye is why xreadlines enjoys a
> significant advantage over the double-loop buffering method
> (readlines_sizehint) on my box; reducing the very large
> (1Mb) buffer in Guido's test program made no material difference
> to that.

[Guido]
> I was baffled at this too (same difference on my box), until I
> discovered that the buffer size is specified *twice*: once as a
> default in the arg list of readlines_sizehint(), then *again* in
> the call to timer() near the bottom of the file.

Bingo!

> Take the latter one out and the times are comparable, in fact
> readlines_sizehint() is a few percent quicker.

They're indistinguishable then on my box (on one run xreadlines is .1
seconds  (out of around 7.6 total) quicker, on another readlines_sizehint),
*provided* that I specify the same buffer size (8192) that xreadlines uses
internally.  However, if I even double that, readlines_sizehint is uniformly
about 10% slower.  It's also a tiny bit slower if I cut the sizehint buffer
size to 4096.

I'm afraid Mysteries will remain no matter how many person-decades we spend
staring at this <0.5 wink> ...




From guido at python.org  Wed Jan 10 19:50:10 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 13:50:10 -0500
Subject: [Python-Dev] Shouldn't the Mac be listed as an environment?
In-Reply-To: Your message of "Wed, 10 Jan 2001 10:41:47 CST."
             <14940.37067.893679.750918@beluga.mojam.com> 
References: <14940.37067.893679.750918@beluga.mojam.com> 
Message-ID: <200101101850.NAA29744@cj20424-a.reston1.va.home.com>

> I just noticed that the "Environment" options for Python on the SF site are
> listed as
> 
>      Console (Text Based), Win32 (MS Windows), X11 Applications
> 
> Shouldn't something Macintosh-related be in that list as well?

Yeah, except for two problems: :-)

(1) This is a selection from a drop-down menu that doesn't have a Mac
    option;

(2) There are only three slots allowed.

So this is the best we can do.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gstein at lyra.org  Wed Jan 10 19:53:32 2001
From: gstein at lyra.org (Greg Stein)
Date: Wed, 10 Jan 2001 10:53:32 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010110194544.V2467@xs4all.nl>; from thomas@xs4all.net on Wed, Jan 10, 2001 at 07:45:44PM +0100
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com> <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com> <20010110194544.V2467@xs4all.nl>
Message-ID: <20010110105332.T4640@lyra.org>

On Wed, Jan 10, 2001 at 07:45:44PM +0100, Thomas Wouters wrote:
> On Wed, Jan 10, 2001 at 11:53:16AM -0500, Guido van Rossum wrote:
> 
> > I note that the IMPORT opcodes in ceval.c require that the imported
> > module (as found in sys.modules[name] or returned by __import__()) is
> > a real module object.  I think this is unnecessary -- at least
> > IMPORT_FROM should work even if the module is a proxy or some other
> > thing (I've been known to smuggle class instances into sys.modules :-)
> > and IMPORT_STAR should work with a non-module at least if it has an
> > __all__ attribute.
> 
> Hmm.... Have you been sneaking looks at python-list again, Guido ? :-) I'm
> certain the expanding of IMPORT would make a lot of people very happy. Alex
> Martelli only just discovered the fact you can populate sys.modules
> yourself, with non-module objects, and was wondering about its legality and
> compatibility.
> 
> I, for one, am very +1 on the idea, also on MAL's idea to do our best in the
> IMPORT_STAR case (try dict.items(), etc.)

+1 ... I'm always up for removing type restrictions. Did that with the
bytecodes in function objects a while back.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From MarkH at ActiveState.com  Wed Jan 10 19:54:34 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Wed, 10 Jan 2001 10:54:34 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules xreadlinesmodule.c,NONE,1.1 Setup.dist,1.3,1.4
In-Reply-To: <20010110190606.T2467@xs4all.nl>
Message-ID: <LCEPIIGDJPKCOIHOBJEPMEGKCOAA.MarkH@ActiveState.com>

> I'm confuzzled. Is this breach of the style guidelines intentional,
> accidental, or just not cared enough about ?

I vote the latter!

Who-really-cares ly,

Mark.



From guido at python.org  Wed Jan 10 20:00:24 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 14:00:24 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: Your message of "Mon, 08 Jan 2001 11:31:09 EST."
             <20010108113109.C7563@kronos.cnri.reston.va.us> 
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com>  
            <20010108113109.C7563@kronos.cnri.reston.va.us> 
Message-ID: <200101101900.OAA30486@cj20424-a.reston1.va.home.com>

[me]
> >I expect Andrew's code to go in before 2.1 is released.  So I don't
> >see a reason why we should hurry and check in a stop-gap measure.

[Andrew]
> But it might not; the final version might be unacceptable or run into
> some intractable problem.  Assuming the patch is correct (I haven't
> looked at it), why not check it in?  The work has already been done to
> write it, after all.

OK, done.

It was more work than I had hoped for, because Eric apparently
(despite having developer privileges!) doesn't use the CVS tree -- he
sent in a diff relative to the 2.0 release.  I munged it into place,
adding the feature that readline, _curses and bsdddb are built as
shared libraries by default.  You'd have to edit Setup.config.in to
change this.  Hope this doesn't break anybody's setup.  (Skip???)

Question for Eric: do you still want developer privileges?  They come
with responsibilities too.  Please check out the @#$%& CVS tree! :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Wed Jan 10 20:03:07 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 10 Jan 2001 14:03:07 -0500
Subject: [Python-Dev] Re: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Mon, 01 Jan 2001 19:49:35 CST."
             <20010101194935.19672@falcon.inetnebr.com> 
References: <20010101194935.19672@falcon.inetnebr.com> 
Message-ID: <200101101903.OAA30522@cj20424-a.reston1.va.home.com>

Hi Jeff,

I'm glad to tell you that I've accepted your xreadlines patches.  It's
all checked into the CVS tree now, except for your patch to
fileinput.py, where I had already checked in a similar change using
readlines(sizehint) directly.

Thanks again for your contribution!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From paulp at ActiveState.com  Wed Jan 10 21:08:31 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Wed, 10 Jan 2001 12:08:31 -0800
Subject: [Python-Dev] Add __exports__ to modules
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>  
	            <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <3A5CC13F.DFB26A0B@ActiveState.com>

Guido van Rossum wrote:
> 
> ...
> 
> Yes -- I came up with the same thought.
> 
> So here's a plan: somebody please submit a patch that does only one
> thing: from...import * looks for __all__ and if it exists, imports
> exactly those names.  No changes to dir(), or anything.

Why? From my point of view, the changes to dir() are much more
important. I seldom tell newbies about import * but I always tell them
how they can browse objects (especially modules) with dir. If dir() is
changed then IDEs and so forth would use that and inherit the right
behavior. If the module exporting behavior gets more sophisticated in a
future version of Python they will continue to inherit the behavior.

Also, dir() could look for an __all__ on all objects including "module
proxies", classes and "plain old instances". In other words we can
extend the convention to other objects "for free".

 Paul



From tim.one at home.com  Wed Jan 10 21:25:24 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 10 Jan 2001 15:25:24 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101101638.LAA26759@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAENJIHAA.tim.one@home.com>

[Tim]
>> Well, it would be easy to fiddle the HAVE_GETC_UNLOCKED method
>> to keep the file locked until the line was complete, and I
>> wouldn't be opposed to making life saner on platforms that allow it.

[Guido]
> Hm...  That would be possible, except for one unfortunate detail:
> _PyString_Resize() may call PyErr_BadInternalCall() which touches
> thread state.

FLOCKFILE/FUNLOCKFILE are independent of Python's notion of thread state.
IOW, do FLOCKFILE once before the for(;;), and FUNLOCKFILE once on every
*exit* path thereafter.  We can block/unblock Python threads as often as
desired between those *file*-locking brackets.  The only thing the repeated
FLOCKFILE/FUNLOCKFILE calls do to my eyes now is to create the *possibility*
for multiple readers to get partial lines of the file.

> ...
> NO, NO NO!  Mixing reads and writes on the same stream wasn't what we
> are locking against at all.  (As you've found out, it doesn't even
> work.)

On Windows, yes, but that still seems to me to be a bug in MS's code.  If
anyone had reported a core dump on any other platform, I'd be more tractable
<wink> on this point.

> We're only trying to protect against concurrent *reads*.

As above, I believe that we could do a better job of that, then, on
platforms that HAVE_GETC_UNLOCKED, by protecting not only against core dumps
but also against .readline() not delivering an intact line from the file.

>> But since FLOCKFILE is in effect, other threads *trying* to write
>> to the stream we're reading will get blocked anyway.  Seems to give us
>> potential for deadlocks.

> Only if tyeh are holding other locks at the same time.

I'm not being clear, then.  Thread X does f.readline(), on a
HAVE_GETC_UNLOCKED platform.  get_line allows other threads to run and
invokes FLOCKFILE on f->f_fp.  get_line's GETC in thread X eventually hits
the end of the stdio buffer, and does its platform's version of _filbuf.
_filbuf may wait (depending on the nature of the stream) for more input to
show up.  Simultaneously, thread Y attempts to write some data to f.  But
the *FLOCKFILE* lock prevents it from doing anything with f.  So X is
waiting for Y to write data inside platform _filbuf, but Y is waiting for X
to release the platform stream lock inside some platform stream-output
routine (if I'm being clear now, Python locks have nothing to do with this
scenario:  it's the platform stream lock).

I think this is purely the user's fault if it happens.  Just pointing it out
as another insecurity we're probably not able to protect users from.

> ...
> Yeah.  But this is insane use -- see my comments on SF.  It's only
> worth fixing because it could be used to intentionally crash Python --
> but there are easier ways...

If it's unique to MS (as I suspect), I see no reason to even consider trying
to fix it in Python.  Unless the Perl Mongers use it to crash Zope <wink>.





From cgw at fnal.gov  Wed Jan 10 22:57:41 2001
From: cgw at fnal.gov (Charles G Waldman)
Date: Wed, 10 Jan 2001 15:57:41 -0600 (CST)
Subject: [Python-Dev] Interning filenames of imported modules
Message-ID: <14940.56021.646147.770080@buffalo.fnal.gov>

I have a question about the following code in compile.c:jcompile (line 3678)

		filename = PyString_InternFromString(sc.c_filename); 
		name = PyString_InternFromString(sc.c_name);

In the case of a long-running server which constantly imports modules,
this causes the interned string dict to grow without bound.  Is there
a strong reason that the filename needs to be interned?  How about the
module name?

How about some way to enforce a limit on the size of the interned
strings dictionary?




From mwh21 at cam.ac.uk  Wed Jan 10 23:02:49 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: Wed, 10 Jan 2001 22:02:49 +0000 (GMT)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A5CC13F.DFB26A0B@ActiveState.com>
Message-ID: <Pine.SOL.4.21.0101102121460.10616-100000@red.csi.cam.ac.uk>

On Wed, 10 Jan 2001, Paul Prescod wrote:

> Guido van Rossum wrote:
> > 
> > ...
> > 
> > Yes -- I came up with the same thought.
> > 
> > So here's a plan: somebody please submit a patch that does only one
> > thing: from...import * looks for __all__ and if it exists, imports
> > exactly those names.  No changes to dir(), or anything.
> 
> Why? From my point of view, the changes to dir() are much more
> important. I seldom tell newbies about import * but I always tell them
> how they can browse objects (especially modules) with dir. If dir() is
> changed then IDEs and so forth would use that and inherit the right
> behavior. If the module exporting behavior gets more sophisticated in a
> future version of Python they will continue to inherit the behavior.

Changing dir would also make rlcompleter nicer - it's something of a pain
to use with a module that has, eg, "from TERMIOS import *"-ed.  This might
also make "from ... import *" less of a pariah...

Sounds good to me, IOW.

Cheers,
M.




From tim.one at home.com  Wed Jan 10 23:23:14 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 10 Jan 2001 17:23:14 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101101639.LAA26776@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGENMIHAA.tim.one@home.com>

[Guido]
> I bet you it's rejected on the basis of "the docs tell you not to mix
> reading and writing on the same stream without intervening seek or
> flush."  If I were on the support line I would do that.

So would I if I were a typical first-line support idiot <wink>.  But the
*implementers*-- if they ever see it --should be very keen to figure out how
they managed to let the _iobuf get corrupted.  *I'm* not mucking with their
internals, nor doing wild pointer stores, nor anything else sneaky to
subvert their locking protection.  I wasn't even trying to break it.  The
only code reading from or storing into the _iobuf is theirs.  They're
ordinary stdio calls with ordinary arguments, and if *any* sequence of those
can cause internal corruption, they've almost certainly got a problem that
will manifest in other situations too.

Think like an implementer here <0.5 wink>:  they've lost track of how many
characters are in the buffer despite a locking scheme whose purpose is to
prevent that.  If it were my implementation, that would be a top-priority
bug no matter how silly the first program I saw that triggered it.

but-willing-to-let-them-decide-whether-they-care-ly y'rs  - tim




From skip at mojam.com  Wed Jan 10 23:52:55 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 10 Jan 2001 16:52:55 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A5CC13F.DFB26A0B@ActiveState.com>
References: <200101052014.PAA20328@cj20424-a.reston1.va.home.com>
	<3A5C61D8.2E5D098C@lemburg.com>
	<200101101653.LAA28986@cj20424-a.reston1.va.home.com>
	<3A5CC13F.DFB26A0B@ActiveState.com>
Message-ID: <14940.59335.723701.574821@beluga.mojam.com>

    Paul> Also, dir() could look for an __all__ on all objects including
    Paul> "module proxies", classes and "plain old instances". In other
    Paul> words we can extend the convention to other objects "for free".

The __exports__/dir() patch I submitted will do this if you remove the
PyModule_Check that guards it.

Skip






From tim.one at home.com  Thu Jan 11 00:06:05 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 10 Jan 2001 18:06:05 -0500
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
In-Reply-To: <3A5C4E44.23B593E9@per.dem.csiro.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCGENOIHAA.tim.one@home.com>

[Mark Favas]
> Just Another Data Point - my box (DEC Alpha, Tru64 Unix) shows the same
> behaviour as Tim's WinBox wrt the new xreadline and the double-loop
> readlines (so it's not just something funny with MS (not that there's
> not anything funny with MS...)):
>
> total 131426612 chars and 514216 lines

You average over 255 chars/line?  Really?  What kind of file are you
reading?  I don't really want to measure the speed of line-at-a-time input
on binary files where "line" doesn't actually make sense <0.6 wink>.

> count_chars_lines     5.450  5.066
> readlines_sizehint    4.112  4.083
> using_fileinput      10.928 10.916
> while_readline       11.766 11.733
> for_xreadlines        3.569  3.533

Guido pointed out that his readlines_sizehint test forced use of a 1Mb
buffer (in the call, not only the default value).  For whatever reason, that
was significantly slower than using an 8Kb sizehint on my box.

Another oddity is that while_readline is slower than using_fileinput for
you.  From that I take it Python config does *not* #define

     HAVE_GETC_UNLOCKED

on your platform.  If that's true (or esp. if it's not!), would you do me a
favor?  Recompile fileobject.c with

     USE_MS_GETLINE_HACK

#define'd, try the timing test again (while_readline is the most interesting
test for this), and run the test_bufio.py std test to make sure you're
actually getting the right answers.

At this point I'm +0.5 on the idea of fileobject.c using ms_getline_hack
whenever HAVE_GETC_UNLOCKED isn't available.  I'd be surprised if
ms_getline_hack failed to work correctly on any platform; a bigger unknown
(to me) is whether it will yield a speedup.  So far it yields a large
speedup on Windows, and looks like a speedup equal to getc_unlocked() yields
on Linux and Solaris.  Info on a platform from Mars (like Tru64 Unix <wink>)
would be valuable in deciding whether to boost +0.5.

don't-want-your-python-to-run-slower-than-possible-if-possible-ly
    y'rs  - tim




From tismer at tismer.com  Wed Jan 10 23:38:57 2001
From: tismer at tismer.com (Christian Tismer)
Date: Thu, 11 Jan 2001 00:38:57 +0200
Subject: [Python-Dev] [Stackless] ANN: Sourcecode for Stackless Python 2.0
Message-ID: <3A5CE481.24A7656@tismer.com>

On Monday, Jan 8th, I spake

"""
Source code and an update to the website will become available in
the next days.
"""

Now, here it is, together with a slightly updated website,
which tries to mention all the people who are helping
or sponsoring me (yes, there are sponsors!).
If somebody feels ignored by me, let me know. I'm good at
making mistakes.

Let me also know if there are problems building the code,
or if there are *no* problems understanding the code.
I don't expect either :-)

There is nearly no support for Unix, but Stackless *should*
build on Unix as it did before without problems.

enjoy - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From nas at arctrix.com  Wed Jan 10 19:15:45 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 10 Jan 2001 10:15:45 -0800
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGENOIHAA.tim.one@home.com>; from tim.one@home.com on Wed, Jan 10, 2001 at 06:06:05PM -0500
References: <3A5C4E44.23B593E9@per.dem.csiro.au> <LNBBLJKPBEHFEDALKOLCGENOIHAA.tim.one@home.com>
Message-ID: <20010110101545.A21305@glacier.fnational.com>

On Wed, Jan 10, 2001 at 06:06:05PM -0500, Tim Peters wrote:
> At this point I'm +0.5 on the idea of fileobject.c using ms_getline_hack
> whenever HAVE_GETC_UNLOCKED isn't available.

Leave it to the timbot use floating point votes. :)

Compare ms_getline_hack to what Perl does in order speed up IO.
I think its worth maintaining that piece of relatively portable
code given the benefit.  If the code has to be maintained then it
might was well be used.  If we find a platform the breaks we can
always disable it before the final release.

  Neil



From m.favas at per.dem.csiro.au  Thu Jan 11 02:28:59 2001
From: m.favas at per.dem.csiro.au (Mark Favas)
Date: Thu, 11 Jan 2001 09:28:59 +0800
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
Message-ID: <3A5D0C5B.162F624A@per.dem.csiro.au>

[Tim produces a warped threader that crashes on MS OS's]
>> ...
>> NO, NO NO!  Mixing reads and writes on the same stream wasn't what
>> we are locking against at all.  (As you've found out, it doesn't 
>> even work.)

>On Windows, yes, but that still seems to me to be a bug in MS's code.  >If anyone had reported a core dump on any other platform, I'd be more >tractable <wink> on this point.

On Tru64 Unix, I get an infinite generator of 'r's (after an initial few
'w's) to the screen (but no crashes). If I reduce the size of the loop
counters from 1000000 to 3000, I get the following output:
opened
w w w w w w w w w w w w w w w w w w w w w w w w w w w r read 5114
done

-- 
Mark Favas  -   m.favas at per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA



From m.favas at per.dem.csiro.au  Thu Jan 11 04:40:18 2001
From: m.favas at per.dem.csiro.au (Mark Favas)
Date: Thu, 11 Jan 2001 11:40:18 +0800
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
Message-ID: <3A5D2B22.B8028AC@per.dem.csiro.au>

[Tim responded]
>>
>> total 131426612 chars and 514216 lines

>You average over 255 chars/line?  Really?  What kind of file are you
>reading?  I don't really want to measure the speed of line-at-a-time >input on binary files where "line" doesn't actually make sense <0.6 wink>.

Real-life input, my boy! It's actually a syslog from my mailserver,
consisting mainly of sendmail log messages, and I have a current need to
process these things (MS Exchange, corrupted database, clobbered backup
tapes), so this thread came along at the right time...

>Guido pointed out that his readlines_sizehint test forced use of a 1Mb
>buffer (in the call, not only the default value).  For whatever >reason, that was significantly slower than using an 8Kb sizehint on my >box.

Removing the buffer size arg in the call to readlines_sizehint results
in this (using up-to-the-minute CVS):
total 131426612 chars and 514216 lines
count_chars_lines     4.922  4.916
readlines_sizehint    3.881  3.850
using_fileinput      10.371 10.366
while_readline       10.943 10.916
for_xreadlines        2.990  2.967

and with an 8Kb sizehint:
total 131426612 chars and 514216 lines
count_chars_lines     5.241  5.216
readlines_sizehint    2.917  2.900
using_fileinput      10.351 10.333
while_readline       10.990 10.983
for_xreadlines        2.877  2.867


>Another oddity is that while_readline is slower than using_fileinput >for you.  From that I take it Python config does *not* #define
>
>     HAVE_GETC_UNLOCKED
>
>on your platform.  If that's true 

Nope, HAVE_GETC_UNLOCKED is indeed #define'd

>(or esp. if it's not!), would you do me a
>favor?  Recompile fileobject.c with
>
>     USE_MS_GETLINE_HACK
>
>#define'd, try the timing test again (while_readline is the most >interesting test for this), and run the test_bufio.py std test to make >sure you're actually getting the right answers.

Sure:
With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #define'd (although
defining the former makes the latter def irrelevant):
(test_bufio also OK)
total 131426612 chars and 514216 lines
count_chars_lines     5.056  5.050
readlines_sizehint    3.771  3.667
using_fileinput      11.128 11.116
while_readline        8.287  8.233
for_xreadlines        3.090  3.083

With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #undef'ed (just for
completeness):
total 131426612 chars and 514216 lines
count_chars_lines     4.916  4.900
readlines_sizehint    3.875  3.867
using_fileinput      14.404 14.383
while_readline       322.728 321.837
for_xreadlines        7.113  7.100

So, having HAVE_GETC_UNLOCKED #define'd does make a small improvement
<grin>

-- 
Mark Favas  -   m.favas at per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA



From nas at arctrix.com  Wed Jan 10 22:55:23 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 10 Jan 2001 13:55:23 -0800
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
In-Reply-To: <3A5D2B22.B8028AC@per.dem.csiro.au>; from m.favas@per.dem.csiro.au on Thu, Jan 11, 2001 at 11:40:18AM +0800
References: <3A5D2B22.B8028AC@per.dem.csiro.au>
Message-ID: <20010110135523.A21894@glacier.fnational.com>

On Thu, Jan 11, 2001 at 11:40:18AM +0800, Mark Favas wrote:
[with getc_unlocked]
> while_readline       10.943 10.916

[without]
> while_readline       322.728 321.837

Holy crap.  Great work team.

  Neil



From tim.one at home.com  Thu Jan 11 06:03:51 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 11 Jan 2001 00:03:51 -0500
Subject: [Python-Dev] Baffled on Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCCEOGIHAA.tim.one@home.com>

In version 2.26 of mmapmodule.c, Guido replaced (as part of a contributed
Cygwin patch):

#ifdef MS_WIN32
__declspec(dllexport) void
#endif /* MS_WIN32 */
#ifdef UNIX
extern void
#endif

by:

DL_EXPORT(void)

before initmmap.

1. Windows Python can no longer import mmap:

>>> import mmap
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ImportError: dynamic module does not define init function (initmmap)
>>>

This is because GetProcAddress returns NULL.

2. Everything's fine if I revert Guido's change (although I assume that
breaks Cygwin then).

3. DL_EXPORT(void) expands to "void".

4. The way mmapmodule.c is coded and built after Guido's change appears to
me to be the same as how every other non-builtin module is coded and built
on Windows.  For example, winsound.c, which uses DL_EXPORT(void) before its
initwinsound and where that macro also expands to "void".  But importing
winsound works fine.

Since what I'm seeing makes no consistent sense, I'm at a loss how to fix
it.  But then I'm punch-drunk too <0.7 wink>.

Any Windows geek got a clue?




From tim.one at home.com  Thu Jan 11 07:10:40 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 11 Jan 2001 01:10:40 -0500
Subject: [Python-Dev] RE: xreadline speed vs readlines_sizehint
In-Reply-To: <3A5D2B22.B8028AC@per.dem.csiro.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPGIHAA.tim.one@home.com>

[Tim, to MarkF]
>> You average over 255 chars/line?  [nag, nag, nag]

[Mark Favas]
> Real-life input, my boy! It's actually a syslog from my
> mailserver, consisting mainly of sendmail log messages, and I
> have a current need to process these things (MS Exchange,
> corrupted database, clobbered backup tapes), so this thread
> came along at the right time...

Hmm.  I tuned ms_getline_hack for Guido's logfiles, which he said don't
often exceed 160 chars/line.  I guess if you're on a 64-bit platform,
though, it must take about twice as many chars per line to record a log msg
<wink>.

> ...
> Removing the buffer size arg in the call to readlines_sizehint results
> in this (using up-to-the-minute CVS):
> total 131426612 chars and 514216 lines
> count_chars_lines     4.922  4.916
> readlines_sizehint    3.881  3.850
> using_fileinput      10.371 10.366
> while_readline       10.943 10.916
> for_xreadlines        2.990  2.967
>
> and with an 8Kb sizehint:
> total 131426612 chars and 514216 lines
> count_chars_lines     5.241  5.216
> readlines_sizehint    2.917  2.900
> using_fileinput      10.351 10.333
> while_readline       10.990 10.983
> for_xreadlines        2.877  2.867

That's sure consistent across platforms, then.  I guess we'll write it off
to "cache effects" (a catch-all explanation for any timing mystery -- go
ahead, just *try* to prove it's wrong <0.5 wink>).

[and Mark has HAVE_GETC_UNLOCKED on his Tru64 Unix box, yet
 using_fileinput is quicker than while_readline]

> With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #define'd
> (although defining the former makes the latter def irrelevant):
> (test_bufio also OK)
> total 131426612 chars and 514216 lines
> count_chars_lines     5.056  5.050
> readlines_sizehint    3.771  3.667
> using_fileinput      11.128 11.116
> while_readline        8.287  8.233
> for_xreadlines        3.090  3.083

So ms_getline_hack is significantly faster on your box (I'm only looking at
while_readline:  11 using getc_unlocked, 8.3 using ms_getline_hack).  There
are only two reasons I can imagine for that:

1. Your vendor optimizes the inner loop in fgets (as all vendors should, but
few do).

and/or

2. Despite the long average length of your lines, many of them are
nevertheless shorter than 200 chars, and so all the pain ms_getline_hack
endures to avoid a realloc pays off.

Unfortunately, there's not enough info to figure out if either, both, or
none of those are on-target.  It's such a large percentage speedup, though,
that my bet goes primarily to #1 -- unless realloc is really pig slow on
your box.  Which some things *are*:

> With USE_MS_GETLINE_HACK and HAVE_GETC_UNLOCKED both #undef'ed (just
> for completeness):
> total 131426612 chars and 514216 lines
> count_chars_lines     4.916  4.900
> readlines_sizehint    3.875  3.867
> using_fileinput      14.404 14.383
> while_readline       322.728 321.837
> for_xreadlines        7.113  7.100
>
> So, having HAVE_GETC_UNLOCKED #define'd does make a small improvement
> <grin>

Yes, that's the "platform from Mars" evidence I was seeking:  if
ms_getline_hack survives test_bufio on *your* crazy box, it's as close to
provably correct as any algorithm in all of Computer Science <wink>.

a-factor-of-39-is-almost-big-enough-to-notice!-ly y'rs  - tim




From m.favas at per.dem.csiro.au  Thu Jan 11 08:26:37 2001
From: m.favas at per.dem.csiro.au (Mark Favas)
Date: Thu, 11 Jan 2001 15:26:37 +0800
Subject: [Python-Dev] Re: xreadline speed vs readlines_sizehint
References: <LNBBLJKPBEHFEDALKOLCIEPGIHAA.tim.one@home.com>
Message-ID: <3A5D602D.9DC991CB@per.dem.csiro.au>

[Tim speculates on getc_unlocked and his ms_getline_hack]:
> 
> So ms_getline_hack is significantly faster on your box (I'm only
> looking at while_readline:  11 using getc_unlocked, 8.3 using 
> ms_getline_hack).  There are only two reasons I can imagine for that:
> 
> 1. Your vendor optimizes the inner loop in fgets (as all vendors
> should, but few do).

Digital engineering, Compaq management/marketing <0.6 wink>
> 
> and/or
> 
> 2. Despite the long average length of your lines, many of them are
> nevertheless shorter than 200 chars, and so all the pain
> ms_getline_hack endures to avoid a realloc pays off.
> 
> Unfortunately, there's not enough info to figure out if either, both,
> or none of those are on-target.  It's such a large percentage
> speedup, though, that my bet goes primarily to #1 -- unless realloc
> is really pig slow on your box.

The lines range in length from 96 to 747 characters, with 11% @ 233, 17%
@ 252 and 52% @ 254 characters, so #1 looks promising - most lines are
long enough to trigger a realloc. Cranking up INITBUFSIZE in
ms_getline_hack to 260 from 200 improves thing again, by another 25%: 
total 131426612 chars and 514216 lines
count_chars_lines     5.081  5.066
readlines_sizehint    3.743  3.717
using_fileinput      11.113 11.100
while_readline        6.100  6.083
for_xreadlines        3.027  3.033

Apart from the name <grin>, I like ms_getline_hack...

tho'-a-factor-of-100-makes-xreadlines-a-welcome-addition!-ly y'rs

-- 
Mark Favas  -   m.favas at per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA



From m.favas at per.dem.csiro.au  Thu Jan 11 10:08:29 2001
From: m.favas at per.dem.csiro.au (Mark Favas)
Date: Thu, 11 Jan 2001 17:08:29 +0800
Subject: [Python-Dev] Current CVS version of sysmodule.c fails to compile
Message-ID: <3A5D780D.62D0F473@per.dem.csiro.au>

On Tru64 Unix, with Compaq's C/CXX compilers, the current CVS version of
sysmodule.c produces the following errors:

cc -O -Olimit 1500 -I./../Include -I.. -DHAVE_CONFIG_H   -c -o
sysmodule.o sysmodule.c
cc: Error: sysmodule.c, line 73: Invalid declarator. (declarator)
        PyObject *o, *stdout;
----------------------^
cc: Error: sysmodule.c, line 79: In this statement, "o" is not declared.
(undeclared)
        if (!PyArg_ParseTuple(args, "O:displayhook", &o))
------------------------------------------------------^
cc: Error: sysmodule.c, line 93: In this statement, "(&_iob[1])" is not
an lvalue, but occurs in a context that requires one. (needlvalue)
        stdout = PySys_GetObject("stdout");
--------^
cc: Warning: sysmodule.c, line 98: In this statement, the referenced
type of the pointer value "(&_iob[1])" is "struct declared without a
tag", which is not compatible with "struct _object". (ptrmismatch)
        if (PyFile_WriteObject(o, stdout, 0) != 0)
----------------------------------^
cc: Warning: sysmodule.c, line 100: In this statement, the referenced
type of the pointer value "(&_iob[1])" is "struct declared without a
tag", which is not compatible with "struct _object". (ptrmismatch)
        PyFile_SoftSpace(stdout, 1);
-------------------------^

The problem is that stdout is a macro #define'd in stdio.h as (&_iob[1])
(stdin and stderr also are similarly #define'd).

-- 
Mark Favas  -   m.favas at per.dem.csiro.au
CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA



From gstein at lyra.org  Thu Jan 11 10:18:44 2001
From: gstein at lyra.org (Greg Stein)
Date: Thu, 11 Jan 2001 01:18:44 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.216,2.217 sysmodule.c,2.80,2.81
In-Reply-To: <E14GaUL-0005nd-00@usw-pr-cvs1.sourceforge.net>; from moshez@users.sourceforge.net on Wed, Jan 10, 2001 at 09:41:29PM -0800
References: <E14GaUL-0005nd-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010111011843.W4640@lyra.org>

On Wed, Jan 10, 2001 at 09:41:29PM -0800, Moshe Zadka wrote:
> Update of /cvsroot/python/python/dist/src/Python
> In directory usw-pr-cvs1:/tmp/cvs-serv21213/Python
> 
> Modified Files:
> 	ceval.c sysmodule.c
>...
> --- 1246,1269 ----
>   		case PRINT_EXPR:
>   			v = POP();
> ! 			w = PySys_GetObject("displayhook");
> ! 			if (w == NULL) {
> ! 				PyErr_SetString(PyExc_RuntimeError,
> ! 						"lost sys.displayhook");
> ! 				err = -1;
>   			}
> + 			if (err == 0) {
> + 				x = Py_BuildValue("(O)", v);
> + 				if (x == NULL)
> + 					err = -1;
> + 			}
> + 			if (err == 0) {
> + 				w = PyEval_CallObject(w, x);
> + 				if (w == NULL)
> + 					err = -1;
> + 			}
>   			Py_DECREF(v);
> + 			Py_XDECREF(x);

x was never initialized to NULL. In fact, the loop sets it to Py_None. If
you get an error in the initial "w" setup case, then you could erroneously
decref None.

Further, there is no DECREF for the CallObject result ("w"). But watch out:
you don't want to DECREF the PySys_GetObject result (that is a borrowed
reference).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From gstein at lyra.org  Thu Jan 11 10:28:16 2001
From: gstein at lyra.org (Greg Stein)
Date: Thu, 11 Jan 2001 01:28:16 -0800
Subject: [Python-Dev] Current CVS version of sysmodule.c fails to compile
In-Reply-To: <3A5D780D.62D0F473@per.dem.csiro.au>; from m.favas@per.dem.csiro.au on Thu, Jan 11, 2001 at 05:08:29PM +0800
References: <3A5D780D.62D0F473@per.dem.csiro.au>
Message-ID: <20010111012815.X4640@lyra.org>

You're quite right! I've checked in a change, renaming it to "outf".

Cheers,
-g

On Thu, Jan 11, 2001 at 05:08:29PM +0800, Mark Favas wrote:
> On Tru64 Unix, with Compaq's C/CXX compilers, the current CVS version of
> sysmodule.c produces the following errors:
> 
> cc -O -Olimit 1500 -I./../Include -I.. -DHAVE_CONFIG_H   -c -o
> sysmodule.o sysmodule.c
> cc: Error: sysmodule.c, line 73: Invalid declarator. (declarator)
>         PyObject *o, *stdout;
> ----------------------^
> cc: Error: sysmodule.c, line 79: In this statement, "o" is not declared.
> (undeclared)
>         if (!PyArg_ParseTuple(args, "O:displayhook", &o))
> ------------------------------------------------------^
> cc: Error: sysmodule.c, line 93: In this statement, "(&_iob[1])" is not
> an lvalue, but occurs in a context that requires one. (needlvalue)
>         stdout = PySys_GetObject("stdout");
> --------^
> cc: Warning: sysmodule.c, line 98: In this statement, the referenced
> type of the pointer value "(&_iob[1])" is "struct declared without a
> tag", which is not compatible with "struct _object". (ptrmismatch)
>         if (PyFile_WriteObject(o, stdout, 0) != 0)
> ----------------------------------^
> cc: Warning: sysmodule.c, line 100: In this statement, the referenced
> type of the pointer value "(&_iob[1])" is "struct declared without a
> tag", which is not compatible with "struct _object". (ptrmismatch)
>         PyFile_SoftSpace(stdout, 1);
> -------------------------^
> 
> The problem is that stdout is a macro #define'd in stdio.h as (&_iob[1])
> (stdin and stderr also are similarly #define'd).
> 
> -- 
> Mark Favas  -   m.favas at per.dem.csiro.au
> CSIRO, Private Bag No 5, Wembley, Western Australia 6913, AUSTRALIA
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Greg Stein, http://www.lyra.org/



From skip at mojam.com  Thu Jan 11 15:13:55 2001
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 11 Jan 2001 08:13:55 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <E14GgKS-0002AH-00@usw-pr-cvs1.sourceforge.net>
References: <E14GgKS-0002AH-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <14941.49059.26189.733094@beluga.mojam.com>

    Moshe> * Did not DECREF result from displayhook function
    ...
    Moshe>   				w = PyEval_CallObject(w, x);
    Moshe> + 				Py_XDECREF(w);
    Moshe>   				if (w == NULL)
    ...


While it works, is it really kosher to test w's value after the DECREF?
Just seems like an odd construct to me.  I'm used to seeing the test
immediately after it's been set.

Skip






From guido at python.org  Thu Jan 11 15:44:58 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 09:44:58 -0500
Subject: [Python-Dev] Interning filenames of imported modules
In-Reply-To: Your message of "Wed, 10 Jan 2001 15:57:41 CST."
             <14940.56021.646147.770080@buffalo.fnal.gov> 
References: <14940.56021.646147.770080@buffalo.fnal.gov> 
Message-ID: <200101111444.JAA14597@cj20424-a.reston1.va.home.com>

> I have a question about the following code in compile.c:jcompile (line 3678)
> 
> 		filename = PyString_InternFromString(sc.c_filename); 
> 		name = PyString_InternFromString(sc.c_name);
> 
> In the case of a long-running server which constantly imports modules,
> this causes the interned string dict to grow without bound.  Is there
> a strong reason that the filename needs to be interned?  How about the
> module name?

It's probably not *necessary* for the filename, but I know why I am
interning it: since a module typically contains a bunch of functions,
and each function has its own code object with a reference to the
filename, I'm trying to save memory (the filename is a C string
pointer in the "sc" structure, so it has to be turned into a Python
string when creating the code object).

The module name is used as an identifier elsewhere so will become
interned anyway.

> How about some way to enforce a limit on the size of the interned
> strings dictionary?

I've never thought of this -- but I suppose that a weak dictionary
could be used.  Fred's working on a PEP for weak references, so
there's a chance that we might use this eventually.

In the mean time, a possibility would be to provide a service function
that goes through the "interned" dictionary and looks for values with
a reference count of 1, and deletes them.  You could then explicitly
call this service function occasionally in your program.  I would let
it return a tuple: (number of values kept, number of values deleted).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Thu Jan 11 16:08:48 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:08:48 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 10 Jan 2001 13:49:40 EST."
             <LNBBLJKPBEHFEDALKOLCGENEIHAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCGENEIHAA.tim.one@home.com> 
Message-ID: <200101111508.KAA14870@cj20424-a.reston1.va.home.com>

> They're indistinguishable then on my box (on one run xreadlines is .1
> seconds  (out of around 7.6 total) quicker, on another readlines_sizehint),
> *provided* that I specify the same buffer size (8192) that xreadlines uses
> internally.  However, if I even double that, readlines_sizehint is uniformly
> about 10% slower.  It's also a tiny bit slower if I cut the sizehint buffer
> size to 4096.
> 
> I'm afraid Mysteries will remain no matter how many person-decades we spend
> staring at this <0.5 wink> ...

8192 happens to be the size of the stack-allocated buffer readlines()
uses, and also the stdio BUFSIZ parameter, on many systems.  Look for
SMALLCHUNK in fileobject.c.

Would it make sense to tie the two constants together more to tune
this optimally even when BUFSIZ is different?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Thu Jan 11 16:09:54 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Thu, 11 Jan 2001 10:09:54 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
References: <200101080416.f084GrM10912@snark.thyrsus.com>
	<20010108074411.N2467@xs4all.nl>
	<20010108014945.A19516@thyrsus.com>
	<200101081427.JAA03146@cj20424-a.reston1.va.home.com>
	<20010108113109.C7563@kronos.cnri.reston.va.us>
	<200101101900.OAA30486@cj20424-a.reston1.va.home.com>
Message-ID: <14941.52418.18484.898061@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido at python.org> writes:

    GvR> It was more work than I had hoped for, because Eric
    GvR> apparently (despite having developer privileges!) doesn't use
    GvR> the CVS tree -- he sent in a diff relative to the 2.0
    GvR> release.  I munged it into place, adding the feature that
    GvR> readline, _curses and bsdddb are built as shared libraries by
    GvR> default.  You'd have to edit Setup.config.in to change this.
    GvR> Hope this doesn't break anybody's setup.  (Skip???)

We may need to move dbm module to Setup.config from Setup and build it
shared too.  The problem I ran into when building the pybsddb3 module
was that even though I'd built the standard bsddb shared, I was also
building in dbm statically.  This pulled in a dependency to the old
db.so module (under RH6.1) and core dumped me during the test suite
for pybsddb.  Commenting out dbm did the trick, so building it shared
should work too.

Couple of things: dbm isn't enabled by default I believe so moving it
to Setup.config may not be the right thing after all (would that imply
an autoconf test and auto-enabling if it's detected?)  Also, Andrew's
distutils-based build procedure may obviate the need for this change.

-Barry




From ping at lfw.org  Thu Jan 11 16:14:17 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 07:14:17 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>

On Wed, 10 Jan 2001, Guido van Rossum wrote:
> Yes -- I came up with the same thought.
> 
> So here's a plan: somebody please submit a patch that does only one
> thing: from...import * looks for __all__ and if it exists, imports
> exactly those names.  No changes to dir(), or anything.

Please don't use __all__.  At the moment, __all__ is the only way
to easily tell whether a particular module object really represents
a package, and the only way to get the list of submodule names.

If __all__ is overloaded to also represent exportable symbols in
modules, these two pieces of information will be impossible (or
require much ugly hackery) to obtain.


-- ?!ng




From guido at python.org  Thu Jan 11 16:23:26 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:23:26 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 10 Jan 2001 15:25:24 EST."
             <LNBBLJKPBEHFEDALKOLCAENJIHAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCAENJIHAA.tim.one@home.com> 
Message-ID: <200101111523.KAA14982@cj20424-a.reston1.va.home.com>

> [Tim]
> >> Well, it would be easy to fiddle the HAVE_GETC_UNLOCKED method
> >> to keep the file locked until the line was complete, and I
> >> wouldn't be opposed to making life saner on platforms that allow it.
> 
> [Guido]
> > Hm...  That would be possible, except for one unfortunate detail:
> > _PyString_Resize() may call PyErr_BadInternalCall() which touches
> > thread state.

[Tim]
> FLOCKFILE/FUNLOCKFILE are independent of Python's notion of thread state.
> IOW, do FLOCKFILE once before the for(;;), and FUNLOCKFILE once on every
> *exit* path thereafter.  We can block/unblock Python threads as often as
> desired between those *file*-locking brackets.  The only thing the repeated
> FLOCKFILE/FUNLOCKFILE calls do to my eyes now is to create the *possibility*
> for multiple readers to get partial lines of the file.

I don't want to call FLOCKFILE while holding the Python lock, as this
means that *if* we're blocked in FLOCKFILE (e.g. we're reading from a
pipe or socket), no other Python thread can run!

> > ...
> > NO, NO NO!  Mixing reads and writes on the same stream wasn't what we
> > are locking against at all.  (As you've found out, it doesn't even
> > work.)
> 
> On Windows, yes, but that still seems to me to be a bug in MS's code.  If
> anyone had reported a core dump on any other platform, I'd be more tractable
> <wink> on this point.

Yes, it's a Windows bug.

> > We're only trying to protect against concurrent *reads*.
> 
> As above, I believe that we could do a better job of that, then, on
> platforms that HAVE_GETC_UNLOCKED, by protecting not only against core dumps
> but also against .readline() not delivering an intact line from the file.

See above for a reason why I think that's not safe.  I think that
applications that want to do this can do their own locking.  (They'll
find out soon enough that readline() isn't atomic. :-)

> >> But since FLOCKFILE is in effect, other threads *trying* to write
> >> to the stream we're reading will get blocked anyway.  Seems to give us
> >> potential for deadlocks.
> 
> > Only if tyeh are holding other locks at the same time.
> 
> I'm not being clear, then.  Thread X does f.readline(), on a
> HAVE_GETC_UNLOCKED platform.  get_line allows other threads to run and
> invokes FLOCKFILE on f->f_fp.  get_line's GETC in thread X eventually hits
> the end of the stdio buffer, and does its platform's version of _filbuf.
> _filbuf may wait (depending on the nature of the stream) for more input to
> show up.  Simultaneously, thread Y attempts to write some data to f.  But
> the *FLOCKFILE* lock prevents it from doing anything with f.  So X is
> waiting for Y to write data inside platform _filbuf, but Y is waiting for X
> to release the platform stream lock inside some platform stream-output
> routine (if I'm being clear now, Python locks have nothing to do with this
> scenario:  it's the platform stream lock).

I don't think that _filbuf can possibly wait for another thread to
write data to the same stream object.  A single stream object doesn't
act like a pipe, even if it is open for simultaneous reading and
writing.  So if there's no more data in the file, _fulbuf will simply
return with an EOF status, not wait for the data that the other thread
would write.

> I think this is purely the user's fault if it happens.  Just pointing it out
> as another insecurity we're probably not able to protect users from.

I don't think this can happen.

> > ...
> > Yeah.  But this is insane use -- see my comments on SF.  It's only
> > worth fixing because it could be used to intentionally crash Python --
> > but there are easier ways...
> 
> If it's unique to MS (as I suspect), I see no reason to even consider trying
> to fix it in Python.  Unless the Perl Mongers use it to crash Zope <wink>.

OK.  It's unique to MS.  So close the bug report with a "won't fix"
resolution.  There's no point in having bug reports remain open that
we know we can't fix.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Thu Jan 11 16:27:05 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:27:05 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Wed, 10 Jan 2001 17:23:14 EST."
             <LNBBLJKPBEHFEDALKOLCGENMIHAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCGENMIHAA.tim.one@home.com> 
Message-ID: <200101111527.KAA15005@cj20424-a.reston1.va.home.com>

> Think like an implementer here <0.5 wink>:  they've lost track of how many
> characters are in the buffer despite a locking scheme whose purpose is to
> prevent that.  If it were my implementation, that would be a top-priority
> bug no matter how silly the first program I saw that triggered it.

The locking prevents concurrent threads accessing the stream.

But mixing reads and writes (without intervening fseek etc.) is
illegal use of the stream, and the C standard allows them to be lax
here, even if the program was single-threaded.

In other words: the locking is so good that it serializes the sequence
of reads and writes; but if the sequence of reads and writes is
illegal, they don't guarantee anything.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido at python.org  Thu Jan 11 16:28:23 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:28:23 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 11 Jan 2001 09:28:59 +0800."
             <3A5D0C5B.162F624A@per.dem.csiro.au> 
References: <3A5D0C5B.162F624A@per.dem.csiro.au> 
Message-ID: <200101111528.KAA15021@cj20424-a.reston1.va.home.com>

> On Tru64 Unix, I get an infinite generator of 'r's (after an initial few
> 'w's) to the screen (but no crashes).

Same here on Linux.

> If I reduce the size of the loop
> counters from 1000000 to 3000, I get the following output:
> opened
> w w w w w w w w w w w w w w w w w w w w w w w w w w w r read 5114
> done

I still get an infinite amount of 'r's.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Thu Jan 11 16:28:21 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 11 Jan 2001 16:28:21 +0100
Subject: [Python-Dev] Rehabilitating fgets
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEEDIHAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 07, 2001 at 11:13:26PM -0500
References: <LNBBLJKPBEHFEDALKOLCEEBPIHAA.tim_one@email.msn.com> <LNBBLJKPBEHFEDALKOLCIEEDIHAA.tim.one@home.com>
Message-ID: <20010111162820.W2467@xs4all.nl>

On Sun, Jan 07, 2001 at 11:13:26PM -0500, Tim Peters wrote:

> I'm curious about how it performs (relative to the getc_unlocked hack) on
> other platforms.  If you'd like to try that, just recompile fileobject.c
> with

>     USE_MS_GETLINE_HACK

> #define'd.  It should *work* on any platform with fgets() meeting the
> assumption.  The new test_bufio.py std test gives it a pretty good
> correctness workout, if you're worried about that.

FreeBSD seems to work fine. Speed is practically the same as without
USE_MS_GETLINE_HACK (but with HAVE_GETC_UNLOCKED), though still not quite
the same as before all this hackery :-) Not by much though. For most tests
it's smaller than the margin of error, though the difference is still as
much as 20, 30% for the while_readline test. When using a second thread
somewhere in the test, the difference vanishes further.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mal at lemburg.com  Thu Jan 11 16:33:28 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 11 Jan 2001 16:33:28 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>
Message-ID: <3A5DD248.8EE0DF63@lemburg.com>

Ka-Ping Yee wrote:
> 
> On Wed, 10 Jan 2001, Guido van Rossum wrote:
> > Yes -- I came up with the same thought.
> >
> > So here's a plan: somebody please submit a patch that does only one
> > thing: from...import * looks for __all__ and if it exists, imports
> > exactly those names.  No changes to dir(), or anything.
> 
> Please don't use __all__.  At the moment, __all__ is the only way
> to easily tell whether a particular module object really represents
> a package, and the only way to get the list of submodule names.

But __all__ has to be user-defined, so I don't buy that argument.
Note that the only true way to recognize a package is by looking
for an attribute "__path__" since Python adds this for packages
only.
 
> If __all__ is overloaded to also represent exportable symbols in
> modules, these two pieces of information will be impossible (or
> require much ugly hackery) to obtain.

Again, __all__ is not automatically generated, so trusting it
doesn't get you very far. To be able to find subpackages you will
always have to apply some hackery (based on __path__) in order
to be sure. It would be better to add a helper function to
packages to query this kind of information -- the package usually
knows best where to look and what to look for.

Note that __all__ was explicitly invented to be used by
from package import * so I think it is the right choice here.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Thu Jan 11 16:37:19 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 11 Jan 2001 10:37:19 -0500
Subject: [Python-Dev] autoconfigure patch submitted on SourceForge
In-Reply-To: <14941.52418.18484.898061@anthem.wooz.org>; from barry@digicool.com on Thu, Jan 11, 2001 at 10:09:54AM -0500
References: <200101080416.f084GrM10912@snark.thyrsus.com> <20010108074411.N2467@xs4all.nl> <20010108014945.A19516@thyrsus.com> <200101081427.JAA03146@cj20424-a.reston1.va.home.com> <20010108113109.C7563@kronos.cnri.reston.va.us> <200101101900.OAA30486@cj20424-a.reston1.va.home.com> <14941.52418.18484.898061@anthem.wooz.org>
Message-ID: <20010111103719.A7191@thyrsus.com>

GvR> It was more work than I had hoped for, because Eric
GvR> apparently (despite having developer privileges!) doesn't use
GvR> the CVS tree -- he sent in a diff relative to the 2.0
GvR> release.

I'm using the CVS tree now.  I did that patch relative to 2.0 for
boring reasons having to do with the state of my laptop.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The IRS has become morally corrupted by the enormous power which we in
Congress have unwisely entrusted to it. Too often it acts like a
Gestapo preying upon defenseless citizens.
	-- Senator Edward V. Long



From thomas at xs4all.net  Thu Jan 11 16:48:32 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 11 Jan 2001 16:48:32 +0100
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A5DD248.8EE0DF63@lemburg.com>; from mal@lemburg.com on Thu, Jan 11, 2001 at 04:33:28PM +0100
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> <3A5DD248.8EE0DF63@lemburg.com>
Message-ID: <20010111164831.X2467@xs4all.nl>

On Thu, Jan 11, 2001 at 04:33:28PM +0100, M.-A. Lemburg wrote:

> > Please don't use __all__.  At the moment, __all__ is the only way
> > to easily tell whether a particular module object really represents
> > a package, and the only way to get the list of submodule names.
> 
> But __all__ has to be user-defined, so I don't buy that argument.
> Note that the only true way to recognize a package is by looking
> for an attribute "__path__" since Python adds this for packages
> only.

Ehm.... What, exactly, prevents usercode from doing

__path__ = "neener, neener"

? In other words, even *that* isn't a true way to recognize a package. You
can see what isn't a package, but not what is.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Thu Jan 11 16:58:55 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 10:58:55 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Thu, 11 Jan 2001 07:14:17 PST."
             <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101111558.KAA15447@cj20424-a.reston1.va.home.com>

> Please don't use __all__.  At the moment, __all__ is the only way
> to easily tell whether a particular module object really represents
> a package, and the only way to get the list of submodule names.
> 
> If __all__ is overloaded to also represent exportable symbols in
> modules, these two pieces of information will be impossible (or
> require much ugly hackery) to obtain.

Marc-Andre already explained that __all__ is not to be trusted.

If you want a reasonably good test for package-ness, use the presence
of __path__.

For a really good test, check whether __file__ ends in __init__.py[c].

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Thu Jan 11 17:14:00 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 11 Jan 2001 11:14:00 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
Message-ID: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us>

I've put a new version of the setup.py script at
     http://www.mems-exchange.org/software/files/python/setup.py

(I'm at work and can't remember the password to get into
www.amk.ca. :) )

This version improves the detection of Tcl/Tk, handles the
_curses_panel module, and doesn't do a chdir().  Same drill as before:
just grab the script, drop it in the root of your Python source tree
(2.0 or current CVS), run "./python setup.py build", and look at the
modules it compiles.  I can try it on Linux, so I'm most interested in
hearing reports for other Unix versions (*BSD, HP-UX, etc.)

--amk





From ping at lfw.org  Thu Jan 11 17:36:36 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 08:36:36 -0800 (PST)
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of Python)
Message-ID: <Pine.LNX.4.10.10101110803400.5846-100000@skuld.kingmanhall.org>

I'm pleased to announce a reasonable first pass at a documentation
utility for interactive use.  "pydoc" is usable in three ways:

1.  At the shell prompt, "pydoc <name>" displays documentation
    on <name>, very much like "man".

2.  At the shell prompt, "pydoc -k <keyword>" lists modules whose
    one-line descriptions mention the keyword, like "man -k".

3.  Within Python, "from pydoc import help" provides a "help"
    function to display documentation at the interpreter prompt.

All of them use sys.path in order to guarantee that the documentation
you see matches the modules you get.

To try "pydoc", download:

    http://www.lfw.org/python/pydoc.py
    http://www.lfw.org/python/htmldoc.py
    http://www.lfw.org/python/textdoc.py
    http://www.lfw.org/python/inspect.py

I would very much appreciate your feedback, especially from testing
on non-Unix platforms.  Thank you!

I've pasted some examples from my shell below (when you actually
run pydoc, the output is piped through "less", "more", or a pager
implemented in Python, depending on what is available).



-- ?!ng

"If I have seen farther than others, it is because I was standing on a
really big heap of midgets."
    -- K. Eric Drexler



skuld[1268]% pydoc -k mail
mailbox - Classes to handle Unix style, MMDF style, and MH style mailboxes.
mailcap - Mailcap file handling.  See RFC 1524.
mimify - Mimification and unmimification of mail messages.
test.test_mailbox - (no description)

skuld[1269]% pydoc -k text
textdoc - Generate text documentation from live Python objects.
collab - Routines for collaboration, especially group editing of text documents.
gettext - Internationalization and localization support.
test.test_gettext - (no description)
curses.textpad - Simple textbox editing widget with Emacs-like keybindings.
distutils.text_file - text_file
ScrolledText - (no description)

skuld[1270]% pydoc -k html
htmldoc - Generate HTML documentation from live Python objects.
htmlentitydefs - HTML character entity references.
htmllib - HTML 2.0 parser.

skuld[1271]% pydoc md5

Python Library Documentation: built-in module md5

NAME
    md5

FILE
    (built-in)

DESCRIPTION
    This module implements the interface to RSA's MD5 message digest
    algorithm (see also Internet RFC 1321). Its use is quite
    straightforward: use the new() to create an md5 object. You can now
    feed this object with arbitrary strings using the update() method, and
    at any point you can ask it for the digest (a strong kind of 128-bit
    checksum, a.k.a. ``fingerprint'') of the contatenation of the strings
    fed to it so far using the digest() method.
    
    Functions:
    
    new([arg]) -- return a new md5 object, initialized with arg if provided
    md5([arg]) -- DEPRECATED, same as new, but for compatibility
    
    Special Objects:
    
    MD5Type -- type object for md5 objects

FUNCTIONS
    md5(no arg info)
        new([arg]) -> md5 object
        
        Return a new md5 object. If arg is present, the method call update(arg)
        is made.
    
    new(no arg info)
        new([arg]) -> md5 object
        
        Return a new md5 object. If arg is present, the method call update(arg)
        is made.

skuld[1272]% pydoc types

Python Library Documentation: module types

NAME
    types

FILE
    /home/ping/sw/Python-1.5.2/Lib/types.py

DESCRIPTION
    # Define names for all type symbols known in the standard interpreter.
    # Types that are part of optional modules (e.g. array) are not listed.

skuld[1273]% pydoc abs

Python Library Documentation: built-in function abs

abs (no arg info)
    abs(number) -> number
    
    Return the absolute value of the argument.

skuld[1274]% pydoc repr             

Python Library Documentation: built-in function repr

repr (no arg info)
    repr(object) -> string
    
    Return the canonical string representation of the object.
    For most object types, eval(repr(object)) == object.


Python Library Documentation: module repr

NAME
    repr - # Redo the `...` (representation) but with limits on most sizes.

FILE
    /home/ping/sw/Python-1.5.2/Lib/repr.py

CLASSES
    Repr
    
    class Repr
        __init__(self)
        
        repr(self, x)
        
        repr1(self, x, level)
        
        repr_dictionary(self, x, level)
        
        repr_instance(self, x, level)
        
        repr_list(self, x, level)
        
        repr_long_int(self, x, level)
        
        repr_string(self, x, level)
        
        repr_tuple(self, x, level)

FUNCTIONS
    repr(no arg info)

skuld[1275]% pydoc re.MatchObject

Python Library Documentation: class MatchObject in re

class MatchObject
    __init__(self, re, string, pos, endpos, regs)
    
    end(self, g=0)
        Return the end of the substring matched by group g
    
    group(self, *groups)
        Return one or more groups of the match
    
    groupdict(self, default=None)
        Return a dictionary containing all named subgroups of the match
    
    groups(self, default=None)
        Return a tuple containing all subgroups of the match object
    
    span(self, g=0)
        Return (start, end) of the substring matched by group g
    
    start(self, g=0)
        Return the start of the substring matched by group g

skuld[1276]% pydoc xml    

Python Library Documentation: package xml

NAME
    xml - Core XML support for Python.

FILE
    /home/ping/dev/python/dist/src/Lib/xml/__init__.py

DESCRIPTION
    This package contains three sub-packages:
    
    dom -- The W3C Document Object Model.  This supports DOM Level 1 +
           Namespaces.
    
    parsers -- Python wrappers for XML parsers (currently only supports Expat).
    
    sax -- The Simple API for XML, developed by XML-Dev, led by David
           Megginson and ported to Python by Lars Marius Garshol.  This
           supports the SAX 2 API.

VERSION
    1.8

skuld[1277]% pydoc lovelyspam
no Python documentation found for lovelyspam

skuld[1278]% python
Python 1.5.2 (#1, Dec 12 2000, 02:25:44)  [GCC egcs-2.91.66 19990314/Linux (egcs- on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>>          
>>> from pydoc import help
>>> help(int)
Help on built-in function int:

int (no arg info)
    int(x) -> integer
    
    Convert a string or number to an integer, if possible.
    A floating point argument will be truncated towards zero.

>>> help("urlparse.urljoin")
Help on function urljoin in module urlparse:

urljoin(base, url, allow_fragments=1)
    # Join a base URL and a possibly relative URL to form an absolute
    # interpretation of the latter.
>>> import random
>>> help(random.generator)
Help on class generator in module random:

class generator(whrandom.whrandom)
    Random generator class.
    
    __init__(self, a=None)
        Constructor.  Seed from current time or hashable value.
    
    seed(self, a=None)
        Seed the generator from current time or hashable value.
>>> 





From moshez at zadka.site.co.il  Fri Jan 12 01:48:30 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 02:48:30 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <3A5C97F4.945D0C1@lemburg.com>
References: <3A5C97F4.945D0C1@lemburg.com>, <200101052014.PAA20328@cj20424-a.reston1.va.home.com>  
	            <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <20010112004830.21E10A82F@darjeeling.zadka.site.co.il>

On Wed, 10 Jan 2001 18:12:20 +0100, "M.-A. Lemburg" <mal at lemburg.com> wrote:

> > So here's a plan: somebody please submit a patch that does only one
> > thing: from...import * looks for __all__ and if it exists, imports
> > exactly those names.  No changes to dir(), or anything.
> 
> +1 -- this won't be me though (at least not this week).

I'm working on it -- I'll have a patch ready as soon as my slow
modem will manage to finish the "cvs diff".  Guido, I'll
assign it to you, OK?

> Cool.  This could make Python instances usable as "modules"
> -- with full getattr() hook support !

My Patch already does that -- if the instance supports __all__

> For IMPORT_STAR I'd suggest first looking for __all__ and
> then reverting to __dict__.items() in case this fails. 

That's what my patch is doing.

> BTW, is __dict__ needed by the import mechanism or would
> the getattr/setattr slots suffice ? And if yes, must it
> be a real Python dictionary ?

My patch works with getattr (no setattr) as longs as there
is an __all__ attribute. 

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From ping at lfw.org  Thu Jan 11 17:42:44 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 08:42:44 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101111558.KAA15447@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101110842110.5846-100000@skuld.kingmanhall.org>

On Thu, 11 Jan 2001, Guido van Rossum wrote:
> 
> Marc-Andre already explained that __all__ is not to be trusted.
> 
> If you want a reasonably good test for package-ness, use the presence
> of __path__.

Sorry, you're right.  I retract my comment about __all__.


-- ?!ng




From skip at mojam.com  Thu Jan 11 17:47:13 2001
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 11 Jan 2001 10:47:13 -0600 (CST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010111164831.X2467@xs4all.nl>
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>
	<3A5DD248.8EE0DF63@lemburg.com>
	<20010111164831.X2467@xs4all.nl>
Message-ID: <14941.58257.304339.437443@beluga.mojam.com>

    Thomas> __path__ = "neener, neener"

I believe correct English usage here is "neener, neener, neener", with a
little extra emphasis on the first syllable of the third "neener"...

does-that-help?-ly y'rs,

Skip



From MarkH at ActiveState.com  Fri Jan 12 17:55:29 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Fri, 12 Jan 2001 08:55:29 -0800
Subject: [Python-Dev] RE: Baffled on Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEOGIHAA.tim.one@home.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPKEIHCOAA.MarkH@ActiveState.com>

> 4. The way mmapmodule.c is coded and built after Guido's change appears to
> me to be the same as how every other non-builtin module is coded and built
> on Windows.  For example, winsound.c, which uses DL_EXPORT(void)
> before its
> initwinsound and where that macro also expands to "void".  But importing
> winsound works fine.

winsound adds "/export:initwinsound" to the link line.  This is an
alternative to __declspec in the sources.

This all gets back to a discussion we had here nearly a year or so ago -
that "DL_EXPORT" isnt capturing our semantics, and that we should probably
create #defines that match the _intent_ of the definition, rather than the
implementation details - ie, replace DL_EXPORT with (say) PY_API_DECL and
PY_MODULEINIT_DECL or some such.

I'm happy to think about this and help implement it if the time is now
right...

> Any Windows geek got a clue?

Isn't that question a paradox? ;-)

Mark.




From skip at mojam.com  Thu Jan 11 18:11:23 2001
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 11 Jan 2001 11:11:23 -0600 (CST)
Subject: [Python-Dev] dir()/__all__/etc
Message-ID: <14941.59707.632995.224116@beluga.mojam.com>

I know Guido has said he doesn't want to fiddle with dir(), but my sense of
things from the overall discussion of the __exports__ concept tells me that
when used interactively dir() often presents confusing output for new Python
users.

I twiddled CGIHTTPServer to have __all__ and added the following dir()
function to my PYTHONSTARTUP file:

def dir(o,showall=0):
    if not showall and hasattr(o, "__all__"):
        x = list(o.__all__)
        x.sort()
        return x
    from __builtin__ import dir as d
    return d(o)

Compare its output with and without showall set:

  >>> dir(CGIHTTPServer)
  ['CGIHTTPRequestHandler', 'test']
  >>> dir(CGIHTTPServer,1)
  ['BaseHTTPServer', 'CGIHTTPRequestHandler', 'SimpleHTTPServer', '__all__',
   '__builtins__', '__doc__', '__file__', '__name__', '__version__',
   'executable', 'nobody', 'nobody_uid', 'os', 'string', 'sys', 'test',
   'urllib']

I haven't demonstrated any great programming prowess with this little
function, but I rather suspect it may be beyond most brand new users.  If
Guido can't be convinced to allow dir() to change, how about adding a sample
PYTHONSTARTUP file to the distribution that contains little bits like this
and Ping's pydoc.help stuff (assuming it gets into the distro, which I hope
it does)?

Skip



From mal at lemburg.com  Thu Jan 11 18:25:20 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 11 Jan 2001 18:25:20 +0100
Subject: [Python-Dev] Add __exports__ to modules
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> <3A5DD248.8EE0DF63@lemburg.com> <20010111164831.X2467@xs4all.nl>
Message-ID: <3A5DEC80.596F0818@lemburg.com>

Thomas Wouters wrote:
> 
> On Thu, Jan 11, 2001 at 04:33:28PM +0100, M.-A. Lemburg wrote:
> 
> > > Please don't use __all__.  At the moment, __all__ is the only way
> > > to easily tell whether a particular module object really represents
> > > a package, and the only way to get the list of submodule names.
> >
> > But __all__ has to be user-defined, so I don't buy that argument.
> > Note that the only true way to recognize a package is by looking
> > for an attribute "__path__" since Python adds this for packages
> > only.
> 
> Ehm.... What, exactly, prevents usercode from doing
> 
> __path__ = "neener, neener"
> 
> ? In other words, even *that* isn't a true way to recognize a package. You
> can see what isn't a package, but not what is.

Purists.... ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From moshez at zadka.site.co.il  Fri Jan 12 03:06:37 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 04:06:37 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <14941.49059.26189.733094@beluga.mojam.com>
References: <14941.49059.26189.733094@beluga.mojam.com>, <E14GgKS-0002AH-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010112020637.EF4D5A82F@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001 08:13:55 -0600 (CST), Skip Montanaro <skip at mojam.com> wrote:

> While it works, is it really kosher to test w's value after the DECREF?

Yes. It may not point to anything valid, but it won't be NULL.

> Just seems like an odd construct to me.  I'm used to seeing the test
> immediately after it's been set.

It was more convenient that way. And I'm pretty certain the _DECREF
macros do not change their arguments.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From moshez at zadka.site.co.il  Fri Jan 12 03:09:13 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 04:09:13 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org>
Message-ID: <20010112020913.1FE70A82F@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001 07:14:17 -0800 (PST), Ka-Ping Yee <ping at lfw.org> wrote:
> On Wed, 10 Jan 2001, Guido van Rossum wrote:
> > Yes -- I came up with the same thought.
> > 
> > So here's a plan: somebody please submit a patch that does only one
> > thing: from...import * looks for __all__ and if it exists, imports
> > exactly those names.  No changes to dir(), or anything.
> 
> Please don't use __all__.  At the moment, __all__ is the only way
> to easily tell whether a particular module object really represents
> a package

Why not __init__? It has to be there, and is in no other module object.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From moshez at zadka.site.co.il  Fri Jan 12 03:23:16 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 04:23:16 +0200 (IST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010112004830.21E10A82F@darjeeling.zadka.site.co.il>
References: <20010112004830.21E10A82F@darjeeling.zadka.site.co.il>, <3A5C97F4.945D0C1@lemburg.com>, <200101052014.PAA20328@cj20424-a.reston1.va.home.com>  
	            <3A5C61D8.2E5D098C@lemburg.com> <200101101653.LAA28986@cj20424-a.reston1.va.home.com>
Message-ID: <20010112022316.BE682A82D@darjeeling.zadka.site.co.il>

On Fri, 12 Jan 2001, Moshe Zadka <moshez at zadka.site.co.il> wrote:

> I'm working on it -- I'll have a patch ready as soon as my slow
> modem will manage to finish the "cvs diff".  Guido, I'll
> assign it to you, OK?

OK, it's 103200.
Unfortunately, I couldn't assign it to Guido, since I couldn't
upload it at all (yeah, still those lynx problems). This time
I managed to get one specific person to upload for me, but someone
else will have to assign to Guido.

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From nas at arctrix.com  Thu Jan 11 12:42:51 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 11 Jan 2001 03:42:51 -0800
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Jan 11, 2001 at 11:14:00AM -0500
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us>
Message-ID: <20010111034251.A23512@glacier.fnational.com>

Here is what I get on my Debian Linux machine:

  _codecs.so        cPickle.so    imageop.so        pwd.so       termios.so
  _curses.so        cStringIO.so  linuxaudiodev.so  regex.so     time.so
  _curses_panel.so  cmath.so      math.so           resource.so  timing.so
  _locale.so        crypt.so      md5.so            rgbimg.so    ucnhash.so
  _socket.so        dbm.so        mmap.so           rotor.so     unicodedata.so
  _tkinter.so       errno.so      new.so            select.so    zlib.so
  array.so          fcntl.so      nis.so            sha.so
  audioop.so        fpectl.so     operator.so       signal.so
  binascii.so       gdbm.so       parser.so         strop.so
  bsddb.so          grp.so        pcre.so           syslog.so
  
I think that is every module which can be compiled on my machine.  Great work
Andrew (and the distutil developers).

  Neil



From nas at arctrix.com  Thu Jan 11 12:47:09 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 11 Jan 2001 03:47:09 -0800
Subject: [Python-Dev] dir()/__all__/etc
In-Reply-To: <14941.59707.632995.224116@beluga.mojam.com>; from skip@mojam.com on Thu, Jan 11, 2001 at 11:11:23AM -0600
References: <14941.59707.632995.224116@beluga.mojam.com>
Message-ID: <20010111034709.C23512@glacier.fnational.com>

I'm -1 on making dir() pay attention to __all__.  I'm +1 on
adding a help() function which pays attention to __all__ and
(optionally?) prints doc strings.

  Neil



From gstein at lyra.org  Thu Jan 11 20:38:50 2001
From: gstein at lyra.org (Greg Stein)
Date: Thu, 11 Jan 2001 11:38:50 -0800
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101111558.KAA15447@cj20424-a.reston1.va.home.com>; from guido@python.org on Thu, Jan 11, 2001 at 10:58:55AM -0500
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> <200101111558.KAA15447@cj20424-a.reston1.va.home.com>
Message-ID: <20010111113850.F4640@lyra.org>

On Thu, Jan 11, 2001 at 10:58:55AM -0500, Guido van Rossum wrote:
> > Please don't use __all__.  At the moment, __all__ is the only way
> > to easily tell whether a particular module object really represents
> > a package, and the only way to get the list of submodule names.
> > 
> > If __all__ is overloaded to also represent exportable symbols in
> > modules, these two pieces of information will be impossible (or
> > require much ugly hackery) to obtain.
> 
> Marc-Andre already explained that __all__ is not to be trusted.
> 
> If you want a reasonably good test for package-ness, use the presence
> of __path__.
> 
> For a really good test, check whether __file__ ends in __init__.py[c].

Even that isn't safe: if the module was pulled from an archive, __file__
might not get set.

Determining whether something is a package is highly dependent upon how it
was brought into the system. It is entirely possibly that you *can't* know
something represents a package.

You can get close by looking in sys.modules to look for modules "below" the
given module. But if none have been imported yet, then you're out of luck.
If you're using imputil, then you can look for __ispkg__ in the module.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From thomas at xs4all.net  Thu Jan 11 20:50:24 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 11 Jan 2001 20:50:24 +0100
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <20010112020913.1FE70A82F@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Fri, Jan 12, 2001 at 04:09:13AM +0200
References: <Pine.LNX.4.10.10101110711550.5846-100000@skuld.kingmanhall.org> <20010112020913.1FE70A82F@darjeeling.zadka.site.co.il>
Message-ID: <20010111205024.Z2467@xs4all.nl>

On Fri, Jan 12, 2001 at 04:09:13AM +0200, Moshe Zadka wrote:

> Why not __init__? It has to be there, and is in no other module object.

Wrong association... __init__ would be a method that gets executed. (At
least that's what I'd expect :)

'sides,-everyone-was-in-agreement-on-__all__-ly y'rs,

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From MarkH at ActiveState.com  Thu Jan 11 21:25:30 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Thu, 11 Jan 2001 12:25:30 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <20010112020637.EF4D5A82F@darjeeling.zadka.site.co.il>
Message-ID: <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>

> It was more convenient that way. And I'm pretty certain the _DECREF
> macros do not change their arguments.

Pretty certain???  That doesn't inspire confidence <wink>. How certain are
you that this will be true in the future?

I think it bad style indeed - for example, I could see benefit in having
DECREF (or _Py_Dealloc, called by decref) set the object to NULL in debug
builds.  What if that decision is taken in the future?

I thought rules were pretty clear with reference counting - dont assume
_anything_ about the object unless you hold a reference (or are damn sure
someone else does!)

Mark.




From thomas at xs4all.net  Thu Jan 11 22:41:57 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 11 Jan 2001 22:41:57 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>; from MarkH@ActiveState.com on Thu, Jan 11, 2001 at 12:25:30PM -0800
References: <20010112020637.EF4D5A82F@darjeeling.zadka.site.co.il> <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>
Message-ID: <20010111224157.A2467@xs4all.nl>

On Thu, Jan 11, 2001 at 12:25:30PM -0800, Mark Hammond wrote:

> I thought rules were pretty clear with reference counting - dont assume
> _anything_ about the object unless you hold a reference (or are damn sure
> someone else does!)

Moshe isn't breaking that rule. He isn't assuming anything about the object,
just about the value of the pointer to that object. I agree, though, that
it's bad practice to rely on it having the old value, after DECREFing it.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Thu Jan 11 22:48:46 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 16:48:46 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Thu, 11 Jan 2001 08:42:44 PST."
             <Pine.LNX.4.10.10101110842110.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101110842110.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101112148.QAA16227@cj20424-a.reston1.va.home.com>

> Sorry, you're right.  I retract my comment about __all__.

Can you explain *why* you wanted to test for package-ness?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Thu Jan 11 22:55:24 2001
From: guido at python.org (Guido van Rossum)
Date: Thu, 11 Jan 2001 16:55:24 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: Your message of "Thu, 11 Jan 2001 11:14:00 EST."
             <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> 
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> 
Message-ID: <200101112155.QAA16678@cj20424-a.reston1.va.home.com>

> I've put a new version of the setup.py script at
>      http://www.mems-exchange.org/software/files/python/setup.py
> 
> (I'm at work and can't remember the password to get into
> www.amk.ca. :) )
> 
> This version improves the detection of Tcl/Tk, handles the
> _curses_panel module, and doesn't do a chdir().  Same drill as before:
> just grab the script, drop it in the root of your Python source tree
> (2.0 or current CVS), run "./python setup.py build", and look at the
> modules it compiles.  I can try it on Linux, so I'm most interested in
> hearing reports for other Unix versions (*BSD, HP-UX, etc.)

Good work -- but I still can't run this inside a platform-specific
subdirectory.  Are you planning on supporting this?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin at loewis.home.cs.tu-berlin.de  Thu Jan 11 22:20:45 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 11 Jan 2001 22:20:45 +0100
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of Python)
Message-ID: <200101112120.f0BLKjc01982@mira.informatik.hu-berlin.de>

> I would very much appreciate your feedback

At the first glance, it looks *very* promising. I really look forward
to see it in 2.1.

However, robustness probably needs to be improved:

>>> help()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: not enough arguments to help(); expected 1, got 0    

Wasn't there even a proposal that

>>> help

should do something meaningful (by implementing __repr__)?

>>> import string
>>> help(string)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "pydoc.py", line 183, in help
    pager('Help on %s:\n\n' % desc + textdoc.document(thing))
  File "./textdoc.py", line 171, in document
    if inspect.ismodule(object): results = document_module(object)
  File "./textdoc.py", line 87, in document_module
    if (inspect.getmodule(value) or object) is object:
  File "./inspect.py", line 190, in getmodule
    file = getsourcefile(object)
  File "./inspect.py", line 204, in getsourcefile
    filename = getfile(object)
  File "./inspect.py", line 172, in getfile
    raise TypeError, 'arg is a built-in class'
TypeError: arg is a built-in class

Also, the tools could use some command line options:

martin at mira:~/pydoc > ./pydoc.py --help
Traceback (most recent call last):
  File "./pydoc.py", line 190, in ?
    opts[args[i][1:]] = args[i+1]
IndexError: list index out of range

At a minimum, I propose -h, --help, -v, -V.

Regards,
Martin



From fdrake at acm.org  Thu Jan 11 23:11:24 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 11 Jan 2001 17:11:24 -0500 (EST)
Subject: [Python-Dev] [PEP 205] Weak References PEP updated, patch available!
Message-ID: <14942.12172.129547.770776@cj42289-a.reston1.va.home.com>

  I've updated the Weak References PEP a little:

http://python.sourceforge.net/peps/pep-0205.html

  A preliminary version of the implementation and documentation is
available as well:

http://sourceforge.net/patch/?func=detailpatch&patch_id=103203&group_id=5470

  Please send feedback on the PEP or implementation to me.
  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From akuchlin at mems-exchange.org  Thu Jan 11 23:26:33 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 11 Jan 2001 17:26:33 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: <200101112155.QAA16678@cj20424-a.reston1.va.home.com>; from guido@python.org on Thu, Jan 11, 2001 at 04:55:24PM -0500
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> <200101112155.QAA16678@cj20424-a.reston1.va.home.com>
Message-ID: <20010111172633.A26249@kronos.cnri.reston.va.us>

On Thu, Jan 11, 2001 at 04:55:24PM -0500, Guido van Rossum wrote:
>Good work -- but I still can't run this inside a platform-specific
>subdirectory.  Are you planning on supporting this?

I didn't really understand this when you pointed it out, but forgot to
ask for clarification.  What does your directory layout look like?

--amk




From ping at lfw.org  Thu Jan 11 23:26:53 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 14:26:53 -0800 (PST)
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <200101112120.f0BLKjc01982@mira.informatik.hu-berlin.de>
Message-ID: <Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>

On Thu, 11 Jan 2001, Martin v. Loewis wrote:
> 
> However, robustness probably needs to be improved:

Agreed.

> Wasn't there even a proposal that
> 
> >>> help
> 
> should do something meaningful (by implementing __repr__)?

There was.  I am planning to incorporate Paul Prescod's mechanism
for doing this; i just didn't have time to throw in that feature
yet, and wanted feedback on the man-like stuff first.

My next two targets are:
    1.  Generating text from the HTML documentation files
        using Paul Prescod's stuff in onlinehelp.py.

    2.  Running a background HTTP server that produces its
        pages using htmldoc.py.

Both are pieces we already have and only need to integrate; i just
wanted to get at least a working candidate done first.

Did using pydoc like "man" work okay for you?

> >>> import string
> >>> help(string)
> Traceback (most recent call last):
...
> TypeError: arg is a built-in class

Mine doesn't do this for me.  I think i may have left up an older version
of inspect.py by mistake.  Try downloading

    http://www.lfw.org/python/inspect.py

again -- apologies for the hassle.

> Also, the tools could use some command line options:
> 
> martin at mira:~/pydoc > ./pydoc.py --help
> Traceback (most recent call last):
>   File "./pydoc.py", line 190, in ?
>     opts[args[i][1:]] = args[i+1]
> IndexError: list index out of range
> 
> At a minimum, I propose -h, --help, -v, -V.

Okay.  There is usage help already; i just failed to make it sufficiently
robust about deciding when to show it.

    skuld[1010]% pydoc
    /home/ping/bin/pydoc <name> ...
        Show documentation on something.
        <name> may be the name of a Python function, module,
        package, or a dotted reference to a class or function
        within a module or module in a package.

    /home/ping/bin/pydoc -k <keyword>
        Search for a keyword in the short descriptions of modules.


-- ?!ng

"If I have seen farther than others, it is because I was standing on a
really big heap of midgets."
    -- K. Eric Drexler




From ping at lfw.org  Thu Jan 11 23:28:44 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 14:28:44 -0800 (PST)
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: <200101112148.QAA16227@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101111427060.5846-100000@skuld.kingmanhall.org>

On Thu, 11 Jan 2001, Guido van Rossum wrote:
> > Sorry, you're right.  I retract my comment about __all__.
> 
> Can you explain *why* you wanted to test for package-ness?

Auto-generating documentation.  pydoc.py currently tests for __path__,
and looks for the presence of __init__.py in a subdirectory to mean
that the subdirectory name is a package name.  Is it safe on all platforms
to just list all .py files in the subdirectory to get all submodules?


-- ?!ng

"If I have seen farther than others, it is because I was standing on a
really big heap of midgets."
    -- K. Eric Drexler




From tim.one at home.com  Fri Jan 12 00:17:06 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 11 Jan 2001 18:17:06 -0500
Subject: [Python-Dev] RE: Baffled on Windows
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPKEIHCOAA.MarkH@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEBJIIAA.tim.one@home.com>

[Mark Hammond]
> winsound adds "/export:initwinsound" to the link line.  This is an
> alternative to __declspec in the sources.

Yup/arghghghgh.  It's fixed now.  Thanks!

> This all gets back to a discussion we had here nearly a year
> or so ago -

Yup/arghghghgh.
.
> that "DL_EXPORT" isnt capturing our semantics, and that we should
> probably create #defines that match the _intent_ of the
> definition, rather than the implementation details - ie, replace
> DL_EXPORT with (say) PY_API_DECL and PY_MODULEINIT_DECL or some
> such.

Yup/noarghghghgh.

> I'm happy to think about this and help implement it if the time
> is now right...

Same here.  Now how can we tell whether the time is right?  I must say, it
hasn't gotten better by leaving it alone for a year.  I think we need a Unix
dweeb to play along, though -- if only to confirm that their compilers are
no help.

>> Any Windows geek got a clue?

> Isn't that question a paradox? ;-)

Well, nobody else will understand this, but *we* know that Windows geeks
need more clues than everyone else put together just to get the box booted
each day (or hour <0.9 wink>).




From michel at digicool.com  Fri Jan 12 02:15:52 2001
From: michel at digicool.com (Michel Pelletier)
Date: Thu, 11 Jan 2001 20:15:52 -0500
Subject: [Python-Dev] New Draft PEP: Python Interfaces
Message-ID: <web-555709@digicool.com>

Hello,

I have roughed out a draft PEP that proposes the extension
of Python to include an interface framework.  It is posted
online here:

http://www.zope.org/Members/michel/InterfacesPEP/PEP.txt

This is my first revision and stab at a PEP.  I'd like to
find out what you think about the PEP and maybe discuss it
some more offline on a different list.

Thanks!

-Michel



From martin at loewis.home.cs.tu-berlin.de  Fri Jan 12 02:15:25 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 12 Jan 2001 02:15:25 +0100
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>
	(message from Ka-Ping Yee on Thu, 11 Jan 2001 14:26:53 -0800 (PST))
References: <Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>
Message-ID: <200101120115.f0C1FPx03702@mira.informatik.hu-berlin.de>

> Did using pydoc like "man" work okay for you?

Yes, that is very impressive.

> Mine doesn't do this for me.  I think i may have left up an older version
> of inspect.py by mistake.  Try downloading
> 
>     http://www.lfw.org/python/inspect.py
> 
> again -- apologies for the hassle.

No need to apologize. It works fine now.

Thanks,
Martin



From moshez at zadka.site.co.il  Fri Jan 12 10:53:35 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 11:53:35 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>
References: <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com>
Message-ID: <20010112095335.E8A15A82D@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001, "Mark Hammond" <MarkH at ActiveState.com> wrote:

> I think it bad style indeed - for example, I could see benefit in having
> DECREF (or _Py_Dealloc, called by decref) set the object to NULL in debug
> builds.  What if that decision is taken in the future?
> 
> I thought rules were pretty clear with reference counting - dont assume
> _anything_ about the object unless you hold a reference (or are damn sure
> someone else does!)

I'm not assuming anything about the object -- I'm assuming something
about the pointer. And macros should not change their arguments --
DECREF is basically a wrapper around _Py_Dealloc((PyObject *)(op)).

Just like

free(pointer);
if (pointer == NULL) 
	do_something();
is perfectly legal C.

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From moshez at zadka.site.co.il  Fri Jan 12 10:57:32 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 11:57:32 +0200 (IST)
Subject: [Python-Dev] dir()/__all__/etc
In-Reply-To: <14941.59707.632995.224116@beluga.mojam.com>
References: <14941.59707.632995.224116@beluga.mojam.com>
Message-ID: <20010112095732.1F65BA82D@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001 11:11:23 -0600 (CST), Skip Montanaro <skip at mojam.com> wrote:
> 
> I know Guido has said he doesn't want to fiddle with dir(), but my sense of
> things from the overall discussion of the __exports__ concept tells me that
> when used interactively dir() often presents confusing output for new Python
> users.
> 
> I twiddled CGIHTTPServer to have __all__ and added the following dir()
> function to my PYTHONSTARTUP file:
> 
> def dir(o,showall=0):
>     if not showall and hasattr(o, "__all__"):
>         x = list(o.__all__)
>         x.sort()
>         return x
>     from __builtin__ import dir as d
>     return d(o)
> 
> Compare its output with and without showall set:
> 
>   >>> dir(CGIHTTPServer)
>   ['CGIHTTPRequestHandler', 'test']
>   >>> dir(CGIHTTPServer,1)
>   ['BaseHTTPServer', 'CGIHTTPRequestHandler', 'SimpleHTTPServer', '__all__',
>    '__builtins__', '__doc__', '__file__', '__name__', '__version__',
>    'executable', 'nobody', 'nobody_uid', 'os', 'string', 'sys', 'test',
>    'urllib']
> 
> I haven't demonstrated any great programming prowess with this little
> function, but I rather suspect it may be beyond most brand new users.  If
> Guido can't be convinced to allow dir() to change, how about adding a sample
> PYTHONSTARTUP file to the distribution that contains little bits like this
> and Ping's pydoc.help stuff (assuming it gets into the distro, which I hope
> it does)?

And, while we're at it, the following bit too can be in the PYTHONSTARTUP:

def display(x):
	import __builtin__
	__builtin__._ = None
	if type(x) == type(''):
		print `x`
	else:
		print x
	__built__._ = x

import sys
sys.displayhook = display

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From tim.one at home.com  Fri Jan 12 03:33:59 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 11 Jan 2001 21:33:59 -0500
Subject: [Python-Dev] dir()/__all__/etc
In-Reply-To: <20010111034709.C23512@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEBPIIAA.tim.one@home.com>

[Neil Schemenauer]
> I'm -1 on making dir() pay attention to __all__.

Me too.  The original __exports__ idea was an ironclad guarantee about which
names were externally visible for *any* purpose.  Then it made sense to
restrict dir() accordingly.  But if __all__ is just "a hint" (to be ignored
or honored at whim, by whoever chooses), the introspective uses of dir()
must be served too.

> I'm +1 on adding a help() function which pays attention to
> __all__ and (optionally?) prints doc strings.

I can't be +1 on anything that vague -- although I'm +1 on each part of it
if done in exactly the way I envision <wink>.




From ping at lfw.org  Fri Jan 12 03:51:54 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 11 Jan 2001 18:51:54 -0800 (PST)
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of
 Python)
In-Reply-To: <200101120115.f0C1FPx03702@mira.informatik.hu-berlin.de>
Message-ID: <Pine.LNX.4.10.10101111846240.5846-100000@skuld.kingmanhall.org>

On Fri, 12 Jan 2001, Martin v. Loewis wrote:
> > Did using pydoc like "man" work okay for you?
> 
> Yes, that is very impressive.

Good.  What platform did you try it on?

I have updated the scripts now to provide a very rudimentary HTTP server
feature:

    skuld[1316]% pydoc -p 8080
    starting server on port 8080

This starts a server on port 8080 that generates HTML documentation for
modules on the fly.  The root page (http://localhost:8080/) shows an
index of modules -- it badly needs some cleaning up, but at least it
provides access to all the documentation.

    http://www.lfw.org/python/pydoc.py
    http://www.lfw.org/python/htmldoc.py

Also, as you requested:

    skuld[1324]% pydoc -h
    /home/ping/bin/pydoc <name> ...
        Show documentation on something.
        <name> may be the name of a Python function, module,
        package, or a dotted reference to a class or function
        within a module or module in a package.

    /home/ping/bin/pydoc -k <keyword>
        Search for a keyword in the short descriptions of modules.

    /home/ping/bin/pydoc -p <port>
        Start an HTTP server on the given port on the local machine.


More to come.


-- ?!ng




From fdrake at acm.org  Fri Jan 12 04:02:00 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 11 Jan 2001 22:02:00 -0500 (EST)
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>
References: <200101112120.f0BLKjc01982@mira.informatik.hu-berlin.de>
	<Pine.LNX.4.10.10101111420430.5846-100000@skuld.kingmanhall.org>
Message-ID: <14942.29609.19618.534613@cj42289-a.reston1.va.home.com>

Ka-Ping Yee writes:
 > My next two targets are:
 >     1.  Generating text from the HTML documentation files
 >         using Paul Prescod's stuff in onlinehelp.py.

  You mean the ones I publish as the standard documentation?  Relying
on the structure of that HTML is pure folly!  I don't think I can make
any guaranttees that the HTML structures won't change as the
processing evolves.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tim.one at home.com  Fri Jan 12 04:49:47 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 11 Jan 2001 22:49:47 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101111523.KAA14982@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOECEIIAA.tim.one@home.com>

[Guido]
> I don't want to call FLOCKFILE while holding the Python lock, as
> this means that *if* we're blocked in FLOCKFILE (e.g. we're reading
> from a pipe or socket), no other Python thread can run!

Ah, good point!  Doesn't appear an essential point, though:  the
HAVE_GETC_UNLOCKED code could still be fiddled easily enough to call
FLOCKFILE and FUNLOCKFILE exactly once per line, but with the first thread
release before the (dynamically only) FLOCKFILE and the last thread grab
after the (dynamically only) FUNLOCKFILE.  It's just a question of will, but
since that's lacking I'll drop it.

> ...
> I don't think that _filbuf can possibly wait for another thread to
> write data to the same stream object.

OK, I'll buy that.  Dropped too.

> ...
> OK.  It's unique to MS.  So close the bug report with a "won't fix"
> resolution.  There's no point in having bug reports remain open that
> we know we can't fix.

We don't really have a policy about that.  Perhaps you're articulating one
here, though!  I've always left bugs open if they're (a) bugs, and (b) open
<wink>.  For example, I left the Norton Blue-Screen crash bug open (although
I see now you eventually closed that).  Ditto the "Rare hangs in
w9xpopen.exe" bug (which is still open, but will never be fixed by *us*).
Just other examples of things we'll almost certainly never fix ourselves (we
have no handle on them, and all evidence says the OS is screwing up).

My view has been that if a user comes to the bug site, it's most helpful for
them if active (== "still happens") crashes and hangs appear among the open
problems.  Now that your view of it is clearer, I'll switch to yours.

too-easy<wink>-ly y'rs  - tim








From tim.one at home.com  Fri Jan 12 05:22:40 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 11 Jan 2001 23:22:40 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: <200101111527.KAA15005@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIECGIIAA.tim.one@home.com>

[Guido]
> The locking prevents concurrent threads accessing the stream.
>
> But mixing reads and writes (without intervening fseek etc.) is
> illegal use of the stream, and the C standard allows them to be lax
> here, even if the program was single-threaded.
>
> In other words: the locking is so good that it serializes the
> sequence of reads and writes; but if the sequence of reads and
> writes is illegal, they don't guarantee anything.

We're never going to agree on this one, you know.

My definition of "bug" here has nothing to do with the std:  something's "a
bug" if it's not functioning as designed.  That's all.  So if the
implementers would say "oops!  that should not have happened!", then to me
it's "a bug".  It so happens I believe the MS implementers would consider
this to be a bug under that defn.  Multi-threaded libraries have to be
written to a much higher level than the C std guarantees (been there, done
that, and so have you), and this is specifically corruption in a crucial
area vulnerable to races.  They have a timing hole!  That's clear.  If the
MS implementers don't believe that's "a bug", then I'd say they're too
unprofessional to be allowed in the same country as a multithreaded library
<0.1 wink>.

Your definition of "bug" seems to be more "I don't want it in Python's open
bug list, so I'll do what Tim usually does and appeal to the std in a
transparent effort to convince someone that it's not really 'a bug' -- then
maybe I'll get it off of Python's bug list".

I'm sure you'll agree that's a fair summary of both sides <wink>.

it's-a-bug-and-it's-no-longer-on-python's-open-bug-list-ly y'rs
    - tim




From tim.one at home.com  Fri Jan 12 07:54:47 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 12 Jan 2001 01:54:47 -0500
Subject: [Python-Dev] RE: [Patches] [Patch #102915] xreadlines : readlines :: xrange : range
In-Reply-To: <200101111508.KAA14870@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECMIIAA.tim.one@home.com>

[Tim, on for_xreadlines vs readlines_sizehint, after disabling the
 default 1Mb buffer size in the latter]
> They're indistinguishable then on my box (on one run xreadlines
> is .1 seconds  (out of around 7.6 total) quicker, on another
> readlines_sizehint), *provided* that I specify the same buffer
> size (8192) that xreadlines uses internally.  However, if I even
> double that, readlines_sizehint is uniformly about 10% slower.  It's
> also a tiny bit slower if I cut the sizehint buffer size to 4096.

[Guido]
> 8192 happens to be the size of the stack-allocated buffer readlines()
> uses, and also the stdio BUFSIZ parameter, on many systems.  Look for
> SMALLCHUNK in fileobject.c.
>
> Would it make sense to tie the two constants together more to tune
> this optimally even when BUFSIZ is different?

Have to repeat what I first said:

> I'm afraid Mysteries will remain no matter how many
> person-decades we spend staring at this <0.5 wink> ...

I'm repeating that because BUFSIZ is 4096 on WinTel, but SMALLCHUNK (8192)
worked best for me.  Now we're in some complex balancing act among how often
the outer loop needs to refill the readlines_sizehint buffer;, how out of
whack the latter is with the platform stdio buffer; whether platform malloc
takes only twice as long to allocate space for 2*N strings as for N; and, if
the readlines buffer is too large, at exactly which point the known Win9x
eventually-quadratic-time behavior of PyList_Append starts to kick in.  I
can't out-think all that.  Indeed, I can't out-think any of it <frown>.

After staring at the code, I expect my "only a tiny bit slower" was an
illusion:  if 0 < sizehint <= SMALLCHUNK, sizehint appears to have no effect
on the operation on file_readline.

BTW, changing fileobject.c's SMALLCHUNK to a copy of BUFSIZ didn't make any
difference on Windows.




From moshez at zadka.site.co.il  Fri Jan 12 17:03:58 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 12 Jan 2001 18:03:58 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules xreadlinesmodule.c,1.2,1.3
In-Reply-To: <E14GjqE-0003qi-00@usw-pr-cvs1.sourceforge.net>
References: <E14GjqE-0003qi-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010112160358.B0AC0A82D@darjeeling.zadka.site.co.il>

On Thu, 11 Jan 2001, Thomas Wouters <twouters at users.sourceforge.net> wrote:

> Noone but me cares, but Guido said to go ahead and fix it if it bothered me.

I think you meant no one. Noone is an archaic spelling of noon.

quid-pro-quo-ly y'rs, Z.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From fredrik at effbot.org  Fri Jan 12 09:17:11 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 12 Jan 2001 09:17:11 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules xreadlinesmodule.c,1.2,1.3
References: <E14GjqE-0003qi-00@usw-pr-cvs1.sourceforge.net> <20010112160358.B0AC0A82D@darjeeling.zadka.site.co.il>
Message-ID: <012a01c07c70$11aac700$e46940d5@hagrid>

> > Noone but me cares, but Guido said to go ahead and fix it if it bothered me.
> 
> I think you meant no one. Noone is an archaic spelling of noon.

no, he meant me.  I care.

</F>




From martin at loewis.home.cs.tu-berlin.de  Fri Jan 12 09:09:00 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 12 Jan 2001 09:09:00 +0100
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of
 Python)
In-Reply-To: <Pine.LNX.4.10.10101111846240.5846-100000@skuld.kingmanhall.org>
	(message from Ka-Ping Yee on Thu, 11 Jan 2001 18:51:54 -0800 (PST))
References: <Pine.LNX.4.10.10101111846240.5846-100000@skuld.kingmanhall.org>
Message-ID: <200101120809.f0C890B00802@mira.informatik.hu-berlin.de>

> Good.  What platform did you try it on?

Linux, in a Konsole. I guess that is an environment you'd been using
as well :-)

Martin




From jack at oratrix.nl  Fri Jan 12 10:57:27 2001
From: jack at oratrix.nl (Jack Jansen)
Date: Fri, 12 Jan 2001 10:57:27 +0100
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of 
 Python)
In-Reply-To: Message by Ka-Ping Yee <ping@lfw.org> ,
	     Thu, 11 Jan 2001 08:36:36 -0800 (PST) , <Pine.LNX.4.10.10101110803400.5846-100000@skuld.kingmanhall.org> 
Message-ID: <20010112095727.C56D13BD8B0@snelboot.oratrix.nl>

> I'm pleased to announce a reasonable first pass at a documentation
> utility for interactive use.  "pydoc" is usable in three ways:
[...]
> I would very much appreciate your feedback, especially from testing
> on non-Unix platforms.  Thank you!

Wow, I'm impressed!

To make it run on the mac I had to add tests for the existence of os.system 
only. (So all statements "if os.system(...) > 0:" got to be "if hasattr(os, 
"system") and os.system(...) > 0:").

There are however various other niceties that could be added to make it more 
useful, can this be put into the repository or something?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 





From gstein at lyra.org  Fri Jan 12 11:31:53 2001
From: gstein at lyra.org (Greg Stein)
Date: Fri, 12 Jan 2001 02:31:53 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python ceval.c,2.217,2.218
In-Reply-To: <20010111224157.A2467@xs4all.nl>; from thomas@xs4all.net on Thu, Jan 11, 2001 at 10:41:57PM +0100
References: <20010112020637.EF4D5A82F@darjeeling.zadka.site.co.il> <LCEPIIGDJPKCOIHOBJEPAEJBCOAA.MarkH@ActiveState.com> <20010111224157.A2467@xs4all.nl>
Message-ID: <20010112023153.Q4640@lyra.org>

On Thu, Jan 11, 2001 at 10:41:57PM +0100, Thomas Wouters wrote:
> On Thu, Jan 11, 2001 at 12:25:30PM -0800, Mark Hammond wrote:
> 
> > I thought rules were pretty clear with reference counting - dont assume
> > _anything_ about the object unless you hold a reference (or are damn sure
> > someone else does!)
> 
> Moshe isn't breaking that rule. He isn't assuming anything about the object,
> just about the value of the pointer to that object. I agree, though, that
> it's bad practice to rely on it having the old value, after DECREFing it.

Oh, that is just so much baloney.

If I said Py_DECREF(&ptr), *then* I'd be worried. But if I ever call
Py_DECREF(foo) and it modifies foo, then I'd be quite upset. "functions"
just aren't supposed to do that.

-g

-- 
Greg Stein, http://www.lyra.org/



From guido at python.org  Fri Jan 12 14:51:51 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 08:51:51 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: Your message of "Thu, 11 Jan 2001 17:26:33 EST."
             <20010111172633.A26249@kronos.cnri.reston.va.us> 
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> <200101112155.QAA16678@cj20424-a.reston1.va.home.com>  
            <20010111172633.A26249@kronos.cnri.reston.va.us> 
Message-ID: <200101121351.IAA19676@cj20424-a.reston1.va.home.com>

> >Good work -- but I still can't run this inside a platform-specific
> >subdirectory.  Are you planning on supporting this?
> 
> I didn't really understand this when you pointed it out, but forgot to
> ask for clarification.  What does your directory layout look like?

Ah.  It's very simple.  I create a directory "linux" as a subdirectory
of the Python source tree (i.e. at the same level as Lib, Objects,
etc.).  Then I chdir into that directory, and I say "../configure".
The configure script creates subdirectories to hold the object files
for me: Grammar, Parser, Objects, Python, Modules, and sticks
Makefiles in them.  The "srcdir" variable in the Makefiles is set to
"..".  Then I say "make" and it builds Python.  The source directories
are used but no files are created or modified there: all files are
created in the "linux" directory.  This lets me have several separate
configurations: the feature used to be intended for sharing a source
tree between multiple platforms, but now I use it to have threaded,
nonthreaded, debugging, and regular builds under a single source tree.

This also works where the build directory is completely outside the
source tree (some people apparently mount the source tree read-only).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Jan 12 14:54:12 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 08:54:12 -0500
Subject: [Python-Dev] Add __exports__ to modules
In-Reply-To: Your message of "Thu, 11 Jan 2001 14:28:44 PST."
             <Pine.LNX.4.10.10101111427060.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101111427060.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101121354.IAA19700@cj20424-a.reston1.va.home.com>

> > Can you explain *why* you wanted to test for package-ness?
> 
> Auto-generating documentation.  pydoc.py currently tests for __path__,
> and looks for the presence of __init__.py in a subdirectory to mean
> that the subdirectory name is a package name.  Is it safe on all platforms
> to just list all .py files in the subdirectory to get all submodules?

Yes, that should work.  Of course there could also be extension
modules or .pyc-only files there -- you could use imp..get_suffixes()
to find out all modules (even if that means you don't always have the
source code available).

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido at python.org  Fri Jan 12 15:07:30 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 09:07:30 -0500
Subject: [Python-Dev] xreadlines : readlines :: xrange : range
In-Reply-To: Your message of "Thu, 11 Jan 2001 22:49:47 EST."
             <LNBBLJKPBEHFEDALKOLCOECEIIAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCOECEIIAA.tim.one@home.com> 
Message-ID: <200101121407.JAA19781@cj20424-a.reston1.va.home.com>

> [Guido]
> > I don't want to call FLOCKFILE while holding the Python lock, as
> > this means that *if* we're blocked in FLOCKFILE (e.g. we're reading
> > from a pipe or socket), no other Python thread can run!

[Tim]
> Ah, good point!  Doesn't appear an essential point, though:  the
> HAVE_GETC_UNLOCKED code could still be fiddled easily enough to call
> FLOCKFILE and FUNLOCKFILE exactly once per line, but with the first thread
> release before the (dynamically only) FLOCKFILE and the last thread grab
> after the (dynamically only) FUNLOCKFILE.  It's just a question of will, but
> since that's lacking I'll drop it.

Yes, but if the line is very long, you'd have to use malloc() -- you
can't use _PyString_Resize() since that can access the thread state.
You're right that I don't want to do this.

> > OK.  It's unique to MS.  So close the bug report with a "won't fix"
> > resolution.  There's no point in having bug reports remain open that
> > we know we can't fix.
> 
> We don't really have a policy about that.  Perhaps you're articulating one
> here, though!  I've always left bugs open if they're (a) bugs, and (b) open
> <wink>.  For example, I left the Norton Blue-Screen crash bug open (although
> I see now you eventually closed that).  Ditto the "Rare hangs in
> w9xpopen.exe" bug (which is still open, but will never be fixed by *us*).
> Just other examples of things we'll almost certainly never fix ourselves (we
> have no handle on them, and all evidence says the OS is screwing up).

Yes, as I was thinking about this I realized that that was the policy
I wanted.  So, yes, the w9xpopen popen bug can be closed as WontFix too.

> My view has been that if a user comes to the bug site, it's most helpful for
> them if active (== "still happens") crashes and hangs appear among the open
> problems.  Now that your view of it is clearer, I'll switch to yours.

I find it more important that the bug list gives us developers an
overview of tasks to be tackled.  The problems that won't go away can
be listed in the Python 2.0 MoinMoin web!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Fri Jan 12 15:27:43 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 09:27:43 -0500
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of Python)
In-Reply-To: Your message of "Fri, 12 Jan 2001 10:57:27 +0100."
             <20010112095727.C56D13BD8B0@snelboot.oratrix.nl> 
References: <20010112095727.C56D13BD8B0@snelboot.oratrix.nl> 
Message-ID: <200101121427.JAA20034@cj20424-a.reston1.va.home.com>

> There are however various other niceties that could be added to make it more 
> useful, can this be put into the repository or something?

Ping, do you think you could check this in into the nondist tree?
nondist/sandbox/help would seem a good name (next to Paul's
nondist/sandbox/doctools).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Fri Jan 12 17:37:57 2001
From: skip at mojam.com (Skip Montanaro)
Date: Fri, 12 Jan 2001 10:37:57 -0600 (CST)
Subject: [Python-Dev] [Patch #103154] Cygwin Check Import Case Patch
In-Reply-To: <E14Gpl0-00016l-00@usw-sf-web3.sourceforge.net>
References: <E14Gpl0-00016l-00@usw-sf-web3.sourceforge.net>
Message-ID: <14943.13029.103771.261362@beluga.mojam.com>

    Guido> Summary: Cygwin Check Import Case Patch
    ...
    Guido> But I believe the solution is that the TERMIOS module should be
    Guido> renamed.

Isn't this a general problem?  As I recall, the convention when generating
Python modules from C header files is to simply convert the base name to
upper case and replace ".h" with ".py" (errno.h -> ERRNO.py).  From h2py.py:

    # Without filename arguments, acts as a filter.
    # If one or more filenames are given, output is written to corresponding
    # filenames in the local directory, translated to all uppercase, with
    # the extension replaced by ".py".

Perhaps the convention should be instead to append "d" or "data" to the base
name (errno.h -> errnodata.py).

Skip



From guido at python.org  Fri Jan 12 18:47:46 2001
From: guido at python.org (Guido van Rossum)
Date: Fri, 12 Jan 2001 12:47:46 -0500
Subject: [Python-Dev] [Patch #103154] Cygwin Check Import Case Patch
In-Reply-To: Your message of "Fri, 12 Jan 2001 10:37:57 CST."
             <14943.13029.103771.261362@beluga.mojam.com> 
References: <E14Gpl0-00016l-00@usw-sf-web3.sourceforge.net>  
            <14943.13029.103771.261362@beluga.mojam.com> 
Message-ID: <200101121747.MAA27504@cj20424-a.reston1.va.home.com>

>     Guido> Summary: Cygwin Check Import Case Patch
>     ...
>     Guido> But I believe the solution is that the TERMIOS module should be
>     Guido> renamed.
> 
> Isn't this a general problem?  As I recall, the convention when generating
> Python modules from C header files is to simply convert the base name to
> upper case and replace ".h" with ".py" (errno.h -> ERRNO.py).  From h2py.py:
> 
>     # Without filename arguments, acts as a filter.
>     # If one or more filenames are given, output is written to corresponding
>     # filenames in the local directory, translated to all uppercase, with
>     # the extension replaced by ".py".
> 
> Perhaps the convention should be instead to append "d" or "data" to the base
> name (errno.h -> errnodata.py).

An even better solution is to get rid of those generated headers and
incorporate the desired symbols directly in the C extension modules.
That's happened for errno and socket, for example; maybe it's time to
do that for termios, too!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Fri Jan 12 19:54:47 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Fri, 12 Jan 2001 13:54:47 -0500
Subject: [Python-Dev] Patch 103216 - dbmmodule Setup changes
Message-ID: <14943.21239.382891.661026@anthem.wooz.org>

I've just uploaded patch 103216 to the Python project at SF.  This
does a couple of things.  First, it auto-detects (in configure)
whether dbmmodule can be built, and if so whether the -lndbm library
needs to be specified.  Second, it moves the entry for dbmmodule to
Setup.conf, after the *shared* key so that it'll be built as a dynamic
library by default.

This should fix the problem where compiling in dbmmodule sets up a
dependency to libdb which later hoses pybsddb3.

I'd have just checked it in, but I'd like someone else to just proof
it first.  I've only tested this with the current CVS tree on a fairly
stock RH6.1.

BTW, I didn't include the changes to configure in the patch, because
it's large and made SF's patch manager cough.  Besides it can be
generated from configure.in and config.h.in which are included in the
patch.

Cheers,
-Barry




From martin at loewis.home.cs.tu-berlin.de  Fri Jan 12 23:19:57 2001
From: martin at loewis.home.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 12 Jan 2001 23:19:57 +0100
Subject: [Python-Dev] PEP 205 comments
Message-ID: <200101122219.f0CMJvp01376@mira.informatik.hu-berlin.de>

Before commenting on the patch itself, I'd like to comment on the
patch describing it.

I'm missing a discussion as to why weak references don't act as
proxies (or why they do now). A weak proxy would provide the same
attributes as the object which it encapsulates, so it could be used
transparently in place of the original object. I can think of a number
of reasons why it is not done this way (e.g. complete transparency is
impossible to achieve); now that a revision of the patch provides
proxies, the documentation should state which features are forwarded
to the proxy and which aren't (it lists the type() as a difference,
but I doubt that is the only difference - repr is also different).

Next, I wonder whether weakref.new is allowed to return an existing
weak reference to the same object. If that is not acceptable, I'd like
to know why - if it was acceptable, then weakref.new(instance)
(i.e. without callback) could return the same weak reference all the
time. A smart implementation might chose to put the weak reference
with no callback in the start of the list, so creation of additional
weak references to the same object would be inexpensive.

Likewise, I'd like to know the rationale for the clear method. Why is
it desirable to drop the object, yet keep the weak reference? Isn't it
easier for the application to either ignore clearing altogether, or
dropping the reference to the weak reference? So I'd propose to kill
the clear method.

Again on proxies, there is no discussion or documentation of the
ReferenceError. Why is it a RuntimeError? LookupError, ValueError, and
AttributeError seem to be just as fine or better.

On to the type type extensions: Should there be a type flag indicating
presence of tp_weaklistoffset? It appears that the type structure had
tp_xxx7 for a long time, so likely all in-use binary modules have
that field set to zero. Is that sufficient?

Thanks for reading all of this message,

Martin



From skip at mojam.com  Sat Jan 13 16:37:55 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 13 Jan 2001 09:37:55 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib tempfile.py,1.23,1.24
In-Reply-To: <E14HGz6-0005Fh-00@usw-pr-cvs1.sourceforge.net>
References: <E14HGz6-0005Fh-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <14944.30291.658931.489979@beluga.mojam.com>

    Tim> On Linux, someone please run that standalone with more files and/or
    Tim> more threads; e.g.,

    Tim>     python lib/test/test_threadedtempfile.py -f 1000 -t 10

    Tim> to run with 10 threads each creating (and deleting) 1000 temp files.

After capitalizing "Lib", it worked fine for me:

    % ./python Lib/test/test_threadedtempfile.py -f 1000 -t 10
    Creating
    Starting
    Reaping
    Done: errors 0 ok 10000

Skip



From dkwolfe at pacbell.net  Sat Jan 13 19:48:21 2001
From: dkwolfe at pacbell.net (Dan Wolfe)
Date: Sat, 13 Jan 2001 10:48:21 -0800
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
Message-ID: <0G740027Q6Q1KL@mta6.snfc21.pbi.net>

Howdy Folks,

I need some help here. I'd like to see Python build out of the box with a 
./configure, make, make test, and make install on Darwin and Mac OS X.  
Having it build out of the box will make it easier to be incorporated 
into both Darwin and the base Mac OS X distribution - although not for 
the initial release of the latter but definitely doable for subsequent 
releases. In order to do this, I need to have it build cleanly on HFS and 
UFS filesystems.

Under HFS system, I've got a name conflict due to case insenstivity 
between the build target and the "Python" directory that forces me to 
build with a -with-suffix command on HFS and manually change the name 
after install - which is an automatic knockout factor when it comes to 
incorporating it in an automatic build system. Not to mention a problem 
with unix newbies trying to build from source...

Last night, I did some quick investigation to determine the best way to 
fix this problem as documented in PEP-42 in the build section and 
Sourceforge bug 122215 and determined that the easiest and least error 
prone way was to change the directory name Python to PyCore.

It's apparent from the comments that I'm missing something here as the 
reaction has been negative so far - to the point where Guido has rejected 
the patch. Can someone explain what I'd missing that's causing such 
strong feelings?

My second question is how do I resolve the name conflict in an approved 
way?  It's been suggested that a build directory be created (/src/build 
?) and that the target be place here. The problem that I had with this 
suggestion is that it would require an additional layer to execute the 
target and I wasn't sure what impact it whould have on running python 
from a new directory... which is the reason I took the more known path. 
:-)

Bottom line, come March 24th, Mac OS X 1.0 will be released and as of 
July 2001 all Macintoshes  will come with Mac OS X.  I'd like to see 
Python be easily built on "out of the box" these machines - rather come 
with a haphazardous list of instructions or commands as currently needed 
for 1.5.2 and 2.0 releases. And hopefully, at some point be incorporated 
into the base Mac OS X installation...

- Dan Wolfe



From esr at thyrsus.com  Sat Jan 13 21:23:50 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sat, 13 Jan 2001 15:23:50 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
Message-ID: <20010113152350.A17338@thyrsus.com>

I have a new goodie for the 2.1 standard library, a module called
"simil" that supports computation of similarity indices between
strings such as one might use for recovery-matching of misspellings
against a dictionary.

The three methods supported are stemming, normalized Hamming
similarity, and (the star of the show) Ratcliff-Obershelp gestalt
subpattern matching.  The latter is spookily effective for detecting
not just substition typos but insertions and deletions.  The module is
a C extension (my first!) for speed and because the Ratcliff-Obershelp
implementation uses pointer arithmetic heavily.

It's documented, tested, and ready to go.  But having written it, I
now have a question: why is soundex marked obsolete?  Is there
something wrong with the algorithm or implementation?  If not, then
it would be natural for simil to absorb the existing soundex 
implementation as a fourth entry point.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Whether the authorities be invaders or merely local tyrants, the
effect of such [gun control] laws is to place the individual at the 
mercy of the state, unable to resist.
        -- Robert Anson Heinlein, 1949

-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Americans have the right and advantage of being armed - unlike the citizens
of other countries whose governments are afraid to trust the people with arms.
	-- James Madison, The Federalist Papers



From tim.one at home.com  Sat Jan 13 22:34:10 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 13 Jan 2001 16:34:10 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010113152350.A17338@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>

[Eric S. Raymond]
> I have a new goodie for the 2.1 standard library, a module called
> "simil" that supports computation of similarity indices between
> strings such as one might use for recovery-matching of misspellings
> against a dictionary.

My guess is that Guido won't accept it.

> The three methods supported are stemming, normalized Hamming
> similarity, and (the star of the show) Ratcliff-Obershelp gestalt
> subpattern matching.  The latter is spookily effective for detecting
> not just substition typos but insertions and deletions.  The module is
> a C extension (my first!) for speed and because the Ratcliff-Obershelp
> implementation uses pointer arithmetic heavily.

Never heard of R-O, so tracked down some C code via google.  It appears I
invented the same algorithm at Cray Research in the early 80's for a diff
generator, which later got reincarnated in my ndiff.py (in the
Tools/scripts/ directory).  ndiff generates "human-friendly" diffs between
text files, at both the "file is a sequence of lines" and "line is a
sequence of characters" levels.  I didn't have the hyperbolic marketing
genius to call it "gestalt subpattern matching", though <wink> -- I thought
of it as what Unix diff *would* do if it constrained itself to matching
*contiguous* subsequences, and under the theory people would find that more
natural because contiguity is something the human visual system naturally
latches on to.  ndiff can be spookily natural in practice too.

> It's documented, tested, and ready to go.  But having written it, I
> now have a question: why is soundex marked obsolete?  Is there
> something wrong with the algorithm or implementation?

What is the soundex algorithm?  Not joking.  Skip Montanaro and I were
unable to find the algorithm implemented by soundex.c anywhere in the
literature, and I never found *any* two definitions that were the same.
Even Knuth changed his description of Soundex between editions 2 and 3 of
volume 3.  Skip eventually merged my and Fred Drake's Python implementations
of Knuth Vol 3 Ed 3 Soundex (see the Vaults of Parnassus).

> If not, then it would be natural for simil to absorb the existing
> soundex implementation as a fourth entry point.

Well, soundex.c doesn't match any other Soundex on earth, so it's not worth
reproducing in new code.  Guido doesn't want to be in the middle of fighting
over ill-defined algorithms, so booted Soundex entirely.  Another candidate
for inclusion is the NYSIIS algorithm, which is probably in more "serious"
use than Soundex anyway.  Same thing with NYSIIS, though (i.e., what--
exactly --is "the NYSIIS algorithm"?), except that Knuth didn't do us the
favor of making up his own variation that will *become* "the std" via force
of reputation.  Sean True implemented *a* NYSIIS in Python (and again see
the Vaults for a link to that).

So that's why the module is unlikely to make it into the core:

+ There are any number of algorithms people may want to see (I don't know
what "normalized Hamming similarity" means, but if it's not the same as
Levenshtein edit distance then add the latter to the pot too).

+ Each algorithm on its own is likely controversial.

+ Computing string similarity is something few apps need anyway.

Lots of hassle + little demand == not a natural for the core.  ndiff is in
the core only because many people found the *app* useful; its
SequenceMatcher class isn't even advertised.

may-never-understand-how-bigints-got-into-python<wink>-ly
    y'rs  - tim




From fdrake at acm.org  Sat Jan 13 22:45:12 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sat, 13 Jan 2001 16:45:12 -0500 (EST)
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>
References: <20010113152350.A17338@thyrsus.com>
	<LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>
Message-ID: <14944.52328.558763.46161@cj42289-a.reston1.va.home.com>

Tim Peters writes:
 > + Computing string similarity is something few apps need anyway.

  And this is a biggie.

 > Lots of hassle + little demand == not a natural for the core.  ndiff is in

  But it *is* an excellent type of thing to have around -- Eric: just
post it on your Web site and register it with the Vaults.

 > the core only because many people found the *app* useful; its
 > SequenceMatcher class isn't even advertised.

  Did you ever write documentation for it?  ;-)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From nas at arctrix.com  Sat Jan 13 16:17:58 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sat, 13 Jan 2001 07:17:58 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python sysmodule.c,2.82,2.83
In-Reply-To: <E14HYoJ-0002n3-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Sat, Jan 13, 2001 at 02:06:07PM -0800
References: <E14HYoJ-0002n3-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010113071758.C28643@glacier.fnational.com>

[Guido van Rossum on Demo/embed/loop]
> (Except it still leaks, but that's probably a separate issue.)

Could this be caused by modules adding things to their dict and
then forgetting to decref them?  I know I've been guilty of that.

  Neil



From esr at thyrsus.com  Sat Jan 13 23:15:28 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sat, 13 Jan 2001 17:15:28 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>; from tim.one@home.com on Sat, Jan 13, 2001 at 04:34:10PM -0500
References: <20010113152350.A17338@thyrsus.com> <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>
Message-ID: <20010113171528.A17480@thyrsus.com>

OK, now I understand why soundex isn't in the core -- there's no canonical 
version.

Tim Peters <tim.one at home.com>:
> + There are any number of algorithms people may want to see (I don't know
> what "normalized Hamming similarity" means, but if it's not the same as
> Levenshtein edit distance then add the latter to the pot too).

Normalized Hamming similarity: it's an inversion of Hamming distance
-- number of pairwise matches in two strings of the same length,
divided by the common string length.  Gives a measure in [0.0, 1.0].

I've looked up "Levenshtein edit distance" and you're rigbt.  I'll add it
as a fourth entry point as soon as I can find C source to crib.  (Would
you happen to have a pointer?)

> + Each algorithm on its own is likely controversial.

Not these.  There *are* canonical versions of all these, and exact
equivalents are all heavily used in commercial OCR software.

> + Computing string similarity is something few apps need anyway.

Tim, this isn't true.  Any time you need to validate user input
against a controlled vocabulary and give feedback on probable right
choices, R/O similarity is *very* useful.  I've had it in my personal
toolkit for a decade and used it heavily for this -- you take your
unknown input, check it against a dictionary and kick "maybe you meant
foo?" to the user for every foo with an R/O similarity above 0.6 or so.

The effects look like black magic.  Users love it.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"I hold it, that a little rebellion, now and then, is a good thing, and as 
necessary in the political world as storms in the physical."
	-- Thomas Jefferson, Letter to James Madison, January 30, 1787



From guido at python.org  Sat Jan 13 23:25:12 2001
From: guido at python.org (Guido van Rossum)
Date: Sat, 13 Jan 2001 17:25:12 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python sysmodule.c,2.82,2.83
In-Reply-To: Your message of "Sat, 13 Jan 2001 07:17:58 PST."
             <20010113071758.C28643@glacier.fnational.com> 
References: <E14HYoJ-0002n3-00@usw-pr-cvs1.sourceforge.net>  
            <20010113071758.C28643@glacier.fnational.com> 
Message-ID: <200101132225.RAA03197@cj20424-a.reston1.va.home.com>

> [Guido van Rossum on Demo/embed/loop]
> > (Except it still leaks, but that's probably a separate issue.)
> 
> Could this be caused by modules adding things to their dict and
> then forgetting to decref them?  I know I've been guilty of that.

Do you have a tool that detects leaks?  Barry has one: Insure++.  It's
expensive and we don't have a site license, so I'll ask Barry to
investigate this.

(Barry: go to Demo/embed and do "make looptest".  Then in another
shell window use "top" to watch the "loop" process grow slowly.  I'd
love to find out what's the problem here.  It's not dependent on what
you ask it to loop over; "./loop pass" also grows.  Of course it could
be one of the modules loaded during initialization...)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Sat Jan 13 23:33:34 2001
From: guido at python.org (Guido van Rossum)
Date: Sat, 13 Jan 2001 17:33:34 -0500
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
In-Reply-To: Your message of "Sat, 13 Jan 2001 10:48:21 PST."
             <0G740027Q6Q1KL@mta6.snfc21.pbi.net> 
References: <0G740027Q6Q1KL@mta6.snfc21.pbi.net> 
Message-ID: <200101132233.RAA03229@cj20424-a.reston1.va.home.com>

> Howdy Folks,
> 
> I need some help here. I'd like to see Python build out of the box with a 
> ./configure, make, make test, and make install on Darwin and Mac OS X.  
> Having it build out of the box will make it easier to be incorporated 
> into both Darwin and the base Mac OS X distribution - although not for 
> the initial release of the latter but definitely doable for subsequent 
> releases. In order to do this, I need to have it build cleanly on HFS and 
> UFS filesystems.
> 
> Under HFS system, I've got a name conflict due to case insenstivity 
> between the build target and the "Python" directory that forces me to 
> build with a -with-suffix command on HFS and manually change the name 
> after install - which is an automatic knockout factor when it comes to 
> incorporating it in an automatic build system. Not to mention a problem 
> with unix newbies trying to build from source...
> 
> Last night, I did some quick investigation to determine the best way to 
> fix this problem as documented in PEP-42 in the build section and 
> Sourceforge bug 122215 and determined that the easiest and least error 
> prone way was to change the directory name Python to PyCore.
> 
> It's apparent from the comments that I'm missing something here as the 
> reaction has been negative so far - to the point where Guido has rejected 
> the patch. Can someone explain what I'd missing that's causing such 
> strong feelings?

We use CVS to manage the sources.  CVS makes it it very hard to a
directory; it doesn't have a command for this, so you have to do the
move directly in the repository, which will then break checkouts for
everyone who has a work directory linked to the CVS repository.  Using
SourceForge makes it a bit harder still: we have to ask the SF
sysadmins to do the move for us.

And if we did the move, it would be much harder to reproduce old
versions of the source tree with a single CVS command.  A way around
that would be to do a copy instead of a move, but that would cause the
directory "PyCore" to pop up in all old versions, too.

I just don't want to go through this hassle in order to make building
easier for one relatively little-used platform.

> My second question is how do I resolve the name conflict in an approved 
> way?  It's been suggested that a build directory be created (/src/build 
> ?) and that the target be place here. The problem that I had with this 
> suggestion is that it would require an additional layer to execute the 
> target and I wasn't sure what impact it whould have on running python 
> from a new directory... which is the reason I took the more known path. 
> :-)

I don't understand what you are proposing here; I can't imagine that
an extra directory level could cause a slowdown.

A suggestion I would be open to: change the executable name during
build (currently a .exe suffix is added), but change it back (removing
the .exe suffix) during the install.  That should be a small change to
the Makefile.

> Bottom line, come March 24th, Mac OS X 1.0 will be released and as of 
> July 2001 all Macintoshes  will come with Mac OS X.  I'd like to see 
> Python be easily built on "out of the box" these machines - rather come 
> with a haphazardous list of instructions or commands as currently needed 
> for 1.5.2 and 2.0 releases. And hopefully, at some point be incorporated 
> into the base Mac OS X installation...

Just get Apple to include Python with their standard distribution and
nobody will *have* to build Python on Mac OSX. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Sun Jan 14 00:59:44 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 13 Jan 2001 18:59:44 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010113171528.A17480@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHHIIAA.tim.one@home.com>

[Eric]
> OK, now I understand why soundex isn't in the core -- there's no
> canonical version.

Actually, I think Knuth Vol 3 Ed 3 is canonical *now* -- nobody would dare
to oppose him <0.5 wink>.

> Normalized Hamming similarity: it's an inversion of Hamming distance
> -- number of pairwise matches in two strings of the same length,
> divided by the common string length.  Gives a measure in [0.0, 1.0].
>
> I've looked up "Levenshtein edit distance" and you're rigbt.  I'll add
> it as a fourth entry point as soon as I can find C source to crib.
> (Would you happen to have a pointer?)

If you throw almost everything out of Unix diff, that's what you'll be left
with.  Offhand I don't know of enencumbered, industrial-strength C source; a
problem is that writing a program to compute this is a std homework exercise
(it's a common first "dynamic programming" example), so you can find tons of
bad C source.

Caution:  many people want small variations of "edit distance", usually via
assigning different weights to insertions, replacements and deletions.  A
less common but still popular variant is to say that a transposition ("xy"
vs "yx") is less costly than a delete plus an insert.  Etc.  "edit distance"
is really a family of algorithms.

>> + Each algorithm on its own is likely controversial.

> Not these.  There *are* canonical versions of all these,

See the "edit distance" gloss above.

> and exact equivalents are all heavily used in commercial OCR
> software.

God forbid that core Python may lose the commercial OCR developer market
<wink>.  It's not accepted that for every field F, core Python needs to
supply the algorithms F uses heavily.  Heck, core Python doesn't even ship
with an FFT!  Doesn't bother the folks working in signal processing.

>> + Computing string similarity is something few apps need anyway.

> Tim, this isn't true.  Any time you need to validate user input
> against a controlled vocabulary and give feedback on probable right
> choices,

Which is something few apps need anyway -- in my experience, but more so in
my *primary* role here of trying to channel for you (& Guido) what Guido
will say.  It should be clear that I've got some familiarity with these
schemes, so it should also be clear that Guido is likely to ask me about
them whenever they pop up.  But Guido has hardly ever asked me about them
over the past decade, with the exception of the short-lived Soundex
brouhaha.  From that I guess hardly anyone ever asks *him* about them, and
that's how channeling works:  if this were an area where Guido felt core
Python needed beefier libraries, I'm pretty sure I would have heard about it
by now.

But now Guido can speak for himself.  There's no conceivable argument that
could change what I *predict* he'll say.

> R/O similarity is *very* useful.  I've had it in my personal
> toolkit for a decade and used it heavily for this -- you take your
> unknown input, check it against a dictionary and kick "maybe you meant
> foo?" to the user for every foo with an R/O similarity above 0.6 or so.
>
> The effects look like black magic.  Users love it.

I believe that.  And I'd guess we all have things in our personal toolkits
our users love.  That isn't enough to get into the core, as I expect Guido
will belabor on the next iteration of this <wink>.

doesn't-mean-the-code-isn't-mondo-cool-ly y'rs  - tim




From dkwolfe at pacbell.net  Sun Jan 14 01:19:56 2001
From: dkwolfe at pacbell.net (Dan Wolfe)
Date: Sat, 13 Jan 2001 16:19:56 -0800
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
Message-ID: <0G7400EZQM2TXD@mta5.snfc21.pbi.net>

>CVS makes it it very hard to a directory...
>which will then break checkouts for everyone...

with the potential to cause development code to be lost

>Using SourceForge...have to ask the SF sysadmins

I understand... we also use CVS and periodically (usually pre alpha) 
reorganize the source... going thru SF sysadmin makes it doublely hard... 
yuck!

However, since you have "released" tarball archives, it seems to me that 
the loss of the diffs and log notes is more troubling that the need to 
create an old version.... at least that's been my experience when 
building software. ;-)

>I just don't want to go through this hassle in order to make building
>easier for one relatively little-used platform.

humph. Ok, I'll accept that for now as we've only sold 100,000 Beta 
copies of Mac OS X... but if were not over 1 million users by this time 
next year... I'll eat my words. ;-)

>> It's been suggested that a build directory be created (/src/build ?) 
>> and that the target be place here. 

>I don't understand what you are proposing here; I can't imagine that
>an extra directory level could cause a slowdown.

moshez suggested this in his comment on the patch - moving the target to 
a seperate directory. I'm not sure of the implications of doing this 
however, and wondered if it might effect the running of the regression 
suite and the executable before it was installed.

>A suggestion I would be open to: change the executable name during
>build (currently a .exe suffix is added), but change it back (removing
>the .exe suffix) during the install.  That should be a small change to
>the Makefile.

You mean without using the -with-suffix command? That can probably be 
done... but based on my readings, I'd thought you reject it as not being 
"clean" and complicating the build process more than it should - not to 
mention renaming the executable behind the builder's back...  Lesser of 
two evils I guess - I'll investigate this however...

>> I'd like to see Python be easily built on "out of the box"...
>> [and] incorporated into the base Mac OS X installation...
>
>Just get Apple to include Python with their standard distribution and
>nobody will *have* to build Python on Mac OSX. :-)

Easier said that done as they already have the other P language 
installed. ;-) But then on the other hand, there are quite a few 
Pythonatic including me who use it in daily work at Apple. 

As I mentioned, the road to getting it in Mac OS X begins with getting it 
to build cleanly with the automated build system... so I've got to get 
this problem fixed before I start working on getting it in the build.

- Dan
  (yes, I work for Apple, but this is something that I'm doing on my own!)




From mwh21 at cam.ac.uk  Sun Jan 14 01:41:35 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 14 Jan 2001 00:41:35 +0000
Subject: [Python-Dev] a readline replacement?
In-Reply-To: Michael Hudson's message of "17 Dec 2000 18:18:24 +0000"
References: <200012142319.MAA02140@s454.cosc.canterbury.ac.nz> <3A39ED07.6B3EE68E@lemburg.com> <14906.17412.221040.895357@anthem.concentric.net> <20001215040304.A22056@glacier.fnational.com> <20001215235425.A29681@xs4all.nl> <m3hf42q5cf.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <m3snmn3qyo.fsf_-_@atrus.jesus.cam.ac.uk>

Michael Hudson <mwh21 at cam.ac.uk> writes:

> It wouldn't be particularly hard to rewrite editline in Python (we
> have termios & the terminal handling functions in curses - and even
> ioctl if we get really keen).
> 
> I've been hacking on my own Python line reader on and off for a while;
> it's still pretty buggy, but if you're feeling brave you could look at:
> 
> http://www-jcsu.jesus.cam.ac.uk/~mwh21/hacks/pyrl-0.0.0.tar.gz

As I secretly planned <wink>, the embarrassment of having code that
full of holes publicly accessible spurred me to writing a much better
version, to be found at:

  http://www-jcsu.jesus.cam.ac.uk/~mwh21/hacks/pyrl-0.2.0.tar.gz

(or, now rsync works there again, in the equivalent place on the
starship...).

If you unpack it and execute

$ python python_reader.py

you should get something that closely mimics the current interpreter
top level.  It supports a wide range of cursor motion commands,
built-in support for multiple line input and history (including
incremental search).  It doesn't do completion, basically because I
haven't got round to it yet, and it will get into severe trouble if
you enter an input that is taller than your terminal (I think this
should be surmountable, but I haven't gotten round to this either).
Another thing that I haven't gotten round to yet is documentation.
After I've tackled these points I'll probably stick it up on
parnassus.

I've been using it as my standard python shell for a week or so, and
quite like it, though the lack of completion is a drag.

It is probably staggeringly unportable, so I'd appreciate finding out
how it breaks on systems other that Linux with terminals other than
xterms...

Have the changes to enable use of editline been checked in yet?  I
worry that the licensing situation around the readline module is grey
at best...

Cheers,
M.

-- 
  That's why the smartest companies use Common Lisp, but lie about it
  so all their competitors think Lisp is slow and C++ is fast.  (This
  rumor has, however, gotten a little out of hand. :)
                                        -- Erik Naggum, comp.lang.lisp




From esr at thyrsus.com  Sun Jan 14 01:58:08 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sat, 13 Jan 2001 19:58:08 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHHIIAA.tim.one@home.com>; from tim.one@home.com on Sat, Jan 13, 2001 at 06:59:44PM -0500
References: <20010113171528.A17480@thyrsus.com> <LNBBLJKPBEHFEDALKOLCIEHHIIAA.tim.one@home.com>
Message-ID: <20010113195808.B17712@thyrsus.com>

Tim Peters <tim.one at home.com>:
> If you throw almost everything out of Unix diff, that's what you'll be left
> with.  Offhand I don't know of enencumbered, industrial-strength C source; a
> problem is that writing a program to compute this is a std homework exercise
> (it's a common first "dynamic programming" example), so you can find tons of
> bad C source.

I found some formal descriptions of the algorithm and some unencumbered 
Oberon source.  I'm coding up C now.  It's not complicated if you're willing 
to hold the cost matrix in memory, which is reasonable for a string comparator
in a way it wouldn't be for a file diff.
 
> Caution:  many people want small variations of "edit distance", usually via
> assigning different weights to insertions, replacements and deletions.  A
> less common but still popular variant is to say that a transposition ("xy"
> vs "yx") is less costly than a delete plus an insert.  Etc.  "edit distance"
> is really a family of algorithms.

Which about collapse into one if your function has three weight
arguments for insert/replace/delete weights, as mine does.  It don't
get more general than that -- I can see that by looking at the formal
description.  

OK, so I'll give you that I don't weight transpositions separately,
but neither does any other variant I found on the web nor the formal
descriptions.  A fourth optional weight agument someday, maybe :-).

> God forbid that core Python may lose the commercial OCR developer market
> <wink>.  It's not accepted that for every field F, core Python needs to
> supply the algorithms F uses heavily.

That's not my point -- I don't see OCR as a big Python market either.
My point in observing that OCR uses Ratcliff/Obershelp heavily was
simplty to show that it's a well-established algorithm, not
`controversial'.

>                      Heck, core Python doesn't even ship
> with an FFT!  Doesn't bother the folks working in signal processing.

It probably won't surprise you that I considered writing an FFT extension
module at one point :-).  

> > Tim, this isn't true.  Any time you need to validate user input
> > against a controlled vocabulary and give feedback on probable right
> > choices,
> 
> Which is something few apps need anyway

I fundamentally disagree.  Few application designers *know* they need
it, but user interfaces would get a hell of a lot better if the
technique were more commonly applied -- and that's why I want it in
the Python library, so doing the right thing in Python will be a
minimum-effort proposition.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

What if you were an idiot, and what if you were a member of Congress?
But I repeat myself.
        -- Mark Twain



From tim.one at home.com  Sun Jan 14 04:17:34 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 13 Jan 2001 22:17:34 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <14944.52328.558763.46161@cj42289-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHNIIAA.tim.one@home.com>

[Fred]
>   Did you ever write documentation for it?  ;-)

A lot more than you did <wink>.

just-show-me-"write-docs"-in-my-job-description-ly y'rs  - tim




From tim.one at home.com  Sun Jan 14 05:39:59 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 13 Jan 2001 23:39:59 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010113195808.B17712@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHPIIAA.tim.one@home.com>

[Eric, on "edit distance"]
> I found some formal descriptions of the algorithm and some
> unencumbered Oberon source.  I'm coding up C now.  It's not
> complicated if you're willing to hold the cost matrix in memory,
> which is reasonable for a string comparator in a way it wouldn't
> be for a file diff.

All agreed, and it should be a straightforward task then.  I'm assuming it
will work with Unicode strings too <wink>.

[on differing weights]
> Which about collapse into one if your function has three weight
> arguments for insert/replace/delete weights, as mine does.  It don't
> get more general than that -- I can see that by looking at the formal
> description.
>
> OK, so I'll give you that I don't weight transpositions separately,
> but neither does any other variant I found on the web nor the formal
> descriptions.  A fourth optional weight agument someday, maybe :-).
> ...
> and that's why I want it in the Python library, so doing the right
> thing in Python will be a minimum-effort proposition.

Guido will depart from you at a different point.  I depart here:  it's not
"the right thing".  It's a bunch of hacks that appeal not because they solve
a problem, but because they're cute algorithms that are pretty easy to
implement and kinda solve part of a problem.   "The right thing"-- which you
can buy --at least involves capturing a large base of knowledge about
phonetics and spelling.  In high school, one of my buddies was Dan
Pryzbylski.  If anyone who knew him (other than me <wink>) were to type his
name into the class reunion guy's web page, they'd probably spell it the way
they remember him pronouncing it:  sha-bill-skey (and that's how he
pronounced "Dan" <wink>).  If that hit on the text string "Pryzbylski",
*then* it would be "the right thing" in a way that makes sense to real
people, not just to implementers.

Working six years in commercial speech recog really hammered that home to
me:  95% solutions are on the margin of unsellable, because an error one try
in 20 is intolerable for real people.  Developers writing for developers get
"whoa! cool!" where my sisters walk away going "what good is that?".  Edit
distance doesn't get within screaming range of 95% in real life.

Even for most developers, it would be better to package up the single best
approach you've got (f(list, word) -> list of possible matches sorted in
confidence order), instead of a module with 6 (or so) functions they don't
understand and a pile of equally mysterious knobs.  Then it may actually get
used!  Developers of the breed who would actually take the time to
understand what you've done are, I suggest, similar to us:  they'd skim the
docs, ignore the code, and write their own variations.  Or, IOW:

> so doing the right thing in Python will be a minimum-effort
> proposition.

Make someone think first, and 95% of developers will just skip over it too.

BTW, the theoretical literature ignored transposition at first, because it
didn't fit well in the machinery.  IIRC, I first read about it in an issue
of SP&E (Software Practice & Experience), where the authors were forced into
it because the "traditional" edit sequence measure sucked in their practice.
They were much happier after taking transposition into account.  The
theoreticians have more than caught up since, and research is still active;
e.g., 1997's

    PATTERN RECOGNITION OF STRINGS WITH SUBSTITUTIONS, INSERTIONS,
    DELETIONS AND GENERALIZED TRANSPOSITIONS
    B. J. Oommen and R. K. S. Loke
    http://www.scs.carleton.ca/~oommen/papers/GnTrnsJ2.PDF

is a good read.  As they say there,

    If one views the elements of the confusion matrices as
    probabilities, this [treating each character independent
    of all others, as "edit distance" does] is equivalent to
    assuming that the transformation probabilities at each
    position in the string are statistically independent and
    possess first-order Markovian characteristics. This model
    is usually assumed for simplicity rather it [sic] having
    any statistical significance.

IOW, because it's easy to analyze, not because it solves a real problem --
and they're complaining about an earlier generalization of edit distance
that makes the weights depend on the individual symbols involved as well as
on the edit/delete/insert distinction (another variation trying to make this
approach genuinely useful in real life).  The Oommen-Loke algorithm appears
much more realistic, taking into account the observed probabilities of
mistyping specific letter pairs (although it still ignores phonetics), and
they report accuracies approaching 98% in correctly identifying mangled
words.

98% (more than twice as good as 95% -- the error rate is actually more
useful to think about, 2% vs 5%) is truly useful for non-geek end users, and
the state of the art here is far beyond what's easy to find and dead easy to
implement.

> ...
> It probably won't surprise you that I considered writing an FFT
> extension module at one point :-).

Nope!  More power to you, Eric.  At least FFTs *are* state of the art,
although *coding* them optimally is likely beyond human ability on modern
machines:

    http://www.fftw.org/

(short course:  they've generally got the fastest FFTs available, and their
code is generated by program, systematically *trying* every trick in the
book, timing it on a given box, and synthesizing a complete strategy out of
the quickest pieces).

sooner-or-later-the-only-code-real-people-will-use-won't-be-written-
    by-people-at-all-ly y'rs  - tim




From tim.one at home.com  Sun Jan 14 06:38:52 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 00:38:52 -0500
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
In-Reply-To: <0G7400EZQM2TXD@mta5.snfc21.pbi.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEIAIIAA.tim.one@home.com>

[Dan Wolfe]
> ...
> As I mentioned, the road to getting it in Mac OS X begins with
> getting it to build cleanly with the automated build system... so
> I've got to get  this problem fixed before I start working on
> getting it in the build.
>
> - Dan
>   (yes, I work for Apple, but this is something that I'm doing
>    on my own!)

Hang in there, Dan!  I did the first Python port to the KSR-1 on my own time
too, despite working for the visionless bastards at the time.  The rest is
history:  the glory, the fame, the riches, the groupies, the adulation of my
peers.  We won't mention the financial scandal and subsequent bankruptcy
lest it discourage you for no good reason <wink>.

BTW, "do the simplest thing that can possibly work"!  It's OK if it's a
little ugly.  Better that than force hundreds of Python-builders to get
divorced from a decade-old directory naming scheme.




From esr at thyrsus.com  Sun Jan 14 08:08:57 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 02:08:57 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEHPIIAA.tim.one@home.com>; from tim.one@home.com on Sat, Jan 13, 2001 at 11:39:59PM -0500
References: <20010113195808.B17712@thyrsus.com> <LNBBLJKPBEHFEDALKOLCEEHPIIAA.tim.one@home.com>
Message-ID: <20010114020857.E19782@thyrsus.com>

Tim Peters <tim.one at home.com>:
> All agreed, and it should be a straightforward task then.  I'm assuming it
> will work with Unicode strings too <wink>.

Thought about that.  Want to get it working for 8 bits first.
 
> Guido will depart from you at a different point.  I depart here:  it's not
> "the right thing".  It's a bunch of hacks that appeal not because they solve
> a problem, but because they're cute algorithms that are pretty easy to
> implement and kinda solve part of a problem.

Again, my experience says differently.  I have actually *used*
Ratcliff-Obershelp to implement Do What I Mean (actually, Tell Me What
I Mean) -- and had it work very well for non-geek users.  That's why I
want other Python programmers to have easy access to the capability.

> Working six years in commercial speech recog really hammered that home to
> me:  95% solutions are on the margin of unsellable, because an error one try
> in 20 is intolerable for real people.  Developers writing for developers get
> "whoa! cool!" where my sisters walk away going "what good is that?".  Edit
> distance doesn't get within screaming range of 95% in real life.

I suspect your speech recognition experience has given you an
unhelpful bias.  For English, what you say is certainly true -- but
that's a gross worst-case application of R/O and Levenshtein that I'm
not interested in pursuing.  Nor do I expect Python hackers to use
my module for that.

Where techniques like Ratcliff-Obershelp really shine (and what I
expect the module to be used for) is with controlled vocabularies such
as command interfaces.  These tend to have better orthogonality than
NL, so antinoise filtering by R/O or Levenshtein distance (a kindred
technique I somehow didn't learn until today -- there are
disadvantages to being an autodidact) can really go to town on them.

(Actually, my gut after thinking about both algorithms hard is that
R/O is still a better technique than Levenshtein for the kind of
application I have in mind.  But I also suspect the difference is
marginal.)

(Other good uses for algorithms in this class include cladistics and
genomic analysis.)

> Even for most developers, it would be better to package up the single best
> approach you've got (f(list, word) -> list of possible matches sorted in
> confidence order), instead of a module with 6 (or so) functions they don't
> understand and a pile of equally mysterious knobs.

That's why good documentation, with motivating usage hints, is important.
I write good documentation, Tim.

>     PATTERN RECOGNITION OF STRINGS WITH SUBSTITUTIONS, INSERTIONS,
>     DELETIONS AND GENERALIZED TRANSPOSITIONS
>     B. J. Oommen and R. K. S. Loke
>     http://www.scs.carleton.ca/~oommen/papers/GnTrnsJ2.PDF

Thanks for the pointer; I've downloaded it and will read it.  If the 
description of Ooomen's algorithm is good enough, I'll implement it and
add it to the module.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Power concedes nothing without a demand. It never did, and it never will.
Find out just what people will submit to, and you have found out the exact
amount of injustice and wrong which will be imposed upon them; and these will
continue until they are resisted with either words or blows, or with both.
The limits of tyrants are prescribed by the endurance of those whom they
oppress.
	-- Frederick Douglass, August 4, 1857



From dkwolfe at pacbell.net  Sun Jan 14 08:48:51 2001
From: dkwolfe at pacbell.net (Dan Wolfe)
Date: Sat, 13 Jan 2001 23:48:51 -0800
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEIAIIAA.tim.one@home.com>
Message-ID: <0G75009ZD6UYYE@mta5.snfc21.pbi.net>

On Saturday, January 13, 2001, at 09:38 PM, Tim Peters wrote:

> [Dan Wolfe]
>> ...
>> As I mentioned, the road to getting it in Mac OS X begins with
>> getting it to build cleanly with the automated build system... so
>> I've got to get  this problem fixed before I start working on
>> getting it in the build.
>>
>> - Dan
>> (yes, I work for Apple, but this is something that I'm doing
>> on my own!)
>
> Hang in there, Dan!  I did the first Python port to the KSR-1 on my own 
> time
> too, despite working for the visionless bastards at the time.

Well, I won't go that far..... some of them are quite visionaries (I 
can't stop drooling over a Ti portable....).

> The rest is
> history:  the glory, the fame, the riches, the groupies, the adulation 
> of my
> peers.  We won't mention the financial scandal and subsequent bankruptcy
> lest it discourage you for no good reason <wink>.

You left out the part where they turn ya into a timbot... <wink><wink>

> BTW, "do the simplest thing that can possibly work"!  It's OK if it's a
> little ugly.  Better that than force hundreds of Python-builders to get
> divorced from a decade-old directory naming scheme.

Well the mv Python to PyCore was the simplest... but obviously the most 
painful.... The longer ugly fix is working but it's such a hack that I'd 
rather not show it off...I need to fix it so that it allow nice things 
such allowing the -with-suffix to be used...and then testing all the 
edge cases such as clobber, etc so that I don't break anything. :-)

appreciating-your-note-after-attempting-to-understand-makefiles-on-Saturday-night'
ly yours,

- Dan









-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 1729 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20010113/4151a467/attachment-0001.bin>

From tim.one at home.com  Sun Jan 14 11:45:53 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 05:45:53 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010114020857.E19782@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEIGIIAA.tim.one@home.com>

[Tim]
>> ...It's a bunch of hacks that appeal not because they solve
>> a problem, but because they're cute algorithms that are pretty
>> easy to implement and kinda solve part of a problem.

[Eric]
> Again, my experience says differently.  I have actually *used*
> Ratcliff-Obershelp to implement Do What I Mean (actually, Tell Me What
> I Mean) -- and had it work very well for non-geek users.  That's why I
> want other Python programmers to have easy access to the capability.
> ...
> Where techniques like Ratcliff-Obershelp really shine (and what I
> expect the module to be used for) is with controlled vocabularies
> such as command interfaces.

Yet the narrower the domain, the less call for a library with multiple
approaches.  If R-O really shone for you, why bother with anything else?
Seriously.  You haven't used some (most?) of these.  The core isn't a place
for research modules either (note that I have no objection whatsoever to
writing any module you like -- the only question here is what belongs in the
core, and any algorithm *nobody* here has experience with in your target
domain is plainly a poor *core* candidate for that reason alone -- we have
to maintain, justify and explain it for years to come).

> I suspect your speech recognition experience has given you an
> unhelpful bias.

Try to think of it as a helpfully different perspective <0.5 wink>.  It's in
favor of measuring error rate by controlled experiments, skeptical of
intuition, and dismissive of anecdotal evidence.  I may well agree you don't
need all that heavy machinery if I had a clear definition of what problem it
is you're trying to solve (I've learned it's not the kinds of problems *I*
had in mind when I first read your description!).

BTW, telephone speech recog requires controlled vocabularies because phone
acoustics are too poor for the customary close-talking microphone approaches
to work well enough.  A std technique there is to build a "confusability
matrix" of the words *in* the vocabulary, to spot trouble before it happens:
if two words are acoustically confusable, it flags them and bounces that
info back to the vocabulary designer.  A similar approach should work well
in your domain:  if you get to define the cmd interface, run all the words
in it pairwise through your similarity measure of choice, and dream up new
words whenever a pair is "too close".  That all but ensures that even a
naive similarity algorithm will perform well (in telephone speech recog, the
unconstrained error rate is up to 70% on cell phones; by constraining the
vocabulary with the aid of confusability measures, we cut that to under 1%).

> ...
> (Actually, my gut after thinking about both algorithms hard is that
> R/O is still a better technique than Levenshtein for the kind of
> application I have in mind.  But I also suspect the difference is
> marginal.)

So drop Levenshtein -- go with your best shot.  Do note that they both
(usually) consider a single transposition to be as much a mutation as two
replacements (or an insert plus a delete -- "pure" Levenshtein treats those
the same).

What happens when the user doesn't enter an exact match?  Does the kind of
app you have in mind then just present them with a list of choices?  If
that's all (as opposed to, e.g., substituting its best guess for what the
user actually typed and proceeding as if the user had given that from the
start), then the evidence from studies says users are almost as pleased when
the correct choice appears somewhere in the first three choices as when it
appears as *the* top choice.  A well-designed vocabulary can almost
guarantee that happy result (note that most of the current research is aimed
at the much harder job of getting the intended word into the #1 slot on the
choice list).

> (Other good uses for algorithms in this class include cladistics and
> genomic analysis.)

I believe you'll find current work in those fields has moved far beyond
these simplest algorithms too, although they remain inspirational (for
example, see
"Protein Sequence Alignment and Database Scanning" at

    http://barton.ebi.ac.uk/papers/rev93_1/rev93_1.html

Much as in typing, some mutations are more likely than others for *physical*
reasons, so treating all pairs of symbols in the alphabet alike is too gross
a simplification.).

>> Even for most developers, it would be better to package up the
>> single best approach you've got (f(list, word) -> list of possible
>> matches sorted in confidence order), instead of a module with 6
>> (or so) functions they don't understand and a pile of equally
>> mysterious knobs.

> That's why good documentation, with motivating usage hints, is
> important.  I write good documentation, Tim.

You're not going to find offense here even if you look for it, Eric <wink>:
while only a small percentage of developers don't read docs at all, everyone
else spaces out at least in linear proportion to the length of the docs.
Most people will be looking for "a solution", not for "a toolkit".  If the
docs read like a toolkit, it doesn't matter how good they are, the bulk of
the people you're trying to reach will pass on it.  If you really want this
to be *used*, supply one class that does *all* the work, including making
the expert-level choices of which algorithm is used under the covers and how
it's tuned.  That's good advice.

I still expect Guido won't want it in the core before wide use is a
demonstrated fact, though (and no, that's not a chicken-vs-egg thing:  "wide
use" for a thing outside the core is narrower than "wide use" for a thing in
the core).  An exception would likely get made if he tried it and liked it a
lot.  But to get it under his radar, it's again much easier if the usage
docs are no longer than a couple paragraphs.

I'll attach a tiny program that uses ndiff's SequenceMatcher to guess which
of the 147 std 2.0 top-level library modules a user may be thinking of (and
best I can tell, these are the same results case-folding R/O would yield):

Module name? random
Hmm.  My best guesses are random, whrandom, anydbm
(BTW, the first choice was an exact match)
Module name? disect
Hmm.  My best guesses are bisect, dis, UserDict
Module name? password
Hmm.  My best guesses are keyword, getpass, asyncore
Module name? chitchat
Hmm.  My best guesses are whichdb, stat, asynchat
Module name? xml
Hmm.  My best guesses are xmllib, mhlib, xdrlib

[So far so good]

Module name? http
Hmm.  My best guesses are httplib, tty, stat

[I was thinking of httplib, but note that it missed
 SimpleHTTPServer:  a name that long just isn't going to score
 high when the input is that short]

Module name? dictionary
Hmm.  My best guesses are Bastion, ConfigParser, tabnanny

[darn, I *think* I was thinking of UserDict there]

Module name? uuencode
Hmm.  My best guesses are code, codeop, codecs

[Missed uu]

Module name? parse
Hmm.  My best guesses are tzparse, urlparse, pre
Module name? browser
Hmm.  My best guesses are webbrowser, robotparser, user
Module name? brower
Hmm.  My best guesses are webbrowser, repr, reconvert
Module name? Thread
Hmm.  My best guesses are threading, whrandom, sched
Module name? pickle
Hmm.  My best guesses are pickle, profile, tempfile
(BTW, the first choice was an exact match)
Module name? shelf
Hmm.  My best guesses are shelve, shlex, sched
Module name? katmandu
Hmm.  My best guesses are commands, random, anydbm

[I really was thinking of "commands"!]

Module name? temporary
Hmm.  My best guesses are tzparse, tempfile, fpformat

So it gets what I was thinking of into the top 3 very often, and despite
some wildly poor guesses at the correct spelling -- you'd *almost* think it
was doing a keyword search, except the *unintended* choices on the list are
so often insane <wink>.

Something like that may be a nice addition to Paul/Ping's help facility
someday too.

Hard question:  is that "good enough" for what you want?  Checking against
147 things took no perceptible time, because SequenceMatcher is already
optimized for "compare one thing against N", doing preprocessing work on the
"one thing" that greatly speeds the N similarity computations (I suspect
you're not -- yet).  It's been tuned and tested in practice for years; it
works for any sequence type with hashable elements (so Unicode strings are
already covered); it works for long sequences too.  And if R-O is the best
trick we've got, I believe it already does it.  Do we need more?  Of course
*I'm* not convinced we even need *it* in the core, but packaging a
match-1-against-N class is just a few minutes' editing of what follows.

something-to-play-with-anyway-ly y'rs  - tim


NDIFFPATH = "/Python20/Tools/Scripts"
LIBPATH = "/Python20/Lib"

import sys, os

sys.path.append(NDIFFPATH)
from ndiff import SequenceMatcher

modules = {}  # map lowercase module stem to module name
for f in os.listdir(LIBPATH):
    if f.endswith(".py"):
        f = f[:-3]
        modules[f.lower()] = f

def match(fname, numchoices=3):
    lower = fname.lower()
    s = SequenceMatcher()
    s.set_seq2(lower)
    scores = []
    for lowermod, mod in modules.items():
        s.set_seq1(lowermod)
        scores.append((s.ratio(), mod))
    scores.sort()
    scores.reverse()
    return modules.has_key(lower), [x[1] for x in scores[:numchoices]]

while 1:
    name = raw_input("Module name? ")
    is_exact, choices = match(name)
    print "Hmm.  My best guesses are", ", ".join(choices)
    if is_exact:
        print "(BTW, the first choice was an exact match)"




From esr at thyrsus.com  Sun Jan 14 13:15:33 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 07:15:33 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEIGIIAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 14, 2001 at 05:45:53AM -0500
References: <20010114020857.E19782@thyrsus.com> <LNBBLJKPBEHFEDALKOLCEEIGIIAA.tim.one@home.com>
Message-ID: <20010114071533.A5812@thyrsus.com>

Tim Peters <tim.one at home.com>:
> Yet the narrower the domain, the less call for a library with multiple
> approaches.  If R-O really shone for you, why bother with anything else?

Well, I was bothering with Levenshtein because *you* suggested it. :-)

I put in Hamming similarity and stemming because they're O(n) where
R/O is quadratic, and both widely used in situations where a fast sloppy
job is preferable to a good but slow one.  My documentation page is explicit
about the tradeoff.

> Seriously.  You haven't used some (most?) of these. 

I've used stemming and R-O.  Haven't used Hamming or Levenshtein.

>                                   The core isn't a place
> for research modules either (note that I have no objection whatsoever to
> writing any module you like -- the only question here is what belongs in the
> core, and any algorithm *nobody* here has experience with in your target
> domain is plainly a poor *core* candidate for that reason alone -- we have
> to maintain, justify and explain it for years to come).

Fair point.  I read it, in this context, as good advice to drop the Hamming 
entry point and forget about the Levenshtein implementation -- stick to what
I've used and know is useful as opposed to what I think might be useful.

>                                                I may well agree you don't
> need all that heavy machinery if I had a clear definition of what problem it
> is you're trying to solve (I've learned it's not the kinds of problems *I*
> had in mind when I first read your description!).

I think you have it by now, judging by the following...

> What happens when the user doesn't enter an exact match?  Does the kind of
> app you have in mind then just present them with a list of choices? 

Yes.  I've used this technique a lot.  It gives users not just guidance 
but warm fuzzy feelings -- they react as though there's a friendly 
homunculus inside the software looking out for them.  Actually, in my
experience, the less techie they are the more they like this.

> If that's all (as opposed to, e.g., substituting its best guess for what the
> user actually typed and proceeding as if the user had given that from the
> start), then the evidence from studies says users are almost as pleased when
> the correct choice appears somewhere in the first three choices as when it
> appears as *the* top choice.

Interesting.  That does fit what I've seen.

>                    A well-designed vocabulary can almost
> guarantee that happy result (note that most of the current research is aimed
> at the much harder job of getting the intended word into the #1 slot on the
> choice list).

Yes.  One of my other tricks is to design command vocabularies so the
first three characters close to unique.  This means R/O will almost
always nail the right thing.

> Much as in typing, some mutations are more likely than others for *physical*
> reasons, so treating all pairs of symbols in the alphabet alike is too gross
> a simplification.).

Indeed.  Couple weeks ago I was a speaker at a conference called "After the
Genome 6" at which one of the most interesting papers was given by a lady
mathematician who designs algorithms for DNA sequence matching.  She made
exactly this point.

> > That's why good documentation, with motivating usage hints, is
> > important.  I write good documentation, Tim.
> 
> You're not going to find offense here even if you look for it, Eric <wink>:

No worries, I wasn't looking. :-)

> Most people will be looking for "a solution", not for "a toolkit".  If the
> docs read like a toolkit, it doesn't matter how good they are, the bulk of
> the people you're trying to reach will pass on it.  If you really want this
> to be *used*, supply one class that does *all* the work, including making
> the expert-level choices of which algorithm is used under the covers and how
> it's tuned.  That's good advice.

I don't think that's possible in this case -- the proper domains for
stemming and R-O are too different.  But maybe this is another nudge to drop
the Hamming code.

>       But to get it under his radar, it's again much easier if the usage
> docs are no longer than a couple paragraphs.

How's this?

\section{\module{simil} -- 
         String similarily metrics}

\declaremodule{standard}{simil}
\moduleauthor{Eric S. Raymond}{esr at thyrsus.com}
\modulesynopsis{String similarity metrics.}

\sectionauthor{Eric S. Raymond}

The \module{simil} module provides similarity functions for
approximate word or string matching.  One important application is for
checking input words against a dictionary to match possible
misspellings with the right terms in a controlled vocabulary.

The entry points provide different tradeoffs ranging from crude and
fast (stemming) to effective but slow (Ratcliff-Obershelp gestalt
subpattern matching).  The latter is one of the standard techniques
used in commercial OCR software.

The \module{simil} module defines the following functions:

\begin{funcdesc}{stem}{}
Returns the length of the longest common prefix of two strings divided
by the length of the longer.  Similarity scores range from 0.0 (no
common prefix) to 1.0 (identity).  Running time is linear in string
length.
\end{funcdesc}

\begin{funcdesc}{hamming}{}
Computes a normalized Hamming similarity between two strings of equal
length -- the number of pairwise matches in the strings, divided by
their common length.  It returns None if the strings are of unequal
length.  Similarity scores range from 0.0 (no positions equal) to 1.0
(identity).  Running time is linear in string length.
\end{funcdesc}

\begin{funcdesc}{ratcliff}{}
Returns a Ratcliff/Obershelp gestalt similarity score based on
co-occurrence of subpatterns.  Similarity scores range from 0.0 (no
common subpatterns) to 1.0 (identity).  Running time is best-case
linear, worst-case quadratic in string length.
\end{funcdesc}

> Module name? http
> Hmm.  My best guesses are httplib, tty, stat
> 
> [I was thinking of httplib, but note that it missed
>  SimpleHTTPServer:  a name that long just isn't going to score
>  high when the input is that short]

>>> simil.ratcliff("http", "httplib")
0.72727274894714355
>>> simil.ratcliff("http", "tty")
0.57142859697341919
>>> simil.ratcliff("http", "stat")
0.5
>>> simil.ratcliff("http", "simplehttpserver")
0.40000000596046448

So with the 0.6 threshold I normally use R-O does better at eliminating
the false matches but doesn't catch SimpleHTTPServer (case is, I'm
sure you'll agree, an irrelevant detail here).
 
> Module name? dictionary
> Hmm.  My best guesses are Bastion, ConfigParser, tabnanny
> 
> [darn, I *think* I was thinking of UserDict there]

>>> simil.ratcliff("dictionary", "bastion")
0.47058823704719543
>>> simil.ratcliff("dictionary", "configparser")
0.45454546809196472
>>> simil.ratcliff("dictionary", "tabnanny")
0.4444444477558136
>>> simil.ratcliff("dictionary", "userdict")
0.4444444477558136

R-O would have booted all of these.  Hiighest score to configparser.
Interesting -- I'm beginning to think R-O overweights lots of small
subpattern matches relative to a few big ones, something I didn't notice
before because the statistics of my vocabularies masked it.

> Module name? uuencode
> Hmm.  My best guesses are code, codeop, codecs

>>> simil.ratcliff("uuencode", "code")
0.66666668653488159
>>> simil.ratcliff("uuencode", "codeops")
0.53333336114883423
>>> simil.ratcliff("uuencode", "codecs")
0.57142859697341919
>>> simil.ratcliff("uuencode", "uu")
0.40000000596046448

R-O would pick "code" and boot the rest.

> [Missed uu]
> 
> Module name? parse
> Hmm.  My best guesses are tzparse, urlparse, pre

>>> simil.ratcliff("parse", "tzparse")
0.83333331346511841
>>> simil.ratcliff("parse", "urlparse")
0.76923078298568726
>>> simil.ratcliff("parse", "pre")
0.75

Same result.

> Module name? browser
> Hmm.  My best guesses are webbrowser, robotparser, user

>>> simil.ratcliff("browser", "webbrowser")
0.82352942228317261
>>> simil.ratcliff("browser", "robotparser")
0.55555558204650879
>>> simil.ratcliff("browser", "user")
0.54545456171035767

Big win for R-O.  Picks the right one, boots the wrong two.

> Module name? brower
> Hmm.  My best guesses are webbrowser, repr, reconvert

>>> simil.ratcliff("brower", "webbrowser")
0.75
>>> simil.ratcliff("brower", "repr")
0.60000002384185791
>>> simil.ratcliff("brower", "reconvert")
0.53333336114883423

Small win for R/O -- boots reconvert, and repr squeaks in under the wire.

> Module name? Thread
> Hmm.  My best guesses are threading, whrandom, sched

>>> simil.ratcliff("thread", "threading")
0.80000001192092896
>>> simil.ratcliff("thread", "whrandom")
0.57142859697341919
>>> simil.ratcliff("thread", "sched")
0.54545456171035767

Big win for R-O.

> Module name? pickle
> Hmm.  My best guesses are pickle, profile, tempfile

>>> simil.ratcliff("pickle", "pickle")
1.0
>>> simil.ratcliff("pickle", "profile")
0.61538463830947876
>>> simil.ratcliff("pickle", "tempfile")
0.57142859697341919

R-O wins again.

> (BTW, the first choice was an exact match)
> Module name? shelf
> Hmm.  My best guesses are shelve, shlex, sched

>>> simil.ratcliff("shelf", "shelve")
0.72727274894714355
>>> simil.ratcliff("shelf", "shlex")
0.60000002384185791
>>> simil.ratcliff("shelf", "sched")
0.60000002384185791

Interesting.  Shelve scoores highest, both the others squeak in.

> Module name? katmandu
> Hmm.  My best guesses are commands, random, anydbm
>
> [I really was thinking of "commands"!]

>>> simil.ratcliff("commands", "commands")
1.0
>>> simil.ratcliff("commands", "random")
0.4285714328289032
>>> simil.ratcliff("commands", "anydbm")
0.4285714328289032

R-O wins big.
 
> Module name? temporary
> Hmm.  My best guesses are tzparse, tempfile, fpformat

>>> simil.ratcliff("temporary", "tzparse")
0.5
>>> simil.ratcliff("temporary", "tempfile")
0.47058823704719543
>>> simil.ratcliff("temporary", "fpformat")
0.47058823704719543

R-O boots all of these.  

> Hard question:  is that "good enough" for what you want?

Um...notice that R-O filtering, even though it seems to be
underweighting large matches, did a rather better job on your examples!
With an 0.66 threshold it would have done *much* better.

I think you've just made an argument for replacing your SequenceMatcher
with simil.ratcliff.  Mine's even documented. :-).
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Militias, when properly formed, are in fact the people themselves and
include all men capable of bearing arms. [...] To preserve liberty it is
essential that the whole body of the people always possess arms and be
taught alike, especially when young, how to use them.
        -- Senator Richard Henry Lee, 1788, on "militia" in the 2nd Amendment



From ping at lfw.org  Sun Jan 14 13:38:42 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 04:38:42 -0800 (PST)
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
Message-ID: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>

Sorry i'm being forgetful -- could someone please refresh my memory:

Was there a good reason for allowing both lowercase and capital 'r'
as a prefix for raw-strings?  I assume that the availability of both
r'' and R'' is what led to having both u'' and U''.  Is there any
good reason for that either?

This just seems to lead to ambiguity and unneeded complexity:
more cases in tokenize.py, more cases in tokenize.c, more work
for IDLE, more annoying when searching for u' in your editor.
(I was about to fix the lack of u'' support in tokenize.py and
that made me think about this.)

What happened to TOOWTDI?

Would you believe we now have 36 different ways of starting a string:

    '      "      '''    """
    r'     r"     r'''   r"""
    u'     u"     u'''   u"""
    ur'    ur"    ur'''  ur"""
    R'     R"     R'''   R"""
    U'     U"     U'''   U"""
    uR'    uR"    uR'''  uR"""
    Ur'    Ur"    Ur'''  Ur"""
    UR'    UR"    UR'''  UR"""

Would it be outrageous to suggest deprecating the last five rows?


-- ?!ng

[1] We started with 4.  Perl has (by my count) 381 ways of starting
    a string literal, so we're halfway there, logarithmically speaking.
    Perl has 757 if you count the fancier operators qx, qw, s, and tr.




From mal at lemburg.com  Sun Jan 14 14:33:29 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sun, 14 Jan 2001 14:33:29 +0100
Subject: [Python-Dev] Why is soundex marked obsolete?
References: <LNBBLJKPBEHFEDALKOLCIEHCIIAA.tim.one@home.com>
Message-ID: <3A61AAA9.F6F1EA9F@lemburg.com>

[Lots of talk about interesting algorithms for "human" pattern matching]

I just want to add my 2 cents to the discussion:

* Eric's package seems very useful for pattern matching, but that
  is a very specific domain -- not main stream

* I would opt to create a neat distutils style package for it
  for people to install at their own liking (I would certainly
  like it :)

* If wrapped up as a separate package, I'd suggest to add all
  known algorithms to the package and also make it Unicode
  aware. There are similar package for e.g. RNGs on Parnassus.

BTW, are there less English centric "sounds alike" matchers
around ? The NIST soundex algorithm as published on the internet:

    http://physics.nist.gov/cuu/Reference/soundex.html

works fine for English texts, but other languages of course
have different letter coding requirements (or even different
alphabets).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Sun Jan 14 14:53:03 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sun, 14 Jan 2001 14:53:03 +0100
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
References: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>
Message-ID: <3A61AF3F.EE6DAB88@lemburg.com>

Ka-Ping Yee wrote:
> 
> Sorry i'm being forgetful -- could someone please refresh my memory:
> 
> Was there a good reason for allowing both lowercase and capital 'r'
> as a prefix for raw-strings?  I assume that the availability of both
> r'' and R'' is what led to having both u'' and U''. 

Right.

> Is there any
> good reason for that either?

No idea... I have never used anything other than the lowercase
versions.
 
> This just seems to lead to ambiguity and unneeded complexity:
> more cases in tokenize.py, more cases in tokenize.c, more work
> for IDLE, more annoying when searching for u' in your editor.
> (I was about to fix the lack of u'' support in tokenize.py and
> that made me think about this.)
> 
> What happened to TOOWTDI?
> 
> Would you believe we now have 36 different ways of starting a string:
> 
>     '      "      '''    """
>     r'     r"     r'''   r"""
>     u'     u"     u'''   u"""
>     ur'    ur"    ur'''  ur"""
>     R'     R"     R'''   R"""
>     U'     U"     U'''   U"""
>     uR'    uR"    uR'''  uR"""
>     Ur'    Ur"    Ur'''  Ur"""
>     UR'    UR"    UR'''  UR"""
>
> Would it be outrageous to suggest deprecating the last five rows?

No. + 1 on the idea.
 
> -- ?!ng
> 
> [1] We started with 4.  Perl has (by my count) 381 ways of starting
>     a string literal, so we're halfway there, logarithmically speaking.
>     Perl has 757 if you count the fancier operators qx, qw, s, and tr.
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://www.python.org/mailman/listinfo/python-dev

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Sun Jan 14 15:24:08 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sun, 14 Jan 2001 15:24:08 +0100
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>; from ping@lfw.org on Sun, Jan 14, 2001 at 04:38:42AM -0800
References: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>
Message-ID: <20010114152408.G1005@xs4all.nl>

On Sun, Jan 14, 2001 at 04:38:42AM -0800, Ka-Ping Yee wrote:

> [1] We started with 4.  Perl has (by my count) 381 ways of starting
>     a string literal, so we're halfway there, logarithmically speaking.
>     Perl has 757 if you count the fancier operators qx, qw, s, and tr.

Don't forget 'qr//', which is quite like a raw string, except that Perl uses
it to 'precompile' regular expressions as a side effect. 

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Sun Jan 14 18:08:28 2001
From: guido at python.org (Guido van Rossum)
Date: Sun, 14 Jan 2001 12:08:28 -0500
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: Your message of "Sun, 14 Jan 2001 14:53:03 +0100."
             <3A61AF3F.EE6DAB88@lemburg.com> 
References: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>  
            <3A61AF3F.EE6DAB88@lemburg.com> 
Message-ID: <200101141708.MAA11161@cj20424-a.reston1.va.home.com>

> Ka-Ping Yee wrote:
> > 
> > Sorry i'm being forgetful -- could someone please refresh my memory:
> > 
> > Was there a good reason for allowing both lowercase and capital 'r'
> > as a prefix for raw-strings?  I assume that the availability of both
> > r'' and R'' is what led to having both u'' and U''. 
> 
> Right.
> 
> > Is there any
> > good reason for that either?
> 
> No idea... I have never used anything other than the lowercase
> versions.

It comes from the numeric literals.  C allows 0x0 and 0X0, and 0L as
well as 0l.  So does Python (and also 0j == 0J).

> > This just seems to lead to ambiguity and unneeded complexity:
> > more cases in tokenize.py, more cases in tokenize.c, more work
> > for IDLE, more annoying when searching for u' in your editor.
> > (I was about to fix the lack of u'' support in tokenize.py and
> > that made me think about this.)
> > 
> > What happened to TOOWTDI?
> > 
> > Would you believe we now have 36 different ways of starting a string:
> > 
> >     '      "      '''    """
> >     r'     r"     r'''   r"""
> >     u'     u"     u'''   u"""
> >     ur'    ur"    ur'''  ur"""
> >     R'     R"     R'''   R"""
> >     U'     U"     U'''   U"""
> >     uR'    uR"    uR'''  uR"""
> >     Ur'    Ur"    Ur'''  Ur"""
> >     UR'    UR"    UR'''  UR"""
> >
> > Would it be outrageous to suggest deprecating the last five rows?
> 
> No. + 1 on the idea.

Why bother?  All that does is outdate a bunch of documentation.  I
don't see the extra effort in various parsers as a big deal.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at effbot.org  Sun Jan 14 18:53:32 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Sun, 14 Jan 2001 18:53:32 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
Message-ID: <010f01c07e52$e9801fc0$e46940d5@hagrid>

The name database portions of SF task 17335 ("add
compressed unicode database") were postponed to
2.1.

My current patch replaces the ~450k large ucnhash
module with a new ~160k large module.  (See earlier
posts for more info on how the new database works).

Should I check it in?

</F>




From skip at mojam.com  Sun Jan 14 18:51:52 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sun, 14 Jan 2001 11:51:52 -0600 (CST)
Subject: [Python-Dev] pydoc - put it in the core
Message-ID: <14945.59192.400783.403810@beluga.mojam.com>

Ping's pydoc is awesome!  Move it out of the sandbox and put it in the
standard distribution.

Biggest hook for me:

   1. execute "pydoc -p 3200"
   2. visit "http://localhost:3200/"
   3. knock yourself out

Skip



From martin at mira.cs.tu-berlin.de  Sun Jan 14 18:57:57 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 14 Jan 2001 18:57:57 +0100
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
Message-ID: <200101141757.f0EHvvt01407@mira.informatik.hu-berlin.de>

> > Would it be outrageous to suggest deprecating the last five rows?
> Why bother?  All that does is outdate a bunch of documentation.

He suggested to deprecate it, not to remove it. By the time it is
removed, the documentation still mentioning it should be outdated for
other reasons (e.g. the string module might have disappeared).

In general, the rationale for deprecating things would be that the
simplification will make everybody's life easier in the long run. In
the case of a small change (such as this one), that advantage would be
small. OTOH, the hassle for users that rely on the then-removed
feature will be also small; I see it as quite unlikely that anybody
uses that feature actively (although I do think that people use 0X10
and 100L; the latter is common since 100l is oft confused with 1001).

Regards,
Martin



From tim.one at home.com  Sun Jan 14 20:00:21 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 14:00:21 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010114071533.A5812@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEJAIIAA.tim.one@home.com>

Very quick (swamped):

> I think you've just made an argument for replacing your
> SequenceMatcher with simil.ratcliff.

Actually, I'm certain they're the same algorithm now, except the C is
showing through in ratcliff to the floating-point eye <wink>.  For
demonstration, I *always* printed the top three scorers (that's logic in the
little driver I posted, not in SequenceMatcher), without any notion of
cutoff (ndiff does use a cutoff).  Add this line before the return (in the
posted driver) to see the actual scores:

    print scores[:numchoices]

For example:

Module name? browser
[(0.82352941176470584, 'webbrowser'),
 (0.55555555555555558, 'robotparser'),
 (0.54545454545454541, 'user')]
Hmm.  My best guesses are webbrowser, robotparser, user
Module name?

On this example you reported:

>>> simil.ratcliff("browser", "webbrowser")
0.82352942228317261
>>> simil.ratcliff("browser", "robotparser")
0.55555558204650879
>>> simil.ratcliff("browser", "user")
0.54545456171035767

which strongly suggests you're using C floats instead of Python floats to
compute the final score.  I didn't try every example in your email, but it's
the same story on the three I did try (scores identical modulo
simil.ratcliff dropping about 30 of the low-order result bits -- which is
about the difference between a C double and a C float on most boxes).

> Mine's even documented. :-).

Which I appreciate!  I dreamt up the SequenceMatcher algorithm going on 20
years ago for a friendly diff generator, and never even considered using it
for other purposes.  But then I may have mentioned that these other purposes
never come up in my apps <wink>.

or-at-least-they-haven't-in-contexts-where-r/o-would-have-been-
    strong-enough-ly y'rs  - tim




From bckfnn at worldonline.dk  Sun Jan 14 20:00:33 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Sun, 14 Jan 2001 19:00:33 GMT
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <010f01c07e52$e9801fc0$e46940d5@hagrid>
References: <010f01c07e52$e9801fc0$e46940d5@hagrid>
Message-ID: <3a61f12a.36601630@smtp.worldonline.dk>

On Sun, 14 Jan 2001 18:53:32 +0100, you wrote:

>The name database portions of SF task 17335 ("add
>compressed unicode database") were postponed to
>2.1.
>
>My current patch replaces the ~450k large ucnhash
>module with a new ~160k large module.  (See earlier
>posts for more info on how the new database works).

Do you have a link or an approx date of this earlier posts? I must have
missed it. The patch on sourceforge seems a bit empty:

https://sourceforge.net/patch/index.php?func=detailpatch&patch_id=100899&group_id=5470

As a result I invented my own compression format for the ucnhash for
jython. I managed to achive ~100k but that probably have different
performance properties.

regards,
finn



From esr at thyrsus.com  Sun Jan 14 20:09:01 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 14:09:01 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEJAIIAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 14, 2001 at 02:00:21PM -0500
References: <20010114071533.A5812@thyrsus.com> <LNBBLJKPBEHFEDALKOLCCEJAIIAA.tim.one@home.com>
Message-ID: <20010114140901.A6431@thyrsus.com>

Tim Peters <tim.one at home.com>:
> > I think you've just made an argument for replacing your
> > SequenceMatcher with simil.ratcliff.
> 
> Actually, I'm certain they're the same algorithm now, except the C is
> showing through in ratcliff to the floating-point eye <wink>.

Take a look:

/*****************************************************************************
 *
 * Ratcliff-Obershelp common-subpattern similarity.
 *
 * This code first appeared in a letter to the editor in Doctor
 * Dobbs's Journal, 11/1988.  The original article on the algorithm,
 * "Pattern Matching by Gestalt" by John Ratcliff, had appeared in the
 * July 1988 issue (#181) but the algorithm was presented in assembly.
 * The main drawback of the Ratcliff-Obershelp algorithm is the cost
 * of the pairwise comparisons.  It is significantly more expensive
 * than stemming, Hamming distance, soundex, and the like.
 *
 * Running time quadratic in the data size, memory usage constant.
 *
 *****************************************************************************/

static int RatcliffObershelp(char *st1, char *end1, char *st2, char *end2)
{
    register char *a1, *a2;
    char *b1, *b2; 
    char *s1 = st1, *s2 = st2;	/* initializations are just to pacify GCC */
    short max, i;

    if (end1 <= st1 || end2 <= st2)
	return(0);
    if (end1 == st1 + 1 && end2 == st2 + 1)
	return(0);
		
    max = 0;
    b1 = end1; b2 = end2;
	
    for (a1 = st1; a1 < b1; a1++)
    {
	for (a2 = st2; a2 < b2; a2++)
	{
	    if (*a1 == *a2)
	    {
		/* determine length of common substring */
		for (i = 1; a1[i] && (a1[i] == a2[i]); i++) 
		    continue;
		if (i > max)
		{
		    max = i; s1 = a1; s2 = a2;
		    b1 = end1 - max; b2 = end2 - max;
		}
	    }
	}
    }
    if (!max)
	return(0);
    max += RatcliffObershelp(s1 + max, end1, s2 + max, end2);	/* rhs */
    max += RatcliffObershelp(st1, s1, st2, s2);			/* lhs */
    return max;
}

static float ratcliff(char *s1, char *s2)
/* compute Ratcliff-Obershelp similarity of two strings */
{
    short l1, l2;

    l1 = strlen(s1);
    l2 = strlen(s2);
	
    /* exact match end-case */
    if (l1 == 1 && l2 == 1 && *s1 == *s2)
	return(1.0);
			
    return 2.0 * RatcliffObershelp(s1, s1 + l1, s2, s2 + l2) / (l1 + l2);
}

static PyObject *
simil_ratcliff(PyObject *self, PyObject *args)
{
    char *str1, *str2;
    
    if(!PyArg_ParseTuple(args, "ss:ratcliff", &str1, &str2))
        return NULL;

    return Py_BuildValue("f", ratcliff(str1, str2));
}
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Taking my gun away because I might shoot someone is like cutting my tongue
out because I might yell `Fire!' in a crowded theater."
        -- Peter Venetoklis



From fredrik at effbot.org  Sun Jan 14 20:31:06 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Sun, 14 Jan 2001 20:31:06 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <010f01c07e52$e9801fc0$e46940d5@hagrid> <3a61f12a.36601630@smtp.worldonline.dk>
Message-ID: <040e01c07e60$8c74d100$e46940d5@hagrid>

finn wrote:
> As a result I invented my own compression format for the ucnhash for
> jython. I managed to achive ~100k but that probably have different
> performance properties.

here's the description:

---

From: "Fredrik Lundh" <effbot at telia.com>
Date: Sun, 16 Jul 2000 20:40:46 +0200

/.../

    The unicodenames database consists of two parts: a name
    database which maps character codes to names, and a code
    database, mapping names to codes.

* The Name Database (getname)

    First, the 10538 text strings are split into 42193 words,
    and combined into a 4949-word lexicon (a 29k array).

    Each word is given a unique index number (common words get
    lower numbers), and there's a "lexicon offset" table mapping
    from numbers to words (10k).

    To get back to the original text strings, I use a "phrase
    book".  For each original string, the phrase book stores a a
    list of word numbers.  Numbers 0-127 are stored in one byte,
    higher numbers (less common words) use two bytes.  At this
    time, about 65% of the words can be represented by a single
    byte.  The result is a 56k array.

    The final data structure is an offset table, which maps code
    points to phrase book offsets.  Instead of using one big
    table, I split each code point into a "page number" and a
    "line number" on that page.

      offset = line[ (page[code>>SHIFT]<<SHIFT) + (code&MASK) ]

    Since the unicode space is sparsely populated, it's possible
    to split the code so that lots of pages gets no contents.  I
    use a brute force search to find the optimal SHIFT value.

    In the current database, the page table has 1024 entries
    (SHIFT is 6), and there are 199 unique pages in the line
    table.  The total size of the offset table is 26k.

* The code database (getcode)

    For the code table, I use a straight-forward hash table to store
    name to code mappings.  It's basically the same implementation
    as in Python's dictionary type, but a different hash algorithm.
    The table lookup loop simply uses the name database to check
    for hits.

    In the current database, the hash table is 32k.

/.../

</F>




From tim.one at home.com  Sun Jan 14 20:46:44 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 14:46:44 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <3A61AAA9.F6F1EA9F@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com>

[M.-A. Lemburg]
> BTW, are there less English centric "sounds alike" matchers
> around ?

Yes, but if anything there are far too many of them:  like Soundex, they're
just heuristics, and *everybody* who cares adds their own unique twists,
while proper studies are almost non-existent.  Few variants appear to be in
use much beyond their inventor's friends; one notable exception in the
Jewish community is the Daitch-Mokotoff variation, originally tailored to
their unique needs but later generalized; a brief description here:

    http://www.avotaynu.com/soundex.html

The similarly involved NYSIIS algorithm (New York State Identification
Intelligence System -- look for NYSIIS on Parnassus) was the winner from a
field of about two dozen competing algorithms, after measuring their
effectiveness on assorted databases maintained by the state of New York.
Since New York has a large immigrant population, NYSIIS isn't as
Anglocentric as Soundex either.

But state-of-the-art has given up on purely computational algorithms for
these purposes:  proper names are simply too much a mess.  For example, if I
search for "Richard", it *ought* to match on "Dick"; if my Arab buddy
searches on "Mohammed", it *ought* to match on "Mhd"; "the rules" people
actually use just aren't reducible to pure computation -- it takes a large
knowledge base to capture what people "just know".  You may enjoy visiting
this commercial site (AFAIK, nobody is giving away state-of-the-art for
free):

    http://www.las-inc.com/

> ...
>     http://physics.nist.gov/cuu/Reference/soundex.html
>
> works fine for English texts,

If that were true, the English-speaking researchers would have declared
victory 120 years ago <wink>.  But English pronunciation is *notoriously*
difficult to predict from spelling, partly because English is the Perl of
human languages.

or-maybe-the-borg-assuming-there's-a-difference<wink>-ly y'rs  - tim




From esr at thyrsus.com  Sun Jan 14 21:17:53 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 15:17:53 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 14, 2001 at 02:46:44PM -0500
References: <3A61AAA9.F6F1EA9F@lemburg.com> <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com>
Message-ID: <20010114151753.A6671@thyrsus.com>

Tim Peters <tim.one at home.com>:
> If that were true, the English-speaking researchers would have declared
> victory 120 years ago <wink>.  But English pronunciation is *notoriously*
> difficult to predict from spelling, partly because English is the Perl of
> human languages.

Actually, according to the Oxford Encyclopedia of Linguistics, this is
an urban myth.  The orthography of English is, in fact, quite
consistent; it looks much more wacked out than it is because the
maddening irregularities are concentrated in the 400 most commonly
used words.

The situation is much like that with French verb forms -- most French
verbs have a very regular inflection pattern, but the twenty or so
exceptions are the most commonly used ones.  In fact it's a general
rule in language evolution that irregularities are preserved in common
forms and not rare ones -- in the rare ones they get forgotten.

American personal names are are problem precisely because they sometimes
do *not* have English orthography.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

  "...quemadmodum gladius neminem occidit, occidentis telum est."
[...a sword never kills anybody; it's a tool in the killer's hand.]
        -- (Lucius Annaeus) Seneca "the Younger" (ca. 4 BC-65 AD),



From tim.one at home.com  Sun Jan 14 21:31:06 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 15:31:06 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <20010114140901.A6431@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEJCIIAA.tim.one@home.com>

[Tim]
> Actually, I'm certain they're the same algorithm now, except the C is
> showing through in ratcliff to the floating-point eye <wink>.

[Eric]
> Take a look:

Yup, same thing, except:

> static float ratcliff(char *s1, char *s2)

accounts for the numeric differences (change "float"->"double" and they'd be
the same; Python has to convert it to a double anyway, lacking any internal
support for C's floats; and the C code is *computing* in double regardless,
cutting it back to a float upon return just because of the "float" decl).

The code in SequenceMatcher doesn't *look* anything like it, though, due to
years of dreaming up faster ways to do this (in its original role as a diff
generator, it routinely had to deal with sequences containing 10s of
thousands of elements, and code very much like the code you posted was just
too slow for that).

One simple trick that can enormously speed the worst cases:  the "find the
longest match starting here" innermost loop is guarded by

> 	    if (*a1 == *a2)

However, it can't possibly find a *bigger* max unless it's also the case
that

    a1[max) == a2[max)

That's usually false in real life, so by adding that test to the guard you
usually get to skip the innermost loop entirely.  Probably more important in
a diff-generator role, though.

SequenceMatcher's prime trick is to preprocess one of the strings, in linear
time building up a hash table mapping each character in the string to a list
of the indices at which it appears.  Then the second-innermost loop is saved
from needing to do any search:  when we get to, e.g., 'x' in the other
string, the precomputed hash table tells us directly where to find all the
x's in the original string.  And in the match-1-against-N case, this hash
table can be computed once & reused N times.  That's a monster win.

However, I never had the patience to code that in C, so I never *did* that
before I reimplemented my stuff in Python.  Now the Python ndiff runs
circles around the old Pascal and C versions.  I'm sure that has nothing to
do with machines having gotten 100x faster in the meantime <wink>>

for-short-1-against-1-matches-yours-will-certainly-be-quicker-ly
    y'rs  - tim




From guido at python.org  Sun Jan 14 21:55:21 2001
From: guido at python.org (Guido van Rossum)
Date: Sun, 14 Jan 2001 15:55:21 -0500
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: Your message of "Sun, 14 Jan 2001 11:51:52 CST."
             <14945.59192.400783.403810@beluga.mojam.com> 
References: <14945.59192.400783.403810@beluga.mojam.com> 
Message-ID: <200101142055.PAA13041@cj20424-a.reston1.va.home.com>

> Ping's pydoc is awesome!  Move it out of the sandbox and put it in the
> standard distribution.
> 
> Biggest hook for me:
> 
>    1. execute "pydoc -p 3200"
>    2. visit "http://localhost:3200/"
>    3. knock yourself out

Yes, wow!

Now, if we could somehow get this to show both the docs that Fred
maintains and the stuff that Ping extracts from the source code, that
would be even better!  (I think that Ping's stuff should also run on
the python.org site, by the way.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Sun Jan 14 21:59:28 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Sun, 14 Jan 2001 15:59:28 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEJCIIAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 14, 2001 at 03:31:06PM -0500
References: <20010114140901.A6431@thyrsus.com> <LNBBLJKPBEHFEDALKOLCIEJCIIAA.tim.one@home.com>
Message-ID: <20010114155928.A6793@thyrsus.com>

Tim Peters <tim.one at home.com>:
> [Tim]
> > Actually, I'm certain they're the same algorithm now, except the C is
> > showing through in ratcliff to the floating-point eye <wink>.
> 
> [Eric]
> > Take a look:
> 
> Yup, same thing, except:
> 
> > static float ratcliff(char *s1, char *s2)
> 
> accounts for the numeric differences (change "float"->"double" and they'd be
> the same; Python has to convert it to a double anyway, lacking any internal
> support for C's floats; and the C code is *computing* in double regardless,
> cutting it back to a float upon return just because of the "float" decl).

OK, so the right answer is to make your version visible and documented
in the library.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

No one is bound to obey an unconstitutional law and no courts are bound
to enforce it.  
	-- 16 Am. Jur. Sec. 177 late 2d, Sec 256



From tim.one at home.com  Sun Jan 14 22:01:19 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 16:01:19 -0500
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: <Pine.LNX.4.10.10101140418050.5846-100000@skuld.kingmanhall.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEJDIIAA.tim.one@home.com>

[?!ng]
> [1] We started with 4.

Na, *we* started with two, just ' and ".  And at the time, I thought that
was arguably one too many already <wink>.  Allowing the modifiers to be
case-insensitive seems to me much more Pythonic than the original sin of
making ' and " mean the same thing.  OTOH, if only " had been allowed at the
start, we'd probably spell raw strings with ' today, and that doesn't really
scream that they're so very different from " strings.

leaving-this-one-be-ly y'rs  - tim




From barry at digicool.com  Sun Jan 14 22:02:07 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Sun, 14 Jan 2001 16:02:07 -0500
Subject: [Python-Dev] pydoc - put it in the core
References: <14945.59192.400783.403810@beluga.mojam.com>
Message-ID: <14946.5071.92879.789400@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip at mojam.com> writes:

    SM> Ping's pydoc is awesome!  Move it out of the sandbox and put
    SM> it in the standard distribution.

    SM> Biggest hook for me:

    |    1. execute "pydoc -p 3200"
    |    2. visit "http://localhost:3200/"
    |    3. knock yourself out

Whoa.  Awesome.




From ping at lfw.org  Sun Jan 14 22:01:45 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 13:01:45 -0800 (PST)
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: <200101141708.MAA11161@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101141235520.5846-100000@skuld.kingmanhall.org>

On Sun, 14 Jan 2001, Guido van Rossum wrote:
> 
> It comes from the numeric literals.  C allows 0x0 and 0X0, and 0L as
> well as 0l.  So does Python (and also 0j == 0J).

I just did a little test.  Neither Python, Perl, nor Tcl support
"\X66", only "\x66".  Perl doesn't support 0X1234, only 0x1234.
Tcl's "expr" routine does support 0X1234.  Javascript supports
0X1234, but not "\X66".  I'd bet that no one really relies on or
expects the uppercase forms except L.


-- ?!ng




From ping at lfw.org  Sun Jan 14 22:14:34 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 13:14:34 -0800 (PST)
Subject: [Python-Dev] Re: pydoc.py (show docs both inside and outside of
 Python)
In-Reply-To: <14942.29609.19618.534613@cj42289-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101141309320.5846-100000@skuld.kingmanhall.org>

On Thu, 11 Jan 2001, Fred L. Drake, Jr. wrote:
> Ka-Ping Yee writes:
>  > My next two targets are:
>  >     1.  Generating text from the HTML documentation files
>  >         using Paul Prescod's stuff in onlinehelp.py.
> 
> You mean the ones I publish as the standard documentation?  Relying
> on the structure of that HTML is pure folly!

Paul's onlinehelp.py is using the HTMLParser and AbstractFormatter
to turn HTML into text.  It also contains paths to specific files,
e.g. help('assert') looks for "ref/assert.html".  Are you okay with
this technique?  Have you tried onlinehelp.py?  I was planning to
do the same to provide help on the language in pydoc.


-- ?!ng




From skip at mojam.com  Sun Jan 14 22:26:48 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sun, 14 Jan 2001 15:26:48 -0600 (CST)
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: <200101142055.PAA13041@cj20424-a.reston1.va.home.com>
References: <14945.59192.400783.403810@beluga.mojam.com>
	<200101142055.PAA13041@cj20424-a.reston1.va.home.com>
Message-ID: <14946.6552.542015.620760@beluga.mojam.com>

    Guido> Now, if we could somehow get this to show both the docs that Fred
    Guido> maintains and the stuff that Ping extracts from the source code,
    Guido> that would be even better!

I had exactly the same thought.  I suspect that if the install target were
modified to install the html-ized sections of the lib reference manual pydoc
could grovel around in sys and find the root of the library reference manual
pretty easily.  If not, it could simply redirect to the relevant section of
http://www.python.org/doc/current/lib/.

Skip




From tim.one at home.com  Sun Jan 14 22:45:48 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 16:45:48 -0500
Subject: [Python-Dev] Why both r'' and R'', u'' and U''?
In-Reply-To: <Pine.LNX.4.10.10101141235520.5846-100000@skuld.kingmanhall.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEJGIIAA.tim.one@home.com>

[?!ng]
> ...
> I'd bet that no one really relies on or expects the uppercase
> forms except L.

And 0X.  I don't think it's in the std library, but I've certainly seen
Python code do stuff like

    magic = 0XFEEDFACE

Plus it's always good for a language to be able parse the stuff it prints,
and "0X..." is generated by Python's %#X format code.

Don't believe I've ever seen the "u" or "r" string modifiers in uppercase,
though, but really don't see the harm in allowing that.




From ping at lfw.org  Sun Jan 14 22:50:43 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 13:50:43 -0800 (PST)
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: <14946.5071.92879.789400@anthem.wooz.org>
Message-ID: <Pine.LNX.4.10.10101141349040.5846-100000@skuld.kingmanhall.org>

On Sun, 14 Jan 2001, Barry A. Warsaw wrote:
> Whoa.  Awesome.

Thanks!

Two things added recently: constants (any numbers, lists, tuples,
strings, or types) in modules are shown; and packages are listed
in the index as they should be.


-- ?!ng




From bckfnn at worldonline.dk  Sun Jan 14 23:20:51 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Sun, 14 Jan 2001 22:20:51 GMT
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <040e01c07e60$8c74d100$e46940d5@hagrid>
References: <010f01c07e52$e9801fc0$e46940d5@hagrid> <3a61f12a.36601630@smtp.worldonline.dk> <040e01c07e60$8c74d100$e46940d5@hagrid>
Message-ID: <3a622615.50148579@smtp.worldonline.dk>

[/F]

>here's the description:

Thanks.

>From: "Fredrik Lundh" <effbot at telia.com>
>Date: Sun, 16 Jul 2000 20:40:46 +0200
>
>/.../
>
>    The unicodenames database consists of two parts: a name
>    database which maps character codes to names, and a code
>    database, mapping names to codes.
>
>* The Name Database (getname)
>
>    First, the 10538 text strings are split into 42193 words,
>    and combined into a 4949-word lexicon (a 29k array).

I only added a word to the lexicon if it was used more than once and if
the length was larger then the lexicon index. I ended up with 1385
entries in the lexicon. (a 7k array)

>    Each word is given a unique index number (common words get
>    lower numbers), and there's a "lexicon offset" table mapping
>    from numbers to words (10k).

My lexicon offset table is 3k and I also use 4k on a perfect hash of the
words.

>    To get back to the original text strings, I use a "phrase
>    book".  For each original string, the phrase book stores a a
>    list of word numbers.  Numbers 0-127 are stored in one byte,
>    higher numbers (less common words) use two bytes.  At this
>    time, about 65% of the words can be represented by a single
>    byte.  The result is a 56k array.

Because not all words are looked up in the lexicon, I used the values
0-38 for the letters and number, 39-250 are used for one byte lexicon
index, and 251-255 are combined with following byte to form a two byte.
This also result in a 57k array

So far it is only minor variations.

>    The final data structure is an offset table, which maps code
>    points to phrase book offsets.  Instead of using one big
>    table, I split each code point into a "page number" and a
>    "line number" on that page.
>
>      offset = line[ (page[code>>SHIFT]<<SHIFT) + (code&MASK) ]
>
>    Since the unicode space is sparsely populated, it's possible
>    to split the code so that lots of pages gets no contents.  I
>    use a brute force search to find the optimal SHIFT value.
>
>    In the current database, the page table has 1024 entries
>    (SHIFT is 6), and there are 199 unique pages in the line
>    table.  The total size of the offset table is 26k.
>
>* The code database (getcode)
>
>    For the code table, I use a straight-forward hash table to store
>    name to code mappings.  It's basically the same implementation
>    as in Python's dictionary type, but a different hash algorithm.
>    The table lookup loop simply uses the name database to check
>    for hits.
>
>    In the current database, the hash table is 32k.

I chose to split a unicode name into words even when looking up a
unicode name. Each word is hashed to a lexicon index and a "phrase book
string" is created. The sorted phrase book is then search with a binary
search among 858 entries that can be address directly followed by a
sequential search among 12 entries. The phrase book search index is 8k
and a table that maps phrase book indexes to codepoints is another 20k.

The searching I do makes jython slower then the direct calculation you
do. I'll take another look at this after jython 2.0 to see if I can
improve performance with your page/line number scheme and a total
hashing of all the unicode names.

regards,
finn



From ping at lfw.org  Sun Jan 14 23:44:47 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Sun, 14 Jan 2001 14:44:47 -0800 (PST)
Subject: [Python-Dev] SourceForge and long patches
Message-ID: <Pine.LNX.4.10.10101141443200.5846-100000@skuld.kingmanhall.org>

Okay, this is getting really annoying.  SourceForge won't accept
any patches > 16k.  Why not?  Is there a way around this?

    SourceForge: Exiting with Error

    ERROR

    Patch Uploaded ERROR - Submission failed PQsendQuery() -- query is too long. Maximum length is 16382 

I'm trying to submit the update to tokenize.py, but it's too long
because i've changed test/output/test_tokenize and that's a big file.


-- ?!ng




From guido at python.org  Sun Jan 14 23:58:03 2001
From: guido at python.org (Guido van Rossum)
Date: Sun, 14 Jan 2001 17:58:03 -0500
Subject: [Python-Dev] SourceForge and long patches
In-Reply-To: Your message of "Sun, 14 Jan 2001 14:44:47 PST."
             <Pine.LNX.4.10.10101141443200.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101141443200.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101142258.RAA13606@cj20424-a.reston1.va.home.com>

> Okay, this is getting really annoying.  SourceForge won't accept
> any patches > 16k.  Why not?  Is there a way around this?

I have no idea why; can only assume it's a limitation in the database
package they use.

The standard workaround is to upload a URL pointing to the patch. :-(

>     SourceForge: Exiting with Error
> 
>     ERROR
> 
>     Patch Uploaded ERROR - Submission failed PQsendQuery() -- query is too long. Maximum length is 16382 

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Mon Jan 15 00:35:51 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 00:35:51 +0100
Subject: [Python-Dev] Where's Greg Ward ?
Message-ID: <3A6237D7.673BBB30@lemburg.com>

He seems to be offline and the people on the distutils list have some
patches and other things which would be nice to have in distutils 
for 2.1.

I suppose we could simply check in the patches, but we still want
to get his OK on things before applying patches to the distutils
tree.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Mon Jan 15 00:57:45 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 14 Jan 2001 18:57:45 -0500
Subject: [Python-Dev] Where's Greg Ward ?
In-Reply-To: <3A6237D7.673BBB30@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com>

[MAL]
> He seems to be offline and the people on the distutils list have
> some patches and other things which would be nice to have in
> distutils for 2.1.

Greg's somewhere near the end of the process of moving from Virginia to
Canada; I expect he'll become visible again Real Soon.

> I suppose we could simply check in the patches, but we still want
> to get his OK on things before applying patches to the distutils
> tree.

The distutils SIG could elect a Shadow Dictator in his place; if everyone
agrees to vote for Andrew, you save the effort of counting votes <wink>.




From tismer at tismer.com  Mon Jan 15 02:35:57 2001
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 15 Jan 2001 02:35:57 +0100
Subject: [Python-Dev] Minor Bug-fix release for Stackless Python 2.0
Message-ID: <3A6253FD.E9B30462@tismer.com>

Wolfgang Lipp reported that Microthreads were executing
sequentially with SLP 2.0 .

The bug fix is available on the website.
Please use this new version, or microthreads will not
give you much fun.

http://www.stackless.com/spc20-win32.exe
http://www.stackless.com/spc-src-010115.zip

enjoy - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From tommy at ilm.com  Mon Jan 15 03:18:20 2001
From: tommy at ilm.com (Captain Senorita)
Date: Sun, 14 Jan 2001 18:18:20 -0800 (PST)
Subject: [Python-Dev] chomp()?
In-Reply-To: <14923.31238.65155.496546@buffalo.fnal.gov>
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com>
	<14923.31238.65155.496546@buffalo.fnal.gov>
Message-ID: <14946.23981.694472.406438@mace.lucasdigital.com>

Charles G Waldman writes:
| 
|              P=NP (Python is not Perl)

Is it too late to suggest this for the SPAM9 t-shirt? :)



From guido at python.org  Mon Jan 15 03:24:36 2001
From: guido at python.org (Guido van Rossum)
Date: Sun, 14 Jan 2001 21:24:36 -0500
Subject: [Python-Dev] chomp()?
In-Reply-To: Your message of "Sun, 14 Jan 2001 18:18:20 PST."
             <14946.23981.694472.406438@mace.lucasdigital.com> 
References: <200012281504.KAA25892@cj20424-a.reston1.va.home.com> <14923.31238.65155.496546@buffalo.fnal.gov>  
            <14946.23981.694472.406438@mace.lucasdigital.com> 
Message-ID: <200101150224.VAA15254@cj20424-a.reston1.va.home.com>

> Charles G Waldman writes:
> | 
> |              P=NP (Python is not Perl)
> 
> Is it too late to suggest this for the SPAM9 t-shirt? :)

By just about a day -- I haven't seen the new design yet, but Just &
Eric were supposed to design it today and hand in the final proofs
tomorrow.  I believe the slogan will be "it fits your brain" (or "it
fits my brain").

But if you print a bunch of P=NP shirts, I'm sure you can sell them
with a profit, both in Long Beach and in San Diego (at the O'Reilly
Open Source conference)...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Mon Jan 15 07:35:05 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 01:35:05 -0500
Subject: [Python-Dev] xreadline speed vs readlines_sizehint
In-Reply-To: <20010110101545.A21305@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEKGIIAA.tim.one@home.com>

[Timmy]
> At this point I'm +0.5 on the idea of fileobject.c using
> ms_getline_hack whenever HAVE_GETC_UNLOCKED isn't available.

[NeilS, from Wednesday]
> Compare ms_getline_hack to what Perl does in order speed up IO.

Believe me, I have <wink>.

> I think its worth maintaining that piece of relatively portable
> code given the benefit.  If the code has to be maintained then it
> might was well be used.  If we find a platform the breaks we can
> always disable it before the final release.

Given that hearty encouragement, and the utterly non-scary results so far, I
just checked in a new scheme:

On a platform with getc_unlocked():
    By default, use getc_unlocked().
    If you want to use fgets() instead, #define USE_FGETS_IN_GETLINE.
        [so motivated people can use fgets() instead if it's faster
         on their platform]
On a platform without getc_unlocked():
    By default, use fgets().
    If you don't want to use fgets(), #define DONT_USE_FGETS_IN_GETLINE.
        [so if we stumble into a platform it fails on between
         releases, the user will have an easy time turning it off
         themself]




From gstein at lyra.org  Mon Jan 15 08:18:20 2001
From: gstein at lyra.org (Greg Stein)
Date: Sun, 14 Jan 2001 23:18:20 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib httplib.py,1.26,1.27
In-Reply-To: <E14HTxn-0003nR-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Sat, Jan 13, 2001 at 08:55:35AM -0800
References: <E14HTxn-0003nR-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010114231820.C6081@lyra.org>

On Sat, Jan 13, 2001 at 08:55:35AM -0800, Guido van Rossum wrote:
> Update of /cvsroot/python/python/dist/src/Lib
> In directory usw-pr-cvs1:/tmp/cvs-serv14586
> 
> Modified Files:
> 	httplib.py 
> Log Message:
> SF Patch #103225 by Ping: httplib: smallest Python patch ever
>...

Not so small:

>...
> *** 333,337 ****
>               i = host.find(':')
>               if i >= 0:
> !                 port = int(host[i+1:])
>                   host = host[:i]
>               else:
> --- 333,340 ----
>               i = host.find(':')
>               if i >= 0:
> !                 try:
> !                     port = int(host[i+1:])
> !                 except ValueError, msg:
> !                     raise socket.error, str(msg)
>                   host = host[:i]
>               else:


Did you intend to commit this?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From moshez at zadka.site.co.il  Mon Jan 15 16:53:58 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon, 15 Jan 2001 17:53:58 +0200 (IST)
Subject: [Python-Dev] chomp()?
In-Reply-To: <200101150224.VAA15254@cj20424-a.reston1.va.home.com>
References: <200101150224.VAA15254@cj20424-a.reston1.va.home.com>, <200012281504.KAA25892@cj20424-a.reston1.va.home.com> <14923.31238.65155.496546@buffalo.fnal.gov>  
            <14946.23981.694472.406438@mace.lucasdigital.com>
Message-ID: <20010115155358.86E5AA828@darjeeling.zadka.site.co.il>

On Sun, 14 Jan 2001 21:24:36 -0500, Guido van Rossum <guido at python.org> wrote:

> But if you print a bunch of P=NP shirts, I'm sure you can sell them
> with a profit, both in Long Beach and in San Diego (at the O'Reilly
> Open Source conference)...

And the Libre Software Meeting (http://lsm.abul.org), which has a Python
subtopic too.
(Since it's in France, no one is calling it "free", so it's probable you
can sell those T-shirts there...)
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From mal at lemburg.com  Mon Jan 15 10:44:14 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 10:44:14 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <010f01c07e52$e9801fc0$e46940d5@hagrid>
Message-ID: <3A62C66E.2BB69E61@lemburg.com>

Fredrik Lundh wrote:
> 
> The name database portions of SF task 17335 ("add
> compressed unicode database") were postponed to
> 2.1.
> 
> My current patch replaces the ~450k large ucnhash
> module with a new ~160k large module.  (See earlier
> posts for more info on how the new database works).
> 
> Should I check it in?

Since the Unicode character names are probably
not used for performance sensitive tasks, I suggest to
checkin the smallest version possible.

If it is too much work to get Finn's version recoded in C
(presuming it's written in Java), then I'd suggest checking
in your version until someone comes up with a yet smaller
edition.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 15 10:48:49 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 10:48:49 +0100
Subject: [Python-Dev] pydoc - put it in the core
References: <14945.59192.400783.403810@beluga.mojam.com>
		<200101142055.PAA13041@cj20424-a.reston1.va.home.com> <14946.6552.542015.620760@beluga.mojam.com>
Message-ID: <3A62C781.22240D3C@lemburg.com>

Skip Montanaro wrote:
> 
>     Guido> Now, if we could somehow get this to show both the docs that Fred
>     Guido> maintains and the stuff that Ping extracts from the source code,
>     Guido> that would be even better!
> 
> I had exactly the same thought.  I suspect that if the install target were
> modified to install the html-ized sections of the lib reference manual pydoc
> could grovel around in sys and find the root of the library reference manual
> pretty easily.  If not, it could simply redirect to the relevant section of
> http://www.python.org/doc/current/lib/.

Since Fred remarked that the URLs for the different docs are
not fixed, how about adding a __onlinedocs__ attribute to the
standard Python modules providing the correct URL ?

Or, alternatively, pass the module's name through some Google
like "I feel lucky" documentation search engine...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 15 10:51:40 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 10:51:40 +0100
Subject: [Python-Dev] Where's Greg Ward ?
References: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com>
Message-ID: <3A62C82C.EA25AAF5@lemburg.com>

[CCed to distutils, since it matters there]
Tim Peters wrote:
> 
> [MAL]
> > He seems to be offline and the people on the distutils list have
> > some patches and other things which would be nice to have in
> > distutils for 2.1.
> 
> Greg's somewhere near the end of the process of moving from Virginia to
> Canada; I expect he'll become visible again Real Soon.

Great :)
 
> > I suppose we could simply check in the patches, but we still want
> > to get his OK on things before applying patches to the distutils
> > tree.
> 
> The distutils SIG could elect a Shadow Dictator in his place; if everyone
> agrees to vote for Andrew, you save the effort of counting votes <wink>.

Ok, let's agree to vote for Andrew :)

Andrew, is that OK with you ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Mon Jan 15 11:52:09 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 05:52:09 -0500
Subject: [Python-Dev] RE: xreadline speed vs readlines_sizehint
In-Reply-To: <3A5D602D.9DC991CB@per.dem.csiro.au>
Message-ID: <LNBBLJKPBEHFEDALKOLCMELAIIAA.tim.one@home.com>

[Mark Favas]
> ...
> The lines range in length from 96 to 747 characters, with
> 11% @ 233, 17% @ 252 and 52% @ 254 characters, so #1 [a vendor
> who actually optimized fgets()] looks promising - most lines are
> long enough to trigger a realloc.

Plus as soon as you spill over the stack buffer, I make you pay for filling
1024 new bytes with newlines before the next fgets() call, and almost all of
those are irrelevant to you.  It doesn't degrade gracefully.  Alas, I tried
several "adaptive" schemes (adjusting how much of the initial segment of a
larger stack buffer they would use, based on the actual line lengths seen in
the past), but the costs always exceeded the savings on my box.

> Cranking up INITBUFSIZE in ms_getline_hack to 260 from 200
> improves thing again, by another 25%:
> total 131426612 chars and 514216 lines
> count_chars_lines     5.081  5.066
> readlines_sizehint    3.743  3.717
> using_fileinput      11.113 11.100
> while_readline        6.100  6.083
> for_xreadlines        3.027  3.033

Well, I couldn't let you forego *all* of 25%.  The current fileobject.c has
a stack buffer of 300 bytes, but only uses 100 of them on the first gets()
call.  On a very quiet machine, that saved 3-4% of the runtime on *my* test
case, whose line lengths are typical of the text files I crunch over, so I'm
happy for me.  If 100 bytes aren't enough, it must call fgets() again, but
just appends the next call into the full 300-byte buffer.  So it saves the
realloc for lines under 300 chars.

> Apart from the name <grin>, I like ms_getline_hack...

Ya, it's now the non-pejorative getline_via_fgets().  I hate that I became a
grown-up <0.9 wink>.

time-to-pick-wings-off-of-flies-ly y'rs  - tim




From ping at lfw.org  Mon Jan 15 12:11:16 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 03:11:16 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib
 httplib.py,1.26,1.27
In-Reply-To: <20010114231820.C6081@lyra.org>
Message-ID: <Pine.LNX.4.10.10101150310100.5846-100000@skuld.kingmanhall.org>

On Sun, 14 Jan 2001, Greg Stein wrote:
> Not so small:
> 
> >...
> > *** 333,337 ****
> >               i = host.find(':')
> >               if i >= 0:
> > !                 port = int(host[i+1:])
> >                   host = host[:i]
> >               else:
> > --- 333,340 ----
> >               i = host.find(':')
> >               if i >= 0:
> > !                 try:
> > !                     port = int(host[i+1:])
> > !                 except ValueError, msg:
> > !                     raise socket.error, str(msg)
> >                   host = host[:i]
> >               else:

The above changes were not part of the patch i submitted;
the patch i submitted was exactly a one-character change.
Guido has already edited the file, so there's no need to
commit anything further here.



-- ?!ng




From mal at lemburg.com  Mon Jan 15 12:56:37 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 12:56:37 +0100
Subject: [Python-Dev] Why is soundex marked obsolete?
References: <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com>
Message-ID: <3A62E575.9A584108@lemburg.com>

Tim Peters wrote:
> 
> [M.-A. Lemburg]
> > BTW, are there less English centric "sounds alike" matchers
> > around ?
> 
> Yes, but if anything there are far too many of them:  like Soundex, they're
> just heuristics, and *everybody* who cares adds their own unique twists,
> while proper studies are almost non-existent.  Few variants appear to be in
> use much beyond their inventor's friends; one notable exception in the
> Jewish community is the Daitch-Mokotoff variation, originally tailored to
> their unique needs but later generalized; a brief description here:
> 
>     http://www.avotaynu.com/soundex.html
> 
> The similarly involved NYSIIS algorithm (New York State Identification
> Intelligence System -- look for NYSIIS on Parnassus) was the winner from a
> field of about two dozen competing algorithms, after measuring their
> effectiveness on assorted databases maintained by the state of New York.
> Since New York has a large immigrant population, NYSIIS isn't as
> Anglocentric as Soundex either.

Thanks for the pointer. I'll add that module to my lib :)

       http://metagram.webreply.com/downloads/nysiis.py

Perhaps Eric ought to add this one to his package as well  ?!
BTW, where can I find your package on the web, Eric ? I'd like
to give it a ride under German language conditions ;)
 
> But state-of-the-art has given up on purely computational algorithms for
> these purposes:  proper names are simply too much a mess.  For example, if I
> search for "Richard", it *ought* to match on "Dick"; if my Arab buddy
> searches on "Mohammed", it *ought* to match on "Mhd"; "the rules" people
> actually use just aren't reducible to pure computation -- it takes a large
> knowledge base to capture what people "just know".  You may enjoy visiting
> this commercial site (AFAIK, nobody is giving away state-of-the-art for
> free):
> 
>     http://www.las-inc.com/

Sad -- "patent pending" algorithms don't help anyone on this
planet :(
 
> > ...
> >     http://physics.nist.gov/cuu/Reference/soundex.html
> >
> > works fine for English texts,
> 
> If that were true, the English-speaking researchers would have declared
> victory 120 years ago <wink>.  But English pronunciation is *notoriously*
> difficult to predict from spelling, partly because English is the Perl of
> human languages.

Then Dutch must be the Python of human languages... ;)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From moshez at zadka.site.co.il  Mon Jan 15 21:13:18 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon, 15 Jan 2001 22:13:18 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib tabnanny.py,1.10,1.11 telnetlib.py,1.8,1.9 tempfile.py,1.26,1.27 threading.py,1.10,1.11 toaiff.py,1.8,1.9 tokenize.py,1.15,1.16 traceback.py,1.18,1.19 tty.py,1.2,1.3 tzparse.py,1.8,1.9
In-Reply-To: <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
References: <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>

On Sun, 14 Jan 2001 19:26:38 -0800, Tim Peters <tim_one at users.sourceforge.net> wrote:
> Modified Files:
> 	tabnanny.py 
> Log Message:
> Whitespace normalization.

hmmmmmm.......
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From mal at lemburg.com  Mon Jan 15 13:10:30 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 13:10:30 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib 
 tabnanny.py,1.10,1.11 telnetlib.py,1.8,1.9 tempfile.py,1.26,1.27 
 threading.py,1.10,1.11 toaiff.py,1.8,1.9 tokenize.py,1.15,1.16 
 traceback.py,1.18,1.19 tty.py,1.2,1.3 tzparse.py,1.8,1.9
References: <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
Message-ID: <3A62E8B6.3DFC1FA2@lemburg.com>

Moshe Zadka wrote:
> 
> On Sun, 14 Jan 2001 19:26:38 -0800, Tim Peters <tim_one at users.sourceforge.net> wrote:
> > Modified Files:
> >       tabnanny.py
> > Log Message:
> > Whitespace normalization.
> 
> hmmmmmm.......

Perhaps you ought to make this a CRON job ?!

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From moshez at zadka.site.co.il  Mon Jan 15 21:24:48 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon, 15 Jan 2001 22:24:48 +0200 (IST)
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <3A62E8B6.3DFC1FA2@lemburg.com>
References: <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
Message-ID: <20010115202448.38F60A828@darjeeling.zadka.site.co.il>

I'm sorry! I meant to reply to tim alone, and ended up spamming python-dev!
Of course, the real culprit is the person who fixed up the reply-to in
the checkin messages to point to python-dev. Why was it done, and
isn't there a better way? This makes it painful to personally comment
on people's checkin messages. I suggest instead to add a mail-followup-to
header

(Didn't anyone read "Reply-To Munging Considered Harmful"?)
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From esr at thyrsus.com  Mon Jan 15 13:23:25 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 07:23:25 -0500
Subject: [Python-Dev] Why is soundex marked obsolete?
In-Reply-To: <3A62E575.9A584108@lemburg.com>; from mal@lemburg.com on Mon, Jan 15, 2001 at 12:56:37PM +0100
References: <LNBBLJKPBEHFEDALKOLCEEJBIIAA.tim.one@home.com> <3A62E575.9A584108@lemburg.com>
Message-ID: <20010115072325.A10377@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> Perhaps Eric ought to add this one to his package as well  ?!

Actually, at this point, my plan is to give Tim a decent interval to
refactor ndiff so his SequenceMatcher class is exposed and documented --
otherwise *I'll* go in and do it (har! waving a bloody knife!).

His turns out to be the same as the Ratcliff-Obershelp technique I was
using, except Tim had his bullshit threshold set too low (:-)) and let
through matches I wouldn't have.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The only purpose for which power can be rightfully exercised over any
member of a civilized community, against his will, is to prevent harm
to others. His own good, either physical or moral, is not a sufficient
warrant
	-- John Stuart Mill, "On Liberty", 1859



From mal at lemburg.com  Mon Jan 15 13:26:59 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 13:26:59 +0100
Subject: [Python-Dev] Re: Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il>
Message-ID: <3A62EC93.9AA60ABA@lemburg.com>

Moshe Zadka wrote:
> 
> I'm sorry! I meant to reply to tim alone, and ended up spamming python-dev!
> Of course, the real culprit is the person who fixed up the reply-to in
> the checkin messages to point to python-dev. Why was it done, and
> isn't there a better way? This makes it painful to personally comment
> on people's checkin messages. I suggest instead to add a mail-followup-to
> header
> 
> (Didn't anyone read "Reply-To Munging Considered Harmful"?)

Naa, noone needs to be shot in the foot ;)

In fact I like it, that replies go to python-dev ... after all,
that's where these things should be discussed.

BTW, in case you misunderstood my reply: it would indeed make
sense to automate these kinds of check (tabnanny et al).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From moshez at zadka.site.co.il  Mon Jan 15 21:42:15 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon, 15 Jan 2001 22:42:15 +0200 (IST)
Subject: [Python-Dev] Re: Someone should be shot
In-Reply-To: <3A62EC93.9AA60ABA@lemburg.com>
References: <3A62EC93.9AA60ABA@lemburg.com>, <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il>
Message-ID: <20010115204215.84F0CA828@darjeeling.zadka.site.co.il>

On Mon, 15 Jan 2001 13:26:59 +0100, "M.-A. Lemburg" <mal at lemburg.com> wrote:
 
> In fact I like it, that replies go to python-dev ... after all,
> that's where these things should be discussed.

Well, that's the mailing list where things should be discussed.
But when I press the "Reply" button (as opposed to "Reply to List" button)
I expect my e-mail to go to the person originating the e-mail. 
Reply-To: means "I'd like to get replies to some other address".
What if, say, a checkin message relates to some private topic
I'd discussed with someone: I'd like to reply to him personally.

I agree that responses to Python-Checkins should be handled on Python-Dev:
that's what the mail-followup-to header is for.

> BTW, in case you misunderstood my reply: it would indeed make
> sense to automate these kinds of check (tabnanny et al).

Oh, ok. The "cron" part threw me off (why cron?) 
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From barry at digicool.com  Mon Jan 15 14:15:28 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 08:15:28 -0500
Subject: [Python-Dev] Where's Greg Ward ?
References: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com>
	<3A62C82C.EA25AAF5@lemburg.com>
Message-ID: <14946.63472.282750.828218@anthem.wooz.org>

>>>>> "M" == M  <mal at lemburg.com> writes:

    >>  The distutils SIG could elect a Shadow Dictator in his place;
    >> if everyone agrees to vote for Andrew, you save the effort of
    >> counting votes <wink>.

    M> Ok, let's agree to vote for Andrew :)

    M> Andrew, is that OK with you ?

He's got my vote.  I've been experiencing some weird problems with the
distutils installation of pybsddb3 out of the current Python cvs
tree.  It'd be nice if the outstanding distutils patches are
integrated before I dive in.  I don't see anything relevant in patches
or bugs, but I don't know if there are other repositories of distutils
fixes (like the archives?).

-Barry




From barry at digicool.com  Mon Jan 15 14:27:02 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 08:27:02 -0500
Subject: [Python-Dev] Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>
	<E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
	<20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
	<20010115202448.38F60A828@darjeeling.zadka.site.co.il>
Message-ID: <14946.64166.348139.425223@anthem.wooz.org>

>>>>> "MZ" == Moshe Zadka <moshez at zadka.site.co.il> writes:

    MZ> I'm sorry! I meant to reply to tim alone, and ended up
    MZ> spamming python-dev!  Of course, the real culprit is the
    MZ> person who fixed up the reply-to in the checkin messages to
    MZ> point to python-dev. Why was it done, and isn't there a better
    MZ> way? This makes it painful to personally comment on people's
    MZ> checkin messages. I suggest instead to add a mail-followup-to
    MZ> header

    MZ> (Didn't anyone read "Reply-To Munging Considered Harmful"?)

Or how about

    http://www.metasystema.org/essays/reply-to-useful.mhtml

for a dissenting view.  Of course Mail-Followup-To is completely
non-standard, but even if it were, having the mailing list munge it in
isn't recommended:

    http://cr.yp.to/proto/replyto.html

Bottom line (IMHO), this is just something about email that is and
will forever remain broken.  Given that, it was voted a long while
back to make Reply-To for checkins point to python-dev so until
there's a hue and cry to change it back, I'll leave it as is.  And
yeah, it bites me sometimes too!

-Barry




From tony at lsl.co.uk  Mon Jan 15 15:18:36 2001
From: tony at lsl.co.uk (Tony J Ibbs (Tibs))
Date: Mon, 15 Jan 2001 14:18:36 -0000
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <Pine.LNX.4.10.10101110803400.5846-100000@skuld.kingmanhall.org>
Message-ID: <002801c07efe$0c728a80$f05aa8c0@lslp7o.int.lsl.co.uk>

<fx: jumps up and down in glee>

Neat stuff. Ka-Ping Yee strikes again. And it works with Python 1.5.2.

<fx: more of the same>

Running on NT (4.00.1381) in an "MS-DOS" window, using Python 1.5.2
installed in the effbot manner, it works, with the slight strangeness
that if I do:

	python pydoc.py <name>

I get the documentation for <name> OK, but it is preceded with a line
claiming that:

	The system cannot find the path specified.

I don't have the time to pursue this at the moment - it's possibly an
artefact of our system?

(one minor "prettiness" hack - those of us who have been tainted by
Emacs Lisp programming tend to start module documentation off with a
line of the form:

	<name>.py -- information about the module

which, when pydoc'ed, results in a NAME line which starts with <name>
twice...
Of course, if I'm the only person doing this, I'll just have to, well,
stop...)

A request - a "-f" switch to allow the user to specify a particular
Python file (i.e., something not on the PYTHONPATH).

Tibs

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"How fleeting are all human passions compared with the massive
continuity of ducks." - Dorothy L. Sayers, "Gaudy Night"
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)




From jack at oratrix.nl  Mon Jan 15 15:32:02 2001
From: jack at oratrix.nl (Jack Jansen)
Date: Mon, 15 Jan 2001 15:32:02 +0100
Subject: [Python-Dev] Regarding Patch #103222: mv Python to PyCore 
In-Reply-To: Message by Guido van Rossum <guido@python.org> ,
	     Sat, 13 Jan 2001 17:33:34 -0500 , <200101132233.RAA03229@cj20424-a.reston1.va.home.com> 
Message-ID: <20010115143203.A44B63C2031@snelboot.oratrix.nl>

Also note that the problem only occurs when trying to build a unix-Python 
out-of-the-box on MacOSX. If you're building a Carbon Python from the 
MacPython sources (something very few people can do right now:-) the 
executable isn't called "python". And when a real MacOSX-Python will be done 
it'll have all the nifty packaging stuff that will also make sure that there's 
nothing called "python" in the toplevel folder.

And the two workarounds (1-Use a UFS filesystem, 2-Put a ".exe" extension in 
the Makefile) work fine for the mean time.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 





From guido at python.org  Mon Jan 15 15:33:23 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 09:33:23 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib httplib.py,1.26,1.27
In-Reply-To: Your message of "Sun, 14 Jan 2001 23:18:20 PST."
             <20010114231820.C6081@lyra.org> 
References: <E14HTxn-0003nR-00@usw-pr-cvs1.sourceforge.net>  
            <20010114231820.C6081@lyra.org> 
Message-ID: <200101151433.JAA17944@cj20424-a.reston1.va.home.com>

> >...
> > *** 333,337 ****
> >               i = host.find(':')
> >               if i >= 0:
> > !                 port = int(host[i+1:])
> >                   host = host[:i]
> >               else:
> > --- 333,340 ----
> >               i = host.find(':')
> >               if i >= 0:
> > !                 try:
> > !                     port = int(host[i+1:])
> > !                 except ValueError, msg:
> > !                     raise socket.error, str(msg)
> >                   host = host[:i]
> >               else:
> 
> Did you intend to commit this?

Oops.  That was a patch submitted a while ago that I applied as an
experiment but then decided I didn't like (argument: why bother).
I've reverted it.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Jan 15 15:40:30 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 09:40:30 -0500
Subject: [Python-Dev] Someone should be shot
In-Reply-To: Your message of "Mon, 15 Jan 2001 22:24:48 +0200."
             <20010115202448.38F60A828@darjeeling.zadka.site.co.il> 
References: <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>  
            <20010115202448.38F60A828@darjeeling.zadka.site.co.il> 
Message-ID: <200101151440.JAA18045@cj20424-a.reston1.va.home.com>

> I'm sorry! I meant to reply to tim alone, and ended up spamming python-dev!
> Of course, the real culprit is the person who fixed up the reply-to in
> the checkin messages to point to python-dev. Why was it done, and
> isn't there a better way? This makes it painful to personally comment
> on people's checkin messages. I suggest instead to add a mail-followup-to
> header
> 
> (Didn't anyone read "Reply-To Munging Considered Harmful"?)

I agree with you, but Barry (who set this up) seems to believe that
there's a good reason to do it this way.  Barry, do you still feel
that way?  The auto-reply-all has probably tripped me up more than
anyone.  Anyone else have a strong reason why this should be set?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From moshez at zadka.site.co.il  Tue Jan 16 00:03:25 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue, 16 Jan 2001 01:03:25 +0200 (IST)
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <14946.64166.348139.425223@anthem.wooz.org>
References: <14946.64166.348139.425223@anthem.wooz.org>, <3A62E8B6.3DFC1FA2@lemburg.com>
	<E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
	<20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
	<20010115202448.38F60A828@darjeeling.zadka.site.co.il>
Message-ID: <20010115230325.1C7F5A828@darjeeling.zadka.site.co.il>

On Mon, 15 Jan 2001 08:27:02 -0500, barry at digicool.com (Barry A. Warsaw) wrote:
> 
> Or how about
> 
>     http://www.metasystema.org/essays/reply-to-useful.mhtml

     If your mailer doesn't have this option, you should request it from
     its development team. Any mailer, whose development team refuses
     this simple request due to some ideological position, cannot be
     said to be reasonable.

As some people here know, I'm my mailer's "development team". I refuse to add
it due to an ideological position. Anyone who knows me know I'm quite 
unreasonable. Hmmm....I'm not making much headway, am I ;-)

> for a dissenting view.  Of course Mail-Followup-To is completely
> non-standard, but even if it were, having the mailing list munge it in
> isn't recommended:
> 
>     http://cr.yp.to/proto/replyto.html

This has no relevance to the current case, since python-checkin 
messages are machine-generated -- so this is closer to doing this in
the script generating the checkin message, and only differes in 
implementation.

> Bottom line (IMHO), this is just something about email that is and
> will forever remain broken.  Given that, it was voted a long while
> back to make Reply-To for checkins point to python-dev so until
> there's a hue and cry to change it back, I'll leave it as is.  And
> yeah, it bites me sometimes too!

I won't continue this thread, but remember that my vote is "no".
I simply shudder at the thought that I might send someone e-mail
with something like "nice bugfix. Didn't know you were back from
the sex-change operation", and it would be broadcast out to all
Python-Dev *and* the archives, for posterity.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From thomas at xs4all.net  Mon Jan 15 16:31:22 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 15 Jan 2001 16:31:22 +0100
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <14946.64166.348139.425223@anthem.wooz.org>; from barry@digicool.com on Mon, Jan 15, 2001 at 08:27:02AM -0500
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <14946.64166.348139.425223@anthem.wooz.org>
Message-ID: <20010115163122.I1005@xs4all.nl>

On Mon, Jan 15, 2001 at 08:27:02AM -0500, Barry A. Warsaw wrote:

> Bottom line (IMHO), this is just something about email that is and
> will forever remain broken.  Given that, it was voted a long while
> back to make Reply-To for checkins point to python-dev so until
> there's a hue and cry to change it back, I'll leave it as is.  And
> yeah, it bites me sometimes too!

I've said this before, on the Mailman-devel list, but I'll repeat it here
for the record (in case this issue ever comes up for vote again :)

The main bite (for me) is that to reply to a person in private, you have to
cut&paste the 'From' header from the original mail, and edit your new mail's
headers, in order to reply to a specific person. My mailer is mature enough
to have a 'reply', 'reply-group' and 'reply-list' keybinding, so the
'Reply-To' only interferes. There probably is a
'reply-to-from-ignoring-replyto' keybinding in there, too, somewhere, or it
could be added, but remembering to type that different key is almost as much
trouble as typing the email address by hand ;P

So, my vote, like Moshe's, is just back from a sex change, and reads 'no'.

Recount-recount-ly y'rs,
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Mon Jan 15 16:38:01 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 10:38:01 -0500
Subject: [Python-Dev] Someone should be shot
In-Reply-To: Your message of "Mon, 15 Jan 2001 08:27:02 EST."
             <14946.64166.348139.425223@anthem.wooz.org> 
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il>  
            <14946.64166.348139.425223@anthem.wooz.org> 
Message-ID: <200101151538.KAA21937@cj20424-a.reston1.va.home.com>

> Bottom line (IMHO), this is just something about email that is and
> will forever remain broken.  Given that, it was voted a long while
> back to make Reply-To for checkins point to python-dev so until
> there's a hue and cry to change it back, I'll leave it as is.  And
> yeah, it bites me sometimes too!

It sounds like a hue and cry to change it to me!  It looks like it's
time for a BDFL Pronouncement.  I pronounce:

Given that:

- we all know how to mail to python-dev;

- replying to the sender is by far the most common kind of reply;

- the mistake of replying to the sender when a reply-all was intended
  does much less potential harm than the mistake of replying to all
  when reply-to-sender was intended,

the reply-to header shall be removed.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Mon Jan 15 17:57:19 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 15 Jan 2001 11:57:19 -0500
Subject: [Python-Dev] Where's Greg Ward ?
In-Reply-To: <14946.63472.282750.828218@anthem.wooz.org>; from barry@digicool.com on Mon, Jan 15, 2001 at 08:15:28AM -0500
References: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com> <3A62C82C.EA25AAF5@lemburg.com> <14946.63472.282750.828218@anthem.wooz.org>
Message-ID: <20010115115719.B919@kronos.cnri.reston.va.us>

On Mon, Jan 15, 2001 at 08:15:28AM -0500, Barry A. Warsaw wrote:
>tree.  It'd be nice if the outstanding distutils patches are
>integrated before I dive in.  I don't see anything relevant in patches
>or bugs, but I don't know if there are other repositories of distutils
>fixes (like the archives?).

There are a few patches buried in the back archives, but I don't know
of any outstanding bugfixes, so please report whatever problem you're
seeing.

Oh, and Barry, did the issue holding up your patch for adding shar
support (#102313) ever get resolved?

--amk



From guido at python.org  Mon Jan 15 17:02:39 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 11:02:39 -0500
Subject: [Python-Dev] TELL64
In-Reply-To: Your message of "Mon, 08 Jan 2001 18:20:56 PST."
             <20010108182056.C4640@lyra.org> 
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net>  
            <20010108182056.C4640@lyra.org> 
Message-ID: <200101151602.LAA22272@cj20424-a.reston1.va.home.com>

Greg Stein noticed me checking in *yet* another system that needs
the fallback TELL64() definition in fileobjects.c, and wrote:

> All of those #ifdefs could be tossed and it would be more robust (long term)
> if an autoconf macro were used to specify when TELL64 should be defined.
> 
> [ I've looked thru fileobject.c and am a bit confused: the conditions for
>   defining TELL64 do not match the conditions for *using* it. that would
>   seem to imply a semantic error somewhere and/or a potential gotcha when
>   they get skewed (like I assume what happened to FreeBSD). simplifying with
>   an autoconf macro may help to rationalize it. ]

I have a better idea.  Since "lseek((fd),0,SEEK_CUR)" seems to be the
universal fallback, why not just define TELL64 to be that if it's not
previously defined (currently only MS_WIN64 has a different
definition)?  It isn't always *used* (the conditions under which
_portable_fseek() uses it are quite complex), but *when* it is used,
this seems to be the most common definition...

Patch:

*** fileobject.c	2001/01/15 10:36:56	2.106
--- fileobject.c	2001/01/15 16:02:06
***************
*** 58,66 ****
  /* define the appropriate 64-bit capable tell() function */
  #if defined(MS_WIN64)
  #define TELL64 _telli64
! #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(__FreeBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
! /* NOTE: this is only used on older
!    NetBSD prior to f*o() funcions */
  #define TELL64(fd) lseek((fd),0,SEEK_CUR)
  #endif
  
--- 58,65 ----
  /* define the appropriate 64-bit capable tell() function */
  #if defined(MS_WIN64)
  #define TELL64 _telli64
! #else
! /* Fallback for older systems that don't have the f*o() funcions */
  #define TELL64(fd) lseek((fd),0,SEEK_CUR)
  #endif


I'll check this in after 24 hours unless a better idea comes up.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Mon Jan 15 17:17:07 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 11:17:07 -0500
Subject: [Python-Dev] PEP 205 comments
In-Reply-To: Your message of "Fri, 12 Jan 2001 23:19:57 +0100."
             <200101122219.f0CMJvp01376@mira.informatik.hu-berlin.de> 
References: <200101122219.f0CMJvp01376@mira.informatik.hu-berlin.de> 
Message-ID: <200101151617.LAA22359@cj20424-a.reston1.va.home.com>

I'll leave most of this to Fred, but I'll reply to two items (Fred can
add these replies to the PEP):

> Again on proxies, there is no discussion or documentation of the
> ReferenceError. Why is it a RuntimeError? LookupError, ValueError, and
> AttributeError seem to be just as fine or better.

RuntimeError was my suggestion.  The error doesn't really qualify as a
LookupError in my view (there's no key that could be valid or invalid)
and ValueError seems too general (that's typically used for
out-of-range arguments and unparseable strings and the like).  Do you
have a reason why RuntimeError is inappropriate?

> On to the type type extensions: Should there be a type flag indicating
> presence of tp_weaklistoffset? It appears that the type structure had
> tp_xxx7 for a long time, so likely all in-use binary modules have
> that field set to zero. Is that sufficient?

Yes, that should be sufficient.  (I'm also going to clain tp_xxx7 for
the rich comparison function slot, but either patch can be modified to
use tp_xxx8 instead.)  Maybe it's time to add a bunch of new spares?

> Thanks for reading all of this message,

You're welcome.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Mon Jan 15 17:39:03 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 11:39:03 -0500
Subject: [Python-Dev] Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>
	<E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
	<20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
	<20010115202448.38F60A828@darjeeling.zadka.site.co.il>
	<14946.64166.348139.425223@anthem.wooz.org>
	<200101151538.KAA21937@cj20424-a.reston1.va.home.com>
Message-ID: <14947.10151.575008.869188@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido at python.org> writes:

    GvR> the reply-to header shall be removed.

I'm more than happy to do this (I remember adding the reply-to munging
reluctantly).  Understand one thing: anybody who naively replies to
the whole list will send those replies to python-checkins, not
python-dev.

Still want it?

-Barry




From barry at digicool.com  Mon Jan 15 17:46:28 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 11:46:28 -0500
Subject: [Python-Dev] Where's Greg Ward ?
References: <LNBBLJKPBEHFEDALKOLCCEJMIIAA.tim.one@home.com>
	<3A62C82C.EA25AAF5@lemburg.com>
	<14946.63472.282750.828218@anthem.wooz.org>
	<20010115115719.B919@kronos.cnri.reston.va.us>
Message-ID: <14947.10596.733726.995351@anthem.wooz.org>

>>>>> "AK" == Andrew Kuchling <akuchlin at mems-exchange.org> writes:

    AK> There are a few patches buried in the back archives, but I
    AK> don't know of any outstanding bugfixes, so please report
    AK> whatever problem you're seeing.

Okay, will do.

    AK> Oh, and Barry, did the issue holding up your patch for adding
    AK> shar support (#102313) ever get resolved?

No, but I'll try to take another poke at it.

-Barry




From moshez at zadka.site.co.il  Tue Jan 16 02:07:48 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Tue, 16 Jan 2001 03:07:48 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Demo/metaclasses Meta.py,1.3,1.4
In-Reply-To: <E14ICtM-00083b-00@usw-pr-cvs1.sourceforge.net>
References: <E14ICtM-00083b-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010116010748.41869A828@darjeeling.zadka.site.co.il>

On Mon, 15 Jan 2001, Guido van Rossum <gvanrossum at users.sourceforge.net> wrote:

> Modified Files:
> 	Meta.py 
> Log Message:
> Geoffrey Gerrietts discovered that a KeyError was caught that probably
> should have been a NameError.  I'm checking in a change that catches
> both, just to be sure -- I can't be bothered trying to understand this
> code any more. :-)
...
> !             except (KeyError, AttributeError):

Ummmm....can you be bothered to make sure you really meant AttributeError
when you said NameError? <wink>
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From guido at python.org  Mon Jan 15 18:06:07 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 12:06:07 -0500
Subject: [Python-Dev] Someone should be shot
In-Reply-To: Your message of "Mon, 15 Jan 2001 11:39:03 EST."
             <14947.10151.575008.869188@anthem.wooz.org> 
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <14946.64166.348139.425223@anthem.wooz.org> <200101151538.KAA21937@cj20424-a.reston1.va.home.com>  
            <14947.10151.575008.869188@anthem.wooz.org> 
Message-ID: <200101151706.MAA22884@cj20424-a.reston1.va.home.com>

> I'm more than happy to do this (I remember adding the reply-to munging
> reluctantly).  Understand one thing: anybody who naively replies to
> the whole list will send those replies to python-checkins, not
> python-dev.
> 
> Still want it?

Yes.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Mon Jan 15 18:11:29 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 12:11:29 -0500
Subject: [Python-Dev] Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>
	<E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
	<20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
	<20010115202448.38F60A828@darjeeling.zadka.site.co.il>
	<14946.64166.348139.425223@anthem.wooz.org>
	<200101151538.KAA21937@cj20424-a.reston1.va.home.com>
	<14947.10151.575008.869188@anthem.wooz.org>
	<200101151706.MAA22884@cj20424-a.reston1.va.home.com>
Message-ID: <14947.12097.613433.580928@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido at python.org> writes:

    >> I'm more than happy to do this (I remember adding the reply-to
    >> munging reluctantly).  Understand one thing: anybody who
    >> naively replies to the whole list will send those replies to
    >> python-checkins, not python-dev.  Still want it?

    GvR> Yes.

Done.




From thomas at xs4all.net  Mon Jan 15 18:34:37 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 15 Jan 2001 18:34:37 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib ftplib.py,1.47,1.48
In-Reply-To: <E14ICYu-000781-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Mon, Jan 15, 2001 at 08:32:52AM -0800
References: <E14ICYu-000781-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010115183437.J1005@xs4all.nl>

On Mon, Jan 15, 2001 at 08:32:52AM -0800, Guido van Rossum wrote:

> This is slightly controversial, but after reading the argumentation in
> the bug tracker for and against, I believe this is the right solution.

It's really only slightly controversional. 'mfisk' convinced me too, and I
used to use ftp to a server behind a firewall :-)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mal at lemburg.com  Mon Jan 15 19:21:54 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 15 Jan 2001 19:21:54 +0100
Subject: [Python-Dev] Re: Someone should be shot
References: <3A62EC93.9AA60ABA@lemburg.com>, <3A62E8B6.3DFC1FA2@lemburg.com>, <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <20010115204215.84F0CA828@darjeeling.zadka.site.co.il>
Message-ID: <3A633FC2.11F90E94@lemburg.com>

Moshe Zadka wrote:
> 
> On Mon, 15 Jan 2001 13:26:59 +0100, "M.-A. Lemburg" <mal at lemburg.com> wrote:
> 
> > In fact I like it, that replies go to python-dev ... after all,
> > that's where these things should be discussed.
> 
> Well, that's the mailing list where things should be discussed.
> But when I press the "Reply" button (as opposed to "Reply to List" button)
> I expect my e-mail to go to the person originating the e-mail.
> Reply-To: means "I'd like to get replies to some other address".
> What if, say, a checkin message relates to some private topic
> I'd discussed with someone: I'd like to reply to him personally.
> 
> I agree that responses to Python-Checkins should be handled on Python-Dev:
> that's what the mail-followup-to header is for.

Ah, ok. I thought you pressed Reply-All and then wondered why
your message got copied to python-dev...
 
> > BTW, in case you misunderstood my reply: it would indeed make
> > sense to automate these kinds of check (tabnanny et al).
> 
> Oh, ok. The "cron" part threw me off (why cron?)

CRON is what's used on Unix to implement jobs which run
on a regular basis... perhaps we just need to seup the
CRON job in timbot though ;)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at python.org  Mon Jan 15 19:35:54 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 13:35:54 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Demo/metaclasses Meta.py,1.3,1.4
In-Reply-To: Your message of "Tue, 16 Jan 2001 03:07:48 +0200."
             <20010116010748.41869A828@darjeeling.zadka.site.co.il> 
References: <E14ICtM-00083b-00@usw-pr-cvs1.sourceforge.net>  
            <20010116010748.41869A828@darjeeling.zadka.site.co.il> 
Message-ID: <200101151835.NAA26712@cj20424-a.reston1.va.home.com>

> > Modified Files:
> > 	Meta.py 
> > Log Message:
> > Geoffrey Gerrietts discovered that a KeyError was caught that probably
> > should have been a NameError.  I'm checking in a change that catches
> > both, just to be sure -- I can't be bothered trying to understand this
> > code any more. :-)
> ...
> > !             except (KeyError, AttributeError):
> 
> Ummmm....can you be bothered to make sure you really meant AttributeError
> when you said NameError? <wink>

The code is correct.  Ignore the comment. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas at arctrix.com  Mon Jan 15 12:55:51 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 15 Jan 2001 03:55:51 -0800
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <14947.10151.575008.869188@anthem.wooz.org>; from barry@digicool.com on Mon, Jan 15, 2001 at 11:39:03AM -0500
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <14946.64166.348139.425223@anthem.wooz.org> <200101151538.KAA21937@cj20424-a.reston1.va.home.com> <14947.10151.575008.869188@anthem.wooz.org>
Message-ID: <20010115035550.B4336@glacier.fnational.com>

[Barry on removing the reply-to header on python-checkins messages]
> I'm more than happy to do this (I remember adding the reply-to munging
> reluctantly).  Understand one thing: anybody who naively replies to
> the whole list will send those replies to python-checkins, not
> python-dev.

Could you make the script generate mail-followup-to instead of
reply-to?  I know its not a standard header but some MUA
understand it and it is exactly what is needed to solve this
problem.  I think promoting it is a good thing.

  Neil



From thomas at xs4all.net  Mon Jan 15 19:59:12 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 15 Jan 2001 19:59:12 +0100
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <20010115035550.B4336@glacier.fnational.com>; from nas@arctrix.com on Mon, Jan 15, 2001 at 03:55:51AM -0800
References: <3A62E8B6.3DFC1FA2@lemburg.com> <E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net> <20010115201318.A2E73A828@darjeeling.zadka.site.co.il> <20010115202448.38F60A828@darjeeling.zadka.site.co.il> <14946.64166.348139.425223@anthem.wooz.org> <200101151538.KAA21937@cj20424-a.reston1.va.home.com> <14947.10151.575008.869188@anthem.wooz.org> <20010115035550.B4336@glacier.fnational.com>
Message-ID: <20010115195912.K1005@xs4all.nl>

On Mon, Jan 15, 2001 at 03:55:51AM -0800, Neil Schemenauer wrote:
> [Barry on removing the reply-to header on python-checkins messages]
> > I'm more than happy to do this (I remember adding the reply-to munging
> > reluctantly).  Understand one thing: anybody who naively replies to
> > the whole list will send those replies to python-checkins, not
> > python-dev.

> Could you make the script generate mail-followup-to instead of
> reply-to?  I know its not a standard header but some MUA
> understand it and it is exactly what is needed to solve this
> problem.  I think promoting it is a good thing.

The script just calls '/bin/mail'. The Reply-To munging is done by Mailman,
which is slightly more than 'a script'. syncmail could do it, but that would
mean using sendmail instead of mail, and writing all headers itself.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at python.org  Mon Jan 15 20:17:27 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 14:17:27 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: Your message of "Fri, 05 Jan 2001 14:14:49 EST."
             <14934.7465.360749.199433@localhost.localdomain> 
References: <14934.7465.360749.199433@localhost.localdomain> 
Message-ID: <200101151917.OAA29687@cj20424-a.reston1.va.home.com>

There doesn't seem to be a lot of enthousiasm for a Unittest
bakeoff...  Certainly I don't think I'll get to this myself before the
conference.

How about the following though: talking of low-hanging fruit, Tim's
doctest module is an excellent thing even if it isn't a unit testing
framework!  (I found this out when I played with it -- it's real easy
to get used to...)

Would anyone object against Tim checking this in?  Since it isn't a
contender in the unit test bake-off, it shouldn't affect the outcome
there at all.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Mon Jan 15 20:40:03 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 14:40:03 -0500
Subject: [Python-Dev] Someone should be shot
References: <3A62E8B6.3DFC1FA2@lemburg.com>
	<E14I0I2-0004AS-00@usw-pr-cvs1.sourceforge.net>
	<20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
	<20010115202448.38F60A828@darjeeling.zadka.site.co.il>
	<14946.64166.348139.425223@anthem.wooz.org>
	<200101151538.KAA21937@cj20424-a.reston1.va.home.com>
	<14947.10151.575008.869188@anthem.wooz.org>
	<20010115035550.B4336@glacier.fnational.com>
	<20010115195912.K1005@xs4all.nl>
Message-ID: <14947.21011.310090.686632@anthem.wooz.org>

>>>>> "TW" == Thomas Wouters <thomas at xs4all.net> writes:

    >> Could you make the script generate mail-followup-to instead of
    >> reply-to?  I know its not a standard header but some MUA
    >> understand it and it is exactly what is needed to solve this
    >> problem.  I think promoting it is a good thing.

    TW> The script just calls '/bin/mail'. The Reply-To munging is
    TW> done by Mailman, which is slightly more than 'a
    TW> script'. syncmail could do it, but that would mean using
    TW> sendmail instead of mail, and writing all headers itself.

I'm sure Fred or I would be happy to review such a patch to syncmail
<wink>.

-Barry



From jeremy at alum.mit.edu  Mon Jan 15 20:31:44 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 15 Jan 2001 14:31:44 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <200101151917.OAA29687@cj20424-a.reston1.va.home.com>
References: <14934.7465.360749.199433@localhost.localdomain>
	<200101151917.OAA29687@cj20424-a.reston1.va.home.com>
Message-ID: <14947.20512.140859.119597@localhost.localdomain>

>>>>> "GvR" == Guido van Rossum <guido at python.org> writes:

  GvR> There doesn't seem to be a lot of enthousiasm for a Unittest
  GvR> bakeoff...  Certainly I don't think I'll get to this myself
  GvR> before the conference.

Let's have all the interested parties vote now, then.  It would
certainly be helpful to have the new unittest module in the alpha
release of 2.1.  I'd like to write some new tests and I'd rather use
the new stuff than the old stuff.

Jeremy



From tim.one at home.com  Mon Jan 15 21:01:52 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 15:01:52 -0500
Subject: [Python-Dev] Someone should be shot
In-Reply-To: <14947.10151.575008.869188@anthem.wooz.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEMOIIAA.tim.one@home.com>

[Barry]
> ...
> Understand one thing: anybody who naively replies to the whole
> list will send those replies to python-checkins, not python-dev.

IIRC, that's why the redirect to python-dev was added to begin with:  of
course people will reply to python-checkins, and then the next guy x-posts
to python-dev too, and the next three in turn variously remove one or the
other groups, or keep both or add c.l.py too.  In the end, no single archive
contains a coherent record on its own, and the random mix of "[Python-Dev]"
and "[Python-checkins]" Subject tags even make it impossible to sort by
(true) subject easily in your own mail client.

> Still want it?

Don't care <wink -- but simple tech approaches to human carelessness (the
true problem here!) don't work no matter which way you flip the switch>.




From tim.one at home.com  Mon Jan 15 21:08:15 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 15:08:15 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib tabnanny.py,1.10,1.11 telnetlib.py,1.8,1.9 tempfile.py,1.26,1.27 threading.py,1.10,1.11 toaiff.py,1.8,1.9 tokenize.py,1.15,1.16 traceback.py,1.18,1.19 tty.py,1.2,1.3 tzparse.py,1.8,1.9
In-Reply-To: <20010115201318.A2E73A828@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEMOIIAA.tim.one@home.com>

[<tim_one at users.sourceforge.net>]
> Modified Files:
> 	tabnanny.py
> Log Message:
> Whitespace normalization.

[Moshe]
> hmmmmmm.......

LOL!  I was hoping nobody would notice that <0.7 wink>.  The appalling truth
is that late in tabnanny's development I deliberately indented a large block
of code by one column, and actually thought it was a good idea at the time.
I'm as delighted to see that finally fixed as I am emabarrassed by the
necessity.

although-perhaps-more-appalled-that-was-there-was-followup-
    debate-about-followups-containing-more-msgs-than-there-
    were-characters-in-moshe's-followup-ly y'rs  - tim




From ping at lfw.org  Mon Jan 15 21:10:10 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 12:10:10 -0800 (PST)
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and
 outside of Python)
In-Reply-To: <002801c07efe$0c728a80$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <Pine.LNX.4.10.10101151155270.5846-100000@skuld.kingmanhall.org>

On Mon, 15 Jan 2001, Tony J Ibbs (Tibs) wrote:
> I get the documentation for <name> OK, but it is preceded with a line
> claiming that:
> 
> 	The system cannot find the path specified.

Thanks for the NT testing.  That's funny -- i put in a special case
for Windows to avoid messages like the above a couple of days ago.
How recently did you download pydoc.py?  Does your copy contain:

    if hasattr(sys, 'winver'):
        return lambda text: tempfilepager(text, 'more')

?

> 	<name>.py -- information about the module
> 
> which, when pydoc'ed, results in a NAME line which starts with <name>
> twice...
> Of course, if I'm the only person doing this, I'll just have to, well,
> stop...)

I think i'm going to ask you to stop, unless Guido prefers
otherwise.  Guido, do you have a style pronouncement for module
docstrings?

> A request - a "-f" switch to allow the user to specify a particular
> Python file (i.e., something not on the PYTHONPATH).

Yes, it's on my to-do list.

So you can see what i'm up to, here's my current to-do list:

    make boldness optional (only if using more/less?  only Unix?)
    document a .py file given on the command line
  + webserver in background
    help should have a repr
    write a better htmlrepr (\n should look special, max length limit, etc.)
    generate docs from lib HTML
    generate HTML index from precis and __path__ and package contents list
    have help(...) produce a directory of available things to ask for help on
    curses.wrapper is broken: both function and package
    respect package __all__
    coherent answer to .py vs .pyc: do we show .pyc?
    fix getcomments() bug: last two lines stuck together
  + grey out shadowed modules/packages
    refactor .py/.pyc/.module.so/.module.so.1 listers in htmldoc, textdoc
    skip __main__ module
  + index built-in modules too
    Windows and Mac testing
    default to HTTP mode on GUI platforms?  (win, mac)

The ones marked with + i consider done.  Feel free to comment on
or suggest priorities for the others; in particular, what do you
think of the last one?  The idea is that double-clicking on
pydoc.py in Windows or MacOS could launch the server and then open
the localhost URL using webbrowser.py to display the documentation
index.  Should it do this by default?


-- ?!ng




From guido at python.org  Mon Jan 15 21:41:25 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 15:41:25 -0500
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and outside of Python)
In-Reply-To: Your message of "Mon, 15 Jan 2001 12:10:10 PST."
             <Pine.LNX.4.10.10101151155270.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101151155270.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101152041.PAA32298@cj20424-a.reston1.va.home.com>

> > 	<name>.py -- information about the module
> > 
> > which, when pydoc'ed, results in a NAME line which starts with <name>
> > twice...
> > Of course, if I'm the only person doing this, I'll just have to, well,
> > stop...)
> 
> I think i'm going to ask you to stop, unless Guido prefers
> otherwise.  Guido, do you have a style pronouncement for module
> docstrings?

I'm with Ping.  None of the examples in the style guide start the
docstring with the function name.  Almost none of the standard library
modules start their module docstring with the module name (codecs is
an exception, but I didn't write it :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From bckfnn at worldonline.dk  Mon Jan 15 21:45:02 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Mon, 15 Jan 2001 20:45:02 GMT
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <3A62C66E.2BB69E61@lemburg.com>
References: <010f01c07e52$e9801fc0$e46940d5@hagrid> <3A62C66E.2BB69E61@lemburg.com>
Message-ID: <3a636122.45847835@smtp.worldonline.dk>

[Fredrik Lundh]

> The name database portions of SF task 17335 ("add
> compressed unicode database") were postponed to
> 2.1.
> 
> My current patch replaces the ~450k large ucnhash
> module with a new ~160k large module.  (See earlier
> posts for more info on how the new database works).
> 
> Should I check it in?

[M.-A. Lemburg]

>Since the Unicode character names are probably
>not used for performance sensitive tasks, I suggest to
>checkin the smallest version possible.
>
>If it is too much work to get Finn's version recoded in C
>(presuming it's written in Java), then I'd suggest checking
>in your version until someone comes up with a yet smaller
>edition.

FWIW, I agree the that 160k module should be used. Please, nobody should
use the jython compression as an argument to delay any improvements in
CPython. 

I certainly didn't post because I wanted to complicate your processes. I
just wanted to show off <wink>.

regards,
finn



From fredrik at effbot.org  Mon Jan 15 21:58:11 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Mon, 15 Jan 2001 21:58:11 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <010f01c07e52$e9801fc0$e46940d5@hagrid> <3A62C66E.2BB69E61@lemburg.com> <3a636122.45847835@smtp.worldonline.dk>
Message-ID: <001f01c07f35$e2c09500$e46940d5@hagrid>

mal, finn:
> >If it is too much work to get Finn's version recoded in C
> >(presuming it's written in Java), then I'd suggest checking
> >in your version until someone comes up with a yet smaller
> >edition.
> 
> FWIW, I agree the that 160k module should be used. Please, nobody should
> use the jython compression as an argument to delay any improvements in
> CPython. 

okay, unless someone throws in a -1 vote, I'll check
this in tomorrow.

Cheers /F




From tim.one at home.com  Mon Jan 15 21:57:26 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 15:57:26 -0500
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <010f01c07e52$e9801fc0$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCGENEIIAA.tim.one@home.com>

[Fredrik Lundh]
> The name database portions of SF task 17335 ("add
> compressed unicode database") were postponed to
> 2.1.
>
> My current patch replaces the ~450k large ucnhash
> module with a new ~160k large module.  (See earlier
> posts for more info on how the new database works).
>
> Should I check it in?

Absolutely!  But not like as for 2.0:  check it in *now*, so we have a few
days to deal with surprises before the alpha release.  With 300K sitting on
the table waiting to be taken, it's not worth delaying one hour to worry
about 60K additional that may or may not be achievable later.




From ping at lfw.org  Mon Jan 15 22:02:38 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 13:02:38 -0800 (PST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Demo/metaclasses
 Meta.py,1.3,1.4
In-Reply-To: <20010116010748.41869A828@darjeeling.zadka.site.co.il>
Message-ID: <Pine.LNX.4.10.10101151302070.5846-100000@skuld.kingmanhall.org>

On Tue, 16 Jan 2001, Moshe Zadka wrote:
> Ummmm....can you be bothered to make sure you really meant AttributeError
> when you said NameError? <wink>

Nice bugfix.  Didn't know you were back from the sex-change operation.


-- ?!ng




From tim.one at home.com  Mon Jan 15 22:15:54 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 16:15:54 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <200101151917.OAA29687@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMENFIIAA.tim.one@home.com>

[Guido]
> There doesn't seem to be a lot of enthousiasm for a Unittest
> bakeoff...

I'm enthusiastic, but ...

> Certainly I don't think I'll get to this myself before the
> conference.

Ditto.  Takes time that's not there.

> ...
> Would anyone object against Tim checking [doctest] in?

You suggested that before, and so it was already on my 2.1a1 todo list.
Hoped to get to it over the weekend but didn't.  Hope to get to it today,
but won't <wink - I hope>.  On the chance that I do, anyone inclined to
object should do so before the sun sets in Reston.

or-if-it-never-sets-the-world-ends-anyway-ly y'rs  - tim




From akuchlin at mems-exchange.org  Mon Jan 15 22:26:19 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 15 Jan 2001 16:26:19 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <14947.20512.140859.119597@localhost.localdomain>; from jeremy@alum.mit.edu on Mon, Jan 15, 2001 at 02:31:44PM -0500
References: <14934.7465.360749.199433@localhost.localdomain> <200101151917.OAA29687@cj20424-a.reston1.va.home.com> <14947.20512.140859.119597@localhost.localdomain>
Message-ID: <20010115162619.A19484@kronos.cnri.reston.va.us>

On Mon, Jan 15, 2001 at 02:31:44PM -0500, Jeremy Hylton wrote:
>Let's have all the interested parties vote now, then.  It would
>certainly be helpful to have the new unittest module in the alpha
>release of 2.1.  I'd like to write some new tests and I'd rather use
>the new stuff than the old stuff.

Huh?  If no one has tried the different modules, what's the point of
having a vote?  (Given that doctest is going to be added, though, it 
should be checked in ASAP.)

--amk



From trentm at ActiveState.com  Mon Jan 15 23:10:26 2001
From: trentm at ActiveState.com (Trent Mick)
Date: Mon, 15 Jan 2001 14:10:26 -0800
Subject: [Python-Dev] TELL64
In-Reply-To: <200101151602.LAA22272@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 15, 2001 at 11:02:39AM -0500
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com>
Message-ID: <20010115141026.I29870@ActiveState.com>

On Mon, Jan 15, 2001 at 11:02:39AM -0500, Guido van Rossum wrote:
> Greg Stein noticed me checking in *yet* another system that needs
> the fallback TELL64() definition in fileobjects.c, and wrote:
> 
> > All of those #ifdefs could be tossed and it would be more robust (long term)
> > if an autoconf macro were used to specify when TELL64 should be defined.
> > 
> > [ I've looked thru fileobject.c and am a bit confused: the conditions for
> >   defining TELL64 do not match the conditions for *using* it. that would
> >   seem to imply a semantic error somewhere and/or a potential gotcha when
> >   they get skewed (like I assume what happened to FreeBSD). simplifying with
> >   an autoconf macro may help to rationalize it. ]

The problem is that these systems lie when they "say" (according to Python's
configure tests for HAVE_LARGEFILE_SUPPORT) that they have largefile support.
This seems to have happened for a particular release of BSD (which has since
been fixed). I think that the Right(tm) (meaning the cleanest solution where
the tests and definitions in the code actually represent the truth) answer is
a proper configure test (sort of as Greg suggests). I don't really feel
comfortable writing that patch (because (1) lack of time and (2) inability to
test, I don't have any access to any of these BSD machines).

[Guido]
> 
> I have a better idea.  Since "lseek((fd),0,SEEK_CUR)" seems to be the
> universal fallback, why not just define TELL64 to be that if it's not
> previously defined (currently only MS_WIN64 has a different
> definition)?  It isn't always *used* (the conditions under which
> _portable_fseek() uses it are quite complex), but *when* it is used,
> this seems to be the most common definition...

While I agree that it is annoying that the build breaks for these platforms I
think that it is appropriate that the build breaks. Having to put these:
    #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(__FreeBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
definitions here gives a nice list of those platforms that *do* lie. I would
prefer that to having an "#else" block that just captures all other cases,
but that is just my opinion.

Options (in order of preference):

(1) Update the configure test for HAVE_LARGEFILE_SUPPORT such that the proper
    versions of these OSes do *not* #define it.
(2) Guido's suggestion.
(2) Keep extending the "#elif" list.

 ^---- using (2) twice was intentional


Trent

> 
> *** fileobject.c	2001/01/15 10:36:56	2.106
> --- fileobject.c	2001/01/15 16:02:06
> ***************
> *** 58,66 ****
>   /* define the appropriate 64-bit capable tell() function */
>   #if defined(MS_WIN64)
>   #define TELL64 _telli64
> ! #elif defined(__NetBSD__) || defined(__OpenBSD__) || defined(__FreeBSD__) || defined(_HAVE_BSDI) || defined(__APPLE__)
> ! /* NOTE: this is only used on older
> !    NetBSD prior to f*o() funcions */
>   #define TELL64(fd) lseek((fd),0,SEEK_CUR)
>   #endif
>   
> --- 58,65 ----
>   /* define the appropriate 64-bit capable tell() function */
>   #if defined(MS_WIN64)
>   #define TELL64 _telli64
> ! #else
> ! /* Fallback for older systems that don't have the f*o() funcions */
>   #define TELL64(fd) lseek((fd),0,SEEK_CUR)
>   #endif
> 
> 
> I'll check this in after 24 hours unless a better idea comes up.
> 

Better idea but no patch. :(

Trent


-- 
Trent Mick
TrentM at ActiveState.com



From skip at mojam.com  Mon Jan 15 23:10:36 2001
From: skip at mojam.com (Skip Montanaro)
Date: Mon, 15 Jan 2001 16:10:36 -0600 (CST)
Subject: [Python-Dev] should we start instrumenting modules with __all__?
Message-ID: <14947.30044.934204.951564@beluga.mojam.com>

I see the from-import-* patch for __all__ has been checked in.  Should we
make an effort to add __all__ to at least some modules before 2.1a1?

Skip



From akuchlin at mems-exchange.org  Mon Jan 15 23:13:03 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 15 Jan 2001 17:13:03 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: <200101121351.IAA19676@cj20424-a.reston1.va.home.com>; from guido@python.org on Fri, Jan 12, 2001 at 08:51:51AM -0500
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> <200101112155.QAA16678@cj20424-a.reston1.va.home.com> <20010111172633.A26249@kronos.cnri.reston.va.us> <200101121351.IAA19676@cj20424-a.reston1.va.home.com>
Message-ID: <20010115171303.A23626@kronos.cnri.reston.va.us>

On Fri, Jan 12, 2001 at 08:51:51AM -0500, Guido van Rossum wrote:
>Ah.  It's very simple.  I create a directory "linux" as a subdirectory
>of the Python source tree (i.e. at the same level as Lib, Objects,
>etc.).  Then I chdir into that directory, and I say "../configure".
>The configure script creates subdirectories to hold the object files ...
>Then I say "make" and it builds Python.  

This doesn't work at all for me in my copy of the CVS tree.  Are there
other steps or requirements to make this work.  (Transcript available
upon request, but I suspect I'm missing something simple.)

--amk




From tim.one at home.com  Mon Jan 15 23:32:51 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 17:32:51 -0500
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <20010115162619.A19484@kronos.cnri.reston.va.us>
Message-ID: <LNBBLJKPBEHFEDALKOLCKENMIIAA.tim.one@home.com>

[Jeremy]
> Let's have all the interested parties vote now, then.  It would
> certainly be helpful to have the new unittest module in the alpha
> release of 2.1.  I'd like to write some new tests and I'd rather use
> the new stuff than the old stuff.

[Andrew]
> Huh?  If no one has tried the different modules, what's the point of
> having a vote?

Presumably so that *something* gets into 2.1a1.  At least you, Jeremy and
Fredrik have tried them, and if that's all there can't be a tie <wink>.  I
would agree this is not an ideal decision procedure.

the-question-is-whether-it's-better-than-paralysis-ly y'rs  - tim




From ping at lfw.org  Mon Jan 15 23:35:47 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 14:35:47 -0800 (PST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
Message-ID: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>

I don't know whether this is going to be obvious or controversial,
but here goes.  Most of the time we're used to seeing a newline as
'\n', not as '\012', and newlines are typed in as '\n'.

A newcomer to Python is likely to do

    >>> 'hello\n'
    'hello\012'

and ask "what's \012?" -- whereupon one has to explain that it's an
octal escape, that 012 in octal equals 10, and that chr(10) is
newline, which is the same as '\n'.  You're bound to run into this,
and you'll see \012 a lot, because \n is such a common character.
Aside from being slightly more frightening, '\012' also takes up
twice as many characters as necessary.

So... i'm submitting a patch that causes the three most common
special whitespace characters, '\n', '\r', and '\t', to appear in
their natural form rather than as octal escapes when strings are
printed and repr()ed.

Mm?


-- ?!ng




From esr at thyrsus.com  Tue Jan 16 00:15:50 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 18:15:50 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>; from ping@lfw.org on Mon, Jan 15, 2001 at 02:35:47PM -0800
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
Message-ID: <20010115181550.A11566@thyrsus.com>

Ka-Ping Yee <ping at lfw.org>:
> I don't know whether this is going to be obvious or controversial,
> but here goes.  Most of the time we're used to seeing a newline as
> '\n', not as '\012', and newlines are typed in as '\n'.
> 
> A newcomer to Python is likely to do
> 
>     >>> 'hello\n'
>     'hello\012'
> 
> and ask "what's \012?" -- whereupon one has to explain that it's an
> octal escape, that 012 in octal equals 10, and that chr(10) is
> newline, which is the same as '\n'.  You're bound to run into this,
> and you'll see \012 a lot, because \n is such a common character.
> Aside from being slightly more frightening, '\012' also takes up
> twice as many characters as necessary.
> 
> So... i'm submitting a patch that causes the three most common
> special whitespace characters, '\n', '\r', and '\t', to appear in
> their natural form rather than as octal escapes when strings are
> printed and repr()ed.

Works for me.  I'd add \v, \b and \a to cover the whole ANSI C 
standard escape set (hmmm...am I missing any?)
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Live free or die; death is not the worst of evils.
	-- General George Stark.



From thomas at xs4all.net  Tue Jan 16 00:49:30 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 00:49:30 +0100
Subject: [Python-Dev] time functions
Message-ID: <20010116004930.L1005@xs4all.nl>

Maybe this is a dead and buried subject, but I'm going to try anyway, since
everyone's been in such a wonderful 'lets fix ugly but harmless nits' mood
lately :)

Why do we need the following atrocity <wink>:

  timestr = time.strftime("<format>", time.localtime(time.time()))

To do the simple task of 'date +<format>' ?  I never really understood why
there isn't a way to get a timetuple directly from C, rather than converting
a float that we got from C a bytecode before, even though the higher level
almost always deals with timetuples. How about making the float-to-tuple
functions (time.localtime, time.gmtime) accept 0 arguments as well, and
defaulting to time.time() in that case ? Even better, how about doing the
same for the other functions, too ? (where it makes sense, of course :) 

Actually, I'll split it up in three proposals:

- Making the time in time.strftime default to 'now', so that the above
  becomes the ever so slightly confusing:

  timestr = time.strftime("<format>")
  (confusing because it looks a bit like a regexp constructor...)

- Making the time in time.asctime and time.ctime optional, defaulting to
  'now', so you can just call 'time.ctime()' without having to pass
  time.time() (which are about half the calls in my own code :)

- Making the time in time.localtime and time.gmtime default to 'now'.

I'm 0/+1/+1 myself :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Tue Jan 16 00:55:36 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 00:55:36 +0100
Subject: [Python-Dev] TELL64
In-Reply-To: <20010115141026.I29870@ActiveState.com>; from trentm@ActiveState.com on Mon, Jan 15, 2001 at 02:10:26PM -0800
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com>
Message-ID: <20010116005536.M1005@xs4all.nl>

On Mon, Jan 15, 2001 at 02:10:26PM -0800, Trent Mick wrote:

> > > [ I've looked thru fileobject.c and am a bit confused: the conditions
> > >   for defining TELL64 do not match the conditions for *using* it. that
> > >   would seem to imply a semantic error somewhere and/or a potential
> > >   gotcha when they get skewed (like I assume what happened to
> > >   FreeBSD). simplifying with an autoconf macro may help to rationalize
> > >   it. ]

> The problem is that these systems lie when they "say" (according to
> Python's configure tests for HAVE_LARGEFILE_SUPPORT) that they have
> largefile support. This seems to have happened for a particular release of
> BSD (which has since been fixed). I think that the Right(tm) (meaning the
> cleanest solution where the tests and definitions in the code actually
> represent the truth) answer is a proper configure test (sort of as Greg
> suggests). I don't really feel comfortable writing that patch (because (1)
> lack of time and (2) inability to test, I don't have any access to any of
> these BSD machines).

There is no (longer any) 'single BSD release', so I doubt it has 'since been
fixed' :) We should consider the different BSD derived OSes as separate, if
slightly related, systems (much like SunOS <-> BSD.) The problem in the BSDI
case is really simple: the autoconf test doesn't test whether the fs really
supports large files, but rather whether the system has an off_t type that
is 64 bits. BSDI has that type, but does not actually use it in any of the
seek/tell functions. This has not been 'fixed' as far as I know, precisely
because it isn't 'broken' :)

I tried to fix the test, but I have been completely unable to find a proper
test. There doesn't seem to be a 'standard' one, and I wasn't able to figure
out what, say, 'zsh' uses -- black autoconf magic, for sure.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From trentm at ActiveState.com  Tue Jan 16 01:24:54 2001
From: trentm at ActiveState.com (Trent Mick)
Date: Mon, 15 Jan 2001 16:24:54 -0800
Subject: [Python-Dev] TELL64
In-Reply-To: <20010116005536.M1005@xs4all.nl>; from thomas@xs4all.net on Tue, Jan 16, 2001 at 12:55:36AM +0100
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl>
Message-ID: <20010115162454.D3864@ActiveState.com>

On Tue, Jan 16, 2001 at 12:55:36AM +0100, Thomas Wouters wrote:
> On Mon, Jan 15, 2001 at 02:10:26PM -0800, Trent Mick wrote:
> 
> > The problem is that these systems lie when they "say" (according to
> > Python's configure tests for HAVE_LARGEFILE_SUPPORT) that they have
> > largefile support. This seems to have happened for a particular release of
> > BSD (which has since been fixed). I think that the Right(tm) (meaning the
> > cleanest solution where the tests and definitions in the code actually
> > represent the truth) answer is a proper configure test (sort of as Greg
> > suggests). I don't really feel comfortable writing that patch (because (1)
> > lack of time and (2) inability to test, I don't have any access to any of
> > these BSD machines).
> 
> There is no (longer any) 'single BSD release', so I doubt it has 'since been
> fixed' :) 

Okay sure (showing my ignorance). My only understanding was that this
"lying" was the case for some unspecified BSDs a while ago but that the
latest releases of any of them *did* have largefile support.

> 
> I tried to fix the test, but I have been completely unable to find a proper
> test. There doesn't seem to be a 'standard' one, and I wasn't able to figure
> out what, say, 'zsh' uses -- black autoconf magic, for sure.

Hmmm... if one code encode whether or not a 64-bit fseek could be
implemented (either using fseek, fseek0, fseek64, _fseek, fsetpos/fgetpos,
etc.) in a short C program then that would be the test (or at least most of
the test, might have to see if ftell could be implemented as well). Or are
there other requirements?


Trent

-- 
Trent Mick
TrentM at ActiveState.com



From esr at thyrsus.com  Tue Jan 16 02:26:14 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 20:26:14 -0500
Subject: [Python-Dev] time functions
In-Reply-To: <20010116004930.L1005@xs4all.nl>; from thomas@xs4all.net on Tue, Jan 16, 2001 at 12:49:30AM +0100
References: <20010116004930.L1005@xs4all.nl>
Message-ID: <20010115202614.A11732@thyrsus.com>

Thomas Wouters <thomas at xs4all.net>:
> Actually, I'll split it up in three proposals:
> 
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)
> 
> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)
> 
> - Making the time in time.localtime and time.gmtime default to 'now'.
> 
> I'm 0/+1/+1 myself :)

Likewise.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Never trust a man who praises compassion while pointing a gun at you.



From barry at digicool.com  Tue Jan 16 03:14:33 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 15 Jan 2001 21:14:33 -0500
Subject: [Python-Dev] time functions
References: <20010116004930.L1005@xs4all.nl>
Message-ID: <14947.44681.254332.976234@anthem.wooz.org>

>>>>> "TW" == Thomas Wouters <thomas at xs4all.net> writes:

    TW> I'm 0/+1/+1 myself :)

Maybe I'm an inch on the +0/+1/+1 side. :)



From jeremy at alum.mit.edu  Tue Jan 16 01:11:59 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 15 Jan 2001 19:11:59 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <20010115162619.A19484@kronos.cnri.reston.va.us>
References: <14934.7465.360749.199433@localhost.localdomain>
	<200101151917.OAA29687@cj20424-a.reston1.va.home.com>
	<14947.20512.140859.119597@localhost.localdomain>
	<20010115162619.A19484@kronos.cnri.reston.va.us>
Message-ID: <14947.37327.395622.66435@localhost.localdomain>

>>>>> "AMK" == Andrew Kuchling <akuchlin at mems-exchange.org> writes:

  AMK> On Mon, Jan 15, 2001 at 02:31:44PM -0500, Jeremy Hylton wrote:
  >> Let's have all the interested parties vote now, then.  It would
  >> certainly be helpful to have the new unittest module in the alpha
  >> release of 2.1.  I'd like to write some new tests and I'd rather
  >> use the new stuff than the old stuff.

  AMK> Huh?  If no one has tried the different modules, what's the
  AMK> point of having a vote?  (Given that doctest is going to be
  AMK> added, though, it should be checked in ASAP.)

Guido is the only person that said he hadn't tried anything.  If
others have given it a whirl, they ought to chime in now.  If very few
people have given them a try, we should decide whether we wait for
them or proceed without them.  We can't wait indefinitely.  I'm not
sure when we need to decide.

Jeremy



From nas at arctrix.com  Mon Jan 15 20:40:55 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 15 Jan 2001 11:40:55 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Python sysmodule.c,2.82,2.83
In-Reply-To: <200101132225.RAA03197@cj20424-a.reston1.va.home.com>; from guido@python.org on Sat, Jan 13, 2001 at 05:25:12PM -0500
References: <E14HYoJ-0002n3-00@usw-pr-cvs1.sourceforge.net> <20010113071758.C28643@glacier.fnational.com> <200101132225.RAA03197@cj20424-a.reston1.va.home.com>
Message-ID: <20010115114055.A5879@glacier.fnational.com>

On Sat, Jan 13, 2001 at 05:25:12PM -0500, Guido van Rossum wrote:
> Do you have a tool that detects leaks?

debauch is showing promise athough it is still pretty rough
around the edges.  memprof is another option.  It looks like
init_exceptions may be leaking memory.  Some debauch output:

 1      Leaked Memory 0x0849cf98, size 44 (from 0x0) AllocTime: 79269 FreeTime: 43436      
        return stack:
                ???:?? (0x40016005) 
                classobject.c:84 (0x805c16d) <PyClass_New+631>
                exceptions.c:337 (0x8088594) <make_Exception+250>
                exceptions.c:1061 (0x80898dc) <init_exceptions+232>
                pythonrun.c:151 (0x8053581) <Py_Initialize+573>
                loop.c:23 (0x8053305) <main+101>

I haven't figured out if this is a real leak yet.

  Neil



From michel at digicool.com  Tue Jan 16 07:33:00 2001
From: michel at digicool.com (Michel Pelletier)
Date: Mon, 15 Jan 2001 22:33:00 -0800 (PST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <14947.37327.395622.66435@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.10101152216200.2373-100000@localhost.localdomain>

On Mon, 15 Jan 2001, Jeremy Hylton wrote:

> >>>>> "AMK" == Andrew Kuchling <akuchlin at mems-exchange.org> writes:
> 
>   AMK> On Mon, Jan 15, 2001 at 02:31:44PM -0500, Jeremy Hylton wrote:
>   >> Let's have all the interested parties vote now, then.  It would
>   >> certainly be helpful to have the new unittest module in the alpha
>   >> release of 2.1.  I'd like to write some new tests and I'd rather
>   >> use the new stuff than the old stuff.
> 
>   AMK> Huh?  If no one has tried the different modules, what's the
>   AMK> point of having a vote?  (Given that doctest is going to be
>   AMK> added, though, it should be checked in ASAP.)
> 
> Guido is the only person that said he hadn't tried anything.  If
> others have given it a whirl, they ought to chime in now.  

I have used pyunit to create a simple set of tests.  It seemed to do the
job well and it was very easy. I'd never done it before and the docs were
fat and A+.

I can only give a one-sided opinion.  I know of AMK's work but I have not
used it, are there others?

-Michel




From akuchlin at mems-exchange.org  Tue Jan 16 04:03:31 2001
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Mon, 15 Jan 2001 22:03:31 -0500
Subject: [Python-Dev] Detecting install time
Message-ID: <200101160303.WAA11632@207-172-111-91.s91.tnt1.ann.va.dialup.rcn.com>

For PEP 229, the setup.py script needs to figure out if it's running
from the build directory, because then distutils.sysconfig needs to
look at different config files; ./Modules/Makefile instead of
/usr/lib/python2.0/config/Makefile, and so forth.  Is there a
simple/clean way to do this?

--amk






From guido at python.org  Tue Jan 16 04:21:43 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:21:43 -0500
Subject: [Python-Dev] PEP 229: setup.py revised
In-Reply-To: Your message of "Mon, 15 Jan 2001 17:13:03 EST."
             <20010115171303.A23626@kronos.cnri.reston.va.us> 
References: <E14GkMS-0006DF-00@kronos.cnri.reston.va.us> <200101112155.QAA16678@cj20424-a.reston1.va.home.com> <20010111172633.A26249@kronos.cnri.reston.va.us> <200101121351.IAA19676@cj20424-a.reston1.va.home.com>  
            <20010115171303.A23626@kronos.cnri.reston.va.us> 
Message-ID: <200101160321.WAA00648@cj20424-a.reston1.va.home.com>

> On Fri, Jan 12, 2001 at 08:51:51AM -0500, Guido van Rossum wrote:
> >Ah.  It's very simple.  I create a directory "linux" as a subdirectory
> >of the Python source tree (i.e. at the same level as Lib, Objects,
> >etc.).  Then I chdir into that directory, and I say "../configure".
> >The configure script creates subdirectories to hold the object files ...
> >Then I say "make" and it builds Python.  
> 
> This doesn't work at all for me in my copy of the CVS tree.  Are there
> other steps or requirements to make this work.  (Transcript available
> upon request, but I suspect I'm missing something simple.)

You can't start doing this in a tree where you have already built
Python using the default way -- you have to use a pristine tree.  The
reason is the funny way Make's VPATH feature works, it sees the .o
files in the source directory and then thinks it doesn't have to creat
the .o file in the build directory.  I think a "make clobber" at the
top level would probably eradicate everything that confuses Make.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido at python.org  Tue Jan 16 04:24:04 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:24:04 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Mon, 15 Jan 2001 14:35:47 PST."
             <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101160324.WAA00677@cj20424-a.reston1.va.home.com>

> I don't know whether this is going to be obvious or controversial,
> but here goes.  Most of the time we're used to seeing a newline as
> '\n', not as '\012', and newlines are typed in as '\n'.
> 
> A newcomer to Python is likely to do
> 
>     >>> 'hello\n'
>     'hello\012'
> 
> and ask "what's \012?" -- whereupon one has to explain that it's an
> octal escape, that 012 in octal equals 10, and that chr(10) is
> newline, which is the same as '\n'.  You're bound to run into this,
> and you'll see \012 a lot, because \n is such a common character.
> Aside from being slightly more frightening, '\012' also takes up
> twice as many characters as necessary.
> 
> So... i'm submitting a patch that causes the three most common
> special whitespace characters, '\n', '\r', and '\t', to appear in
> their natural form rather than as octal escapes when strings are
> printed and repr()ed.

+1 on the idea; no time to study the patch tonight.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 04:28:38 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:28:38 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Mon, 15 Jan 2001 18:15:50 EST."
             <20010115181550.A11566@thyrsus.com> 
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>  
            <20010115181550.A11566@thyrsus.com> 
Message-ID: <200101160328.WAA00723@cj20424-a.reston1.va.home.com>

> > So... i'm submitting a patch that causes the three most common
> > special whitespace characters, '\n', '\r', and '\t', to appear in
> > their natural form rather than as octal escapes when strings are
> > printed and repr()ed.
> 
> Works for me.  I'd add \v, \b and \a to cover the whole ANSI C 
> standard escape set (hmmm...am I missing any?)

You missed \f [*].  Unclear to me whether it's a good idea to add the
lesser-known ones; they are just as likely binary gobbledegook rather
than what their escapes stand for.

[*] http://www.python.org/doc/current/ref/strings.html

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 04:31:19 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:31:19 -0500
Subject: [Python-Dev] time functions
In-Reply-To: Your message of "Tue, 16 Jan 2001 00:49:30 +0100."
             <20010116004930.L1005@xs4all.nl> 
References: <20010116004930.L1005@xs4all.nl> 
Message-ID: <200101160331.WAA00780@cj20424-a.reston1.va.home.com>

> Maybe this is a dead and buried subject, but I'm going to try anyway, since
> everyone's been in such a wonderful 'lets fix ugly but harmless nits' mood
> lately :)
> 
> Why do we need the following atrocity <wink>:
> 
>   timestr = time.strftime("<format>", time.localtime(time.time()))
> 
> To do the simple task of 'date +<format>' ?  I never really understood why
> there isn't a way to get a timetuple directly from C, rather than converting
> a float that we got from C a bytecode before, even though the higher level
> almost always deals with timetuples. How about making the float-to-tuple
> functions (time.localtime, time.gmtime) accept 0 arguments as well, and
> defaulting to time.time() in that case ? Even better, how about doing the
> same for the other functions, too ? (where it makes sense, of course :) 
> 
> Actually, I'll split it up in three proposals:
> 
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)

I don't see the confusion.

> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)
> 
> - Making the time in time.localtime and time.gmtime default to 'now'.
> 
> I'm 0/+1/+1 myself :)

Yes, I've wondered this myself too.  I guess the current API is based
too much on the C API...

+1/+1/+1.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 04:47:32 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 22:47:32 -0500
Subject: [Python-Dev] Detecting install time
In-Reply-To: Your message of "Mon, 15 Jan 2001 22:03:31 EST."
             <200101160303.WAA11632@207-172-111-91.s91.tnt1.ann.va.dialup.rcn.com> 
References: <200101160303.WAA11632@207-172-111-91.s91.tnt1.ann.va.dialup.rcn.com> 
Message-ID: <200101160347.WAA01132@cj20424-a.reston1.va.home.com>

> For PEP 229, the setup.py script needs to figure out if it's running
> from the build directory, because then distutils.sysconfig needs to
> look at different config files; ./Modules/Makefile instead of
> /usr/lib/python2.0/config/Makefile, and so forth.  Is there a
> simple/clean way to do this?

You could check for the presence of config.status -- that file is not
installed.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Tue Jan 16 04:53:16 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 22:53:16 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com>

[?!ng]
> So... i'm submitting a patch that causes the three most common
> special whitespace characters, '\n', '\r', and '\t', to appear in
> their natural form rather than as octal escapes when strings are
> printed and repr()ed.

-1 on doing that when they're printed (although I probably misunderstand
what you mean there).

+1 for changing repr() as suggested.

-0 on generalizing to \a \b \f \v too (I've never used one of those in a
string literal in my life, so would be more baffled by seeing one come back
than I would the octal equivalent).

I would also be +1 on using hex escapes instead of octal (I grew up on 36-
and 60-bit machines, but that was the last time octal looked *natural*!).
Octal and hex escapes both consume 4 characters, so I can't imagine what
octal has going for it in the 21st century <wink>.

377-is-an-irritating-way-to-spell-ff-ly y'rs  - tim


PS:  Note that C doesn't define what numerical values \a etc have, just
that:

    Each of these escape sequences shall produce a unique
    implementation-defined value which can be stored in a single
    char object. The external representations in a text file need
    not be identical to the internal representations, and are
    outside the scope of this International Standard.

The current method does have the advantage of extreme clarity.




From guido at python.org  Tue Jan 16 05:08:46 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 23:08:46 -0500
Subject: [Python-Dev] TELL64
In-Reply-To: Your message of "Mon, 15 Jan 2001 16:24:54 PST."
             <20010115162454.D3864@ActiveState.com> 
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl>  
            <20010115162454.D3864@ActiveState.com> 
Message-ID: <200101160408.XAA01368@cj20424-a.reston1.va.home.com>

Looking at the code (in _portable_fseek()) that uses TELL64, I don't
understand why it can't use fgetpos().  That code is used only when
fpos_t -- the type used by fgetpos() and fsetpos() -- is 64-bit.

Trent, you wrote that code.  Why wouldn't this work just as well?

(your code):
			if ((pos = TELL64(fileno(fp))) == -1L)
				return -1;
(my suggestion):
			if (fgetpos(fp, &pos) != 0)
				return -1;

It can't be because fgetpos() doesn't exist or is otherwise unusable,
because the SEEK_CUR case uses it.

We also know that offset is 8-bit capable (the #if around the
declaration of _portable_fseek() ensures that).

I would even go as far as to collapse the entire switch as follows:

	fpos_t pos;
	switch (whence) {
	case SEEK_END:
		/* do a "no-op" seek first to sync the buffering so that
		   the low-level tell() can be used correctly */
		if (fseek(fp, 0, SEEK_END) != 0)
			return -1;
		/* fall through */
	case SEEK_CUR:
		if (fgetpos(fp, &pos) != 0)
			return -1;
		offset += pos;
		break;
	/* case SEEK_SET: break; */
	}
	return fsetpos(fp, &offset);

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 05:13:40 2001
From: guido at python.org (Guido van Rossum)
Date: Mon, 15 Jan 2001 23:13:40 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Mon, 15 Jan 2001 22:53:16 EST."
             <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com> 
Message-ID: <200101160413.XAA01404@cj20424-a.reston1.va.home.com>

> [?!ng]
> > So... i'm submitting a patch that causes the three most common
> > special whitespace characters, '\n', '\r', and '\t', to appear in
> > their natural form rather than as octal escapes when strings are
> > printed and repr()ed.
> 
> -1 on doing that when they're printed (although I probably misunderstand
> what you mean there).

Ping was using imprecise language here -- he meant repr() and "printed
at the command line prompt."

> +1 for changing repr() as suggested.
> 
> -0 on generalizing to \a \b \f \v too (I've never used one of those in a
> string literal in my life, so would be more baffled by seeing one come back
> than I would the octal equivalent).
> 
> I would also be +1 on using hex escapes instead of octal (I grew up on 36-
> and 60-bit machines, but that was the last time octal looked *natural*!).

Me too.  One summer vacation while in college I had nothing better to
do than decode the Pascal runtime system for the University's CDC-6600
from an octal dump into assembly.  Learned lots!

> Octal and hex escapes both consume 4 characters, so I can't imagine what
> octal has going for it in the 21st century <wink>.

Originally, using \x for these was impractical (at least) because of
the stupid gobble-up-everything-that-looks-like-a-hex-digit semantics
of the \x escape.  Now we've fixed this, I agree.

> 377-is-an-irritating-way-to-spell-ff-ly y'rs  - tim
> 
> 
> PS:  Note that C doesn't define what numerical values \a etc have, just
> that:
> 
>     Each of these escape sequences shall produce a unique
>     implementation-defined value which can be stored in a single
>     char object. The external representations in a text file need
>     not be identical to the internal representations, and are
>     outside the scope of this International Standard.
> 
> The current method does have the advantage of extreme clarity.

Python doesn't support non-ASCII machines, like the C standard
(pretends to).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Tue Jan 16 05:26:13 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 23:26:13 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <200101160328.WAA00723@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 15, 2001 at 10:28:38PM -0500
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <20010115181550.A11566@thyrsus.com> <200101160328.WAA00723@cj20424-a.reston1.va.home.com>
Message-ID: <20010115232613.B12166@thyrsus.com>

Guido van Rossum <guido at python.org>:
> > > So... i'm submitting a patch that causes the three most common
> > > special whitespace characters, '\n', '\r', and '\t', to appear in
> > > their natural form rather than as octal escapes when strings are
> > > printed and repr()ed.
> > 
> > Works for me.  I'd add \v, \b and \a to cover the whole ANSI C 
> > standard escape set (hmmm...am I missing any?)
> 
> You missed \f [*].  Unclear to me whether it's a good idea to add the
> lesser-known ones; they are just as likely binary gobbledegook rather
> than what their escapes stand for.
> 
> [*] http://www.python.org/doc/current/ref/strings.html

Truth is, Guido, I'm kind of iffy about whether there'd be a gain in
clarity myself.  But I find I'm rather attached to the idea of
maintaining strictest possible symmetry between what Python handles on
input and what it emits on output.

So unless we think adding \f, \v, \b, and \a to the special set would
actually produce a *loss* of clarity relative to octal gibberish (!),
I say do 'em all.  Aesthetically, that feels to me like the right thing, 
and the *Pythonic* thing, to do here.

Have I erred in my intuition, O BDFL?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

A man who has nothing which he is willing to fight for, nothing 
which he cares about more than he does about his personal safety, 
is a miserable creature who has no chance of being free, unless made 
and kept so by the exertions of better men than himself. 
	-- John Stuart Mill, writing on the U.S. Civil War in 1862



From nas at arctrix.com  Mon Jan 15 22:45:28 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 15 Jan 2001 13:45:28 -0800
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <20010115232613.B12166@thyrsus.com>; from esr@thyrsus.com on Mon, Jan 15, 2001 at 11:26:13PM -0500
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <20010115181550.A11566@thyrsus.com> <200101160328.WAA00723@cj20424-a.reston1.va.home.com> <20010115232613.B12166@thyrsus.com>
Message-ID: <20010115134528.B6193@glacier.fnational.com>

On Mon, Jan 15, 2001 at 11:26:13PM -0500, Eric S. Raymond wrote:
> [...] I find I'm rather attached to the idea of maintaining
> strictest possible symmetry between what Python handles on
> input and what it emits on output.
> 
> So unless we think adding \f, \v, \b, and \a to the special set would
> actually produce a *loss* of clarity relative to octal gibberish (!),
> I say do 'em all.

Symmetry is good but I bet most people who would see \f, \v, \b,
\a wouldn't have entered those characters using escapes.  Most
likely those character's would have been read from a binary file.

That said, I don't really mind either way.

  Neil



From tim.one at home.com  Tue Jan 16 05:43:06 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 15 Jan 2001 23:43:06 -0500
Subject: [Python-Dev] Whitesapce normalization
Message-ID: <LNBBLJKPBEHFEDALKOLCCEOOIIAA.tim.one@home.com>

You may have noticed that I checked in changes to most of the modules in the
top level of Lib yesterday (Sunday).  This is part of a Crusade that was
supposed to happen before 2.0a1, but got dropped on the floor then due to
misunderstandings:  make the Python code we distribute adhere to Guido's
style guide (4-space indents, no hard tabs), + clean up minor whitespace
nits (no stray blank lines at the ends of files, no trailing whitespace on
lines, last line of the file should end with a newline).

It would be nice if people cleaned up their code this way too; I'm not going
to go thru the entire distribution doing this.  So, if you give a rip, pick
a directory or some modules you're fond of, and clean 'em up.

The program Tools/scripts/reindent.py does all of the above for you, so it's
not hard.  But it takes some care in two areas, which is why I did the top
level of Lib one file at a time by hand, and studied diffs by eyeball before
checking in any changes:

+ It's unlikely but possible that some program file *depends* on trailing
whitespace.  That plain sucks (it's *going* to break sooner or later), but
reindent.py can't help you there.

+ While reindent should never otherwise damage program logic, very strange
commenting or docstring styles may get mangled by it, making code and/or
docs hard to read.  reindent works very hard to do a good job on that, and
indeed I found no need to make manual changes to anything it did in the top
level of Lib.  But check anyway.  Especially some of the very oldest modules
are littered with ugly stuff like

    #

all over the place, from back when nobody had an editor smart enough to skip
over preceding blank lines when suggesting indentation for the current line.
Then again, maybe we should just drop the Irix5 directory <wink>.

voice-in-the-wilderness-ly y'rs  - tim




From esr at thyrsus.com  Tue Jan 16 05:43:24 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 15 Jan 2001 23:43:24 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 15, 2001 at 10:53:16PM -0500
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com>
Message-ID: <20010115234324.C12166@thyrsus.com>

Tim Peters <tim.one at home.com>:
> I would also be +1 on using hex escapes instead of octal (I grew up on 36-
> and 60-bit machines, but that was the last time octal looked *natural*!).
> Octal and hex escapes both consume 4 characters, so I can't imagine what
> octal has going for it in the 21st century <wink>.

Tim, on the level of aesthetic preference I'm totally with you.  I've always
found octal really ugly myself.  Hex fits my brain better; somehow I find it
easier to visualize the bit patterns from.

Sadly, there are so many other related ways in which Python
intelligently follows C/Unix conventions that I think changing to a default
of hex escapes rather than octal would violate the Rule of Least
Surprise.

One of the things I like about Python is precisely its conservatism in
areas like string escapes, that Guido refrained from inventing new OS
APIs or new conventions for things like string escapes in places where
Unix and C did them in a well-established and reasonable way.  He didn't
make the mistake, all too typical in academic languages, of confusing
novelty with value...

This conservatism is valuable because it frees the C-experienced
programmer's mind from having to think about where the language is
trivially different, so he can concentrate on where it's importantly
different.  It's worth maintaining.

On the other hand, the change would mesh well with the Unicode support.
Hmm.  Tough call.  I could go either way, I guess.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The politician attempts to remedy the evil by increasing the very thing
that caused the evil in the first place: legal plunder.
	-- Frederick Bastiat



From tim.one at home.com  Tue Jan 16 06:07:16 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 16 Jan 2001 00:07:16 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <20010115234324.C12166@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPAIIAA.tim.one@home.com>

[Eric]
> Tim, on the level of aesthetic preference I'm totally with you.
> I've always found octal really ugly myself.  Hex fits my brain
> better;  somehow I find it easier to visualize the bit patterns from.
>
> Sadly, there are so many other related ways in which Python
> intelligently follows C/Unix conventions that I think changing to
> a default of hex escapes rather than octal would violate the Rule
> of Least Surprise.
>
> ... [and skipping nice stuff I *do* agree with <wink>] ...

The saving grace here is that repr() is a form of ASCII dump.  C has nothing
to say about that, while last time I used Unix it was real easy to get dumps
in hex (and indeed that's what everyone I knew routinely did).  I expect
that od retains both its name and its octal defaults on most systems simply
due to inertia.  An octal dump would be infinitely surprising on Windows
(I'm not sure I can even get one without writing it myself).

Do people actually use octal dumps on Unices anymore?  I'd be surprised, if
they're running on power-of-2 boxes.  Defaults aren't conventions when
*everyone* overrides them, they're just old and in the way.

takes-one-to-know-one<wink>-ly y'rs  - tim




From ping at lfw.org  Tue Jan 16 06:27:33 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 21:27:33 -0800 (PST)
Subject: [Python-Dev] time functions
In-Reply-To: <20010116004930.L1005@xs4all.nl>
Message-ID: <Pine.LNX.4.10.10101152126120.5846-100000@skuld.kingmanhall.org>

On Tue, 16 Jan 2001, Thomas Wouters wrote:
> Actually, I'll split it up in three proposals:
> 
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)
> 
> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)
> 
> - Making the time in time.localtime and time.gmtime default to 'now'.

I like all of these suggestions.  Go for it!


-- ?!ng




From esr at thyrsus.com  Tue Jan 16 06:31:14 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 16 Jan 2001 00:31:14 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEPAIIAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 16, 2001 at 12:07:16AM -0500
References: <20010115234324.C12166@thyrsus.com> <LNBBLJKPBEHFEDALKOLCCEPAIIAA.tim.one@home.com>
Message-ID: <20010116003114.A12365@thyrsus.com>

Tim Peters <tim.one at home.com>:
> Do people actually use octal dumps on Unices anymore? 

Well, we do when we momentarily forget to give od(1) the -x escape :-)

This so annoyed me that back around 1983 I wrote my own hex dumper
specifically to emulate the 16-hex-bytes-with-midpage-gutter-and-ASCII-
over-on-the-right-side format that CP/M used and DOS inherited.  It's
still available at <http://www.tuxedo.org/~esr/hex/>.

Do you know the history on this?  C speaks octal because a bunch of 
mode fields in the PDP-11 instruction word were three bits wide.
Time was it was actually useful to have the output from (say)
core files chunk that way. But I haven't seen an octal code dump 
in over a decade, probably pushing fifteen years now.  
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

In the absence of any evidence tending to show that possession 
or use of a 'shotgun having a barrel of less than eighteen inches 
in length' at this time has some reasonable relationship to the 
preservation or efficiency of a well regulated militia, we cannot 
say that the Second Amendment guarantees the right to keep and bear 
such an instrument. [...] The Militia comprised all males 
physically capable of acting in concert for the common defense.  
        -- Majority Supreme Court opinion in "U.S. vs. Miller" (1939)



From ping at lfw.org  Tue Jan 16 06:33:42 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 15 Jan 2001 21:33:42 -0800 (PST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <200101160413.XAA01404@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101152130090.5846-100000@skuld.kingmanhall.org>

On Mon, 15 Jan 2001, Guido van Rossum wrote:
> > > special whitespace characters, '\n', '\r', and '\t', to appear in
> > > their natural form rather than as octal escapes when strings are
> > > printed and repr()ed.
> > 
> > -1 on doing that when they're printed (although I probably misunderstand
> > what you mean there).
> 
> Ping was using imprecise language here -- he meant repr() and "printed
> at the command line prompt."

Yes, i referred to "when strings are printed and repr()ed" as two cases
because both string_print() and string_repr() have to be changed.

(Side question: when are *_print() and *_repr() ever different, and why?)

> Originally, using \x for these was impractical (at least) because of
> the stupid gobble-up-everything-that-looks-like-a-hex-digit semantics
> of the \x escape.  Now we've fixed this, I agree.

Oh, now i understand.  Good point.  I'll update the patch to do hex.

0xdeadbeef-ly yours,


-- ?!ng




From fredrik at effbot.org  Tue Jan 16 08:11:38 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 16 Jan 2001 08:11:38 +0100
Subject: [Python-Dev] time functions
References: <20010116004930.L1005@xs4all.nl>
Message-ID: <00b201c07f8b$93996820$e46940d5@hagrid>

thomas wrote:
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)

where "now" is local time, I assume?

since you're assuming a time zone, you could make it accept
an integer as well...

> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)

same here.

</F>




From thomas at xs4all.net  Tue Jan 16 08:18:38 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 08:18:38 +0100
Subject: [Python-Dev] time functions
In-Reply-To: <00b201c07f8b$93996820$e46940d5@hagrid>; from fredrik@effbot.org on Tue, Jan 16, 2001 at 08:11:38AM +0100
References: <20010116004930.L1005@xs4all.nl> <00b201c07f8b$93996820$e46940d5@hagrid>
Message-ID: <20010116081838.N1005@xs4all.nl>

On Tue, Jan 16, 2001 at 08:11:38AM +0100, Fredrik Lundh wrote:
> thomas wrote:
> > - Making the time in time.strftime default to 'now', so that the above
> >   becomes the ever so slightly confusing:
> > 
> >   timestr = time.strftime("<format>")
> >   (confusing because it looks a bit like a regexp constructor...)

> where "now" is local time, I assume?

Yes. See the patch I'll upload later today (meetings first, grrr)

> since you're assuming a time zone, you could make it accept
> an integer as well...

Could, yes... I'll include it in the 2nd revision of the patch, it can be
rejected (or accepted) separately.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Tue Jan 16 09:22:11 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 09:22:11 +0100
Subject: [Python-Dev] time functions
In-Reply-To: <20010116081838.N1005@xs4all.nl>; from thomas@xs4all.net on Tue, Jan 16, 2001 at 08:18:38AM +0100
References: <20010116004930.L1005@xs4all.nl> <00b201c07f8b$93996820$e46940d5@hagrid> <20010116081838.N1005@xs4all.nl>
Message-ID: <20010116092211.O1005@xs4all.nl>

On Tue, Jan 16, 2001 at 08:18:38AM +0100, Thomas Wouters wrote:
> On Tue, Jan 16, 2001 at 08:11:38AM +0100, Fredrik Lundh wrote:

> > >   timestr = time.strftime("<format>")

> > since you're assuming a time zone, you could make it accept
> > an integer as well...

> Could, yes... 

Actually, on second thought, lets not, not just yet anyway. Doing that for
all functions in the time module would continue to pollute the already toxic
waters of a C API translated into Python :P Who knows what 'ctime' stands
for, anyway ? And 'asctime' ? How can we expect Python programmers who think
'C' is a high note or average grade, to understand how the time module is
supposed to be used ? :)

We now have:
time() -- return current time in seconds since the Epoch as a float
gmtime() -- convert seconds since Epoch to UTC tuple
localtime() -- convert seconds since Epoch to local time tuple
asctime() -- convert time tuple to string
ctime() -- convert time in seconds to string
mktime() -- convert local time tuple to seconds since Epoch
strftime() -- convert time tuple to string according to format specification

where asctime and ctime are basically wrappers around strftime, and would do
the exact same thing if they both accepted tuples and floats. 

I think we should have something like:
time() -- current time in float
timetuple() -- current (local) time in timetuple
tuple2time(tuple) -- tuple -> float
time2tuple(float, tz=local) -- float -> tuple using timezone tz
stringtime(time=now, format="ctimeformat") -- convert time value to string

Those are just working names, to make the point, I don't have time to think
up better ones :) I'm not sure if the timezone support in the above list is
extensive enough, mostly because I hardly use timezones myself. Also,
tuple2time() could be merged with time(), and likewise for time2tuple() and
timetuple(). I think keeping strftime() and maybe ctime() for ease-of-use is
a good idea, but the rest could eventually be deprecated.

Off-to-important-meetings-*cough*-ly y'rs
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From fredrik at effbot.org  Tue Jan 16 09:30:28 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 16 Jan 2001 09:30:28 +0100
Subject: [Python-Dev] unit testing bake-off
References: <LNBBLJKPBEHFEDALKOLCKENMIIAA.tim.one@home.com>
Message-ID: <01ba01c07f96$967b7870$e46940d5@hagrid>

Tim Peters wrote:
> At least you, Jeremy and Fredrik have tried them, and
> if that's all there can't be a tie <wink>.

let me guess:

    Jeremy: PyUnit
    Andrew: unittest
    Fredrik: unittest

(I find pyunit a bit unpythonic, and both overengineered
and underengineered at the same time...  hard to explain,
but I strongly prefer unittest)

> I would agree this is not an ideal decision procedure.

well, any decision procedure that comes up with what I
want just has to be ideal ;-)

</F>




From andy at reportlab.com  Tue Jan 16 10:20:45 2001
From: andy at reportlab.com (Andy Robinson)
Date: Tue, 16 Jan 2001 09:20:45 -0000
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <20010115204701.11972EA6B@mail.python.org>
Message-ID: <PGECLPOBGNBNKHNAGIJHAEELCGAA.andy@reportlab.com>

> Subject: Re: [Python-Dev] unit testing bake-off
> From: Guido van Rossum <guido at python.org>
> Date: Mon, 15 Jan 2001 14:17:27 -0500
> 
> There doesn't seem to be a lot of enthousiasm for a Unittest
> bakeoff...  Certainly I don't think I'll get to this myself before the
> conference.
> 
> How about the following though: talking of low-hanging fruit, Tim's
> doctest module is an excellent thing even if it isn't a unit testing
> framework!  (I found this out when I played with it -- it's real easy
> to get used to...)
> 
> Would anyone object against Tim checking this in?  Since it isn't a
> contender in the unit test bake-off, it shouldn't affect the outcome
> there at all.
> 
> --Guido van Rossum (home page: http://www.python.org/~guido/)

I think it should definitely go in.  Ditto with whatever testing
framework and documentation tools (pydoc etc.) shortly emerge
as "best of breed".  I spend my time on corporate consulting
projects, and saying things like "Python has standard tools for
unit testing and documentation" is even better than saying 
"We have standard tools for unit testing and documentation".

BTW, ReportLab has recently adopted PyUnit's unittest.py
It feels a bit Java-like to me - a few more lines of code
than needed - but it certainly works.   One key feature is
aggregating test suites; a big app we installed on a
customer site can run the test suite for itself, the ReportLab
library (whose test suite we are just getting to work on)
and four or five dependent utilities; another is that
people have heard of JUnit.

Just my 2p worth,
Andy Robinson




From tony at lsl.co.uk  Tue Jan 16 10:47:01 2001
From: tony at lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 16 Jan 2001 09:47:01 -0000
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and
         outside of Python)
In-Reply-To: <200101152041.PAA32298@cj20424-a.reston1.va.home.com>
Message-ID: <003901c07fa1$46e10c70$f05aa8c0@lslp7o.int.lsl.co.uk>

In the context of my starting doc strings in an Emacs Lisp manner,
Ka-Ping Yee said:
> I think i'm going to ask you to stop, unless Guido prefers
> otherwise.  Guido, do you have a style pronouncement for module
> docstrings?

and since Guido replied
> I'm with Ping.  None of the examples in the style guide start the
> docstring with the function name.  Almost none of the standard library
> modules start their module docstring with the module name (codecs is
> an exception, but I didn't write it :-).

I shall indeed stop (of course, my habit started before we HAD
documentation tools, and if we're going to browse things with pydoc, et
al, then there's no need for it. To be honest, it's the answer I
expected.

Oh dear, another item for my TO DO list (i.e., remove the offending
nits). Still, if it's only me it's hardly high impact!

Tibs
--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
Which is safer, driving or cycling?
Cycling - it's harder to kill people with a bike...
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)




From tony at lsl.co.uk  Tue Jan 16 11:13:31 2001
From: tony at lsl.co.uk (Tony J Ibbs (Tibs))
Date: Tue, 16 Jan 2001 10:13:31 -0000
Subject: [Python-Dev] RE: [Doc-SIG] pydoc.py (show docs both inside and
         outside of Python)
In-Reply-To: <Pine.LNX.4.10.10101151155270.5846-100000@skuld.kingmanhall.org>
Message-ID: <003a01c07fa4$fa0883c0$f05aa8c0@lslp7o.int.lsl.co.uk>

I mentioned a "spurious"
>	The system cannot find the path specified.

on NT, and Ka-Ping Yee said:
> Thanks for the NT testing.  That's funny -- i put in a special case
> for Windows to avoid messages like the above a couple of days ago.
> How recently did you download pydoc.py?  Does your copy contain:
>
>     if hasattr(sys, 'winver'):
>         return lambda text: tempfilepager(text, 'more')

Hmm. I downloaded it when I read the email message announcing it, which
was yesterday some time. But it doesn't look like the lines you mention
are there - I'll try re-downloading...

...I've redownloaded the files from http://www.lfw.org/python/pydoc.py,
etc., and done a grep for hasattr within them. There's no check such as
the one you mention, so I guess it's "download impedance".

> So you can see what i'm up to, here's my current to-do list:
>
>     make boldness optional (only if using more/less?  only Unix?)

probably sensible. By the way, I don't get boldness on the NT box - any
chance (he says, not intending to help *at all* in doing it!) of it
happening there as well? (or would that depend on what curses support is
built into the Python?)

>     document a .py file given on the command line

also allow for a directory module (i.e., something with __init__.py in
it) given on the command line?

>     write a better htmlrepr (\n should look special, max
>     length limit, etc.)

yes, but these things can always get better - the fact it's working
allows for improoooovement down the line.

>     generate HTML index from precis and __path__ and package

a neat idea - definitely Good Stuff!

>     contents list

well, I always do these, so I'm for this one as well

>     have help(...) produce a directory of available things to
>     ask for help on

bouncy fun!

>     Windows and Mac testing

I'm running Windows 98 with Python 1.5.2 at home, and will willingly try
it out on that (after all, it's not a very big download) - although it
might sometimes take a day or two to get round to it (for instance, I
haven't yet done so!). But I suspect I shan't be a very demanding
user...

>     default to HTTP mode on GUI platforms?  (win, mac)
>
> The ones marked with + i consider done.  Feel free to comment on
> or suggest priorities for the others; in particular, what do you
> think of the last one?  The idea is that double-clicking on
> pydoc.py in Windows or MacOS could launch the server and then open
> the localhost URL using webbrowser.py to display the documentation
> index.  Should it do this by default?

I'll leave that to better designers than myself (although if one is to
*have* a double click action, that seems sensible to me).

(looks up webbrowser.py - ah, a 2.0 module). Personally, I'd also like
to have the option of having a "mini-browser" supported directly,
perhaps in Tkinter, so I don't need to start up a whole web browser. But
again I may be odd in that wish (I can't remember what IDLE does).

Oh - that also means "integrate into IDLE" presumably goes on at least a
WishList as well...

Other ideas:
* command line switch to *output* HTML to a file (i.e., documentation
generation) (presumably something like "-o <name>.html", where the
"html" indicates the output format - an alternative being "txt"
* if I ever finish the docutils effort (I should be getting back to it
soon) then use that to format the texts (this would mean I need not
worry about the "frontend" to docutils too much, since pydoc is already
doing so much). Or maybe the docutils tool should be importing pydoc...

Tibs (must do some (paid) work now!)

--
Tony J Ibbs (Tibs)      http://www.tibsnjoan.co.uk/
"Bounce with the bunny. Strut with the duck.
 Spin with the chickens now - CLUCK CLUCK CLUCK!"
BARNYARD DANCE! by Sandra Boynton
My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)





From mal at lemburg.com  Tue Jan 16 11:18:44 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 16 Jan 2001 11:18:44 +0100
Subject: [Python-Dev] time functions
References: <20010116004930.L1005@xs4all.nl>
Message-ID: <3A642004.F6197E86@lemburg.com>

Thomas Wouters wrote:
> 
> Maybe this is a dead and buried subject, but I'm going to try anyway, since
> everyone's been in such a wonderful 'lets fix ugly but harmless nits' mood
> lately :)
> 
> Why do we need the following atrocity <wink>:
> 
>   timestr = time.strftime("<format>", time.localtime(time.time()))
> 
> To do the simple task of 'date +<format>' ?  I never really understood why
> there isn't a way to get a timetuple directly from C, rather than converting
> a float that we got from C a bytecode before, even though the higher level
> almost always deals with timetuples. How about making the float-to-tuple
> functions (time.localtime, time.gmtime) accept 0 arguments as well, and
> defaulting to time.time() in that case ? Even better, how about doing the
> same for the other functions, too ? (where it makes sense, of course :)
> 
> Actually, I'll split it up in three proposals:
> 
> - Making the time in time.strftime default to 'now', so that the above
>   becomes the ever so slightly confusing:
> 
>   timestr = time.strftime("<format>")
>   (confusing because it looks a bit like a regexp constructor...)
> 
> - Making the time in time.asctime and time.ctime optional, defaulting to
>   'now', so you can just call 'time.ctime()' without having to pass
>   time.time() (which are about half the calls in my own code :)
> 
> - Making the time in time.localtime and time.gmtime default to 'now'.
> 
> I'm 0/+1/+1 myself :)

+1 all the way -- though these days I tend not to use the
time module anymore. mxDateTime already does everything I want
and there date/time values are objects rather than Python integers
or tuples... ok, I'm just showing opff a little :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Tue Jan 16 11:32:21 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 16 Jan 2001 11:32:21 +0100
Subject: [Python-Dev] Strings: '\012' -> '\n'
References: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com> <200101160413.XAA01404@cj20424-a.reston1.va.home.com>
Message-ID: <3A642335.82358B02@lemburg.com>

Minor nit about this idea: it makes decoding repr() style
strings harder for external tools and it could cause breakage
(e.g. if "\n" is usedby the encoding for some other purpose).

BTW, since there are a gazillion ways to encode strings into
7-bit ASCII, why not use the new codec design to add additional
output schemes for 8-bit strings ?!

Strings have an .encode() method as well...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From ping at lfw.org  Tue Jan 16 11:37:42 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Tue, 16 Jan 2001 02:37:42 -0800 (PST)
Subject: [Python-Dev] pydoc.py (show docs both inside and outside of Python)
In-Reply-To: <003a01c07fa4$fa0883c0$f05aa8c0@lslp7o.int.lsl.co.uk>
Message-ID: <Pine.LNX.4.10.10101160236330.5846-100000@skuld.kingmanhall.org>

Before somebody decides to shoot us for spamming both lists,
i'm taking this thread off of python-dev and solely to doc-sig.
Please continue further discussion there...


-- ?!ng




From ping at lfw.org  Tue Jan 16 11:47:02 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Tue, 16 Jan 2001 02:47:02 -0800 (PST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101152130090.5846-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101160240520.5846-100000@skuld.kingmanhall.org>

On Mon, 15 Jan 2001, Ka-Ping Yee wrote:
> On Mon, 15 Jan 2001, Guido van Rossum wrote:
> > Originally, using \x for these was impractical (at least) because of
> > the stupid gobble-up-everything-that-looks-like-a-hex-digit semantics
> > of the \x escape.  Now we've fixed this, I agree.
> 
> Oh, now i understand.  Good point.  I'll update the patch to do hex.

I assume you would like Unicode strings to do the same (\n, \t, \r,
and \xff rather than \377).

Guido, do you have a Pronouncement on \v, \f, \b, \a?

By the way, why do Unicode escapes appear in capitals?

    >>> u'\uface'
    u'\uFACE'

(If someone tells me that there happens to be a picture of a face at
that code point, i'll laugh.  Is there a cow at \uBEEF?)

Does anyone care that \x will be followed by lowercase and \u by uppercase?

I noticed that the tutorial claims Unicode strings can be str()-ified
and will encode themselves using UTF-8 as default.  But this doesn't
actually work for me:

    >>> us = u'\uface'
    >>> us
    u'\uFACE'
    >>> str(us)
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    UnicodeError: ASCII encoding error: ordinal not in range(128)
    >>> us.encode()
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    UnicodeError: ASCII encoding error: ordinal not in range(128)
    >>> us.encode('UTF-8')
    '\xef\xab\x8e'

Assuming i have understood this correctly, i have submitted a patch
to correct tut.tex.


-- ?!ng





From bckfnn at worldonline.dk  Tue Jan 16 11:52:10 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Tue, 16 Jan 2001 10:52:10 GMT
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org>
Message-ID: <3a642768.6426631@smtp.worldonline.dk>

[Ping]

>I don't know whether this is going to be obvious or controversial,
>but here goes.  Most of the time we're used to seeing a newline as
>'\n', not as '\012', and newlines are typed in as '\n'.
>
>A newcomer to Python is likely to do
>
>    >>> 'hello\n'
>    'hello\012'
>
>and ask "what's \012?" -- whereupon one has to explain that it's an
>octal escape, that 012 in octal equals 10, and that chr(10) is
>newline, which is the same as '\n'.  You're bound to run into this,
>and you'll see \012 a lot, because \n is such a common character.
>Aside from being slightly more frightening, '\012' also takes up
>twice as many characters as necessary.
>
>So... i'm submitting a patch that causes the three most common
>special whitespace characters, '\n', '\r', and '\t', to appear in
>their natural form rather than as octal escapes when strings are
>printed and repr()ed.

I like it, because it removes yet another difference between Python and
Jython. Jython happens to handle these chars specially: \n, \t, \b, \f
and \r.

regards,
finn



From esr at thyrsus.com  Tue Jan 16 11:53:00 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 16 Jan 2001 05:53:00 -0500
Subject: [Python-Dev] time functions
In-Reply-To: <3A642004.F6197E86@lemburg.com>; from mal@lemburg.com on Tue, Jan 16, 2001 at 11:18:44AM +0100
References: <20010116004930.L1005@xs4all.nl> <3A642004.F6197E86@lemburg.com>
Message-ID: <20010116055300.C12847@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> +1 all the way -- though these days I tend not to use the
> time module anymore. mxDateTime already does everything I want
> and there date/time values are objects rather than Python integers
> or tuples... ok, I'm just showing opff a little :)

mxDateTime is on my short list of "why isn't this in the Python library
already?"  Has it ever been discussed?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

You need only reflect that one of the best ways to get yourself 
a reputation as a dangerous citizen these days is to go about 
repeating the very phrases which our founding fathers used in the 
great struggle for independence.
	-- Attributed to Charles Austin Beard (1874-1948)



From mal at lemburg.com  Tue Jan 16 12:18:24 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 16 Jan 2001 12:18:24 +0100
Subject: [Python-Dev] time functions
References: <20010116004930.L1005@xs4all.nl> <3A642004.F6197E86@lemburg.com> <20010116055300.C12847@thyrsus.com>
Message-ID: <3A642E00.BD330647@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal at lemburg.com>:
> > +1 all the way -- though these days I tend not to use the
> > time module anymore. mxDateTime already does everything I want
> > and there date/time values are objects rather than Python integers
> > or tuples... ok, I'm just showing opff a little :)
> 
> mxDateTime is on my short list of "why isn't this in the Python library
> already?"  Has it ever been discussed?

Yes. I'd rather keep it separate from the standard dist for
various reasons. One of these reasons is that I will be moving
the mx tools into a new packaging scheme built on distutils --
installing it should then boil down to a simple RPM install
or maybe a "python setup.py install" thanks to distutils. The
package will then become a subpackage of the mx package.

BTW, I see distutils as strong argument for *not* including
more exotic packages in Python's stdlib. If this catches on,
I expect that together with the Vaults we are not far away
from having our own CPAN style archive of add-on packages.
I also expect the commercial vendors like ActiveState et al.
to take care of wrapping SUMO distributions of Python and
the existing add-ons.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Tue Jan 16 12:20:18 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 16 Jan 2001 06:20:18 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <3a642768.6426631@smtp.worldonline.dk>; from bckfnn@worldonline.dk on Tue, Jan 16, 2001 at 10:52:10AM +0000
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <3a642768.6426631@smtp.worldonline.dk>
Message-ID: <20010116062018.A12935@thyrsus.com>

Finn Bock <bckfnn at worldonline.dk>:
> I like it, because it removes yet another difference between Python and
> Jython. Jython happens to handle these chars specially: \n, \t, \b, \f
> and \r.

This is an argument for adding \b and \f to the special set in
CPython.  If the BDFL looks benignly on adding \v and \a, those
should go into Jython's special set too.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Sometimes it is said that man cannot be trusted with the government
of himself.  Can he, then, be trusted with the government of others?
	-- Thomas Jefferson, in his 1801 inaugural address



From fredrik at pythonware.com  Tue Jan 16 12:37:10 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 16 Jan 2001 12:37:10 +0100
Subject: [Python-Dev] Strings: '\012' -> '\n'
References: <Pine.LNX.4.10.10101160240520.5846-100000@skuld.kingmanhall.org>
Message-ID: <03eb01c07fb0$aaaa19e0$0900a8c0@SPIFF>

ping wrote:
> By the way, why do Unicode escapes appear in capitals?
> 
>     >>> u'\uface'
>     u'\uFACE'
> 
> (If someone tells me that there happens to be a picture of a face at
> that code point, i'll laugh.  Is there a cow at \uBEEF?)

iirc, 0xFACE and 0xBEEF are part of the CJK and
Hangul spaces.  not sure 0xFACE is assigned, but
0xBEEF glyph looks like a ribcage with four legs...

you'll find faces at 0x263A etc.

</F>




From skip at mojam.com  Tue Jan 16 14:09:51 2001
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 16 Jan 2001 07:09:51 -0600 (CST)
Subject: [Python-Dev] bummer - regsub/regex no longer in module index
Message-ID: <14948.18463.971334.401426@beluga.mojam.com>

I am now getting deprecation warnings about regsub so I decided to start
replacing it with more zeal than I had previously.  First thing I wanted to
replace were some regsub.split calls.  I went to the module index to look up
the description but regsub was nowhere to be found.  (I know, I know.  I can
use pydoc.)

Still... how about continuing to include deprecated modules in the library
reference manual but in a separate Deprecated Modules section and annotate
them as such in the module index?

Skip



From guido at python.org  Tue Jan 16 14:44:01 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 08:44:01 -0500
Subject: [Python-Dev] time functions
In-Reply-To: Your message of "Tue, 16 Jan 2001 08:11:38 +0100."
             <00b201c07f8b$93996820$e46940d5@hagrid> 
References: <20010116004930.L1005@xs4all.nl>  
            <00b201c07f8b$93996820$e46940d5@hagrid> 
Message-ID: <200101161344.IAA04513@cj20424-a.reston1.va.home.com>

> thomas wrote:
> > - Making the time in time.strftime default to 'now', so that the above
> >   becomes the ever so slightly confusing:
> > 
> >   timestr = time.strftime("<format>")
> >   (confusing because it looks a bit like a regexp constructor...)
> 
> where "now" is local time, I assume?
> 
> since you're assuming a time zone, you could make it accept
> an integer as well...

What would the integer mean?

> > - Making the time in time.asctime and time.ctime optional, defaulting to
> >   'now', so you can just call 'time.ctime()' without having to pass
> >   time.time() (which are about half the calls in my own code :)
> 
> same here.

Same what here?  "now" == local time, sure.  But accept an integer?
It already accepts an integer!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 14:55:01 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 08:55:01 -0500
Subject: [Python-Dev] time functions
In-Reply-To: Your message of "Tue, 16 Jan 2001 09:22:11 +0100."
             <20010116092211.O1005@xs4all.nl> 
References: <20010116004930.L1005@xs4all.nl> <00b201c07f8b$93996820$e46940d5@hagrid> <20010116081838.N1005@xs4all.nl>  
            <20010116092211.O1005@xs4all.nl> 
Message-ID: <200101161355.IAA04802@cj20424-a.reston1.va.home.com>

Let's not redesign the time module API too much.  I'm all for adding
the default argument values that Thomas proposes.  Then, instead of
changing the API, we should look into a higher-level Python module.
That's how those things typically go.

Digital Creations has its own time extension type somewhere in Zope, a
bit similar to mxDateTime.  I looked into making this a standard
Python extension but quickly gave up.  The problems with these things
seems to be that it's hard to come up with a design that makes
everyone happy: some people want small objects (because they have a
lot of them around, e.g. a timestamp on almost every other object);
others want timezone support; yet others want microsecond resolution;
leap-second support; pre-Christian era support; support for
nonstandard calendars; interval arithmetic; support for dates without
times or times without dates...

Python could use a better time type, but we'll have to look into which
requirements make sense for a generalized type, and which don't.  I
fear that a committee could easily pee away years designing an
interface to satisfy absolutely every wish.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 15:02:29 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 09:02:29 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Mon, 15 Jan 2001 21:33:42 PST."
             <Pine.LNX.4.10.10101152130090.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101152130090.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101161402.JAA05045@cj20424-a.reston1.va.home.com>

> Yes, i referred to "when strings are printed and repr()ed" as two cases
> because both string_print() and string_repr() have to be changed.
> 
> (Side question: when are *_print() and *_repr() ever different, and why?)

You mean the tp_print and tp_str function slots in type objects,
right?  tp_print *should* always render exactly the same as tp_str.
tp_print is used by the print statement, not by value display at the
interactive prompt.

tp_print and tp_str have differed historically for 3rd party extension
types by accident.

So, string_print most definitely should *not* be changed -- only
string_repr!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 15:06:23 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 09:06:23 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Tue, 16 Jan 2001 02:47:02 PST."
             <Pine.LNX.4.10.10101160240520.5846-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101160240520.5846-100000@skuld.kingmanhall.org> 
Message-ID: <200101161406.JAA05153@cj20424-a.reston1.va.home.com>

> I assume you would like Unicode strings to do the same (\n, \t, \r,
> and \xff rather than \377).

Yeah.

> Guido, do you have a Pronouncement on \v, \f, \b, \a?

Practicality beats purity: these will remain octal.

> By the way, why do Unicode escapes appear in capitals?
> 
>     >>> u'\uface'
>     u'\uFACE'

Could it be just that that's what Unicode folks are expecting?

> (If someone tells me that there happens to be a picture of a face at
> that code point, i'll laugh.  Is there a cow at \uBEEF?)

I'm laughing even though I don't see pictures. :-)

> Does anyone care that \x will be followed by lowercase and \u by uppercase?

It's mildly weird, and I think hex escapes in lowercase are more
Pythonic than in upper case.

> I noticed that the tutorial claims Unicode strings can be str()-ified
> and will encode themselves using UTF-8 as default.  But this doesn't
> actually work for me:
> 
>     >>> us = u'\uface'
>     >>> us
>     u'\uFACE'
>     >>> str(us)
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     UnicodeError: ASCII encoding error: ordinal not in range(128)
>     >>> us.encode()
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     UnicodeError: ASCII encoding error: ordinal not in range(128)
>     >>> us.encode('UTF-8')
>     '\xef\xab\x8e'
> 
> Assuming i have understood this correctly, i have submitted a patch
> to correct tut.tex.

Yeah, I guess that part of the tutorial was written before we changed
our minds about this. :-(

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido at python.org  Tue Jan 16 15:09:56 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 09:09:56 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Tue, 16 Jan 2001 11:32:21 +0100."
             <3A642335.82358B02@lemburg.com> 
References: <LNBBLJKPBEHFEDALKOLCEEOKIIAA.tim.one@home.com> <200101160413.XAA01404@cj20424-a.reston1.va.home.com>  
            <3A642335.82358B02@lemburg.com> 
Message-ID: <200101161409.JAA05268@cj20424-a.reston1.va.home.com>

> Minor nit about this idea: it makes decoding repr() style
> strings harder for external tools and it could cause breakage
> (e.g. if "\n" is usedby the encoding for some other purpose).

Such a tool would be broken.  If it accepts string literals it should
accept all forms of escapes.

> BTW, since there are a gazillion ways to encode strings into
> 7-bit ASCII, why not use the new codec design to add additional
> output schemes for 8-bit strings ?!
> 
> Strings have an .encode() method as well...

Good idea!  This could also be used to "hexify" a string, for which
currently one of the quickest ways is still the hack

    "%02x"*len(s) % tuple(s)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at python.org  Tue Jan 16 15:11:53 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 09:11:53 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Tue, 16 Jan 2001 06:20:18 EST."
             <20010116062018.A12935@thyrsus.com> 
References: <Pine.LNX.4.10.10101151429120.5846-100000@skuld.kingmanhall.org> <3a642768.6426631@smtp.worldonline.dk>  
            <20010116062018.A12935@thyrsus.com> 
Message-ID: <200101161411.JAA05336@cj20424-a.reston1.va.home.com>

> Finn Bock <bckfnn at worldonline.dk>:
> > I like it, because it removes yet another difference between Python and
> > Jython. Jython happens to handle these chars specially: \n, \t, \b, \f
> > and \r.

[ESR]
> This is an argument for adding \b and \f to the special set in
> CPython.  If the BDFL looks benignly on adding \v and \a, those
> should go into Jython's special set too.

No, I think Jython should remove \b and \f.  Or the language standard
could allow implementations some freedom here (as long as the output
is a string literal).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fdrake at acm.org  Tue Jan 16 16:06:34 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Tue, 16 Jan 2001 10:06:34 -0500 (EST)
Subject: [Python-Dev] unit testing bake-off
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKENMIIAA.tim.one@home.com>
References: <20010115162619.A19484@kronos.cnri.reston.va.us>
	<LNBBLJKPBEHFEDALKOLCKENMIIAA.tim.one@home.com>
Message-ID: <14948.25466.698063.240902@cj42289-a.reston1.va.home.com>

Tim Peters writes:
 > Presumably so that *something* gets into 2.1a1.  At least you, Jeremy and
 > Fredrik have tried them, and if that's all there can't be a tie <wink>.  I
 > would agree this is not an ideal decision procedure.

  I've been using PyUNIT some, but haven't tried the Quixote unittest
module, which tells me I can't make a particularly informed
recommendation (vote, whatever).


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From thomas at xs4all.net  Tue Jan 16 16:23:52 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 16:23:52 +0100
Subject: [Python-Dev] time functions
In-Reply-To: <200101161355.IAA04802@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Jan 16, 2001 at 08:55:01AM -0500
References: <20010116004930.L1005@xs4all.nl> <00b201c07f8b$93996820$e46940d5@hagrid> <20010116081838.N1005@xs4all.nl> <20010116092211.O1005@xs4all.nl> <200101161355.IAA04802@cj20424-a.reston1.va.home.com>
Message-ID: <20010116162350.A21010@xs4all.nl>

On Tue, Jan 16, 2001 at 08:55:01AM -0500, Guido van Rossum wrote:

> Let's not redesign the time module API too much.

[snip]

Agreed.

> I fear that a committee could easily pee away years designing an
> interface to satisfy absolutely every wish.

A committee is a life form with six or more legs and no brain.
    Lazarus Long in "Time Enough For Love", by R. A. Heinlein.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From skip at mojam.com  Tue Jan 16 18:23:56 2001
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 16 Jan 2001 11:23:56 -0600 (CST)
Subject: [Python-Dev] Re: [Patches] [Patch #102891] Alternative readline module
In-Reply-To: <m366jf4esw.fsf@atrus.jesus.cam.ac.uk>
References: <E14IXZj-0007Cc-00@usw-sf-web1.sourceforge.net>
	<m366jf4esw.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <14948.33708.332464.107009@beluga.mojam.com>

    Michael> ... (or I'll just call it pyttyinput)

Which, like "Guido", when properly pronounced should leave your monitor
slightly moist... ;-)

Skip




From thomas at xs4all.net  Tue Jan 16 18:36:03 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 16 Jan 2001 18:36:03 +0100
Subject: [Python-Dev] Re: [Patches] [Patch #102891] Alternative readline module
In-Reply-To: <14948.33708.332464.107009@beluga.mojam.com>; from skip@mojam.com on Tue, Jan 16, 2001 at 11:23:56AM -0600
References: <E14IXZj-0007Cc-00@usw-sf-web1.sourceforge.net> <m366jf4esw.fsf@atrus.jesus.cam.ac.uk> <14948.33708.332464.107009@beluga.mojam.com>
Message-ID: <20010116183603.B2776@xs4all.nl>

On Tue, Jan 16, 2001 at 11:23:56AM -0600, Skip Montanaro wrote:

> Which, like "Guido", when properly pronounced should leave your monitor
> slightly moist... ;-)

Nono, 'Guido' should be pronounced using a hard, back-of-your-throat 'G',
more like a growl than a hiss. The less moisture the better :)

You-were-thinking-of-Centraal-Wiskunde-Instituut-(cwi.nl)-ly y'rs,

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From trentm at ActiveState.com  Tue Jan 16 19:36:29 2001
From: trentm at ActiveState.com (Trent Mick)
Date: Tue, 16 Jan 2001 10:36:29 -0800
Subject: [Python-Dev] TELL64
In-Reply-To: <200101160408.XAA01368@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 15, 2001 at 11:08:46PM -0500
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl> <20010115162454.D3864@ActiveState.com> <200101160408.XAA01368@cj20424-a.reston1.va.home.com>
Message-ID: <20010116103626.D30209@ActiveState.com>

On Mon, Jan 15, 2001 at 11:08:46PM -0500, Guido van Rossum wrote:
> 
> Trent, you wrote that code.  Why wouldn't this work just as well?
> 
> (your code):
> 			if ((pos = TELL64(fileno(fp))) == -1L)
> 				return -1;
> (my suggestion):
> 			if (fgetpos(fp, &pos) != 0)
> 				return -1;

I agree, that looks to me like it would. I guess I just missed that when I
wrote it.

> 
> I would even go as far as to collapse the entire switch as follows:
> 
> 	fpos_t pos;
> 	switch (whence) {
> 	case SEEK_END:
> 		/* do a "no-op" seek first to sync the buffering so that
> 		   the low-level tell() can be used correctly */
> 		if (fseek(fp, 0, SEEK_END) != 0)
> 			return -1;
> 		/* fall through */
> 	case SEEK_CUR:
> 		if (fgetpos(fp, &pos) != 0)
> 			return -1;
> 		offset += pos;
> 		break;
> 	/* case SEEK_SET: break; */
> 	}
> 	return fsetpos(fp, &offset);

Sure. Just get rid of the """do a "no-op" seek...""" comment because it is no
longer applicable. I am not setup to test this on Win64 right and I don't
suppose there are a lot of you out there with your own Win64 setups. I will
be able to test this before the scheduled 2.1 beta (late Feb), though.

Trent


-- 
Trent Mick
TrentM at ActiveState.com



From trentm at ActiveState.com  Tue Jan 16 20:34:17 2001
From: trentm at ActiveState.com (Trent Mick)
Date: Tue, 16 Jan 2001 11:34:17 -0800
Subject: [Python-Dev] TELL64
In-Reply-To: <20010116103626.D30209@ActiveState.com>; from trentm@ActiveState.com on Tue, Jan 16, 2001 at 10:36:29AM -0800
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl> <20010115162454.D3864@ActiveState.com> <200101160408.XAA01368@cj20424-a.reston1.va.home.com> <20010116103626.D30209@ActiveState.com>
Message-ID: <20010116113417.I30209@ActiveState.com>

On Tue, Jan 16, 2001 at 10:36:29AM -0800, Trent Mick wrote:
> Sure. Just get rid of the """do a "no-op" seek...""" comment because it is no
> longer applicable. I am not setup to test this on Win64 right and I don't

s/right/right now/


Trent

-- 
Trent Mick
TrentM at ActiveState.com



From cgw at fnal.gov  Tue Jan 16 21:19:09 2001
From: cgw at fnal.gov (Charles G Waldman)
Date: Tue, 16 Jan 2001 14:19:09 -0600 (CST)
Subject: [Python-Dev] Re: [Patch #103248] Fix a memory leak in _sre.c
Message-ID: <14948.44221.876681.838046@buffalo.fnal.gov>

Frederik - I noticed that you chose to check in a slightly different
patch than the one I submitted.

I wonder why you chose to do this?  In particular at line 1238 I had:

    if (PyErr_Occurred()) {
        Py_DECREF(self);
        return NULL;
    }

and you changed this to 

    if (PyErr_Occurred()) {
        PyObject_DEL(self);
        return NULL;
    }

Can you explain why you made this (seemingly arbitrary) change? 

I think that since "self" was created via:

 self = PyObject_NEW_VAR(PatternObject, &Pattern_Type, n);

which calls PyObjectINIT, which in turn calls _Py_NewReference, which
increments _Py_RefTotal, it is incorrect to simply do a PyObject_DEL
to de-allocate it -- won't this screw up the value of _Py_RefTotal?

Admittedly this is a minor nit and only matters if Py_TRACE_REFS is
defined - I just wanted to check to make sure my understanding of
reference counting w.r.t. memory allocation and deallocation is
correct - if the above is in error, I'd apprecate any corrections...




From guido at python.org  Tue Jan 16 21:53:41 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 15:53:41 -0500
Subject: [Python-Dev] TELL64
In-Reply-To: Your message of "Tue, 16 Jan 2001 10:36:29 PST."
             <20010116103626.D30209@ActiveState.com> 
References: <E14Fo57-0007wR-00@usw-pr-cvs1.sourceforge.net> <20010108182056.C4640@lyra.org> <200101151602.LAA22272@cj20424-a.reston1.va.home.com> <20010115141026.I29870@ActiveState.com> <20010116005536.M1005@xs4all.nl> <20010115162454.D3864@ActiveState.com> <200101160408.XAA01368@cj20424-a.reston1.va.home.com>  
            <20010116103626.D30209@ActiveState.com> 
Message-ID: <200101162053.PAA13099@cj20424-a.reston1.va.home.com>

> I agree, that looks to me like it would. I guess I just missed that when I
> wrote it.

Excellent!  I've checked this in now -- we'll hear if it breaks
anywhere soon enough.

>I am not setup to test this on Win64 right [now] and I don't
> suppose there are a lot of you out there with your own Win64 setups.

What happened to ActiveState's Itanium boxes?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Tue Jan 16 22:53:22 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Tue, 16 Jan 2001 16:53:22 -0500
Subject: [Python-Dev] Re: Detecting install time
In-Reply-To: <200101160347.WAA01132@cj20424-a.reston1.va.home.com>; from guido@python.org on Mon, Jan 15, 2001 at 10:47:32PM -0500
References: <200101160303.WAA11632@207-172-111-91.s91.tnt1.ann.va.dialup.rcn.com> <200101160347.WAA01132@cj20424-a.reston1.va.home.com>
Message-ID: <20010116165322.B29674@kronos.cnri.reston.va.us>

[CC'ing to the distutils-sig]

On Mon, Jan 15, 2001 at 10:47:32PM -0500, Guido van Rossum wrote:
>> For PEP 229, the setup.py script needs to figure out if it's running
>> from the build directory, because then distutils.sysconfig needs to
>
>You could check for the presence of config.status -- that file is not
>installed.

This isn't a check suitable for inclusion in distutils.sysconfig,
though, because it's so liable to being fooled (consider a
Distutils-packaged module that comes with a configure script to build
some library).  Right now I'm using a hacked version of sysconfig with several patches like this:

@@ -120,12 +121,16 @@
 def get_config_h_filename():
     """Return full pathname of installed config.h file."""
     inc_dir = get_python_inc(plat_specific=1)
+    # XXX
+    if 1: inc_dir = '.'
     return os.path.join(inc_dir, "config.h")
 
One hackish approach would be to add a assume_build_directories() to
distutils.sysconfig, a little back door to be used by the setup.py
script that comes with Python, so the above would become 'if
build_time_flag: ...'.  Anyone have a cleaner idea?

--amk




From akuchlin at mems-exchange.org  Wed Jan 17 02:46:47 2001
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Tue, 16 Jan 2001 20:46:47 -0500
Subject: [Python-Dev] PEP 229 issues
Message-ID: <200101170146.UAA00542@207-172-112-159.s159.tnt4.ann.va.dialup.rcn.com>

I'm in a quandry about the patch implementing PEP 229.  The patch is
quite close to being ready, with only a few minor issues remaining,
but to fix those issues, I need to make some changes to the Distutils,
such as the sysconfig modification I recently suggested. 

Problem: I believe the patch *must* go in at the alpha stage, because
there are bound to be lots of platform-specific problems that will
show up; it should not be added in the beta stage, because it'll need
time to get tested and debugged, and I wouldn't be surprised if it has
to be reverted later because of some insurmountable problem.

Problem: Greg Ward, the Distutils maintainer, is away at the moment.
I can check in changes to the Distutils without his say-so, but when
Greg gets back he might shriek in horror and rip all of the changes
out again.  (Or he's stuck with maintaining them until 2.2.)

Problem: 2.1alpha1 is due on Friday.

So, what to do?  If I know there's going to be an alpha2, that's
probably fine; Greg should have resurfaced by then, and the patch can
go in for alpha2.  

Or, I can check in the changes before Friday, and if they're
unacceptable, they can be fixed for alpha2/beta1, or simply backed
out.  

Or, I can leave Distutils alone and make setup.py a tissue of hacks
and workarounds.  For example, it might insert new versions of various
functions into the distutils.sysconf module.  Icky and fragile, but
cleaning it up for beta1 would then be a priority.

Suggestions?  Pronouncements?

--amk



From guido at python.org  Wed Jan 17 02:39:35 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 20:39:35 -0500
Subject: [Python-Dev] PEP 229 issues
In-Reply-To: Your message of "Tue, 16 Jan 2001 20:46:47 EST."
             <200101170146.UAA00542@207-172-112-159.s159.tnt4.ann.va.dialup.rcn.com> 
References: <200101170146.UAA00542@207-172-112-159.s159.tnt4.ann.va.dialup.rcn.com> 
Message-ID: <200101170139.UAA17954@cj20424-a.reston1.va.home.com>

I expect that there will be an alpha2, but I still recommend that you
check in *something* that works for alpha1, to get maximal testing
coverage.  Alpha1 may slip a day or so (Jeremy and I are both late
with our big patches, respectively nested scopes and rich comparisons,
that we really want to have in alpha1).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Wed Jan 17 03:04:53 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 16 Jan 2001 21:04:53 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <200101161409.JAA05268@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIECBIJAA.tim.one@home.com>

[Guido]
> Good idea [using string.encode()]!  This could also be used to
> "hexify" a string, for which currently one of the quickest ways
> is still the hack
>
>     "%02x"*len(s) % tuple(s)

Note that as of 2.0, a far quicker way is to use binascii.b2a_hex(), or its
absurdist (read "Barry" <wink>) synonym binascii.hexlify().

I'm wary of using string.encode() for this, because one normally hexlifies
binary data (e.g., like sha checksums), and 4 days of 7 we're more than not
in favor of moving away from strings to carry binary data.

Of course we can change our minds about this across releases, and have
even-numbered releases deprecate the function forms while odd-numbered ones
abjure methods.  Works for me <wink>.




From nas at arctrix.com  Tue Jan 16 22:08:23 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 16 Jan 2001 13:08:23 -0800
Subject: [Python-Dev] [droux@tuks.co.za: Our application doesn't work with Debian packaged Python]
Message-ID: <20010116130823.C9640@glacier.fnational.com>

This message was on the debian-python list.  Does anyone know why
the patch is needed?

  Neil

----- Forwarded message from Danie Roux <droux at tuks.co.za> -----

Date: Tue, 16 Jan 2001 11:44:48 +0200
From: Danie Roux <droux at tuks.co.za>
Subject: Our application doesn't work with Debian packaged Python
To: Debian Python <debian-python at lists.debian.org>

Good they all,

Our program is an archiver for gnome that uses gnome-python with one
widget written in C.

I converted our program to autoconf and automake so anyone can (and please
do!) compile it and see what I mean.

Everything compiles fine. But when it runs it just throws a weird
exception.

The funny thing is, if I alien RedHat 6.2's python package, and install
that, it works! I need to change nothing else. Only the python package.

I then went and look at the source rpm. They have this patch in there:

--- Python-1.5.2/Python/importdl.c.global	Sat Jul 17 16:52:26 1999
+++ Python-1.5.2/Python/importdl.c	Sat Jul 17 16:53:19 1999
@@ -441,13 +441,13 @@
 #ifdef RTLD_NOW
 		/* RTLD_NOW: resolve externals now
 		   (i.e. core dump now if some are missing) */
-		void *handle = dlopen(pathname, RTLD_NOW);
+		void *handle = dlopen(pathname, RTLD_NOW | RTLD_GLOBAL);
 #else
 		void *handle;
 		if (Py_VerboseFlag)
 			printf("dlopen(\"%s\", %d);\n", pathname,
-			       RTLD_LAZY);
-		handle = dlopen(pathname, RTLD_LAZY);
+			       RTLD_LAZY | RTLD_GLOBAL);
+		handle = dlopen(pathname, RTLD_LAZY | RTLD_GLOBAL);
 #endif /* RTLD_NOW */
 		if (handle == NULL) {
 			PyErr_SetString(PyExc_ImportError, dlerror());

Sure enough this fixes my problem. The thing is that this means our
program only works on Redhat (and who ever patched python 1.5.2 with this).

So what can I do now? How can I get this patch into debian-python? How can
I change my program to not need the patch?

btw the program is garchiver, it will be hosted at sourceforge as soon as
they get back to me, in the mean time I will mail anyone a copy of the
sources.

-- 
Danie Roux *shuffle* Adore Unix


-- 
To UNSUBSCRIBE, email to debian-python-request at lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster at lists.debian.org


----- End forwarded message -----



From guido at python.org  Wed Jan 17 05:16:48 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 23:16:48 -0500
Subject: [Python-Dev] [droux@tuks.co.za: Our application doesn't work with Debian packaged Python]
In-Reply-To: Your message of "Tue, 16 Jan 2001 13:08:23 PST."
             <20010116130823.C9640@glacier.fnational.com> 
References: <20010116130823.C9640@glacier.fnational.com> 
Message-ID: <200101170416.XAA20515@cj20424-a.reston1.va.home.com>

> This message was on the debian-python list.  Does anyone know why
> the patch is needed?

> -		handle = dlopen(pathname, RTLD_LAZY);

> +		handle = dlopen(pathname, RTLD_LAZY | RTLD_GLOBAL);

This comes back every once in a while.  It means that they have an
module whose shared library implementation exports symbols that are
needed by another shared library (probably another module).

IMO this approach is evil, because RTLD_GLOBAL means that *all*
external symbols defined by any module are exported to all other
shared libraries, and this will cause conflicts if the same symbol is
exported by two different modules -- which can happen quite easily.
(I don't know what happens on conflicts -- maybe you get an error,
maybe it links to the wrong symbol.)

The proper solution would be to put the needed entry points beside the
init<module> entry point in a separate shared library.  But that's
often not how quick-and-dirty extension modules are designed...

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido at python.org  Wed Jan 17 05:22:54 2001
From: guido at python.org (Guido van Rossum)
Date: Tue, 16 Jan 2001 23:22:54 -0500
Subject: [Python-Dev] Rich Comparisons technical prerelease
Message-ID: <200101170422.XAA20626@cj20424-a.reston1.va.home.com>

I've got a working version of the rich comparisons ready for preview.

The patch is here:

  http://www.python.org/~guido/richdiff.txt

It's also referenced at sourceforge:

  http://sourceforge.net/patch/?func=detailpatch&patch_id=103283&group_id=5470

Here's a summary:

- The comparison operators support "rich comparison overloading" (PEP
  207).  C extension types can provide a rich comparison function in
  the new tp_richcompare slot in the type object.  The cmp() function
  and the C function PyObject_Compare() first try the new rich
  comparison operators before trying the old 3-way comparison.  There
  is also a new C API PyObject_RichCompare() (which also falls back on
  the old 3-way comparison, but does not constrain the outcome of the
  rich comparison to a Boolean result).

  The rich comparison function takes two objects (at least one of
  which is guaranteed to have the type that provided the function) and
  an integer indicating the opcode, which can be Py_LT, Py_LE, Py_EQ,
  Py_NE, Py_GT, Py_GE (for <, <=, ==, !=, >, >=), and returns a Python
  object, which may be NotImplemented (in which case the tp_compare
  slot function is used as a fallback, if defined).

  Classes can overload individual comparison operators by defining one
  or more of the methods__lt__, __le__, __eq__, __ne__, __gt__,
  __ge__.  There are no explicit "reversed argument" versions of
  these; instead, __lt__ and __gt__ are each other's reverse, likewise
  for__le__ and __ge__; __eq__ and __ne__ are their own reverse
  (similar at the C level).  No other implications are made; in
  particular, Python does not assume that == is the inverse of !=, or
  that < is the inverse of >=.  This makes it possible to define types
  with partial orderings.

  Classes or types that want to implement (in)equality tests but not
  the ordering operators (i.e. unordered types) should implement ==
  and !=, and raise an error for the ordering operators.

  It is possible to define types whose comparison results are not
  Boolean; e.g. a matrix type might want to return a matrix of bits
  for A < B, giving elementwise comparisons.  Such types should ensure
  that any interpretation of their value in a Boolean context raises
  an exception, e.g. by defining __nonzero__ (or the tp_nonzero slot
  at the C level) to always raise an exception.

  XXX TO DO for this feature:

  - the test "test_compare" fails, because of the changed semantics
    for complex number comparisons (1j<2j raises an error now)
  - tuple, dict should implement EQ/NE so containers containing
    complex numbers can be compared for equality (list is already
    done) -- or complex numbers should be reverted to old behavior
  - list.sort() shoud use rich comparison
  - check for memory leaks
  - int, long, float contain new-style-cmp functions that aren't used
    to their full potential any more (the new-style-cmp functions
    introduced by Neil's coercion work are gone again)
  - decide on unresolved issues from PEP 207
  - documentation
  - more testing
  - compare performance to 2.0 (microbench?)

Please give this a good spin -- I'm hoping to check this in and
make it part of the alpha 1 release Friday...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Wed Jan 17 05:50:25 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 16 Jan 2001 23:50:25 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
References: <200101161409.JAA05268@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCIECBIJAA.tim.one@home.com>
Message-ID: <14949.9361.591610.684695@anthem.wooz.org>

>>>>> "TP" == Tim Peters <tim.one at home.com> writes:

    TP> Note that as of 2.0, a far quicker way is to use
    TP> binascii.b2a_hex(), or its absurdist (read "Barry" <wink>)
    TP> synonym binascii.hexlify().

Thanks for the compliment Tim, but I can't take credit for that name.
If it was me I'd have called it wudduptify() (and its inverse,
notmuchlify()).  I stole the name from Emacs's hexlify-buffer function
which kind of does the same thing.

would-converting-to-octal-digits-be-called-octopuslify-ly y'rs,
-Barry



From fredrik at effbot.org  Wed Jan 17 09:12:32 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Wed, 17 Jan 2001 09:12:32 +0100
Subject: [Python-Dev] Re: [Patch #103248] Fix a memory leak in _sre.c
References: <14948.44221.876681.838046@buffalo.fnal.gov>
Message-ID: <00fe01c0805d$432d4cd0$e46940d5@hagrid>

Charles G Waldman wrote:
> Can you explain why you made this (seemingly arbitrary) change? 
> 
> I think that since "self" was created via:
> 
>  self = PyObject_NEW_VAR(PatternObject, &Pattern_Type, n);
> 
> which calls PyObjectINIT, which in turn calls _Py_NewReference, which
> increments _Py_RefTotal, it is incorrect to simply do a PyObject_DEL
> to de-allocate it -- won't this screw up the value of _Py_RefTotal?

and what do you think will happen if you call the destructor before
you've initialized all pointer fields in the object?

(according to the docs, the NEW/New functions return uninitialized
memory.  in this case, we're bailing out before the object has been
fully initialized.  pattern_dealloc definitely isn't prepared to deal with
random pointer values...)

> Admittedly this is a minor nit and only matters if Py_TRACE_REFS is
> defined - I just wanted to check to make sure my understanding of
> reference counting w.r.t. memory allocation and deallocation is
> correct - if the above is in error, I'd apprecate any corrections...

same here.  I don't doubt it's working as you say it does, but I find it
strange that you shouldn't be able to DEL an object you just created
with NEW...  maybe DEL should be fixed?

Cheers /F




From thomas at xs4all.net  Wed Jan 17 10:48:12 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 17 Jan 2001 10:48:12 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Modules Setup.config.in,1.7,1.8 Setup.dist,1.7,1.8
In-Reply-To: <E14Inu5-00047g-00@usw-pr-cvs1.sourceforge.net>; from esr@users.sourceforge.net on Wed, Jan 17, 2001 at 12:25:13AM -0800
References: <E14Inu5-00047g-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010117104812.F2776@xs4all.nl>

On Wed, Jan 17, 2001 at 12:25:13AM -0800, Eric S. Raymond wrote:

> + # ndbm(3) may require -lndbm or similar
> + @USE_NDBM_MODULE at ndbm ndbmmodule.c @HAVE_LIBNDBM@

This is an interesting module... It's not in the Modules/ directory :-) Did
you mean 'dbmmodule.c' with a different library argument ? 

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From skip at mojam.com  Wed Jan 17 16:17:39 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 17 Jan 2001 09:17:39 -0600 (CST)
Subject: [Python-Dev] Rich comparison confusion
Message-ID: <14949.46995.259157.871323@beluga.mojam.com>

I'm a bit confused about Guido's rich comparison stuff.  In the description
he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.

From akuchlin at mems-exchange.org  Wed Jan 17 16:42:13 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 17 Jan 2001 10:42:13 -0500
Subject: [Python-Dev] PEP 229 issues
In-Reply-To: <200101170139.UAA17954@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Jan 16, 2001 at 08:39:35PM -0500
References: <200101170146.UAA00542@207-172-112-159.s159.tnt4.ann.va.dialup.rcn.com> <200101170139.UAA17954@cj20424-a.reston1.va.home.com>
Message-ID: <20010117104213.B490@kronos.cnri.reston.va.us>

On Tue, Jan 16, 2001 at 08:39:35PM -0500, Guido van Rossum wrote:
>I expect that there will be an alpha2, but I still recommend that you
>check in *something* that works for alpha1, to get maximal testing
>coverage.  Alpha1 may slip a day or so (Jeremy and I are both late
>with our big patches, respectively nested scopes and rich comparisons,
>that we really want to have in alpha1).

OK; thanks for the pronouncement!

I've checked in all the smaller changes that shouldn't break anything.
All that's left now is to actually enable the new feature, which
requires the nasty changes:

     * In the top-level Makefile.in, the "sharedmods" target simply
       runs "./python setup.py build", and "sharedinstall" runs
       "./python setup.py install".  The "clobber" target also deletes
       the build/ subdirectory where Distutils puts its output.

     * Rip stuff out of the Setup files.  Modules/Setup.config.in only
       contains entries for the gc and thread modules; the readline,
       curses, and db modules are removed because it's now setup.py's
       job to handle them.
 
     * Modules/Setup.dist now contains entries for only 3 modules --
       _sre, posix, and strop.

Guido and Jeremy are rushing to finish their patches in time for the
alpha release, though Guido seems to be checking in the rich
comparison stuff now.  I don't want to impede them by making them stop
to debug build problems, so I can either wait until they've landed
their changes (at which point there's nothing major left, I think), or
they can simply not do a 'cvs update' after the serious changes go in.
Thoughts?

--amk



From barry at digicool.com  Wed Jan 17 16:54:06 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Wed, 17 Jan 2001 10:54:06 -0500
Subject: [Python-Dev] Breakage in latest CVS
Message-ID: <14949.49182.636526.292265@anthem.wooz.org>

Looks like the latest CVS (updated just minutes ago) is broken.  I'm
trying to fix some of these complaints, but thought I'd at least
report what I've found...

-Barry

...
gcc -g -O2 -Wall -Wstrict-prototypes -fPIC -I./../Include -I.. -DHAVE_CONFIG_H   -c floatobject.c -o floatobject.o
floatobject.c:675: warning: excess elements in struct initializer after `float_as_number'
floatobject.c:700: `Py_TPFLAGS_NEWSTYLENUMBER' undeclared here (not in a function)
floatobject.c:700: initializer element for `PyFloat_Type.tp_flags' is not constant
...
intobject.c:800: warning: excess elements in struct initializer after `int_as_number'
intobject.c:825: `Py_TPFLAGS_NEWSTYLENUMBER' undeclared here (not in a function)
intobject.c:825: initializer element for `PyInt_Type.tp_flags' is not constant
make[1]: *** [intobject.o] Error 1
...
gcc -g -O2 -Wall -Wstrict-prototypes -fPIC -I./../Include -I.. -DHAVE_CONFIG_H   -c longobject.c -o longobject.o
longobject.c:1865: warning: excess elements in struct initializer after `long_as_number'
longobject.c:1890: `Py_TPFLAGS_NEWSTYLENUMBER' undeclared here (not in a function)
longobject.c:1890: initializer element for `PyLong_Type.tp_flags' is not constant
make[1]: *** [longobject.o] Error 1



From guido at python.org  Wed Jan 17 17:09:27 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 17 Jan 2001 11:09:27 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Your message of "Wed, 17 Jan 2001 09:17:39 CST."
             <14949.46995.259157.871323@beluga.mojam.com> 
References: <14949.46995.259157.871323@beluga.mojam.com> 
Message-ID: <200101171609.LAA04102@cj20424-a.reston1.va.home.com>

> I'm a bit confused about Guido's rich comparison stuff.  In the description
> he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.

Yes.  By this I mean that A<B and B>A are interchangeable, ditto for
A<=B and B>=A.  Also A==B interchanges for B==A, and A!=B for B!=A.

> From a boolean standpoint this just can't be so.  Guido mentions partial
> orderings, but I'm still confused.  Consider this example: Objects of type A
> implement rich comparisons.  Objects of type B don't.  If my code looks like
> 
>     a = A()
>     b = B()
>     ...
>     if b < a:
>         ...
> 
> My interpretation of the rich comparison stuff is that either
> 
>     1. Since b doesn't implement rich comparisons, the interpreter falls
>        back to old fashioned comparisons which may or may not allow the
>        comparison of B objects and A objects.
> 
>     or
> 
>     2. The sense of the inequality is switched (a > b) and the rich
>        comparison code in A's implementation is called.

It's case 2.

> That's my reading of it.  It has to be wrong.  The inverse comparison should
> be a >= b, not a > b, but the described pairing of comparison functions
> would imply otherwise.

We're trying very hard *not* to make any connections between a<b and
a>=b.  You've learned in grade school that these are each other's
Boolean inverse (a<b is true iff a>=b is false).  However, for partial
orderings this may not be true: for unordered a and b, none of a<b,
a<=b, a>b, a>=b, a==b may be true.

On the other hand, even for partially ordered types, a<b and b>a
(note: swapped arguments *and* swapped sense of comparison) always
give the same outcome!

> I'm sure I'm missing something obvious or revealing some fundamental failure
> of my grade school education.  Please explain...

I think what threw you off was the ambiguity of "inverse".  This means
Boolean negation.  I'm not relying on Boolean negation here -- I'm
relying on the more fundamental property that a<b and b>a have the
same outcome.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mwh21 at cam.ac.uk  Wed Jan 17 17:13:32 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 17 Jan 2001 16:13:32 +0000
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Skip Montanaro's message of "Wed, 17 Jan 2001 09:17:39 -0600 (CST)"
References: <14949.46995.259157.871323@beluga.mojam.com>
Message-ID: <m3hf2y2m37.fsf@atrus.jesus.cam.ac.uk>

Skip Montanaro <skip at mojam.com> writes:

> I'm a bit confused about Guido's rich comparison stuff.  In the description
> he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.
> >From a boolean standpoint this just can't be so.  Guido mentions partial
> orderings, but I'm still confused.  Consider this example: Objects of type A
> implement rich comparisons.  Objects of type B don't.  If my code looks like
> 
>     a = A()
>     b = B()
>     ...
>     if b < a:
>         ...
> 
> My interpretation of the rich comparison stuff is that either
> 
>     1. Since b doesn't implement rich comparisons, the interpreter falls
>        back to old fashioned comparisons which may or may not allow the
>        comparison of B objects and A objects.
> 
>     or
> 
>     2. The sense of the inequality is switched (a > b) and the rich
>        comparison code in A's implementation is called.
> 
> That's my reading of it.  It has to be wrong.  The inverse comparison should
> be a >= b, not a > b, but the described pairing of comparison functions
> would imply otherwise.
> 
> I'm sure I'm missing something obvious or revealing some fundamental failure
> of my grade school education.  Please explain...

For a total order:

a < b if and only if b > a.
This is what the rich comparison code does.

a < b if and only if a >= b. 
This is that the rich comparison code doesn't do.

Does this make sense?

Cheers,
M.

-- 
  Presumably pronging in the wrong place zogs it.
                                        -- Aldabra Stoddart, ucam.chat




From moshez at zadka.site.co.il  Thu Jan 18 01:08:06 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Thu, 18 Jan 2001 02:08:06 +0200 (IST)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <14949.46995.259157.871323@beluga.mojam.com>
References: <14949.46995.259157.871323@beluga.mojam.com>
Message-ID: <20010118000806.D1C04A828@darjeeling.zadka.site.co.il>

On Wed, 17 Jan 2001 09:17:39 -0600 (CST), Skip Montanaro <skip at mojam.com> wrote:

> I'm a bit confused about Guido's rich comparison stuff.  In the description
> he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.

I think that you're confused between two meanings of inverses.

You think:
op is an inverse of op' if for every a,b  (a op b) = not (a op' b)

Guido meant (and I hope, implemented):
op is an inverse of op' if for every a,b  (a op b) =  (b op' a)

And a<b iff b>a 
a<=b iff b>=a

Sounds sane.

Unless I'm the one confused....
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!



From fredrik at effbot.org  Wed Jan 17 17:47:29 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Wed, 17 Jan 2001 17:47:29 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <LNBBLJKPBEHFEDALKOLCGENEIIAA.tim.one@home.com>
Message-ID: <012901c080a5$306023a0$e46940d5@hagrid>

tim wrote:
> > Should I check it in?
> 
> Absolutely!  But not like as for 2.0:  check it in *now*, so we have a few
> days to deal with surprises before the alpha release.

as it turned out, the source I had didn't build, and the table-
building python script generated something that wasn't quite
compatible with the C code.  bit rot.

I've almost sorted it all out.  will check it in later tonight (local
time).

</F>




From tim.one at home.com  Wed Jan 17 19:27:11 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 17 Jan 2001 13:27:11 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Tools/idle CallTipWindow.py,1.2,1.3 CallTips.py,1.7,1.8 ClassBrowser.py,1.11,1.12 Debugger.py,1.14,1.15 Delegator.py,1.2,1.3 FileList.py,1.7,1.8 FormatParagraph.py,1.8,1.9 IdleConf.py,1.5,1.6 IdleHistory.py,1.3,1
In-Reply-To: <200101171358.IAA27661@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEBIJAA.tim.one@home.com>

[an anonymous developer panics, after Tim "reindent"s the IDLE dir]

> Oh no!
>
> I have a whole slew of changes to IDLE sitting in my work directory.
> If I do an update half of these will turn into merge conflicts. :-(
>
> Don't worry, I'll get over it.

I imagine this will pop up from time to time until everything is normalized.
If it's about to burn you, run reindent.py on the affected directory
*before* you update ("python redindent.py -v .").  That will make all the
same changes to your local versions as were checked in, modulo the rare
hand-edit (of which there were none in the IDLE directory).




From akuchlin at mems-exchange.org  Wed Jan 17 20:04:04 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 17 Jan 2001 14:04:04 -0500
Subject: [Python-Dev] PEP 229 checked in
Message-ID: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us>

I've checked in the last bit of the PEP 229 changes.  Be sure to
rename your Modules/Setup file (or do a 'make distclean' before
rebuilding.  Squeal if you run into trouble, or file bugs on SF.

--am"Aieee!"k



From jeremy at alum.mit.edu  Wed Jan 17 20:12:47 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Wed, 17 Jan 2001 14:12:47 -0500 (EST)
Subject: [Python-Dev] unexpected consequence of function attributes
Message-ID: <14949.61103.258714.325465@localhost.localdomain>

I have found one place in the library that depended on 
hasattr(func, '__dict__') to return false -- dis.dis.  You might want
to check and see if there is anything other code that doesn't expect
function's to have extra attributes.  I expect that only introspective
code would be affected.

Jeremy



From barry at wooz.org  Wed Jan 17 20:46:36 2001
From: barry at wooz.org (Barry A. Warsaw)
Date: Wed, 17 Jan 2001 14:46:36 -0500
Subject: [Python-Dev] Re: unexpected consequence of function attributes
References: <14949.61103.258714.325465@localhost.localdomain>
Message-ID: <14949.63132.583025.303677@anthem.wooz.org>

>>>>> "JH" == Jeremy Hylton <jeremy at alum.mit.edu> writes:

    JH> I have found one place in the library that depended on
    JH> hasattr(func, '__dict__') to return false -- dis.dis.  You
    JH> might want to check and see if there is anything other code
    JH> that doesn't expect function's to have extra attributes.  I
    JH> expect that only introspective code would be affected.

I guess we need a test_dis.py in the regression test suite, eh? :)

Here's an extremely quick and dirty fix to dis.py.
-Barry

-------------------- snip snip --------------------
Index: dis.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/dis.py,v
retrieving revision 1.28
diff -u -r1.28 dis.py
--- dis.py	2001/01/14 23:36:05	1.28
+++ dis.py	2001/01/17 19:45:40
@@ -15,6 +15,10 @@
         return
     if type(x) is types.InstanceType:
         x = x.__class__
+    if hasattr(x, 'func_code'):
+        x = x.func_code
+    if hasattr(x, 'im_func'):
+        x = x.im_func
     if hasattr(x, '__dict__'):
         items = x.__dict__.items()
         items.sort()
@@ -28,17 +32,12 @@
                 except TypeError, msg:
                     print "Sorry:", msg
                 print
+    elif hasattr(x, 'co_code'):
+        disassemble(x)
     else:
-        if hasattr(x, 'im_func'):
-            x = x.im_func
-        if hasattr(x, 'func_code'):
-            x = x.func_code
-        if hasattr(x, 'co_code'):
-            disassemble(x)
-        else:
-            raise TypeError, \
-                  "don't know how to disassemble %s objects" % \
-                  type(x).__name__
+        raise TypeError, \
+              "don't know how to disassemble %s objects" % \
+              type(x).__name__
 
 def distb(tb=None):
     """Disassemble a traceback (default: last traceback)."""



From barry at wooz.org  Wed Jan 17 20:49:51 2001
From: barry at wooz.org (Barry A. Warsaw)
Date: Wed, 17 Jan 2001 14:49:51 -0500
Subject: [Python-Dev] Re: unexpected consequence of function attributes
References: <14949.61103.258714.325465@localhost.localdomain>
Message-ID: <14949.63327.22745.359978@anthem.wooz.org>

>>>>> "JH" == Jeremy Hylton <jeremy at alum.mit.edu> writes:

    JH> I have found one place in the library that depended on
    JH> hasattr(func, '__dict__') to return false -- dis.dis.  You
    JH> might want to check and see if there is anything other code
    JH> that doesn't expect function's to have extra attributes.  I
    JH> expect that only introspective code would be affected.

Patch #103303

http://sourceforge.net/patch/?func=detailpatch&patch_id=103303&group_id=5470



From tim.one at home.com  Wed Jan 17 21:51:57 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 17 Jan 2001 15:51:57 -0500
Subject: [Python-Dev] Windows Python totally hosed
Message-ID: <LNBBLJKPBEHFEDALKOLCEEEGIJAA.tim.one@home.com>

Failures range from

test test_winsound skipped --  Module use of python20.dll
    conflicts with this version of Python.

to

test test_tokenize crashed -- exceptions.AttributeError: 're' module
    has no attribute 'compile'

I suspect the latter is really a disguised version of

C:\Code\python\dist\src\PCbuild>python
Python 2.1a1 (#8, Jan 17 2001, 13:15:23) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import re
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "c:\code\python\dist\src\lib\re.py", line 28, in ?
    from sre import *
  File "c:\code\python\dist\src\lib\sre.py", line 17, in ?
    import sre_compile
  File "c:\code\python\dist\src\lib\sre_compile.py", line 11, in ?
    import _sre
ImportError: Module use of python20.dll conflicts with this version of
Python.
>>>

Suspect all of this has to do with patchlevel.h changing.  I'll try to dope
it out, but if anyone knows the cure off the top of their head, don't be
shy!




From akuchlin at mems-exchange.org  Wed Jan 17 22:00:56 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 17 Jan 2001 16:00:56 -0500
Subject: [Python-Dev] Re: 'Setup' buglet
In-Reply-To: <200101171928.OAA21460@cj20424-a.reston1.va.home.com>; from guido@python.org on Wed, Jan 17, 2001 at 02:28:36PM -0500
References: <200101171928.OAA21460@cj20424-a.reston1.va.home.com>
Message-ID: <20010117160056.A20603@kronos.cnri.reston.va.us>

[Taking this bug public]

On Wed, Jan 17, 2001 at 02:28:36PM -0500, Guido van Rossum wrote:
>One problem seems to be that the creation
>of the (minimal) Modules/Setup file doesn't seem to be doing the right
>thing.  When I delete Modules/Setup, the next "make" doesn't create
>it; it used to be copied from Setup.dist if it doesn't exist.

This seems to have been removed from Modules/Makefile.pre.in in
revision 1.69 by Fred; instead the configure script now copies
Setup.dist to Setup, so you have to rerun configure in order to create
Modules/Setup after deleting it.  

--amk



From mal at lemburg.com  Wed Jan 17 22:04:29 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jan 2001 22:04:29 +0100
Subject: [Python-Dev] Usage of "assert" in regression tests
Message-ID: <3A6608DD.E12A2422@lemburg.com>

I've just checked in a patch which removes all uses of the
assert statement in the regression tests. This makes the
tests compatible with the -O mode of Python and also allows
centralizing error reporting (many tests already provide their
own little test function for this purpose).

I urge you to only check in tests which use the new API
verify() to verify a certain condition. The API is defined
in the regression tools module test_support.

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fredrik at effbot.org  Wed Jan 17 22:21:56 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Wed, 17 Jan 2001 22:21:56 +0100
Subject: [Python-Dev] Windows Python totally hosed
References: <LNBBLJKPBEHFEDALKOLCEEEGIJAA.tim.one@home.com>
Message-ID: <028801c080cb$86658350$e46940d5@hagrid>

tim wrote:
> Suspect all of this has to do with patchlevel.h changing.  I'll try to dope
> it out, but if anyone knows the cure off the top of their head, don't be
> shy!

text.replace("python20", "python21") for all files in
the PCBuild directory, plus PC/config.h

</F>




From tim.one at home.com  Wed Jan 17 22:42:13 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 17 Jan 2001 16:42:13 -0500
Subject: [Python-Dev] Windows Python totally hosed
In-Reply-To: <028801c080cb$86658350$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEEJIJAA.tim.one@home.com>

[/F]
> text.replace("python20", "python21") for all files in
> the PCBuild directory, plus PC/config.h

Brrrr.  It strikes me as insane to have the core Python files in an MS
project file *named* after the release number (python20.dsp).  So I'm going
to change that to core.dsp so that at least that much never needs to be
changed again.

gratefully y'rs  - tim




From fredrik at effbot.org  Wed Jan 17 22:47:28 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Wed, 17 Jan 2001 22:47:28 +0100
Subject: [Python-Dev] Usage of "assert" in regression tests
References: <3A6608DD.E12A2422@lemburg.com>
Message-ID: <02b401c080cf$1a3a5530$e46940d5@hagrid>

mal wrote:
> I urge you to only check in tests which use the new API
> verify() to verify a certain condition. The API is defined
> in the regression tools module test_support.

did you run the test yourself after applying that patch?

(a patch to the patch is on the way in.  please check
that the test suite still runs on non-Windows boxes...)

</F>




From gstein at lyra.org  Wed Jan 17 22:45:44 2001
From: gstein at lyra.org (Greg Stein)
Date: Wed, 17 Jan 2001 13:45:44 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.106,2.107
In-Reply-To: <E14J06i-0003ty-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Wed, Jan 17, 2001 at 01:27:04PM -0800
References: <E14J06i-0003ty-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010117134544.H7731@lyra.org>

On Wed, Jan 17, 2001 at 01:27:04PM -0800, Guido van Rossum wrote:
> Update of /cvsroot/python/python/dist/src/Objects
> In directory usw-pr-cvs1:/tmp/cvs-serv14991
> 
> Modified Files:
> 	object.c 
> Log Message:
> Deal properly (?) with comparing recursive datastructures.
>...
> - Change the in-progress code to use static variables instead of
>   globals (both the nesting level and the key for the thread dict were
>   globals but have no reason to be globals; the key can even be a
>   function-static variable in get_inprogress_dict()).

The "compare_nesting" variable is a bit troublesome long-term -- it will
cause threading issues in a free-threaded implementation. The solution is to
put the value into the thread-state.

[ not sure if it matters right now, but just bringing it up ]

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From fdrake at acm.org  Wed Jan 17 22:55:02 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 17 Jan 2001 16:55:02 -0500 (EST)
Subject: [Python-Dev] [PEP 205] weak references patch
Message-ID: <14950.5302.356566.778486@cj42289-a.reston1.va.home.com>

  I've updated the patch that implements PEP 205:

http://sourceforge.net/patch/?func=detailpatch&patch_id=103203&group_id=5470

  The actual patch is too big for SF:

http://starship.python.net/crew/fdrake/patches/weakref.patch-5

  One thing about this is that it changes some of the low-level object
creation macros, so you'll need to do a "make clean" before "make"
when testing it.
  Have fun!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From mal at lemburg.com  Wed Jan 17 23:16:29 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jan 2001 23:16:29 +0100
Subject: [Python-Dev] Usage of "assert" in regression tests
References: <3A6608DD.E12A2422@lemburg.com> <02b401c080cf$1a3a5530$e46940d5@hagrid>
Message-ID: <3A6619BD.2AC8F6D3@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:
> > I urge you to only check in tests which use the new API
> > verify() to verify a certain condition. The API is defined
> > in the regression tools module test_support.
> 
> did you run the test yourself after applying that patch?

Yes, but as I wrote in the SF patch message: I can only
test it on Linux and there not all tests are run due
to missing extensions. The alpha testing will hopefully catch all
possible bugs this patch introduced.
 
> (a patch to the patch is on the way in.  please check
> that the test suite still runs on non-Windows boxes...)

I'll have to leave that to the Windows wizards, sorry.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Wed Jan 17 23:49:25 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 17 Jan 2001 23:49:25 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Wed, Jan 17, 2001 at 02:04:04PM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us>
Message-ID: <20010117234925.A17392@xs4all.nl>

On Wed, Jan 17, 2001 at 02:04:04PM -0500, Andrew Kuchling wrote:
> I've checked in the last bit of the PEP 229 changes.  Be sure to
> rename your Modules/Setup file (or do a 'make distclean' before
> rebuilding.

make distclean doesn't remove Modules/Setup anymore :) Also, I couldn't get
it to work with an old tree, even after several make distclean/reconfigures.
I got tired looking for it, so I just grabbed a new tree.

> Squeal if you run into trouble, or file bugs on SF.

I have a couple of questions: what to do when setup.py doesn't work ? Is
there a way to make it bypass a module ? What about specifying include dirs
manually, for some modules (for instance, when you have readline source in a
separate directory, and want to link it statically.)

Here are are some specific squeals. See at the bottom for the most important
one :)

On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
setup.py. Also, SSL support for the socket module was not enabled, though
OpenSSL is installed, in the default path.

On Debian GNU/Linux' 'woody', the 'testing' (soon 'stable') branch, I can't
compile dbmmodule:

building 'dbm' extension
gcc -g -O2 -Wall -Wstrict-prototypes -fPIC -fpic -I. -I/home/thomas/python/python/dist/src/./Include -IInclude/ -c /home/thomas/python/python/dist/src/Modules/dbmmodule.c -o build/temp.linux-i686-2.1/dbmmodule.o
/home/thomas/python/python/dist/src/Modules/dbmmodule.c:24: #error "No ndbm.h available!"
error: command 'gcc' failed with exit status 1
make: *** [sharedmods] Error 1

(ndbm.h does exist, as /usr/include/db1/ndbm.h. There is also
/usr/include/gdbm-ndbm.h, but I'm not sure if that's the same.)

Nor can I build the _tkinter module there:

building '_tkinter' extension
gcc -g -O2 -Wall -Wstrict-prototypes -fPIC -fpic -DWITH_APPINIT=1 -I/usr/X11R6/include -I. -I/home/thomas/python/python/dist/src/./Include -IInclude/ -c /home/thomas/python/python/dist/src/Modules/_tkinter.c -o build/temp.linux-i686-2.1/_tkinter.o
/home/thomas/python/python/dist/src/Modules/_tkinter.c:44: tcl.h: No such file or directory
In file included from /home/thomas/python/python/dist/src/Modules/_tkinter.c:45:/usr/include/tk.h:66: tcl.h: No such file or directory
error: command 'gcc' failed with exit status 1
make: *** [sharedmods] Error 1

The Tcl/Tk header files are stored in /usr/include/tcl<ver>/ on Debian,
which I personally like a lot, though it's probably a bitch to autodetect.
(I tried, using autoconf ;-P)

On Debian GNU/Linux 'sid', the current unstable branch, I can't compile
Python at all, now:

c++  -Xlinker -export-dynamic python.o \
          ../libpython2.1.a   -lpthread -ldl  -lutil -lm  -o python
../libpython2.1.a(posixmodule.o): In function `posix_tmpnam':
/home/thomas/python/python-write/dist/src/Modules/./posixmodule.c:4115: the use of `tmpnam_r' is dangerous, better use `mkstemp'
../libpython2.1.a(posixmodule.o): In function `posix_tempnam':
/home/thomas/python/python-write/dist/src/Modules/./posixmodule.c:4071: the use of `tempnam' is dangerous, better use `mkstemp'
mv python ../python
make[1]: Leaving directory `/home/thomas/python/python-write/dist/src/Modules'
./python ./setup.py build
running build
running build_ext
Traceback (most recent call last):
  File "./setup.py", line 460, in ?
    main()
  File "./setup.py", line 455, in main
    ext_modules=[Extension('struct', ['structmodule.c'])]
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/core.py", line 138, in setup
    dist.run_commands()
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/dist.py", line 871, in run_commands
    self.run_command(cmd)
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/dist.py", line 891, in run_command
    cmd_obj.run()
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/command/build.py", line 106, in run
    self.run_command(cmd_name)
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/cmd.py", line 328, in run_command
    self.distribution.run_command(command)
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/dist.py", line 891, in run_command
    cmd_obj.run()
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/command/build_ext.py", line 202, in run
    customize_compiler(self.compiler)
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/sysconfig.py", line 121, in customize_compiler
    (cc, opt, ccshared, ldshared, so_ext) = \
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/sysconfig.py", line 389, in get_config_vars
    func()
  File "/home/thomas/python/python-write/dist/src/Lib/distutils/sysconfig.py", line 302, in _init_posix
    raise DistutilsPlatformError, my_msg
distutils.errors.DistutilsPlatformError: invalid Python installation: unable to open /usr/lib/python2.1/config/Makefile (No such file or directory)
make: *** [sharedmods] Error 1

For the record, I don't have a /usr/lib/python2.1 directory on the other
machines either.

I haven't been able to test FreeBSD yet, will get to that later tonight.

And most importantly(!), on all these machines, 'make test' stops
functioning. In fact, after setup.py started building, you can't run 'make'
without 'make clean' anymore. You get a lot of undefined-symbol warnings
(see below.) If you run 'make clean;make test' it also doesn't work, because
the build directory is not in the Python library path, and regrtest.py
requires (at least) the time module.

c++  -Xlinker -export-dynamic python.o \
          ../libpython2.1.a   -lpthread -ldl  -lutil -lm  -o python 
../libpython2.1.a(posixmodule.o): In function `posix_tmpnam':
/home/thomas/python/python/dist/src/Modules/./posixmodule.c:4115: the use of `tmpnam_r' is dangerous, better use `mkstemp'
../libpython2.1.a(posixmodule.o): In function `posix_tempnam':
/home/thomas/python/python/dist/src/Modules/./posixmodule.c:4071: the use of `tempnam' is dangerous, better use `mkstemp'
../libpython2.1.a(myreadline.o): In function `my_fgets':
/home/thomas/python/python/dist/src/Parser/myreadline.c:41: undefined reference to `PyOS_InterruptOccurred'
/home/thomas/python/python/dist/src/Parser/myreadline.c:35: undefined reference to `PyOS_InterruptOccurred'
../libpython2.1.a(errors.o): In function `PyErr_SetFromErrnoWithFilename':
/home/thomas/python/python/dist/src/Python/errors.c:260: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(pythonrun.o): In function `Py_Finalize':
/home/thomas/python/python/dist/src/Python/pythonrun.c:193: undefined reference to `PyOS_FiniInterrupts'
../libpython2.1.a(pythonrun.o): In function `initsigs':
/home/thomas/python/python/dist/src/Python/pythonrun.c:1161: undefined reference to `PyOS_InitInterrupts'
../libpython2.1.a(traceback.o): In function `tb_printinternal':
/home/thomas/python/python/dist/src/Python/traceback.c:213: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(fileobject.o): In function `get_line':
/home/thomas/python/python/dist/src/Objects/fileobject.c:883: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(longobject.o): In function `long_format':
/home/thomas/python/python/dist/src/Objects/longobject.c:644: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(longobject.o): In function `x_divrem':
/home/thomas/python/python/dist/src/Objects/longobject.c:855: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(longobject.o): In function `long_mul':
/home/thomas/python/python/dist/src/Objects/longobject.c:1193: undefined reference to `PyErr_CheckSignals'
../libpython2.1.a(object.o):/home/thomas/python/python/dist/src/Objects/object.c:174: more undefined references to `PyErr_CheckSignals' follow
../libpython2.1.a(posixmodule.o): In function `posix_fork':
/home/thomas/python/python/dist/src/Modules/./posixmodule.c:1666: undefined reference to `PyOS_AfterFork'
../libpython2.1.a(posixmodule.o): In function `posix_forkpty':
/home/thomas/python/python/dist/src/Modules/./posixmodule.c:1733: undefined reference to `PyOS_AfterFork'
collect2: ld returned 1 exit status
make[1]: *** [link] Error 1
make[1]: Leaving directory `/home/thomas/python/python/dist/src/Modules'
make: *** [python] Error 2

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mal at lemburg.com  Wed Jan 17 23:56:58 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 17 Jan 2001 23:56:58 +0100
Subject: [Python-Dev] Standard install locations for Python ?
Message-ID: <3A66233A.A6AE07BD@lemburg.com>

I'm currently busy building new version of my mx packages. While
trying to convert all of them to distutils I found that there
seems to be no standard for installing documentation or other
data files of Python extensions. I also noted, that for Windows
the standard extension installation defaults to \Python instead
of some \Python\Site-Packages. So the general question is:

Where should Python extensions install themselves and their docs ?

(On Linux the typical place for docs is /usr/doc/packages,
for Python code it is /usr/local/lib/pythonX.X/site-packages,
BTW)

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Thu Jan 18 00:04:09 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 17 Jan 2001 18:04:09 -0500
Subject: [Python-Dev] Rich Comparisons technical prerelease
In-Reply-To: <200101170422.XAA20626@cj20424-a.reston1.va.home.com>; from guido@python.org on Tue, Jan 16, 2001 at 11:22:54PM -0500
References: <200101170422.XAA20626@cj20424-a.reston1.va.home.com>
Message-ID: <20010117180409.A17897@thyrsus.com>

Guido van Rossum <guido at python.org>:
>   This makes it possible to define types with partial orderings.

Guido's time machine is working again, and seems now to have been
augmented by telepathy.  I was just thinking about bugging him about
this...

I will definitely check this out with my set() class -- it was waiting on
rich comparisons so I could do partial-orderings properly.  If it works,
we'll have set algebra for the standard library.  Coolness.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Under democracy one party always devotes its chief energies
to trying to prove that the other party is unfit to rule--and
both commonly succeed, and are right... The United States
has never developed an aristocracy really disinterested or an
intelligentsia really intelligent. Its history is simply a record
of vacillations between two gangs of frauds. 
	--- H. L. Mencken



From akuchlin at mems-exchange.org  Thu Jan 18 00:09:47 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Wed, 17 Jan 2001 18:09:47 -0500
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010117234925.A17392@xs4all.nl>; from thomas@xs4all.net on Wed, Jan 17, 2001 at 11:49:25PM +0100
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl>
Message-ID: <20010117180947.E9384@kronos.cnri.reston.va.us>

On Wed, Jan 17, 2001 at 11:49:25PM +0100, Thomas Wouters wrote:
>I have a couple of questions: what to do when setup.py doesn't work ? Is
>there a way to make it bypass a module ? What about specifying include dirs

There's a 'disabled_module_list' global in the code, but no way to set
it from the command-line yet, since I couldn't figure out how to do
that in time.

>On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
>setup.py. Also, SSL support for the socket module was not enabled, though
>OpenSSL is installed, in the default path.

Can you take a look at the detection code in setup.py and see what's
going wrong.  I believe it should be found if OpenSSL is in
/usr/local/, but /usr/contrib isn't checked currently.

>The Tcl/Tk header files are stored in /usr/include/tcl<ver>/ on Debian,
>which I personally like a lot, though it's probably a bitch to autodetect.
>(I tried, using autoconf ;-P)

There's code to handle Debian, though I have no way of testing it, and
it worked on Neil's Debian box for some reason.  Search for
debian_tcl_include in setup.py, and see if you can fix it.

>distutils.errors.DistutilsPlatformError: invalid Python installation: unable to open /usr/lib/python2.1/config/Makefile (No such file or directory)

Are you sure setup.py is up to date; do a 'cvs update setup.py' to check.  
You might get a "setup.py is in the way; remove it' message if you 
downloaded the first setup.py script manually.

>without 'make clean' anymore. You get a lot of undefined-symbol warnings
>(see below.) If you run 'make clean;make test' it also doesn't work, because
>the build directory is not in the Python library path, and regrtest.py
>requires (at least) the time module.

Again, be sure the tree is up to date; I think this stems from
attempting to compile the signal module as shared, which doesn't work.
I know that "make test" doesn't work, but am not sure how to fix it
yet.

--amk



From tim.one at home.com  Thu Jan 18 00:42:24 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 17 Jan 2001 18:42:24 -0500
Subject: [Python-Dev] Windows Python totally rad
Message-ID: <LNBBLJKPBEHFEDALKOLCIEFCIJAA.tim.one@home.com>

Windows Python runs normally again, modulo four test failures I figure are
due to the "get rid of assert" patch.

Note that the python20 DevStudio subproject is gone.  It's been replaced by
a new subproject named pythoncore.




From thomas at xs4all.net  Thu Jan 18 00:44:00 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 18 Jan 2001 00:44:00 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010117234925.A17392@xs4all.nl>; from thomas@xs4all.net on Wed, Jan 17, 2001 at 11:49:25PM +0100
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl>
Message-ID: <20010118004400.B17392@xs4all.nl>

On Wed, Jan 17, 2001 at 11:49:25PM +0100, Thomas Wouters wrote:

I got around to testing on FreeBSD now, and it actually went pretty smooth!
However, some small points:

> On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
> setup.py. Also, SSL support for the socket module was not enabled, though
> OpenSSL is installed, in the default path.

Curiously enough, FreeBSD, with OpenSSL installed in /usr/include/openssl,
*did* get the socketmodule compiled with SSL support, but without the
necessary -I directive, so the compile failed. 

> And most importantly(!), on all these machines, 'make test' stops
> functioning. In fact, after setup.py started building, you can't run 'make'
> without 'make clean' anymore. You get a lot of undefined-symbol warnings

Strangely enough, this problem does not exist on FreeBSD. I can run 'make'
or 'make test' after 'make' just fine. 'make test' still doesn't work
because of the incorrect library path, but it doesn't barf like the other
systems (BSDI and Debian Linux)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From esr at thyrsus.com  Thu Jan 18 01:32:53 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 17 Jan 2001 19:32:53 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <20010118000806.D1C04A828@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Thu, Jan 18, 2001 at 02:08:06AM +0200
References: <14949.46995.259157.871323@beluga.mojam.com> <20010118000806.D1C04A828@darjeeling.zadka.site.co.il>
Message-ID: <20010117193253.A18565@thyrsus.com>

Moshe Zadka <moshez at zadka.site.co.il>:
> I think that you're confused between two meanings of inverses.
> 
> You think:
> op is an inverse of op' if for every a,b  (a op b) = not (a op' b)
> 
> Guido meant (and I hope, implemented):
> op is an inverse of op' if for every a,b  (a op b) =  (b op' a)

I thought the same.

<pedantic role="defrocked mathematician">

if (a op1 b) <=> (b op2 a), op2 is properly described as the "reflection"
of op1, and vice-versa.

</pedantic>
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Sometimes the law defends plunder and participates in it. Sometimes
the law places the whole apparatus of judges, police, prisons and
gendarmes at the service of the plunderers, and treats the victim --
when he defends himself -- as a criminal.
	-- Frederic Bastiat, "The Law"



From greg at cosc.canterbury.ac.nz  Thu Jan 18 01:22:11 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 18 Jan 2001 13:22:11 +1300 (NZDT)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <m3hf2y2m37.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <200101180022.NAA00898@s454.cosc.canterbury.ac.nz>

Michael Hudson <mwh21 at cam.ac.uk>:

> a < b if and only if b > a.
> This is what the rich comparison code does.

Someone is bound to come up with a use for comparison
operator overloading in which this isn't true, just
to be difficult!

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From guido at python.org  Thu Jan 18 04:40:31 2001
From: guido at python.org (Guido van Rossum)
Date: Wed, 17 Jan 2001 22:40:31 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.106,2.107
In-Reply-To: Your message of "Wed, 17 Jan 2001 13:45:44 PST."
             <20010117134544.H7731@lyra.org> 
References: <E14J06i-0003ty-00@usw-pr-cvs1.sourceforge.net>  
            <20010117134544.H7731@lyra.org> 
Message-ID: <200101180340.WAA00655@cj20424-a.reston1.va.home.com>

> > - Change the in-progress code to use static variables instead of
> >   globals (both the nesting level and the key for the thread dict were
> >   globals but have no reason to be globals; the key can even be a
> >   function-static variable in get_inprogress_dict()).
> 
> The "compare_nesting" variable is a bit troublesome long-term -- it will
> cause threading issues in a free-threaded implementation. The solution is to
> put the value into the thread-state.
> 
> [ not sure if it matters right now, but just bringing it up ]

Good point -- especially since the in-progress-dict is already part of
the thread state.  Jeremy explained to me that the compare_nesting
variable is mostly an optimization (avoiding the work with the
in-progress-dict when we don't know for sure that it's worth it) but
yes, mixing nesting levels (even if the dicts are separate) could
cause coupling or interference between threads...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Thu Jan 18 05:20:30 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 17 Jan 2001 22:20:30 -0600 (CST)
Subject: [Python-Dev] urllib.urlencode & repeated values
Message-ID: <14950.28430.572215.10643@beluga.mojam.com>

I'm pretty sure this has come up before, but urllib.urlencode doesn't handle
repeated parameters properly.  If I call

    urllib.urlencode({"performers": ("U2","Lawrence Martin")})

instead of getting

    performers=U2&performers=Lawrence+Martin

I get a quoted stringified tuple:

    performers=%28%27U2%27%2c+%27Lawrence+Martin%27%29

Obviously, fixing this will change the function's current semantics, but I
think it's worth treating lists and tuples (actually, any sequence) as
repeated values.  If the existing semantics are deemed valuable enough, a
third default parameter could be added to switch on the new behavior when
desired.

If others agree I'd be happy to whip up a patch.  I think it's a bug.

Skip



From jeremy at alum.mit.edu  Thu Jan 18 03:58:19 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Wed, 17 Jan 2001 21:58:19 -0500 (EST)
Subject: [Python-Dev] bug in grammar
Message-ID: <14950.23499.275398.963621@localhost.localdomain>

As part of the implementation of PEP 227 (and in an attempt to reach
some low-hanging fruit Guido mentioned on the types-sig long ago), I
have been working on a compiler pass that generates a module-level
symbol table.  I recently discovered a bug in the handling of list
comprehensions that was giving me headaches.

I realize now that the problem is with the current grammar and/or
compiler.  Here's a simple demonstration; try it in your friendly
python 2.0 interpreter.

>>> [i for i in range(10)] = (1, 2, 3)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: unpack list of wrong size

The generated bytecode is:

          0 SET_LINENO               0

          3 SET_LINENO               1
          6 LOAD_CONST               0 (1)
          9 LOAD_CONST               1 (2)
         12 LOAD_CONST               2 (3)
         15 BUILD_TUPLE              3
         18 UNPACK_SEQUENCE          1
         21 STORE_NAME               0 (i)
         24 LOAD_CONST               3 (None)
         27 RETURN_VALUE        

I assume this isn't intended :-).  The compiler is ignoring everything
after the initial atom in the list comprehension.  It's basically
compiling the code as if it were:

[i] = (1, 2, 3)

I'm not sure how to try and fix this.  Should the grammar allow one to
construct the example statement above?  If not, I'm not sure how to
fix the grammar.  If not, I suppose the compiler should detect that
the list comp is misplaced.  This seems fairly messy, since there are
about 10 nodes between the expr_stmt and the list_for.

Or is this a cool way to use list comprehensions to generate
ValueErrors?

Jeremy



From akuchlin at mems-exchange.org  Thu Jan 18 06:19:31 2001
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Thu, 18 Jan 2001 00:19:31 -0500
Subject: [Python-Dev] Embedded language discussion
Message-ID: <200101180519.AAA00612@207-172-111-227.s227.tnt1.ann.va.dialup.rcn.com>

http://www.kuro5hin.org/?op=displaystory;sid=2001/1/16/11334/2280

The poster is on a project that's trying to use Python, but they're
encountering unspecified problems (perhaps because of the global
interpreter lock).

--amk



From mal at lemburg.com  Thu Jan 18 10:32:54 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jan 2001 10:32:54 +0100
Subject: [Python-Dev] Windows Python totally rad
References: <LNBBLJKPBEHFEDALKOLCIEFCIJAA.tim.one@home.com>
Message-ID: <3A66B846.3D24B959@lemburg.com>

Tim Peters wrote:
> 
> Windows Python runs normally again, modulo four test failures I figure are
> due to the "get rid of assert" patch.

Could you tell me which these are ? The tests tested all passed
just fine, so I guess these must be Windows-related problems.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fredrik at effbot.org  Thu Jan 18 07:48:41 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Thu, 18 Jan 2001 07:48:41 +0100
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
References: <LNBBLJKPBEHFEDALKOLCGENEIIAA.tim.one@home.com> <012901c080a5$306023a0$e46940d5@hagrid>
Message-ID: <008701c0811a$b3371c00$e46940d5@hagrid>

I wrote:
> I've almost sorted it all out.  will check it in later tonight (local
> time).

python build problems and real life got in the way.

will 2.1a1 be released according to plan?  will there
be a 2.1a2 release?  maybe I should postpone this?

</F>




From esr at thyrsus.com  Thu Jan 18 08:23:21 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 18 Jan 2001 02:23:21 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
Message-ID: <20010118022321.A9021@thyrsus.com>

So I'm writing a module to that needs to generate unique cookies.  The
module will run inside one of two environments: (1) a trivial test wrapper,
not threaded, and (2) a lomg-running multithreaded server.

Because Python garbage-collects, hash() of a just-created object isn't
good enough.  Because we may be threading, millisecond time isn't
good enough.  Because we may *not* be threading, thread ID isn't good
either.  

On the other hand, I'm on Linux getting millisecond time resolution.
And it's not hard to notice that an object hash is a memory address.

So, how about `time.time()` + hex(hash([]))?

It looks to me like this will remain unique forever, because another thread
would have to create an object at the same memory address during the same
millisecond to collide.

Furthermore, it looks to me like this hack might be portable to any OS
with a clock tick shorter than its timeslice.

Comments?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Good intentions will always be pleaded for every assumption of
authority. It is hardly too strong to say that the Constitution was
made to guard the people against the dangers of good intentions. There
are men in all ages who mean to govern well, but they mean to
govern. They promise to be good masters, but they mean to be masters.
	-- Daniel Webster



From ping at lfw.org  Thu Jan 18 10:29:13 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 18 Jan 2001 01:29:13 -0800 (PST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <200101161402.JAA05045@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org>

On Tue, 16 Jan 2001, Guido van Rossum wrote:
> You mean the tp_print and tp_str function slots in type objects,
> right?  tp_print *should* always render exactly the same as tp_str.
> tp_print is used by the print statement, not by value display at the
> interactive prompt.

Uh, i hate to disagree with you about your own interpreter, but:

    com_expr_stmt in Python/compile.c
        inserts a PRINT_EXPR opcode if c_interactive is true;
    eval_code2 in Python/ceval.c
        handles PRINT_EXPR by calling displayhook;
    sys_displayhook in Python/sysmodule.c
        prints the object by calling PyFile_WriteObject on sys.stdout;
    PyFile_WriteObject in Objects/fileobject.c
        calls PyObject_Print if the file is really a PyFileObject;
    PyObject_Print in Objects/object.c
        calls op->ob_type->tp_print if it's not NULL.

The print statement produces a PRINT_ITEM opcode, which invokes
PyFile_WriteObject with a Py_PRINT_RAW flag.  That Py_PRINT_RAW
flag is propagated down to PyObject_Print and into string_print,
where it causes the string to fwrite itself directly without quoting.

> So, string_print most definitely should *not* be changed -- only
> string_repr!

I had to change them both before i actually saw the change in the
interactive interpreter.  Actually, your statement above (that the
two should always render the same) seems to imply that if i change
one, i must also change the other.


-- ?!ng




From sjoerd at oratrix.nl  Thu Jan 18 11:11:09 2001
From: sjoerd at oratrix.nl (Sjoerd Mullender)
Date: Thu, 18 Jan 2001 11:11:09 +0100
Subject: [Python-Dev] distutils in Python 2.1 not ready for prime time
Message-ID: <20010118101110.6D29C31E1B8@bireme.oratrix.nl>

I just updated my copy of python with the current CVS version and I am
not happy.

The current version uses distutils for configuring and compiling most
modules that are written in C.  That is a nice idea in theory, but in
practice it's not ready for prime time yet.  The major advantage of
using a Setup file is that you can add your own -I and -L compiler
flags on a module-by-module basis.  I *need* those flags since not all
libraries and include files are in standard places (e.g. I need
-I/usr/local/include and -L/usr/local/lib for some modules which my
compiler doesn't provide by itself).  There seems to be no way to tell
distutils to supply those flags.  The documentation (only on the web
site, also not great, but I assume more documentation (at least an
up-to-date README) will be provided in the final release) says that
that has not yet been implemented.

-- Sjoerd Mullender <sjoerd.mullender at oratrix.com>



From ping at lfw.org  Thu Jan 18 11:14:19 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 18 Jan 2001 02:14:19 -0800 (PST)
Subject: [Python-Dev] Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: <3A66BCCC.14997FE3@lemburg.com>
Message-ID: <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org>

I hope you don't mind that i'm taking this over to python-dev,
because it led me to discover a more general issue (see below).

For the others on python-dev, here's the background: MAL was
about to check in the unistr() function, described as follows:

> This patch adds a utility function unistr() which works just like
> the standard builtin str()  -- only that the return value will
> always be a Unicode object.
> 
> The patch also adds a new object level C API PyObject_Unicode()
> which complements PyObject_Str().

I responded:
> Why are unistr() and unicode() two separate functions?
> 
> str() performs one task: convert to string.  It can convert anything,
> including strings or Unicode strings, numbers, instances, etc.
> 
> The other type-named functions e.g. int(), long(), float(), list(),
> tuple() are similar in intent.
> 
> Why have unicode() just for converting strings to Unicode strings,
> and unistr() for converting everything else to a Unicode string?
> What does unistr(x) do differently from unicode(x) if x is a string?

MAL responded:
> unistr() is meant to complement str() very closely. unicode()
> works as constructor for Unicode objects which can also take
> care of decoding encoded data. str() and unistr() don't provide
> this capability but instead always assume the default encoding.
> 
> There's also a subtle difference in that str() and unistr() 
> try the tp_str slot which unicode() doesn't. unicode()
> supports any character buffer which str() and unistr() don't.

Okay, given this explanation, i still feel fairly confident
that unicode() should subsume unistr().  Many of the other
type-named functions try various slots:

    int() looks for __int__
    float() looks for __float__
    long() looks for __long__
    str() looks for __str__

In testing this i also discovered the following:

    >>> class Foo:
    ...     def __int__(self):
    ...         return 3
    ... 
    >>> f = Foo()
    >>> int(f)
    3
    >>> long(f) 
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    AttributeError: Foo instance has no attribute '__long__'
    >>> float(f)
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    AttributeError: Foo instance has no attribute '__float__'

This is kind of surprising.  How about:

    int() looks for __int__
    float() looks for __float__, then tries __int__
    long() looks for __long__, then tries __int__
    str() looks for __str__
    unicode() looks for __unicode__, then tries __str__

The extra parameter to unicode() is very similar to the extra
parameter to int(), so i think there is a natural parallel here.

Hmm... what about the other types?

Wow!!  __complex__ can produce a segfault!

    >>> complex
    <built-in function complex>
    >>> class Foo:
    ...   def __complex__(self): return 3
    ... 
    >>> Foo()
    <__main__.Foo instance at 0x81e8684>
    >>> f = _
    >>> complex(f)
    Segmentation fault (core dumped)

This happens because builtin_complex first retrieves and saves
the PyNumberMethods of the argument (in this case, from the
instance), then tries to call __complex__ (in this case, returning 3),
and THEN coerces the result using nbr->nb_float if the result is
not complex!  (This calls the instance's nb_float method on the
integer object 3!!)

I think __complex__ should probably look for __complex__, then
__float__, then __int__.

One could argue for __list__, __tuple__, or __dict__, but that
seems much weaker; the Pythonic way has always been to implement
__getitem__ instead.  There is no built-in dict(); if it existed
i suppose it would do the opposite of x.items(); again a weak
argument, though i might have found such a function useful once
or twice.

And that about covers the built-in types for data.


-- ?!ng




From ping at lfw.org  Thu Jan 18 11:16:42 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Thu, 18 Jan 2001 02:16:42 -0800 (PST)
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org>

On Thu, 18 Jan 2001, Ka-Ping Yee wrote:
>     str() looks for __str__

Oops.  I forgot that

      str() looks for __str__, then tries __repr__

So, presumably,

      unicode() should look for __unicode__, then __str__, then __repr__


-- ?!ng




From mal at lemburg.com  Thu Jan 18 11:51:46 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jan 2001 11:51:46 +0100
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. unistr()
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org>
Message-ID: <3A66CAC2.74FC894@lemburg.com>

Ka-Ping Yee wrote:
> 
> On Thu, 18 Jan 2001, Ka-Ping Yee wrote:
> >     str() looks for __str__
> 
> Oops.  I forgot that
> 
>       str() looks for __str__, then tries __repr__
> 
> So, presumably,
> 
>       unicode() should look for __unicode__, then __str__, then __repr__

Not quite... str() does this:

1. strings are passed back as-is
2. the type slot tp_str is tried
3. the method __str__ is tried
4. Unicode returns are converted to strings
5. anything other than a string return value is rejected

unistr() does the same, but makes sure that the return
value is an Unicode object.

unicode() does the following:

1. for instances, __str__ is called
2. Unicode objects are returned as-is
3. string objects or character buffers are used as basis for decoding
4. decoding is applied to the character buffer and the results
   are returned

I think we should perhaps merge the two approaches into one
which then applies all of the above in unicode() (and then
forget about unistr()). This might lose hide some type errors,
but since all other generic constructors behave more or less
in the same way, I think unicode() should too.

Thoughts ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From martin at mira.cs.tu-berlin.de  Thu Jan 18 11:48:30 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 11:48:30 +0100
Subject: [Python-Dev] Having extensions builtin
Message-ID: <200101181048.f0IAmU210251@mira.informatik.hu-berlin.de>

With the new distutils configuration scheme, it appears to be
difficult to build modules in a non-shared way. Building modules
non-shared is desirable when freezing is attempted, and also to reduce
the startup time and memory consumption.

It is still possible to add modules to Setup or Setup.local, so that
they will be build into the interpreter. However, setup.py will still
build them in a shared way afterwards. I propose that setup.py builds
only those modules that are not builtin.

Regards,
Martin




From martin at mira.cs.tu-berlin.de  Thu Jan 18 13:20:06 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 13:20:06 +0100
Subject: [Python-Dev] Standard install locations for Python ?
Message-ID: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de>

> Where should Python extensions install themselves and their docs?

I feel that extensions should not need to care. For extensions,
distutils will pick a location, and the system administrator
configuration the package can chose a different location.

Unfortunately, distutils does not support the installation of
documentation, which I think it should.

Now switching sides, as an administrator, I'd wish distutils to follow
the system conventions by default. 

That means on Linux, documentation should go into the system's <doc>
directory, which is /usr/share/doc according to latest
standards. Distributions vary, so distutils should find out - e.g. by
querying the location from rpm. In addition, when building RPMs,
distutils should declare these files as %doc in the spec file, so RPM
will install it following the system conventions.

On Windows, the convention apparently is to put the documentation
"nearby" the software, so it should probably go into Doc or a
subdirectory thereof.

On Unix, there appears to be no standard location, unless the
documentation consists of man pages or perhaps info files. So
<prefix>/share/doc is probably a place as good as any other.

Regards,
Martin



From martin at mira.cs.tu-berlin.de  Thu Jan 18 11:39:30 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 11:39:30 +0100
Subject: [Python-Dev] SSL detection problem
Message-ID: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de>

The distutils-based configuration fails to build on my system (SuSE
7.0) with the error

/usr/src/python/Modules/socketmodule.c:159: rsa.h: Datei oder Verzeichnis nicht
gefunden
/usr/src/python/Modules/socketmodule.c:160: crypto.h: Datei oder Verzeichnis nicht gefunden
/usr/src/python/Modules/socketmodule.c:161: x509.h: Datei oder Verzeichnis nicht gefunden
/usr/src/python/Modules/socketmodule.c:162: pem.h: Datei oder Verzeichnis nicht
gefunden
/usr/src/python/Modules/socketmodule.c:163: ssl.h: Datei oder Verzeichnis nicht
gefunden                                                                       

The problem is that these header files are in /usr/include/openssl,
which is not in the standard include search path.

So the obvious request is: could this be fixed? I guess when setup.py
finds the openssl library, it should also try to find ssl.h, in some
obvious locations.

The not-so-obvious question: How can one work-around such a problem
with the new setup scheme? In the old scheme, I could have chosen to
either provide the right -I option in Modules/Setup, to disable SSL
support, or to disable the _socket module altogether. How can I
achieve either configuration with the new scheme?

Regards,
Martin

P.S. As a quick hack, I added a custom include_dirs parameter to the
SSL extension.



From martin at mira.cs.tu-berlin.de  Thu Jan 18 13:39:54 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 13:39:54 +0100
Subject: [Python-Dev] bug in grammar
Message-ID: <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de>

> Should the grammar allow one to construct the example statement
> above?

It should not. Please note that the grammar allows a number of other
things, e.g.

  a+b = c

(pass this to parser.suite to see details)

> If not, I'm not sure how to fix the grammar.

The central problem is that it allows testlist on the LHS of an
augassign or '=', whereas the languages only allows a small subset in
that position. It is not possible to restrict the grammar in itself,
as that will necessarily produce a conflict - you only know that the
'+' was incorrect when you see the '='.

> I suppose the compiler should detect that the list comp is misplaced

I think there should be a well-formedness pass in-between. I.e. after
the AST has been build, a single pass should descend through the tree,
looking for an expr_statement with more than a single testlist. Once
it finds one, it should confirm that this really is a well-formed
lvalue (in C speak). In this case, the test should be that each term
is a an atom without factors.

If the parser itself performs such checks, the compiler could be
simplified in many places, I guess.

Regards,
Martin



From thomas at xs4all.net  Thu Jan 18 10:53:14 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 18 Jan 2001 10:53:14 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010117180947.E9384@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Wed, Jan 17, 2001 at 06:09:47PM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010117180947.E9384@kronos.cnri.reston.va.us>
Message-ID: <20010118105314.D17392@xs4all.nl>

On Wed, Jan 17, 2001 at 06:09:47PM -0500, Andrew Kuchling wrote:

> >On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
> >setup.py. Also, SSL support for the socket module was not enabled, though
> >OpenSSL is installed, in the default path.
> 
> Can you take a look at the detection code in setup.py and see what's
> going wrong.  I believe it should be found if OpenSSL is in
> /usr/local/, but /usr/contrib isn't checked currently.

Well, OpenSSL rests in the default location, which is
/usr/local/ssl/include/openssl. Haven't the time to look into it right now,
sorry.

> >The Tcl/Tk header files are stored in /usr/include/tcl<ver>/ on Debian,
> >which I personally like a lot, though it's probably a bitch to autodetect.
> >(I tried, using autoconf ;-P)

> There's code to handle Debian, though I have no way of testing it, and
> it worked on Neil's Debian box for some reason.  Search for
> debian_tcl_include in setup.py, and see if you can fix it.

Ah, yes. The problem in my case is that the *library* files are just in
/usr/lib, but the include files are not. I re-indented the code to pull the
debian-specific code out of the 'if prefix + os.sep + 'lib' not in
lib_dirs' block, and it works now. Haven't tested it on other code yet, but
I think it should work regardless.

> >distutils.errors.DistutilsPlatformError: invalid Python installation: unable to open /usr/lib/python2.1/config/Makefile (No such file or directory)

> Are you sure setup.py is up to date; do a 'cvs update setup.py' to check.  
> You might get a "setup.py is in the way; remove it' message if you 
> downloaded the first setup.py script manually.

D'oh, I guess not. I thought I did (I did on all other platforms :) but I
guess I didn't, 'cause it works now. Thanx.

> >without 'make clean' anymore. You get a lot of undefined-symbol warnings
> >(see below.) If you run 'make clean;make test' it also doesn't work, because
> >the build directory is not in the Python library path, and regrtest.py
> >requires (at least) the time module.

> Again, be sure the tree is up to date; I think this stems from
> attempting to compile the signal module as shared, which doesn't work.

This happened even with completely fresh, newly checked out trees, on all
but FreeBSD (three different trees: Debian woody, BSDI 4.0 and BSDI 4.1) so
I'm pretty sure that's not it.

It works now, though, so I guess the move from a dynamic signalmodule to a
static one does the trick ;) I got 'make test' working by applying the
following patch to Makefile{,.in}, and running 'make PYTHONPATH=.:<builddir>
test' (determining builddir by hand, for now.):

***************
*** 216,223 ****
  TESTPYTHON=   ./python$(EXE) -tt
  test:         all
                -rm -f $(srcdir)/Lib/test/*.py[co]
!               -PYTHONPATH= $(TESTPYTHON) $(TESTPROG) $(TESTOPTS)
!               PYTHONPATH= $(TESTPYTHON) $(TESTPROG) $(TESTOPTS)
  
  # Install everything
  install:      altinstall bininstall maninstall
--- 216,223 ----
  TESTPYTHON=   ./python$(EXE) -tt
  test:         all
                -rm -f $(srcdir)/Lib/test/*.py[co]
!               -PYTHONPATH=$(PYTHONPATH) $(TESTPYTHON) $(TESTPROG) $(TESTOPTS)
!               PYTHONPATH=$(PYTHONPATH) $(TESTPYTHON) $(TESTPROG) $(TESTOPTS)
  
  # Install everything
  install:      altinstall bininstall maninstall

And because of that, I also noticed something funny: BSDI calls itself
'BSD/OS <version>', so distutils actually makes a directory called 'lib.bsd'
and 'temp.bsd', with inside those a directory 'os-<version>-i386-2.1'. Is
that a distutils bug, a setup.py bug, or intentional behaviour of one of the
two ?


-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From nas at arctrix.com  Thu Jan 18 08:59:22 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 17 Jan 2001 23:59:22 -0800
Subject: [Python-Dev] new Makefile.in
Message-ID: <20010117235922.A12356@glacier.fnational.com>

Spurred on by comments made by Andrew, I spent some time last
night overhauling the Python Makefiles.  I now have a toplevel
non-recursive Makefile.in that seems to work fairly well.  I'm
pretty sure it still should be portable.  It doesn't use includes
or any special GNU make features.  It is half the size of the old
Makefiles.  The build is faster and its now easier to follow if
something goes wrong.

A question: is it possible to break the Python static library up?
For example, instead of having libpython<version>.a have
Parser/parser<version>.a, Objects/objects<version>.a, etc?  There
would still only be one shared library.  This would speed up
incremental builds and also help Andrew with PEP 229.  I'm
thinking that the Makefile do something like this:

    all: python$(EXE)

    PYLIBS= Parser/parser.a Objects/objects.a ...  Modules/modules.a

    python$(EXE): $(PYLIBS)
        $(LINKCC) -o python$(EXE) $(PYLIBS) ...

    Modules/modules.a: minpython$(EXE)
        ./minpython$(EXE) setup.py


AFACT, the only thing affected by splitting up the static library
is Misc/Makefile.pre.in.  Is this correct?

  Neil



From guido at digicool.com  Thu Jan 18 15:52:23 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 09:52:23 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Your message of "Thu, 18 Jan 2001 13:22:11 +1300."
             <200101180022.NAA00898@s454.cosc.canterbury.ac.nz> 
References: <200101180022.NAA00898@s454.cosc.canterbury.ac.nz> 
Message-ID: <200101181452.JAA06899@cj20424-a.reston1.va.home.com>

> > a < b if and only if b > a.
> > This is what the rich comparison code does.
> 
> Someone is bound to come up with a use for comparison
> operator overloading in which this isn't true, just
> to be difficult!

They'll get what they deserve -- this will be clearly documented!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Thu Jan 18 16:15:25 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jan 2001 10:15:25 -0500 (EST)
Subject: [Python-Dev] Re: bug in grammar
In-Reply-To: <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de>
References: <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de>
Message-ID: <14951.2189.14393.52725@localhost.localdomain>

If I summarize your suggestion, I think you've said that ideally the
grammar should not allow assignment to list comprehensions (or a
variety of other constructs) -- but it doesn't so the compiler has to
deal with it.

This morning it seemed a lot easier to fix the bug than it did last
night :-).  com_assign() already has a number of checks for syntax
errors in assignments.  A test for list comprehensions belongs at the
same place as tests for assignment to [] and augmented assignments
applied to lists.

I'll include a fix for assignment to list comprehensions in my big
compiler patch.

Jeremy




From akuchlin at mems-exchange.org  Thu Jan 18 16:28:19 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 18 Jan 2001 10:28:19 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <20010118022321.A9021@thyrsus.com>; from esr@thyrsus.com on Thu, Jan 18, 2001 at 02:23:21AM -0500
References: <20010118022321.A9021@thyrsus.com>
Message-ID: <20010118102819.A21503@kronos.cnri.reston.va.us>

On Thu, Jan 18, 2001 at 02:23:21AM -0500, Eric S. Raymond wrote:
>And it's not hard to notice that an object hash is a memory address.

Unless the object defines __hash__()!  If you want the memory address, 
use id() instead.

--amk



From akuchlin at mems-exchange.org  Thu Jan 18 16:30:36 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 18 Jan 2001 10:30:36 -0500
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010118004400.B17392@xs4all.nl>; from thomas@xs4all.net on Thu, Jan 18, 2001 at 12:44:00AM +0100
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl>
Message-ID: <20010118103036.B21503@kronos.cnri.reston.va.us>

>On Wed, Jan 17, 2001 at 11:49:25PM +0100, Thomas Wouters wrote:
>> On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
>> setup.py. Also, SSL support for the socket module was not enabled, though
>> OpenSSL is installed, in the default path.

What does the layout of /usr/contrib look like?  Is it
/usr/contrib/openssl/include/, /usr/contrib/include/, or something
else?

>Strangely enough, this problem does not exist on FreeBSD. I can run 'make'
>or 'make test' after 'make' just fine. 'make test' still doesn't work
>because of the incorrect library path, but it doesn't barf like the other
>systems (BSDI and Debian Linux)

Have you already run "make install"?  Perhaps it's picking up the
already-installed modules when running "make test", because it really
shouldn't be working.

--amk




From gward at cnri.reston.va.us  Thu Jan 18 16:42:51 2001
From: gward at cnri.reston.va.us (Greg Ward)
Date: Thu, 18 Jan 2001 10:42:51 -0500
Subject: [Python-Dev] Where's Greg Ward ?
In-Reply-To: <3A6237D7.673BBB30@lemburg.com>; from mal@lemburg.com on Mon, Jan 15, 2001 at 12:35:51AM +0100
References: <3A6237D7.673BBB30@lemburg.com>
Message-ID: <20010118104250.A27049@thrak.cnri.reston.va.us>

On 15 January 2001, M.-A. Lemburg said:
> He seems to be offline and the people on the distutils list have some
> patches and other things which would be nice to have in distutils 
> for 2.1.

Tim was right -- I'm *really* close to being back online.  Just have to
figure out why qmail's not answering port 25 and why LILO doesn't like my
newly repartitioned hard drive, and all will be well.  Oh yeah, and getting
insurance, and a credit card, and unpacking all these cardboard boxes, and
getting some furniture, ...

(If anyone is considering it, I do *not* recommend buying a new computer,
moving internationally, and getting a high speed home Internet connection
all at the same time.)

BTW I quite approve of Andrew being temporary Distutils dictator.  Should
have done it in December, but I didn't think I'd be out of commission for so
long.  Sigh.

        Greg



From moshez at zadka.site.co.il  Fri Jan 19 01:19:45 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 19 Jan 2001 02:19:45 +0200 (IST)
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org>
Message-ID: <20010119001945.80DC8A83E@darjeeling.zadka.site.co.il>

On Thu, 18 Jan 2001 01:29:13 -0800 (PST), Ka-Ping Yee <ping at lfw.org> wrote:
> On Tue, 16 Jan 2001, Guido van Rossum wrote:
> > You mean the tp_print and tp_str function slots in type objects,
> > right?  tp_print *should* always render exactly the same as tp_str.
> > tp_print is used by the print statement, not by value display at the
> > interactive prompt.
> 
> Uh, i hate to disagree with you about your own interpreter, but:
> 
>     com_expr_stmt in Python/compile.c
>         inserts a PRINT_EXPR opcode if c_interactive is true;
>     eval_code2 in Python/ceval.c
>         handles PRINT_EXPR by calling displayhook;
>     sys_displayhook in Python/sysmodule.c
>         prints the object by calling PyFile_WriteObject on sys.stdout;
>     PyFile_WriteObject in Objects/fileobject.c
>         calls PyObject_Print if the file is really a PyFileObject;
>     PyObject_Print in Objects/object.c
>         calls op->ob_type->tp_print if it's not NULL.
> 
> The print statement produces a PRINT_ITEM opcode, which invokes
> PyFile_WriteObject with a Py_PRINT_RAW flag.  That Py_PRINT_RAW
> flag is propagated down to PyObject_Print and into string_print,
> where it causes the string to fwrite itself directly without quoting.
> 
> > So, string_print most definitely should *not* be changed -- only
> > string_repr!
> 
> I had to change them both before i actually saw the change in the
> interactive interpreter.  Actually, your statement above (that the
> two should always render the same) seems to imply that if i change
> one, i must also change the other.
> 
> 
> -- ?!ng
> 
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> 
> 
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From guido at digicool.com  Thu Jan 18 17:23:19 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 11:23:19 -0500
Subject: [Python-Dev] unistr() vs. unicode()
Message-ID: <200101181623.LAA07389@cj20424-a.reston1.va.home.com>

Ping wrote in response to a SourceForge mail about MAL's unistr()
checking:

------- Forwarded Message

Date:    Wed, 17 Jan 2001 23:51:48 -0800
From:    Ka-Ping Yee <ping at lfw.org>
To:      noreply at sourceforge.net
cc:      mal at lemburg.com, guido at python.org, patches at python.org
Subject: Re: [Patches] [Patch #101664] Add new unistr() builtin + PyObject_Unic
	  ode() C API

On Wed, 17 Jan 2001 noreply at sourceforge.net wrote:
> Comment:
> This patch adds a utility function unistr() which works just like
> the standard builtin str()  -- only that the return value will
> always be a Unicode object.

Sorry for barging in, but i have an issue/question:

Why are unistr() and unicode() two separate functions?

str() performs one task: convert to string.  It can convert anything,
including strings or Unicode strings, numbers, instances, etc.

The other type-named functions e.g. int(), long(), float(), list(),
tuple() are similar in intent.

Why have unicode() just for converting strings to Unicode strings,
and unistr() for converting everything else to a Unicode string?
What does unistr(x) do differently from unicode(x) if x is a string?


- -- ?!ng

------- End of Forwarded Message

(And no, Tim, this did *not* end up in the patches list because I made
Barry remove the reply-to.  SourceForge mails never had reply-to to
begin with.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Thu Jan 18 17:28:12 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 11:28:12 -0500
Subject: [Python-Dev] urllib.urlencode & repeated values
In-Reply-To: Your message of "Wed, 17 Jan 2001 22:20:30 CST."
             <14950.28430.572215.10643@beluga.mojam.com> 
References: <14950.28430.572215.10643@beluga.mojam.com> 
Message-ID: <200101181628.LAA07406@cj20424-a.reston1.va.home.com>

> I'm pretty sure this has come up before, but urllib.urlencode doesn't handle
> repeated parameters properly.  If I call
> 
>     urllib.urlencode({"performers": ("U2","Lawrence Martin")})
> 
> instead of getting
> 
>     performers=U2&performers=Lawrence+Martin
> 
> I get a quoted stringified tuple:
> 
>     performers=%28%27U2%27%2c+%27Lawrence+Martin%27%29
> 
> Obviously, fixing this will change the function's current semantics, but I
> think it's worth treating lists and tuples (actually, any sequence) as
> repeated values.  If the existing semantics are deemed valuable enough, a
> third default parameter could be added to switch on the new behavior when
> desired.
> 
> If others agree I'd be happy to whip up a patch.  I think it's a bug.

Agreed.  If you can come up with something that supports all sequence
types, and treats singleton sequences the same as their one and only
item, it would even be the inverse of cgi.parse_qs()!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Thu Jan 18 17:43:49 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 18 Jan 2001 17:43:49 +0100
Subject: [Python-Dev] Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org>; from ping@lfw.org on Thu, Jan 18, 2001 at 02:14:19AM -0800
References: <3A66BCCC.14997FE3@lemburg.com> <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org>
Message-ID: <20010118174349.E17392@xs4all.nl>

On Thu, Jan 18, 2001 at 02:14:19AM -0800, Ka-Ping Yee wrote:

> Wow!!  __complex__ can produce a segfault!

>     >>> complex
>     <built-in function complex>
>     >>> class Foo:
>     ...   def __complex__(self): return 3
>     ... 
>     >>> Foo()
>     <__main__.Foo instance at 0x81e8684>
>     >>> f = _
>     >>> complex(f)
>     Segmentation fault (core dumped)

> This happens because builtin_complex first retrieves and saves
> the PyNumberMethods of the argument (in this case, from the
> instance), then tries to call __complex__ (in this case, returning 3),
> and THEN coerces the result using nbr->nb_float if the result is
> not complex!  (This calls the instance's nb_float method on the
> integer object 3!!)

I've noticed that lurking bug in the coercion code when I added augmented
assignment, though I don't recall whether I fixed it then, nor do I know if
that part's been "touched" by the recent coercion changes. If none of the
coercion champions speak up, I'll look at this sometime this weekend.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From akuchlin at mems-exchange.org  Thu Jan 18 17:50:28 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 18 Jan 2001 11:50:28 -0500
Subject: [Python-Dev] SSL detection problem
In-Reply-To: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de>; from martin@mira.cs.tu-berlin.de on Thu, Jan 18, 2001 at 11:39:30AM +0100
References: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de>
Message-ID: <20010118115028.D21503@kronos.cnri.reston.va.us>

On Thu, Jan 18, 2001 at 11:39:30AM +0100, Martin v. Loewis wrote:
>The problem is that these header files are in /usr/include/openssl,
>which is not in the standard include search path.

I have an improved version of setup.py (not checked in yet) that tries
to do better, checking for both header and library files.  One point:
the OpenSSL docs imply that the headers should be loaded as
<openssl/rsa.h>, not as <rsa.h>; the header files themselves use the
openssl/*.h form, which means you'd need two -I directives..  I'll
patch the socket module accordingly.

>The not-so-obvious question: How can one work-around such a problem
>with the new setup scheme? In the old scheme, I could have chosen to
>either provide the right -I option in Modules/Setup, to disable SSL
>support, or to disable the _socket module altogether. How can I
>achieve either configuration with the new scheme?

I still need to implement command-line options to specify such
overrides, but that couldn't possibly get done in time for alpha1.  I
was thinking of something like --<modulename>-libs="foo bar",
--<modulename>-includes="/usr/include/blah/", and so forth.
Suggestions for a better interface welcomed...

--amk



From guido at digicool.com  Thu Jan 18 17:55:39 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 11:55:39 -0500
Subject: [Python-Dev] bug in grammar
In-Reply-To: Your message of "Wed, 17 Jan 2001 21:58:19 EST."
             <14950.23499.275398.963621@localhost.localdomain> 
References: <14950.23499.275398.963621@localhost.localdomain> 
Message-ID: <200101181655.LAA08001@cj20424-a.reston1.va.home.com>

> As part of the implementation of PEP 227 (and in an attempt to reach
> some low-hanging fruit Guido mentioned on the types-sig long ago), I
> have been working on a compiler pass that generates a module-level
> symbol table.  I recently discovered a bug in the handling of list
> comprehensions that was giving me headaches.
> 
> I realize now that the problem is with the current grammar and/or
> compiler.  Here's a simple demonstration; try it in your friendly
> python 2.0 interpreter.
> 
> >>> [i for i in range(10)] = (1, 2, 3)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> ValueError: unpack list of wrong size
> 
> The generated bytecode is:
> 
>           0 SET_LINENO               0
> 
>           3 SET_LINENO               1
>           6 LOAD_CONST               0 (1)
>           9 LOAD_CONST               1 (2)
>          12 LOAD_CONST               2 (3)
>          15 BUILD_TUPLE              3
>          18 UNPACK_SEQUENCE          1
>          21 STORE_NAME               0 (i)
>          24 LOAD_CONST               3 (None)
>          27 RETURN_VALUE        
> 
> I assume this isn't intended :-).  The compiler is ignoring everything
> after the initial atom in the list comprehension.  It's basically
> compiling the code as if it were:
> 
> [i] = (1, 2, 3)
> 
> I'm not sure how to try and fix this.  Should the grammar allow one to
> construct the example statement above?  If not, I'm not sure how to
> fix the grammar.  If not, I suppose the compiler should detect that
> the list comp is misplaced.  This seems fairly messy, since there are
> about 10 nodes between the expr_stmt and the list_for.
> 
> Or is this a cool way to use list comprehensions to generate
> ValueErrors?

Good catch!  Not everything cool deserves to be preserved.

It looks like this happens because the code that traverses lists on
the left-hand side of an assignment was never told about list
comprehensions.  You're right that the grammar can't be fixed; it's
for the same reason that it can't be fixed to disallow "f() = 1".

The solution is to add a test for this to the compiler that flags this
as an error.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Thu Jan 18 18:01:02 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 12:01:02 -0500
Subject: [Python-Dev] Embedded language discussion
In-Reply-To: Your message of "Thu, 18 Jan 2001 00:19:31 EST."
             <200101180519.AAA00612@207-172-111-227.s227.tnt1.ann.va.dialup.rcn.com> 
References: <200101180519.AAA00612@207-172-111-227.s227.tnt1.ann.va.dialup.rcn.com> 
Message-ID: <200101181701.MAA08046@cj20424-a.reston1.va.home.com>

> http://www.kuro5hin.org/?op=displaystory;sid=2001/1/16/11334/2280
> 
> The poster is on a project that's trying to use Python, but they're
> encountering unspecified problems (perhaps because of the global
> interpreter lock).

I've sent the poster an email asking to be more specific about his
questions; probably doing the right dance when calling Python from a
thread created in C++ should do the trick.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Thu Jan 18 18:04:43 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 12:04:43 -0500
Subject: [Python-Dev] Strings: '\012' -> '\n'
In-Reply-To: Your message of "Thu, 18 Jan 2001 01:29:13 PST."
             <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101161640540.4389-100000@skuld.kingmanhall.org> 
Message-ID: <200101181704.MAA08074@cj20424-a.reston1.va.home.com>

> On Tue, 16 Jan 2001, Guido van Rossum wrote:
> > You mean the tp_print and tp_str function slots in type objects,
> > right?  tp_print *should* always render exactly the same as tp_str.
> > tp_print is used by the print statement, not by value display at the
> > interactive prompt.
> 
> Uh, i hate to disagree with you about your own interpreter, but:
> 
>     com_expr_stmt in Python/compile.c
>         inserts a PRINT_EXPR opcode if c_interactive is true;
>     eval_code2 in Python/ceval.c
>         handles PRINT_EXPR by calling displayhook;
>     sys_displayhook in Python/sysmodule.c
>         prints the object by calling PyFile_WriteObject on sys.stdout;
>     PyFile_WriteObject in Objects/fileobject.c
>         calls PyObject_Print if the file is really a PyFileObject;
>     PyObject_Print in Objects/object.c
>         calls op->ob_type->tp_print if it's not NULL.
> 
> The print statement produces a PRINT_ITEM opcode, which invokes
> PyFile_WriteObject with a Py_PRINT_RAW flag.  That Py_PRINT_RAW
> flag is propagated down to PyObject_Print and into string_print,
> where it causes the string to fwrite itself directly without quoting.
> 
> > So, string_print most definitely should *not* be changed -- only
> > string_repr!
> 
> I had to change them both before i actually saw the change in the
> interactive interpreter.  Actually, your statement above (that the
> two should always render the same) seems to imply that if i change
> one, i must also change the other.

Oops.  I'm so grateful that we have a collective memory! :-)

You're right: tp_print() can be invoked in two modes: with or without
Py_PRINT_RAW flag.  In raw mode, it should behave exactly like str();
in cooked mode exactly like repr().

--Guido van Rossum (home page: http://www.python.org/~guido/)



From martin at mira.cs.tu-berlin.de  Thu Jan 18 20:31:29 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 20:31:29 +0100
Subject: [Python-Dev] Weird use of hash() -- will this work?
Message-ID: <200101181931.f0IJVTc00932@mira.informatik.hu-berlin.de>

> Comments?

Yes, three of them:

1. To guarantee uniqueness atleast within the process, the easiest
   solution would be

   if using_threads:
     import thread
     lock=thread.allocate_lock()
     _acquire = lock.acquire_lock
     _release = lock.release_lock
   else:
     _acquire = _release = lambda:None
     
   _cookie = time.time()
   def getCookie():
     global _cookie
     _acquire()
     _cookie+=1
     result = _cookie
     _release()
     return result

2. Invoking [] repeatedly likely returns the an object with the same
   id() when called twice in a row (i.e. with no intermediate objects
   allocated in-between).

3. Why did you send this question to python-dev? python-list is more
   appropriate.

Regards,
Martin




From tim.one at home.com  Thu Jan 18 20:49:12 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 14:49:12 -0500
Subject: [Python-Dev] Windows Python totally rad
In-Reply-To: <3A66B846.3D24B959@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEGKIJAA.tim.one@home.com>

[MAL]
> Could you tell me which these are [new test failures on Windows]?
> The tests tested all passed just fine, so I guess these must be
> Windows-related problems.

Not to worry, all the tests pass now.  Don't want to spend time
backtracking, as I'm not the one who fixed them and don't know who did.
FWIW, they "smelled like" shallow failures (== easy to diagnose & fix).

onward!-ly y'rs  - tim




From martin at mira.cs.tu-berlin.de  Thu Jan 18 20:37:04 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 18 Jan 2001 20:37:04 +0100
Subject: [Python-Dev] new Makefile.in
Message-ID: <200101181937.f0IJb4N00997@mira.informatik.hu-berlin.de>

> A question: is it possible to break the Python static library up?
> For example, instead of having libpython<version>.a have
> Parser/parser<version>.a, Objects/objects<version>.a, etc?

Please, no. It was that way in Python 1.4 (libModules, libObjects, and
I forgot which the others were :-). We had that all documented in our
book, then Guido tried to build an extension module for the first
time, saw that these many libraries were terrible, and combined them
into a single one. That was a good thing, and we have it documented in
our book. I'm not at all looking forward to answering all the
questions why the build infrastructure of Python changed yet again...

Regards,
Martin




From fdrake at acm.org  Thu Jan 18 21:22:30 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Thu, 18 Jan 2001 15:22:30 -0500 (EST)
Subject: [Python-Dev] weak references in 2.1alpha
Message-ID: <14951.20614.176140.672447@cj42289-a.reston1.va.home.com>

  I'd like to put the weak references patch into the alpha, but
haven't received any feedback on the latest patch.  I have some
comments from Martin von L?wis on the PEP that need to be addressed,
and that could change the implementation a bit, but the basic
machinery seems to be pretty reasonable and works for me.
  Does anyone have any objections to it going into the alpha?  I'd
like to enable more wide-spread testing.
  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From mal at lemburg.com  Thu Jan 18 18:10:14 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jan 2001 18:10:14 +0100
Subject: [Python-Dev] Weird use of hash() -- will this work?
References: <20010118022321.A9021@thyrsus.com>
Message-ID: <3A672376.4B951848@lemburg.com>

"Eric S. Raymond" wrote:
> 
> So I'm writing a module to that needs to generate unique cookies.  The
> module will run inside one of two environments: (1) a trivial test wrapper,
> not threaded, and (2) a lomg-running multithreaded server.
> 
> Because Python garbage-collects, hash() of a just-created object isn't
> good enough.  Because we may be threading, millisecond time isn't
> good enough.  Because we may *not* be threading, thread ID isn't good
> either.
> 
> On the other hand, I'm on Linux getting millisecond time resolution.
> And it's not hard to notice that an object hash is a memory address.
> 
> So, how about `time.time()` + hex(hash([]))?
> 
> It looks to me like this will remain unique forever, because another thread
> would have to create an object at the same memory address during the same
> millisecond to collide.
> 
> Furthermore, it looks to me like this hack might be portable to any OS
> with a clock tick shorter than its timeslice.
> 
> Comments?

A combination of time.time(), process id and counter should
work in all cases. Make sure you use a lock around the counter,
though.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Thu Jan 18 18:30:52 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 18 Jan 2001 18:30:52 +0100
Subject: [Python-Dev] Standard install locations for Python ?
References: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de>
Message-ID: <3A67284C.B6C617A@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > Where should Python extensions install themselves and their docs?
> 
> I feel that extensions should not need to care. For extensions,
> distutils will pick a location, and the system administrator
> configuration the package can chose a different location.
> 
> Unfortunately, distutils does not support the installation of
> documentation, which I think it should.

Right.
 
> Now switching sides, as an administrator, I'd wish distutils to follow
> the system conventions by default.
> 
> That means on Linux, documentation should go into the system's <doc>
> directory, which is /usr/share/doc according to latest
> standards. Distributions vary, so distutils should find out - e.g. by
> querying the location from rpm. In addition, when building RPMs,
> distutils should declare these files as %doc in the spec file, so RPM
> will install it following the system conventions.

You currently have to do this by hand (e.g. in setup.cfg or
using the doc_files option). It should fairly easy to add
a command similar to install_data though which then applies
all the necessary magic to the paths.

If there a common landmark to look for on Unix (e.g. in case the
system does not use RPM) ?

Which paths should distutils check ?

(/usr/share/doc/packages, /usr/share/doc, /usr/doc/packages,
/usr/doc in that order ?)
 
> On Windows, the convention apparently is to put the documentation
> "nearby" the software, so it should probably go into Doc or a
> subdirectory thereof.

Na, I'd rather have \Python\Site-Packages and \Python\Site-Docs
for that purpose.
 
> On Unix, there appears to be no standard location, unless the
> documentation consists of man pages or perhaps info files. So
> <prefix>/share/doc is probably a place as good as any other.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From skip at mojam.com  Thu Jan 18 18:45:29 2001
From: skip at mojam.com (Skip Montanaro)
Date: Thu, 18 Jan 2001 11:45:29 -0600 (CST)
Subject: [Python-Dev] urllib.urlencode & repeated values
In-Reply-To: <200101181628.LAA07406@cj20424-a.reston1.va.home.com>
References: <14950.28430.572215.10643@beluga.mojam.com>
	<200101181628.LAA07406@cj20424-a.reston1.va.home.com>
Message-ID: <14951.11193.150232.564700@beluga.mojam.com>

    >> If others agree I'd be happy to whip up a patch.  I think it's a bug.

    Guido> Agreed.

Patch #103314:

    http://sourceforge.net/patch/?func=detailpatch&patch_id=103314&group_id=5470

I assigned it to Fred for doc review.

Skip





From akuchlin at mems-exchange.org  Thu Jan 18 19:56:40 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Thu, 18 Jan 2001 13:56:40 -0500
Subject: [Python-Dev] Standard install locations for Python ?
In-Reply-To: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de>; from martin@mira.cs.tu-berlin.de on Thu, Jan 18, 2001 at 01:20:06PM +0100
References: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de>
Message-ID: <20010118135640.G21503@kronos.cnri.reston.va.us>

On Thu, Jan 18, 2001 at 01:20:06PM +0100, Martin v. Loewis wrote:
>On Unix, there appears to be no standard location, unless the
>documentation consists of man pages or perhaps info files. So
><prefix>/share/doc is probably a place as good as any other.

This seems like a good suggestion.  Should docs go in
<prefix>/share/doc/python<version>/, then?  Perhaps with
subdirectories for different extensions?

--amk





From tismer at tismer.com  Thu Jan 18 22:39:18 2001
From: tismer at tismer.com (Christian Tismer)
Date: Thu, 18 Jan 2001 22:39:18 +0100
Subject: [Python-Dev] Rich comparison confusion
References: <14949.46995.259157.871323@beluga.mojam.com> <200101171609.LAA04102@cj20424-a.reston1.va.home.com>
Message-ID: <3A676286.C33823B4@tismer.com>


Guido van Rossum wrote:
> 
> > I'm a bit confused about Guido's rich comparison stuff.  In the description
> > he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.
> 
> Yes.  By this I mean that A<B and B>A are interchangeable, ditto for
> A<=B and B>=A.  Also A==B interchanges for B==A, and A!=B for B!=A.

...

> I think what threw you off was the ambiguity of "inverse".  This means
> Boolean negation.  I'm not relying on Boolean negation here -- I'm
> relying on the more fundamental property that a<b and b>a have the
> same outcome.

Yes, the "inverse" is confusing. Is what you mean the "reverse" ?
Like the other right-side operators __radd__, is it correct to
think of

   __ge__  == __rle__

if __rle__ was written in the same fashion like __radd__ ?
It looks semantically the same, although the reason for a
call might be different.

And if my above view is right, would it perhaps be less
confusing to use in fact __rle__ and __rlt__,
or woudl it be more confusing, since __rlt__ would also be
invoked left-to-right, implementing ">".

Not shure if I added even more confusion.

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From tim.one at home.com  Thu Jan 18 22:53:44 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 16:53:44 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <20010118022321.A9021@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEGPIJAA.tim.one@home.com>

[Eric S. Raymond, in search of uniqueness]
> ...
> So, how about `time.time()` + hex(hash([]))?
>
> It looks to me like this will remain unique forever, because
> another thread would have to create an object at the same memory
> address during the same millisecond to collide.

I'm afraid it's much more vulnerable than that:  Python's thread granularity
is at the bytecode level, not the statement level.  It's very easy for
thread A and B to see the same `time.time()` value, and after that
arbitrarily long amounts of time may pass before they get around to doing
the hash([]) business.  When hash() completes, the storage for [] is
immediately reclaimed under CPython, and it's again very easy for another
thread to reuse the storage.

I'm attaching an executable test case.  It uses time.clock() because that
has much higher resolution than time.time() on Windows (better than
microsecond), but rounds it back to three decimal places to simulate
millisecond resolution.  The first three runs:

    saw 14600 unique in 30000 total
    saw 14597 unique in 30000 total
    saw 14645 unique in 30000 total

So it sucks bigtime on my box.

Better idea:  borrow the _ThreadSafeCounter class from the tail end of the
current CVS tempfile.py.  The code works whether or not threads are
available.  Then

    `time.time()` + str(_counter.get_next())

is thread-safe.  For that matter, plain old

    str(_counter.get_next())

will always be unique within a single run.  However, in either case you're
still not safe against concurrent *processes* generating the same cookies.

tempfile.py has to worry about that too, of course, so the *best* idea is to
call tempfile.mktemp() and leave it at that.  It wastes some time checking
the filesystem for a file of the same name (which, btw, goes much quicker on
Linux than on Windows).


From tismer at tismer.com  Thu Jan 18 22:56:08 2001
From: tismer at tismer.com (Christian Tismer)
Date: Thu, 18 Jan 2001 22:56:08 +0100
Subject: [Python-Dev] Weird use of hash() -- will this work?
References: <20010118022321.A9021@thyrsus.com>
Message-ID: <3A676678.7E4AF278@tismer.com>


"Eric S. Raymond" wrote:
> 
> So I'm writing a module to that needs to generate unique cookies.  The
> module will run inside one of two environments: (1) a trivial test wrapper,
> not threaded, and (2) a lomg-running multithreaded server.

What do you mean by "unique"? Unique regarding your long-running server?

If so, then I wonder why one should do
> 
> So, how about `time.time()` + hex(hash([]))?
>
instead of using a single, simple counter for all sessions?

> It looks to me like this will remain unique forever, because another thread
> would have to create an object at the same memory address during the same
> millisecond to collide.
> 
> Furthermore, it looks to me like this hack might be portable to any OS
> with a clock tick shorter than its timeslice.
> 
> Comments?

If I'm not overlooking something fundamental, the counter approach
seems to be simpler and most portable. :-)

but-sometimes-my-brain-malfunctions-badly-ly y'rs  - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From nas at arctrix.com  Thu Jan 18 16:07:13 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 18 Jan 2001 07:07:13 -0800
Subject: [Python-Dev] Re: new Makefile.in
In-Reply-To: <200101181937.f0IJb4N00997@mira.informatik.hu-berlin.de>; from martin@mira.cs.tu-berlin.de on Thu, Jan 18, 2001 at 08:37:04PM +0100
References: <200101181937.f0IJb4N00997@mira.informatik.hu-berlin.de>
Message-ID: <20010118070713.A13581@glacier.fnational.com>

On Thu, Jan 18, 2001 at 08:37:04PM +0100, Martin v. Loewis wrote:
> > A question: is it possible to break the Python static library up?
> > For example, instead of having libpython<version>.a have
> > Parser/parser<version>.a, Objects/objects<version>.a, etc?
> 
> Please, no.

Okay.

> I'm not at all looking forward to answering all the questions
> why the build infrastructure of Python changed yet again...

My Makefile patch shouldn't change the way you build extensions.

  Neil



From tim.one at home.com  Fri Jan 19 02:45:42 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 20:45:42 -0500
Subject: [Python-Dev] unistr() vs. unicode()
In-Reply-To: <200101181623.LAA07389@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEHLIJAA.tim.one@home.com>

[Guido]
> (And no, Tim, this did *not* end up in the patches list because I made
> Barry remove the reply-to.  SourceForge mails never had reply-to to
> begin with.)

Aha!  Another thing to blame Barry for <wink>.




From tim.one at home.com  Thu Jan 18 23:11:23 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 17:11:23 -0500
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: <008701c0811a$b3371c00$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEGPIJAA.tim.one@home.com>

[/F]
> python build problems and real life got in the way.
>
> will 2.1a1 be released according to plan?  will there
> be a 2.1a2 release?  maybe I should postpone this?

Depends on how confident you are.  Since this is purely an optimization, I
don't think it *needs* to get into a1 in order to make the final release;
postponing a few days would be better than pushing too hard on something
that's proved hairier than anticipated.

do-the-right-thing-whatever-that-is<wink>-ly y'rs  - tim




From guido at digicool.com  Fri Jan 19 03:17:36 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 21:17:36 -0500
Subject: [Python-Dev] Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: Your message of "Thu, 18 Jan 2001 02:14:19 PST."
             <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101180152470.1568-100000@skuld.kingmanhall.org> 
Message-ID: <200101190217.VAA01497@cj20424-a.reston1.va.home.com>

> I hope you don't mind that i'm taking this over to python-dev,
> because it led me to discover a more general issue (see below).

No -- in fact I wanted to see this here!  (My mail backlog seems to be
clearing -- or maybe it was only a temporary unclogging... :-)

> For the others on python-dev, here's the background: MAL was
> about to check in the unistr() function, described as follows:
> 
> > This patch adds a utility function unistr() which works just like
> > the standard builtin str()  -- only that the return value will
> > always be a Unicode object.
> > 
> > The patch also adds a new object level C API PyObject_Unicode()
> > which complements PyObject_Str().
> 
> I responded:
> > Why are unistr() and unicode() two separate functions?
> > 
> > str() performs one task: convert to string.  It can convert anything,
> > including strings or Unicode strings, numbers, instances, etc.
> > 
> > The other type-named functions e.g. int(), long(), float(), list(),
> > tuple() are similar in intent.
> > 
> > Why have unicode() just for converting strings to Unicode strings,
> > and unistr() for converting everything else to a Unicode string?
> > What does unistr(x) do differently from unicode(x) if x is a string?
> 
> MAL responded:
> > unistr() is meant to complement str() very closely. unicode()
> > works as constructor for Unicode objects which can also take
> > care of decoding encoded data. str() and unistr() don't provide
> > this capability but instead always assume the default encoding.
> > 
> > There's also a subtle difference in that str() and unistr() 
> > try the tp_str slot which unicode() doesn't. unicode()
> > supports any character buffer which str() and unistr() don't.
> 
> Okay, given this explanation, i still feel fairly confident
> that unicode() should subsume unistr().  Many of the other
> type-named functions try various slots:
> 
>     int() looks for __int__
>     float() looks for __float__
>     long() looks for __long__
>     str() looks for __str__
> 
> In testing this i also discovered the following:
> 
>     >>> class Foo:
>     ...     def __int__(self):
>     ...         return 3
>     ... 
>     >>> f = Foo()
>     >>> int(f)
>     3
>     >>> long(f) 
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     AttributeError: Foo instance has no attribute '__long__'
>     >>> float(f)
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     AttributeError: Foo instance has no attribute '__float__'
> 
> This is kind of surprising.  How about:
> 
>     int() looks for __int__
>     float() looks for __float__, then tries __int__
>     long() looks for __long__, then tries __int__
>     str() looks for __str__
>     unicode() looks for __unicode__, then tries __str__

For the numeric types this could perhaps be done by calling
PyNumber_Long() from PyNumber_Float(), calling PyNumber_Int() from
PyNumber_Long().  Complex is a bit of an exception -- there's no
PyNumber_Complex(), just because I felt that nobody would need it. :-)

> The extra parameter to unicode() is very similar to the extra
> parameter to int(), so i think there is a natural parallel here.

Makes sense.

> Hmm... what about the other types?
> 
> Wow!!  __complex__ can produce a segfault!
> 
>     >>> complex
>     <built-in function complex>
>     >>> class Foo:
>     ...   def __complex__(self): return 3
>     ... 
>     >>> Foo()
>     <__main__.Foo instance at 0x81e8684>
>     >>> f = _
>     >>> complex(f)
>     Segmentation fault (core dumped)
> 
> This happens because builtin_complex first retrieves and saves
> the PyNumberMethods of the argument (in this case, from the
> instance), then tries to call __complex__ (in this case, returning 3),
> and THEN coerces the result using nbr->nb_float if the result is
> not complex!  (This calls the instance's nb_float method on the
> integer object 3!!)

Thanks!  Fixed now in CVS.

> I think __complex__ should probably look for __complex__, then
> __float__, then __int__.

I make it call PyNumber_Float(), which could be made smarter as
explained above.

> One could argue for __list__, __tuple__, or __dict__, but that
> seems much weaker; the Pythonic way has always been to implement
> __getitem__ instead.

Yes -- since __list__ etc. aren't used, let's not add them.

> There is no built-in dict(); if it existed
> i suppose it would do the opposite of x.items(); again a weak
> argument, though i might have found such a function useful once
> or twice.

Yeah, it's not very common.  Dict comprehensions anyone?

    d = {k:v for k,v in zip(range(10), range(10))}    # :-)

> And that about covers the built-in types for data.

Thanks!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Thu Jan 18 23:13:14 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 17:13:14 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <20010118022321.A9021@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHAIJAA.tim.one@home.com>

BTW, why doesn't hash([]) blow up in 2.1a1?  In 2.0 it raised

    TypeError: unhashable type

Did someone change this deliberately?




From tim.one at home.com  Thu Jan 18 23:58:22 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 17:58:22 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEHAIJAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHCIJAA.tim.one@home.com>

[Tim whined]
> BTW, why doesn't hash([]) blow up in 2.1a1?  In 2.0 it raised
>
>     TypeError: unhashable type
>
> Did someone change this deliberately?

Answer:  it's an unintended consequence of the rich-comparison changes.
Guido knows how to fix it and probably <wink> will.  The list type grew a
tp_richcompare slot but lost its non-NULL tp_compare pointer.  PyObject_Hash
wasn't changed accordingly (it now believes lists support neither direct
hashing nor comparison, so does them a favor and hashes their memory
addresses).  Something trickier is probably going wrong elsewhere too, but I
won't try to remember what that is unless Guido gets hit by a bus tonight.

in-which-case-we-can-push-off-the-funeral-until-after-the-release-ly
    y'rs  - tim




From thomas at xs4all.net  Fri Jan 19 00:02:09 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 00:02:09 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010118103036.B21503@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Jan 18, 2001 at 10:30:36AM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us>
Message-ID: <20010119000209.F17392@xs4all.nl>

On Thu, Jan 18, 2001 at 10:30:36AM -0500, Andrew Kuchling wrote:
> >On Wed, Jan 17, 2001 at 11:49:25PM +0100, Thomas Wouters wrote:
> >> On BSDI, readline sits in /usr/local or /usr/contrib, and isn't detected by
> >> setup.py. Also, SSL support for the socket module was not enabled, though
> >> OpenSSL is installed, in the default path.

> What does the layout of /usr/contrib look like?  Is it
> /usr/contrib/openssl/include/, /usr/contrib/include/, or something
> else?

Actually, it's /usr/local, not /usr/contrib. I've never installed OpenSSL in
/usr/contrib, though I could, and maybe BSDI will, in the future. (BSDI
installs its own software in /usr, and optional free, pre-compiled software
in /usr/contrib.) OpenSSL installs into
/usr/local/ssl/include/openssl by default, and installing into /usr/contrib
would make it /usr/contrib/ssl/include/openssl.

> >Strangely enough, this problem does not exist on FreeBSD. I can run 'make'
> >or 'make test' after 'make' just fine. 'make test' still doesn't work
> >because of the incorrect library path, but it doesn't barf like the other
> >systems (BSDI and Debian Linux)

> Have you already run "make install"?  Perhaps it's picking up the
> already-installed modules when running "make test", because it really
> shouldn't be working.

Hm, I think you misread my statement. 'make test' *doesn't* work. But it
doesn't barf on the signal module being built dynamically either. You fixed
that for every platform now, I was just pointing out that this was not a
problem for FreeBSD for some reason.

'make test' still doesn't work, but I can make it work by specifying a
hand-tweaked PYTHONPATH that includes the OS/arch-dependant build directory.

This brings me to another point: how can 'make test' work at all ? Does
python always check for './Lib' (and './Modules') for modules ? If that's
specific for 'make test' and running python in the source distribution, that
sounds like a bit of a weird hack. I can't find any such hackery in the
source, but I also can't figure out how else it's working :)

More-later--Meteor-((c)-1979)-is-on-ly y'rs
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From martin at mira.cs.tu-berlin.de  Fri Jan 19 00:14:05 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Fri, 19 Jan 2001 00:14:05 +0100
Subject: [Python-Dev] weak references in 2.1alpha
Message-ID: <200101182314.f0INE5B00338@mira.informatik.hu-berlin.de>

> Does anyone have any objections to it going into the alpha? 

I'd like to request that the .clear() method is removed from the patch
for this alpha, and also that the weak dictionaries are removed until
their semantics is clarified.

It's always easier to add stuff later than to remove it.

Regards,
Martin




From nas at arctrix.com  Thu Jan 18 17:31:09 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 18 Jan 2001 08:31:09 -0800
Subject: [Python-Dev] SSL detection problem
In-Reply-To: <20010118115028.D21503@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Thu, Jan 18, 2001 at 11:50:28AM -0500
References: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de> <20010118115028.D21503@kronos.cnri.reston.va.us>
Message-ID: <20010118083109.A13972@glacier.fnational.com>

On Thu, Jan 18, 2001 at 11:50:28AM -0500, Andrew Kuchling wrote:
> On Thu, Jan 18, 2001 at 11:39:30AM +0100, Martin v. Loewis wrote:
> >The not-so-obvious question: How can one work-around such a problem
> >with the new setup scheme?
> 
> I still need to implement command-line options to specify such
> overrides, but that couldn't possibly get done in time for alpha1.

My non-recursive makefile patch allows you to use both Setup and
setup.py.  Its not quite really for prime time but its getting
close.

I would be interested if someone could point me to the source for
some crappy makes.  I've tried GNU make, BSD 4.4 pmake and
whatever comes with SunOS 5.6.  Searching for "make" doesn't work
too well. :-(

  Neil



From thomas at xs4all.net  Fri Jan 19 00:45:32 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 00:45:32 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Thu, Jan 18, 2001 at 08:46:54AM -0800
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010119004532.G17392@xs4all.nl>

On Thu, Jan 18, 2001 at 08:46:54AM -0800, Guido van Rossum wrote:

> filename = '/tmp/delete_me'

This reminds me: we need a portable way to handle test-files :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Fri Jan 19 00:56:04 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 18:56:04 -0500
Subject: [Python-Dev] new Makefile.in
In-Reply-To: Your message of "Wed, 17 Jan 2001 23:59:22 PST."
             <20010117235922.A12356@glacier.fnational.com> 
References: <20010117235922.A12356@glacier.fnational.com> 
Message-ID: <200101182356.SAA19616@cj20424-a.reston1.va.home.com>

Hi Neil,

My mail suffers delays of 12-24 hours while mail.python.org is working
on some enormous backlog.  So I just saw your message about a new
Makefile...

> Spurred on by comments made by Andrew, I spent some time last
> night overhauling the Python Makefiles.  I now have a toplevel
> non-recursive Makefile.in that seems to work fairly well.  I'm
> pretty sure it still should be portable.  It doesn't use includes
> or any special GNU make features.  It is half the size of the old
> Makefiles.  The build is faster and its now easier to follow if
> something goes wrong.

I'd like to see this!

> A question: is it possible to break the Python static library up?
> For example, instead of having libpython<version>.a have
> Parser/parser<version>.a, Objects/objects<version>.a, etc?  There
> would still only be one shared library.  This would speed up
> incremental builds and also help Andrew with PEP 229.  I'm
> thinking that the Makefile do something like this:
> 
>     all: python$(EXE)
> 
>     PYLIBS= Parser/parser.a Objects/objects.a ...  Modules/modules.a
> 
>     python$(EXE): $(PYLIBS)
>         $(LINKCC) -o python$(EXE) $(PYLIBS) ...
> 
>     Modules/modules.a: minpython$(EXE)
>         ./minpython$(EXE) setup.py

Sounds cool to me.  (Where's the patch for a shared libpython???)

> AFACT, the only thing affected by splitting up the static library
> is Misc/Makefile.pre.in.  Is this correct?

Yeah, and that should be phased out in favor of distutils anyway.  Now
would be a great time!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 01:34:02 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:34:02 -0500
Subject: [Python-Dev] Mail delays and SourceForge bugs
Message-ID: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>

Through no fault of my own, email to guido at python.org (which includes
the python-dev list) is currently suffering delays of 12-24 hours.  I
have a feeling this is probably true for all mail going through
python.org, so checkin messages ans python-dev discussion have been
greatly frustrated, with about 1 day to go until the planned 2.1a1
release date!

On top of that, the SourceForge bug manager has developed a problem:
all references to http://sourceforge.net/bugs/?group_id=5470/ come
back with this error:

  An error occured in the logger. ERROR: pg_atoi: error in "5470/":
  can't parse "/"

I'm still hoping to release Python 2.1a1 tomorrow, unless Jeremy tells
me that he needs more time for his nested scopes patch.

In the mean time, please everybody, do check out the latest CVS
version and give it a good workout!  Andrew's setup.py still has some
rough edges, I believe that in order to run it from the build
directory you still have to point PYTHONPATH to the build/lib*
directory, where he hides the shared libraries for all modules.
Andrew, are you planning to fix this?

If there's anything that you need me to know about, please mail to
guido at digicool.com -- that address suffers no delays.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Fri Jan 19 01:51:19 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 19:51:19 -0500
Subject: [Python-Dev] RE: [Pycabal] Mail delays and SourceForge bugs
In-Reply-To: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEHGIJAA.tim.one@home.com>

[Guido. notes current woes w/ python.org email, and SourceForge]

Note too that, over the past two days, it's not possible to follow
Python-Dev email via

http://mail.python.org/pipermail/python-dev/2001-January/date.html

either, as (unlike during previous occurrences of python.org email delays)
msgs aren't showing up there in a timely fashion either (for example, the
msg of Guido's to which I'm replying isn't there).

good-thing-guido's-so-easy-to-channel<wink>-ly y'rs  - tim




From guido at digicool.com  Fri Jan 19 01:52:02 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:52:02 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: Your message of "Thu, 18 Jan 2001 02:23:21 EST."
             <20010118022321.A9021@thyrsus.com> 
References: <20010118022321.A9021@thyrsus.com> 
Message-ID: <200101190052.TAA26849@cj20424-a.reston1.va.home.com>

> So I'm writing a module to that needs to generate unique cookies.  The
> module will run inside one of two environments: (1) a trivial test wrapper,
> not threaded, and (2) a lomg-running multithreaded server.
> 
> Because Python garbage-collects, hash() of a just-created object isn't
> good enough.  Because we may be threading, millisecond time isn't
> good enough.  Because we may *not* be threading, thread ID isn't good
> either.  
> 
> On the other hand, I'm on Linux getting millisecond time resolution.
> And it's not hard to notice that an object hash is a memory address.
> 
> So, how about `time.time()` + hex(hash([]))?
> 
> It looks to me like this will remain unique forever, because another thread
> would have to create an object at the same memory address during the same
> millisecond to collide.
> 
> Furthermore, it looks to me like this hack might be portable to any OS
> with a clock tick shorter than its timeslice.

Argh!  hash([]) should raise TypeError, since lists are not hashable
objects -- mutable objects can't be allowed as dictionary keys.  This
(hash([]) accidentally returned a value for a brief period after I
checked in the rich comparisons -- I've fixed that now.

But not to worry: instead of using hash([]), you can use hex(id([])).
Same thing.

On the other hand, remember how much you can do in a millisecond!
(E.g. I can call tempfile.mktemp() 5 times in that time.)  And when
you create an object and immediately delete it, the next object
created is very likely to have the same address.

But what's wrong with this:

    try:
        from thread import get_ident as unique_id
    else:
        def unique_id(): return id([])

--Guido van Rossum (home page: http://www.python.org/~guido/)



From billtut at microsoft.com  Fri Jan 19 01:53:15 2001
From: billtut at microsoft.com (Bill Tutt)
Date: Thu, 18 Jan 2001 16:53:15 -0800
Subject: [Python-Dev] MS CRT crashing:
Message-ID: <58C671173DB6174A93E9ED88DCB0883DB863F1@red-msg-07.redmond.corp.microsoft.com>


From guido at digicool.com  Fri Jan 19 01:53:13 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:53:13 -0500
Subject: [Python-Dev] 2.1 alpha: what about the unicode name database?
In-Reply-To: Your message of "Thu, 18 Jan 2001 07:48:41 +0100."
             <008701c0811a$b3371c00$e46940d5@hagrid> 
References: <LNBBLJKPBEHFEDALKOLCGENEIIAA.tim.one@home.com> <012901c080a5$306023a0$e46940d5@hagrid>  
            <008701c0811a$b3371c00$e46940d5@hagrid> 
Message-ID: <200101190053.TAA26862@cj20424-a.reston1.va.home.com>

> I wrote:
> > I've almost sorted it all out.  will check it in later tonight (local
> > time).
> 
> python build problems and real life got in the way.

What?  You've got a real life?  Can't be allowed, not when we're
working on a release!

> will 2.1a1 be released according to plan?  will there
> be a 2.1a2 release?  maybe I should postpone this?

Please check it in, there's still time (2.1a1 won't go out before
Friday night, possibly it'll be delayed until Monday).

And yes, there will be a 2.1a2.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 01:55:15 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:55:15 -0500
Subject: [Python-Dev] SSL detection problem
In-Reply-To: Your message of "Thu, 18 Jan 2001 11:39:30 +0100."
             <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de> 
References: <200101181039.f0IAdUT09947@mira.informatik.hu-berlin.de> 
Message-ID: <200101190055.TAA26905@cj20424-a.reston1.va.home.com>

> The distutils-based configuration fails to build on my system (SuSE
> 7.0) with the error
> 
> /usr/src/python/Modules/socketmodule.c:159: rsa.h: Datei oder Verzeichnis nicht
> gefunden
> /usr/src/python/Modules/socketmodule.c:160: crypto.h: Datei oder Verzeichnis nicht gefunden
> /usr/src/python/Modules/socketmodule.c:161: x509.h: Datei oder Verzeichnis nicht gefunden
> /usr/src/python/Modules/socketmodule.c:162: pem.h: Datei oder Verzeichnis nicht
> gefunden
> /usr/src/python/Modules/socketmodule.c:163: ssl.h: Datei oder Verzeichnis nicht
> gefunden                                                                       

The same happened to Fred on Mandrake 7.0 (except for the German
messages :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 01:58:16 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 19:58:16 -0500
Subject: [Python-Dev] Re: unistr() vs. unicode()
Message-ID: <200101190058.TAA26931@cj20424-a.reston1.va.home.com>

MAL's reply to Ping in this thread.

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Thu, 18 Jan 2001 10:52:12 +0100
From:    "M.-A. Lemburg" <mal at lemburg.com>
To:      Ka-Ping Yee <ping at lfw.org>
cc:      guido at python.org, patches at python.org
Subject: Re: [Patches] [Patch #101664] Add new unistr() builtin + PyObject_Unic
	  ode()C API

Ka-Ping Yee wrote:
> 
> On Wed, 17 Jan 2001 noreply at sourceforge.net wrote:
> > Comment:
> > This patch adds a utility function unistr() which works just like
> > the standard builtin str()  -- only that the return value will
> > always be a Unicode object.
> 
> Sorry for barging in, but i have an issue/question:
> 
> Why are unistr() and unicode() two separate functions?
> 
> str() performs one task: convert to string.  It can convert anything,
> including strings or Unicode strings, numbers, instances, etc.
> 
> The other type-named functions e.g. int(), long(), float(), list(),
> tuple() are similar in intent.
> 
> Why have unicode() just for converting strings to Unicode strings,
> and unistr() for converting everything else to a Unicode string?
> What does unistr(x) do differently from unicode(x) if x is a string?

unistr() is meant to complement str() very closely. unicode()
works as constructor for Unicode objects which can also take
care of decoding encoded data. str() and unistr() don't provide
this capability but instead always assume the default encoding.

There's also a subtle difference in that str() and unistr() 
try the tp_str slot which unicode() doesn't. unicode()
supports any character buffer which str() and unistr() don't.

Perhaps you are right though in that we should make all three
APIs behave in the same way with respect to coercing their
arguments. This could hide some errors... still in the long
run, I agree that the existing setup probably causes more confusion
than good.

Guido ?

- -- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/

_______________________________________________
Patches mailing list
Patches at python.org
http://mail.python.org/mailman/listinfo/patches

------- End of Forwarded Message




From guido at digicool.com  Fri Jan 19 02:04:22 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 20:04:22 -0500
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: Your message of "Thu, 18 Jan 2001 11:51:46 +0100."
             <3A66CAC2.74FC894@lemburg.com> 
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org>  
            <3A66CAC2.74FC894@lemburg.com> 
Message-ID: <200101190104.UAA27056@cj20424-a.reston1.va.home.com>

> Ka-Ping Yee wrote:
> > 
> > On Thu, 18 Jan 2001, Ka-Ping Yee wrote:
> > >     str() looks for __str__
> > 
> > Oops.  I forgot that
> > 
> >       str() looks for __str__, then tries __repr__
> > 
> > So, presumably,
> > 
> >       unicode() should look for __unicode__, then __str__, then __repr__
> 
> Not quite... str() does this:
> 
> 1. strings are passed back as-is
> 2. the type slot tp_str is tried
> 3. the method __str__ is tried
> 4. Unicode returns are converted to strings
> 5. anything other than a string return value is rejected
> 
> unistr() does the same, but makes sure that the return
> value is an Unicode object.
> 
> unicode() does the following:
> 
> 1. for instances, __str__ is called
> 2. Unicode objects are returned as-is
> 3. string objects or character buffers are used as basis for decoding
> 4. decoding is applied to the character buffer and the results
>    are returned
> 
> I think we should perhaps merge the two approaches into one
> which then applies all of the above in unicode() (and then
> forget about unistr()). This might lose hide some type errors,
> but since all other generic constructors behave more or less
> in the same way, I think unicode() should too.

Yes, I would like to see these merged.  I noticed that e.g. there is
special code to compare Unicode strings in the comparison code (I
think I *could* get rid of this now we have rich comparisons, but I
decided to put that off), and when I looked at it it uses the same set
of conversions as unicode().  Some of these seem questionable to me --
why do you try so many ways to get a string out of an object?  (On the
other hand the merge of unicode() and unistr() might have this effect
anyway...)

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido at digicool.com  Fri Jan 19 02:06:23 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 18 Jan 2001 20:06:23 -0500
Subject: [Python-Dev] bug in grammar
In-Reply-To: Your message of "Thu, 18 Jan 2001 13:39:54 +0100."
             <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de> 
References: <200101181239.f0ICdsY10703@mira.informatik.hu-berlin.de> 
Message-ID: <200101190106.UAA27073@cj20424-a.reston1.va.home.com>

> I think there should be a well-formedness pass in-between. I.e. after
> the AST has been build, a single pass should descend through the tree,
> looking for an expr_statement with more than a single testlist. Once
> it finds one, it should confirm that this really is a well-formed
> lvalue (in C speak). In this case, the test should be that each term
> is a an atom without factors.

Good ideal.

> If the parser itself performs such checks, the compiler could be
> simplified in many places, I guess.

Not sure that in practice it makes much of a difference: there aren't
that many of these kinds of checks, and writing a separate pass is
expensive.  On the other hand, Jeremy is just writing a separate pass
anyway, to collect name usage information for the nested scopes.
Maybe it could be folded into that pass...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Fri Jan 19 04:20:08 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jan 2001 22:20:08 -0500 (EST)
Subject: [Python-Dev] deprecated regex used by un-deprecated modules
Message-ID: <14951.45672.806978.600944@localhost.localdomain>

There are several modules in the standard library that use the regex
module.  When they are imported, they print a warning about using a
deprecated module.  I think this is bad form.  Either the modules that
depend on regex should by updated to use re or they should be
deprecated themselves.  

I discovered the following offenders:
asynchat
knee
poplib
reconvert

I would suggest fixing asynchat and poplib and deprecating knee.  The
reconvert module may be a special case.

Jeremy



From jeremy at alum.mit.edu  Fri Jan 19 04:31:02 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 18 Jan 2001 22:31:02 -0500 (EST)
Subject: [Python-Dev] setup.py and build subdirectories
Message-ID: <14951.46326.743921.988828@localhost.localdomain>

I have a bunch of build directories under the source tree, e.g.
src/python/dist/src/build
src/python/dist/src/build-pg
src/python/dist/src/build-O3
...

The new setup.py did not successfully build in these directories.  I
hacked distutils a tiny bit and had some success.  Patch below.  I'm
not sure if the approach is kosher, but it allows me to build
successfully.

I also have a problem running 'make test' from these build
directories.  The reference to the distutils build directory has '..'
prepended to it that shouldn't exist.

Jeremy


Index: setup.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/setup.py,v
retrieving revision 1.8
diff -c -r1.8 setup.py
*** setup.py	2001/01/18 20:39:34	1.8
--- setup.py	2001/01/19 03:26:55
***************
*** 536,540 ****
            
  # --install-platlib
  if __name__ == '__main__':
!     sysconfig.set_python_build()
      main()
--- 536,541 ----
            
  # --install-platlib
  if __name__ == '__main__':
!     path, file = os.path.split(sys.argv[0])
!     sysconfig.set_python_build(path)
      main()
Index: Lib/distutils/sysconfig.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/distutils/sysconfig.py,v
retrieving revision 1.31
diff -c -r1.31 sysconfig.py
*** Lib/distutils/sysconfig.py	2001/01/17 15:16:52	1.31
--- Lib/distutils/sysconfig.py	2001/01/19 03:27:01
***************
*** 24,37 ****
  
  python_build = 0
  
! def set_python_build():
      """Set the python_build flag to true; this means that we're
      building Python itself.  Only called from the setup.py script
      shipped with Python.
      """
      
      global python_build
!     python_build = 1
  
  def get_python_inc(plat_specific=0, prefix=None):
      """Return the directory containing installed Python header files.
--- 24,37 ----
  
  python_build = 0
  
! def set_python_build(loc):
      """Set the python_build flag to true; this means that we're
      building Python itself.  Only called from the setup.py script
      shipped with Python.
      """
      
      global python_build
!     python_build = loc + "/"
  
  def get_python_inc(plat_specific=0, prefix=None):
      """Return the directory containing installed Python header files.
***************
*** 48,54 ****
          prefix = (plat_specific and EXEC_PREFIX or PREFIX)
      if os.name == "posix":
          if python_build:
!             return "Include/"
          return os.path.join(prefix, "include", "python" + sys.version[:3])
      elif os.name == "nt":
          return os.path.join(prefix, "Include") # include or Include?
--- 48,54 ----
          prefix = (plat_specific and EXEC_PREFIX or PREFIX)
      if os.name == "posix":
          if python_build:
!             return python_build + "Include/"
          return os.path.join(prefix, "include", "python" + sys.version[:3])
      elif os.name == "nt":
          return os.path.join(prefix, "Include") # include or Include?





From tim.one at home.com  Fri Jan 19 04:46:16 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 18 Jan 2001 22:46:16 -0500
Subject: [Python-Dev] Type-converting functions, esp. unicode() vs. unistr() 
Message-ID: <LNBBLJKPBEHFEDALKOLCGEIBIJAA.tim.one@home.com>

[attribution lost]
> There is no built-in dict(); if it existed i suppose it would do
> the opposite of x.items(); again a weak argument, though i might
> have found such a function useful once or twice.

[Guido]
> Yeah, it's not very common.  Dict comprehensions anyone?
>
>    d = {k:v for k,v in zip(range(10), range(10))}    # :-)

It's very common in Perl code, but is in no sense the inverse of .items()
there:   when you build a dict from a list L in Perl, it acts like Python

   {L[0]: L[1],
    L[2]: L[3],
    L[4]: L[5],
    ...
   }

That's what seems most practical most often; e.g., when crunching over text
files with records of the form

    key value

(e.g., mail headers are of this form; simple contact databases; to-do lists
segregated by date; etc), whatever fancy re.split() is used to break things
apart naturally returns a flat list.  A list of two-tuples is natural only
if it was obtained from another dict's .items() <0.9 wink>.

pushing-the-limits-of-"practicality-beats-purity"?-ly y'rs  - tim




From tim.one at home.com  Fri Jan 19 07:00:27 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 19 Jan 2001 01:00:27 -0500
Subject: [Python-Dev] test_urllib failing on Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCAEIGIJAA.tim.one@home.com>

test test_urllib crashed 
    -- exceptions.AssertionError: urllib.quote problem




From tim.one at home.com  Fri Jan 19 07:39:30 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 19 Jan 2001 01:39:30 -0500
Subject: [Python-Dev] (no subject)
Message-ID: <LNBBLJKPBEHFEDALKOLCAEIIIJAA.tim.one@home.com>

[some MS internal support group]
> Turns out the C standard explicitly says you can't have an input
> follow iutput on a stream without doing fflush or fseek in-between,
> to make sure the stdio buffer is cleared.  So this program is illegal.

It's undefined (there are no "illegal" programs -- that word doesn't appear
in the std; "undefined" does and has a precise technical meaning).

In the presence of threads-- which the C std doesn't mention --you have to
address issues the std doesn't touch.  To date, MS's is the only C runtime
we've seen that corrupts itself in this situation.  It can do anything it
likes short of blowing up and still be considered a good threaded
implementation.  As is, it has to be considered sub-standard, in the
ordinary sense of displaying worse behavior than other threaded C stdio
implementations.  It falls short there on other counts too (like the lack of
getc_unlocked() & friends), but internal corruption is a particularly
egregious failing.

and-that's-the-end-of-it-for-me-ly y'rs  - tim




From mwh21 at cam.ac.uk  Fri Jan 19 09:31:18 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 19 Jan 2001 08:31:18 +0000
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: Thomas Wouters's message of "Fri, 19 Jan 2001 00:02:09 +0100"
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl>
Message-ID: <m3vgrc0wq1.fsf@atrus.jesus.cam.ac.uk>

Thomas Wouters <thomas at xs4all.net> writes:

> This brings me to another point: how can 'make test' work at all ? Does
> python always check for './Lib' (and './Modules') for modules ? If that's
> specific for 'make test' and running python in the source distribution, that
> sounds like a bit of a weird hack. I can't find any such hackery in the
> source, but I also can't figure out how else it's working :)

It's in Modules/getpath.c

Cheers,
M.

-- 
  I really hope there's a catastrophic bug insome future e-mail
  program where if you try and send an attachment it cancels your
  ISP account, deletes your harddrive, and pisses in your coffee
                                                         -- Adam Rixey




From gstein at lyra.org  Fri Jan 19 09:38:54 2001
From: gstein at lyra.org (Greg Stein)
Date: Fri, 19 Jan 2001 00:38:54 -0800
Subject: [Python-Dev] initializing ob_type (was: CVS: python/dist/src/Modules _cursesmodule.c,2.46,2.47)
In-Reply-To: <E14JPPW-0008Bt-00@usw-pr-cvs1.sourceforge.net>; from gvanrossum@users.sourceforge.net on Thu, Jan 18, 2001 at 04:28:10PM -0800
References: <E14JPPW-0008Bt-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010119003854.F7731@lyra.org>

On Thu, Jan 18, 2001 at 04:28:10PM -0800, Guido van Rossum wrote:
>...
>   PyTypeObject PyCursesWindow_Type = {
> ! 	PyObject_HEAD_INIT(NULL)
>   	0,			/*ob_size*/
>   	"curses window",	/*tp_name*/
>...
> --- 2432,2443 ----
>   /* Initialization function for the module */
>   
> ! DL_EXPORT(void)
>   init_curses(void)
>   {
>   	PyObject *m, *d, *v, *c_api_object;
>   	static void *PyCurses_API[PyCurses_API_pointers];
> + 
> + 	/* Initialize object type */
> + 	PyCursesWindow_Type.ob_type = &PyType_Type;
>   
>   	/* Initialize the C API pointer array */


I've never truly understood this. Is it because Windows cannot initialize
(at load-time) a pointer to a data structure that is located in a different
DLL?

It is a bit painful to keep moving inits from load-time to run-time.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From tim.one at home.com  Fri Jan 19 10:01:22 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 19 Jan 2001 04:01:22 -0500
Subject: [Python-Dev] test_urllib failing on Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCCEINIJAA.tim.one@home.com>

Bet it was failing everywhere; it's fixed now.




From moshez at zadka.site.co.il  Fri Jan 19 18:53:36 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Fri, 19 Jan 2001 19:53:36 +0200 (IST)
Subject: [Python-Dev] Dbm failure
Message-ID: <20010119175336.3B5A0A83E@darjeeling.zadka.site.co.il>

test test_dbm skipped --  /home/moshez/prog/src/python/python/dist/src/build/lib.linux-i686-2.1/dbm.so: undefined symbol: dbm_firstkey

Did it happen to anyone else? 
Anything else you need to know?

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From mal at lemburg.com  Fri Jan 19 10:58:08 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 10:58:08 +0100
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. 
 unistr()
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org>  
	            <3A66CAC2.74FC894@lemburg.com> <200101190104.UAA27056@cj20424-a.reston1.va.home.com>
Message-ID: <3A680FB0.AED2DB55@lemburg.com>

Guido van Rossum wrote:
> 
> > Ka-Ping Yee wrote:
> > >
> > > On Thu, 18 Jan 2001, Ka-Ping Yee wrote:
> > > >     str() looks for __str__
> > >
> > > Oops.  I forgot that
> > >
> > >       str() looks for __str__, then tries __repr__
> > >
> > > So, presumably,
> > >
> > >       unicode() should look for __unicode__, then __str__, then __repr__
> >
> > Not quite... str() does this:
> >
> > 1. strings are passed back as-is
> > 2. the type slot tp_str is tried
> > 3. the method __str__ is tried
> > 4. Unicode returns are converted to strings
> > 5. anything other than a string return value is rejected
> >
> > unistr() does the same, but makes sure that the return
> > value is an Unicode object.
> >
> > unicode() does the following:
> >
> > 1. for instances, __str__ is called
> > 2. Unicode objects are returned as-is
> > 3. string objects or character buffers are used as basis for decoding
> > 4. decoding is applied to the character buffer and the results
> >    are returned
> >
> > I think we should perhaps merge the two approaches into one
> > which then applies all of the above in unicode() (and then
> > forget about unistr()). This might lose hide some type errors,
> > but since all other generic constructors behave more or less
> > in the same way, I think unicode() should too.
> 
> Yes, I would like to see these merged.  I noticed that e.g. there is
> special code to compare Unicode strings in the comparison code (I
> think I *could* get rid of this now we have rich comparisons, but I
> decided to put that off), and when I looked at it it uses the same set
> of conversions as unicode().  Some of these seem questionable to me --
> why do you try so many ways to get a string out of an object?  (On the
> other hand the merge of unicode() and unistr() might have this effect
> anyway...)

... because there are so many ways to get at string
representations of objects in Python at C level.

If we agree to merge the semantics of the two APIs, then str()
would have to change too: is this desirable ? (IMHO, yes)

Here's what we could do:

a) merge the semantics of unistr() into unicode()
b) apply the same semantics in str()
c) remove unistr() -- how's that for a short-living builtin ;)

About the semantics:

These should be backward compatible to str() in that everything
that worked before should continue to work after the merge.

A strawman for processing str() and unicode():

1. strings/Unicode is passed back as-is
2. tp_str is tried
3. the method __str__ is tried
4. the PyObject_AsCharBuffer() API is tried (bf_getcharbuffer)
5. for str(): Unicode return values are converted to strings using
              the default encoding
   for unicode(): Unicode return values are passed back as-is;
              string return values are decoded according to the
              encoding parameter
6. the return object is type-checked: str() will always return
   a string object, unicode() always a Unicode object

Note that passing back Unicode is only allowed in case no encoding
was given. Otherwise an execption is raised: you can't decode
Unicode.

As extension we could add encoding and error parameters to str()
as well. The result would be either an encoding of Unicode objects
passed back by tp_str or __str__ or a recoding of string objects
returned by checks 2, 3 or 4.

If we agree to take this approach, then we should remove the
unistr() Python API before the alpha ships.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fredrik at effbot.org  Fri Jan 19 11:19:06 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 19 Jan 2001 11:19:06 +0100
Subject: [Python-Dev] initializing ob_type (was: CVS: python/dist/src/Modules _cursesmodule.c,2.46,2.47)
References: <E14JPPW-0008Bt-00@usw-pr-cvs1.sourceforge.net> <20010119003854.F7731@lyra.org>
Message-ID: <010c01c08201$4b0ec050$e46940d5@hagrid>

greg wrote:
> I've never truly understood this. Is it because Windows cannot initialize
> (at load-time) a pointer to a data structure that is located in a different
> DLL?

Windows can do it (via DLL initialization code), but the compiler
doesn't generate initialization code for C programs.

you can compile the module as C++, but that's also a bit painful...

</F>




From jack at oratrix.nl  Fri Jan 19 12:02:00 2001
From: jack at oratrix.nl (Jack Jansen)
Date: Fri, 19 Jan 2001 12:02:00 +0100
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
Message-ID: <20010119110200.9E455373C95@snelboot.oratrix.nl>

I get the impression that I'm currently seeing a non-NULL third argument in my 
(C) methods even though the method is called without keyword arguments.

Is this new semantics that I missed the discussion about, or is this a bug?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | ++++ see http://www.xs4all.nl/~tank/ ++++





From thomas at xs4all.net  Fri Jan 19 13:22:06 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 13:22:06 +0100
Subject: [Python-Dev] deprecated regex used by un-deprecated modules
In-Reply-To: <14951.45672.806978.600944@localhost.localdomain>; from jeremy@alum.mit.edu on Thu, Jan 18, 2001 at 10:20:08PM -0500
References: <14951.45672.806978.600944@localhost.localdomain>
Message-ID: <20010119132206.H17392@xs4all.nl>

On Thu, Jan 18, 2001 at 10:20:08PM -0500, Jeremy Hylton wrote:

> I would suggest fixing asynchat and poplib and deprecating knee.  The
> reconvert module may be a special case.

Can't reconvert just disable the warning before importing regex ? That would
seem the sane thing to do, at least to me.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Fri Jan 19 13:26:31 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 13:26:31 +0100
Subject: [Python-Dev] Mail delays and SourceForge bugs
In-Reply-To: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Jan 18, 2001 at 07:34:02PM -0500
References: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>
Message-ID: <20010119132631.I17392@xs4all.nl>

On Thu, Jan 18, 2001 at 07:34:02PM -0500, Guido van Rossum wrote:

> Through no fault of my own, email to guido at python.org (which includes
> the python-dev list) is currently suffering delays of 12-24 hours.  I
> have a feeling this is probably true for all mail going through
> python.org, so checkin messages ans python-dev discussion have been
> greatly frustrated, with about 1 day to go until the planned 2.1a1
> release date!

I doubt it's (just) you, Guido. I'm seeing similar delays, and I already
talked with Barry about it, too. It looks like it's clearing up a bit, now,
but it's confusing as hell, for sure ;)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Fri Jan 19 13:33:47 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 13:33:47 +0100
Subject: [Python-Dev] Dbm failure
In-Reply-To: <20010119175336.3B5A0A83E@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Fri, Jan 19, 2001 at 07:53:36PM +0200
References: <20010119175336.3B5A0A83E@darjeeling.zadka.site.co.il>
Message-ID: <20010119133347.J17392@xs4all.nl>

On Fri, Jan 19, 2001 at 07:53:36PM +0200, Moshe Zadka wrote:
> test test_dbm skipped --  /home/moshez/prog/src/python/python/dist/src/build/lib.linux-i686-2.1/dbm.so: undefined symbol: dbm_firstkey
> Did it happen to anyone else? 

Yes, to me. You're suffering from the same thing I did: GNU sucks. Okay,
okay, not as much as MS products or most other UNIX software, but still ;)
The problem is a conflict between gdbm and glibc.

gdbm (1.7.3, which is what woody currently carries, not sure why it isn't
updated) offers a dbm interface/replacement, which includes a libdbm.(so|a)
and /usr/include/gdbm-ndbm.h. Glibc (or at least the debian package) *also*
offers a dbm interface/replacement, which consists of libdb1.(so|a) and
/usr/include/db1/ndbm.h (which needs /usr/include/db1/*.h). If you add
/usr/include/db1 to your include path, and -ldbm to the dbmmodule, you end
up with the wrong versions. You need either to include /usr/include/db1 in
your includepath and use -ldb1, or fix up dbmmodule.c so it includes
gdbm-ndbm.h and uses -ldbm.

I only figured this out yesterday, and sent Andrew a mail about that... I'm
not sure what the Right(tm) way to fix this is :( I've always loathed these
library/version mismatches :P

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mal at lemburg.com  Fri Jan 19 14:07:00 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 14:07:00 +0100
Subject: [Python-Dev] Standard install locations for Python ?
References: <200101181220.f0ICK6K10612@mira.informatik.hu-berlin.de> <20010118135640.G21503@kronos.cnri.reston.va.us>
Message-ID: <3A683BF4.BD74A979@lemburg.com>

Andrew Kuchling wrote:
> 
> On Thu, Jan 18, 2001 at 01:20:06PM +0100, Martin v. Loewis wrote:
> >On Unix, there appears to be no standard location, unless the
> >documentation consists of man pages or perhaps info files. So
> ><prefix>/share/doc is probably a place as good as any other.
> 
> This seems like a good suggestion.  Should docs go in
> <prefix>/share/doc/python<version>/, then?  Perhaps with
> subdirectories for different extensions?

Hmm, I guess it's better to follow bdist_rpm here: put
the docs into a subdir under .../doc/ using the package
name and version.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From jeremy at alum.mit.edu  Fri Jan 19 15:39:13 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 19 Jan 2001 09:39:13 -0500 (EST)
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: <20010119110200.9E455373C95@snelboot.oratrix.nl>
References: <20010119110200.9E455373C95@snelboot.oratrix.nl>
Message-ID: <14952.20881.848489.869512@localhost.localdomain>

>>>>> "JJ" == Jack Jansen <jack at oratrix.nl> writes:

  JJ> I get the impression that I'm currently seeing a non-NULL third
  JJ> argument in my (C) methods even though the method is called
  JJ> without keyword arguments.

  JJ> Is this new semantics that I missed the discussion about, or is
  JJ> this a bug? 

This is a bug in the changes I made to the call function
implementation.  I wasn't sure what was supposed to happen to a
function that expected a kw argument but was called without one.  I
thought I saw some crashes when I passed NULL, so I changed the
implementation to pass an empty dictionary.

(Is the correct behavior documented anywhere?)

If a NULL value is correct, I'll update the implementation and see if
I can rediscover those crashes.

Jeremy



From nas at arctrix.com  Fri Jan 19 08:39:50 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 18 Jan 2001 23:39:50 -0800
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010119000209.F17392@xs4all.nl>; from thomas@xs4all.net on Fri, Jan 19, 2001 at 12:02:09AM +0100
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl>
Message-ID: <20010118233950.A15636@glacier.fnational.com>

On Fri, Jan 19, 2001 at 12:02:09AM +0100, Thomas Wouters wrote:
> I can't find any such hackery in the source, but I also can't
> figure out how else it's working :)

I thank you want to look at getpath.c.  

  Neil



From jeremy at alum.mit.edu  Fri Jan 19 15:44:50 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 19 Jan 2001 09:44:50 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.107,2.108
In-Reply-To: <E14JND2-0004Tl-00@usw-pr-cvs1.sourceforge.net>
References: <E14JND2-0004Tl-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <14952.21218.416551.695660@localhost.localdomain>

>>>>> "GvR" == Guido van Rossum <gvanrossum at users.sourceforge.net> writes:

  GvR> Log Message: Changes to recursive-object comparisons, having to
  GvR> do with a test case I found where rich comparison of unequal
  GvR> recursive objects gave unintuituve results.  In a discussion
  GvR> with Tim, where we discovered that our intuition on when a<=b
  GvR> should be true was failing, we decided to outlaw ordering
  GvR> comparisons on recursive objects.  (Once we have fixed our
  GvR> intuition and designed a matching algorithm that's practical
  GvR> and reasonable to implement, we can allow such orderings
  GvR> again.)

Sounds sensible to me!  I was quite puzzled about what <= should
return for recursive objects.

  GvR> - Changed the nesting limit to a more reasonable small 20; this
  GvR>   only slows down comparisons of very deeply nested objects
  GvR>   (unlikely to occur in practice), while speeding up
  GvR>   comparisons of recursive objects (previously, this would
  GvR>   first waste time and space on 500 nested comparisons before
  GvR>   it would start detecting recursion).

After we talked through this code yesterday, I was also thinking that
the limit was too high :-).

Jeremy



From guido at digicool.com  Fri Jan 19 16:49:54 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 10:49:54 -0500
Subject: [Python-Dev] new Makefile.in
In-Reply-To: Your message of "Thu, 18 Jan 2001 18:56:04 EST."
             <200101182356.SAA19616@cj20424-a.reston1.va.home.com> 
References: <20010117235922.A12356@glacier.fnational.com>  
            <200101182356.SAA19616@cj20424-a.reston1.va.home.com> 
Message-ID: <200101191549.KAA28699@cj20424-a.reston1.va.home.com>

[Neil]
> > A question: is it possible to break the Python static library up?

[me]
> Sounds cool to me.

Of course after Martin's response I agree with him -- let's keep it
one library.  (Although I expect that the combined effect of setup.py
and Neil's flat Makefile will still affect the infrastructure to build
extensions... :-( )

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 16:56:58 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 10:56:58 -0500
Subject: [Python-Dev] MS CRT crashing:
In-Reply-To: Your message of "Thu, 18 Jan 2001 16:53:15 PST."
             <58C671173DB6174A93E9ED88DCB0883DB863F1@red-msg-07.redmond.corp.microsoft.com> 
References: <58C671173DB6174A93E9ED88DCB0883DB863F1@red-msg-07.redmond.corp.microsoft.com> 
Message-ID: <200101191556.KAA28761@cj20424-a.reston1.va.home.com>

Bill Tutt writes:
> From the internal support squad:
> Turns out the C standard explicitly says you can't have an input follow
> output on a stream without doing fflush or fseek in-between, to make sure
> the stdio buffer is cleared.  So this program is illegal.
> 
> They've gone and resolved it by design.

I'd just like to note for the record that this is exactly what I had
predicted.

I'd also like to note that I *agree*.  Tim seems to think there's a
race condition in the threading code, but it's really much simpler
than that: the same bug can easily be provoked with a single-threaded
program: just randomly read and write alternatingly.  So obviously the
people who wrote the threading code aren't interested in the bug,
because it's not in their code -- and the people who wrote the code
that doesn't behave well when abused are protected by the C standard...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 17:00:30 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:00:30 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Your message of "Thu, 18 Jan 2001 22:39:18 +0100."
             <3A676286.C33823B4@tismer.com> 
References: <14949.46995.259157.871323@beluga.mojam.com> <200101171609.LAA04102@cj20424-a.reston1.va.home.com>  
            <3A676286.C33823B4@tismer.com> 
Message-ID: <200101191600.LAA28788@cj20424-a.reston1.va.home.com>

> Yes, the "inverse" is confusing. Is what you mean the "reverse" ?
> Like the other right-side operators __radd__, is it correct to
> think of
> 
>    __ge__  == __rle__
> 
> if __rle__ was written in the same fashion like __radd__ ?
> It looks semantically the same, although the reason for a
> call might be different.

Yes, it's semantically the same, and the reason for the call is the
same too ("the left argument doesn't support the operator so let's try
if the right one knows").

> And if my above view is right, would it perhaps be less
> confusing to use in fact __rle__ and __rlt__,
> or woudl it be more confusing, since __rlt__ would also be
> invoked left-to-right, implementing ">".

I prefer 6 new operators over 12 any day.  I can see no valid reason
why someone would want to overload a>b different than b<a, while
there are plenty of reasons why a+b and b+a should be different:
e.g. string concatenation.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Fri Jan 19 17:14:55 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Fri, 19 Jan 2001 11:14:55 -0500
Subject: [Python-Dev] new Makefile.in
In-Reply-To: <200101191549.KAA28699@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Fri, Jan 19, 2001 at 10:49:54AM -0500
References: <20010117235922.A12356@glacier.fnational.com> <200101182356.SAA19616@cj20424-a.reston1.va.home.com> <200101191549.KAA28699@cj20424-a.reston1.va.home.com>
Message-ID: <20010119111455.C25056@kronos.cnri.reston.va.us>

On Fri, Jan 19, 2001 at 10:49:54AM -0500, Guido van Rossum wrote:
>Of course after Martin's response I agree with him -- let's keep it
>one library.  (Although I expect that the combined effect of setup.py
>and Neil's flat Makefile will still affect the infrastructure to build
>extensions... :-( )

Which reminds me... there should really be a way to ignore the
setup.py stuff and use the old method.  How should that be done.  A
--use-makesetup flag to configure, maybe?

--amk




From guido at digicool.com  Fri Jan 19 17:14:20 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:14:20 -0500
Subject: [Python-Dev] Re: test_support.py
In-Reply-To: Your message of "Thu, 18 Jan 2001 21:59:23 PST."
             <E14JUa3-0006xu-00@usw-pr-cvs1.sourceforge.net> 
References: <E14JUa3-0006xu-00@usw-pr-cvs1.sourceforge.net> 
Message-ID: <200101191614.LAA28881@cj20424-a.reston1.va.home.com>

>       if not condition:
> !         raise AssertionError(reason)

Wouldn't it be better if this raised TestFailed rather than
AssertionError?  Or is there code that catches the AssertionError?

[...grep...]

Yes, there's code that catches AssertionError:

(1) in Marc-Andre's own test_unicode.py;

(2) in test_re, which catches AssertionError and raises TestFailed
    instead.

Proposal:

(1) change verify() to raise TestFailed;

(2) change test_unicode.py to catch TestFailed instead.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tismer at tismer.com  Fri Jan 19 17:17:06 2001
From: tismer at tismer.com (Christian Tismer)
Date: Fri, 19 Jan 2001 17:17:06 +0100
Subject: [Python-Dev] Rich comparison confusion
References: <14949.46995.259157.871323@beluga.mojam.com> <200101171609.LAA04102@cj20424-a.reston1.va.home.com>  
	            <3A676286.C33823B4@tismer.com> <200101191600.LAA28788@cj20424-a.reston1.va.home.com>
Message-ID: <3A686882.F78C1268@tismer.com>


Guido van Rossum wrote:
> 
> > Yes, the "inverse" is confusing. Is what you mean the "reverse" ?
> > Like the other right-side operators __radd__, is it correct to
> > think of
> >
> >    __ge__  == __rle__
> >
> > if __rle__ was written in the same fashion like __radd__ ?
> > It looks semantically the same, although the reason for a
> > call might be different.
> 
> Yes, it's semantically the same, and the reason for the call is the
> same too ("the left argument doesn't support the operator so let's try
> if the right one knows").
> 
> > And if my above view is right, would it perhaps be less
> > confusing to use in fact __rle__ and __rlt__,
> > or woudl it be more confusing, since __rlt__ would also be
> > invoked left-to-right, implementing ">".
> 
> I prefer 6 new operators over 12 any day.  I can see no valid reason
> why someone would want to overload a>b different than b<a, while
> there are plenty of reasons why a+b and b+a should be different:
> e.g. string concatenation.

Sure, I didn't want to introduce new operators, but use the
"r" versions for three of the six new operators. But I should have
read you proposal before. The confusion is not due to you,
but Skip had a read error, since you don't talk about inverses
at all:

Skip=="""
In the description
he states that __le__ and __ge__ are inverses as are __lt__ and __gt__.
"""

Truth=="""
There are no explicit "reversed argument" versions of
  these; instead, __lt__ and __gt__ are each other's reverse, likewise
  for__le__ and __ge__; __eq__ and __ne__ are their own reverse
  (similar at the C level).
"""

No reason for confusion at all > python-dev/null - ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From thomas at xs4all.net  Fri Jan 19 17:20:56 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 17:20:56 +0100
Subject: [Python-Dev] test_ucn errors ?
Message-ID: <20010119172056.K17392@xs4all.nl>

I'm currently seeing a failure in test_ucn:

test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
error: Illegal Unicode character

It looks like one of the unicode literals in test_ucn is invalid, but it's
damned hard to pin down which:

Python 2.1a1 (#7, Jan 19 2001, 17:06:32) 
[GCC 2.95.2 20000220 (Debian GNU/Linux)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> import test.test_ucn
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeError: Unicode-Escape decoding error: Illegal Unicode character
>>> 

I get the same crashes on FreeBSD and (Debian) Linux.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Fri Jan 19 17:26:34 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:26:34 -0500
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: Your message of "Fri, 19 Jan 2001 00:02:09 +0100."
             <20010119000209.F17392@xs4all.nl> 
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us>  
            <20010119000209.F17392@xs4all.nl> 
Message-ID: <200101191626.LAA29165@cj20424-a.reston1.va.home.com>

> This brings me to another point: how can 'make test' work at all ? Does
> python always check for './Lib' (and './Modules') for modules ?

Look at the logic in Modules/getpath.c, which calculates the initial
(default) sys.path.  It detects that it's running from the build tree
and then modifies the default path a bit to include Lib and Modules
relative to where the python executable was found.

> If that's
> specific for 'make test' and running python in the source distribution, that
> sounds like a bit of a weird hack. I can't find any such hackery in the
> source, but I also can't figure out how else it's working :)

It's not jut for 'make test' -- it's to make life easy for developers
in general (and me in particular :-) who want to try out their hacks
without going through 'make install'.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Fri Jan 19 17:34:58 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 17:34:58 +0100
Subject: [Python-Dev] Re: test_support.py
References: <E14JUa3-0006xu-00@usw-pr-cvs1.sourceforge.net> <200101191614.LAA28881@cj20424-a.reston1.va.home.com>
Message-ID: <3A686CB2.C75D184D@lemburg.com>

Guido van Rossum wrote:
> 
> >       if not condition:
> > !         raise AssertionError(reason)
> 
> Wouldn't it be better if this raised TestFailed rather than
> AssertionError?  Or is there code that catches the AssertionError?
> 
> [...grep...]
> 
> Yes, there's code that catches AssertionError:
> 
> (1) in Marc-Andre's own test_unicode.py;
> 
> (2) in test_re, which catches AssertionError and raises TestFailed
>     instead.
> 
> Proposal:
> 
> (1) change verify() to raise TestFailed;
> 
> (2) change test_unicode.py to catch TestFailed instead.

+1

Why not simply make TestFailed a subclass of AssertionError ?
Then we wouldn't have to fear about breaking test code...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Fri Jan 19 17:34:15 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 17:34:15 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <200101191626.LAA29165@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Fri, Jan 19, 2001 at 11:26:34AM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl> <200101191626.LAA29165@cj20424-a.reston1.va.home.com>
Message-ID: <20010119173415.M17295@xs4all.nl>

On Fri, Jan 19, 2001 at 11:26:34AM -0500, Guido van Rossum wrote:
> > This brings me to another point: how can 'make test' work at all ? Does
> > python always check for './Lib' (and './Modules') for modules ?

> Look at the logic in Modules/getpath.c, which calculates the initial
> (default) sys.path.  It detects that it's running from the build tree
> and then modifies the default path a bit to include Lib and Modules
> relative to where the python executable was found.

Aye, I found it now.

> > If that's
> > specific for 'make test' and running python in the source distribution, that
> > sounds like a bit of a weird hack. I can't find any such hackery in the
> > source, but I also can't figure out how else it's working :)

> It's not jut for 'make test' -- it's to make life easy for developers
> in general (and me in particular :-) who want to try out their hacks
> without going through 'make install'.

Well, after some old SF movies & some sleep, I realized that :) But it is
going to have to change: you now have to include the build tree as well, and
that is quite a bit more difficult to figure out. I'd suggest a 'make run'
that calls python with the appropriate PYTHONPATH environment variable, but
that doesn't cover test-scripts (which I use a lot myself.)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Fri Jan 19 17:34:45 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:34:45 -0500
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: Your message of "Fri, 19 Jan 2001 12:02:00 +0100."
             <20010119110200.9E455373C95@snelboot.oratrix.nl> 
References: <20010119110200.9E455373C95@snelboot.oratrix.nl> 
Message-ID: <200101191634.LAA29239@cj20424-a.reston1.va.home.com>

> I get the impression that I'm currently seeing a non-NULL third
> argument in my (C) methods even though the method is called without
> keyword arguments.

> Is this new semantics that I missed the discussion about, or is this a bug?

Can't tell without spending more time looking at the code and
experimenting than I can afford today; but Jeremy refactored the
calling code, and it could be that you're seeing an empty dictionary
instead of a NULL.

Do you really need the NULL?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 17:41:02 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:41:02 -0500
Subject: [Python-Dev] Mail delays and SourceForge bugs
In-Reply-To: Your message of "Fri, 19 Jan 2001 13:26:31 +0100."
             <20010119132631.I17392@xs4all.nl> 
References: <200101190034.TAA26664@cj20424-a.reston1.va.home.com>  
            <20010119132631.I17392@xs4all.nl> 
Message-ID: <200101191641.LAA29324@cj20424-a.reston1.va.home.com>

> I doubt it's (just) you, Guido. I'm seeing similar delays, and I already
> talked with Barry about it, too. It looks like it's clearing up a bit, now,
> but it's confusing as hell, for sure ;)

It's worse for me though than for most people: for others, only mail
sent through mailman at mail.python.org is affected.  For me, mail
sent directly to guido at python.org is affected too (which is why I've
changed my From address again to that old standby,
guido at digicool.com).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 17:53:39 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 11:53:39 -0500
Subject: [Python-Dev] deprecated regex used by un-deprecated modules
In-Reply-To: Your message of "Thu, 18 Jan 2001 22:20:08 EST."
             <14951.45672.806978.600944@localhost.localdomain> 
References: <14951.45672.806978.600944@localhost.localdomain> 
Message-ID: <200101191653.LAA29774@cj20424-a.reston1.va.home.com>

> There are several modules in the standard library that use the regex
> module.  When they are imported, they print a warning about using a
> deprecated module.  I think this is bad form.  Either the modules that
> depend on regex should by updated to use re or they should be
> deprecated themselves.  
> 
> I discovered the following offenders:
> asynchat
> knee
> poplib
> reconvert
> 
> I would suggest fixing asynchat and poplib and deprecating knee.  The
> reconvert module may be a special case.

Agreed.  There's an idiom to disable the warning, which you can find
in regsub.py:

    import warnings
    warnings.filterwarnings("ignore", "", DeprecationWarning, __name__)

(The "" should be replaced by the specific warning message though.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Fri Jan 19 18:21:28 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 12:21:28 -0500
Subject: [Python-Dev] test_ucn errors ?
In-Reply-To: Your message of "Fri, 19 Jan 2001 17:20:56 +0100."
             <20010119172056.K17392@xs4all.nl> 
References: <20010119172056.K17392@xs4all.nl> 
Message-ID: <200101191721.MAA31937@cj20424-a.reston1.va.home.com>

> I'm currently seeing a failure in test_ucn:
> 
> test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
> error: Illegal Unicode character
> 
> It looks like one of the unicode literals in test_ucn is invalid, but it's
> damned hard to pin down which:

Feels to me like there's a bug in the string literal processing that
makes *any* string literal containing \N{...} fail during code
generation.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at effbot.org  Fri Jan 19 18:37:41 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 19 Jan 2001 18:37:41 +0100
Subject: [Python-Dev] test_ucn errors ?
References: <20010119172056.K17392@xs4all.nl>
Message-ID: <023801c0823e$86fcedc0$e46940d5@hagrid>

> test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
> error: Illegal Unicode character

Make sure you rebuild Objects/unicodeobject.o and the
ucnhash extension.  If they build without warnings, run
the following script.

import ucnhash
count = 0
for code in range(65536):
    try:
        name = ucnhash.getname(code)
        if ucnhash.getcode(name) != code:
            print name
        count += 1
    except ValueError:
        pass
print count

if it prints anything but "10538", let me know.

> It looks like one of the unicode literals in test_ucn is invalid, but it's
> damned hard to pin down which:

If the ucnhash extension cannot be found, the script won't
even compile...  shouldn't be too hard to fix.

</F>




From Barrett at stsci.edu  Fri Jan 19 18:32:26 2001
From: Barrett at stsci.edu (Paul Barrett)
Date: Fri, 19 Jan 2001 12:32:26 -0500 (EST)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <200101191600.LAA28788@cj20424-a.reston1.va.home.com>
References: <14949.46995.259157.871323@beluga.mojam.com>
	<200101171609.LAA04102@cj20424-a.reston1.va.home.com>
	<3A676286.C33823B4@tismer.com>
	<200101191600.LAA28788@cj20424-a.reston1.va.home.com>
Message-ID: <14952.30800.112503.123675@nem-srvr.stsci.edu>

Guido van Rossum writes:
 > 
 > ... I can see no valid reason why someone would want to overload
 > a>b different than b<a, ... 
 > 

I agree.  But this assumes that the result of A<B and B>A is a
collection of Booleans.  In the Interactive Data Language (IDL) these
operators are essentially mapped to ceiling and floor functions which
are not commutative.  I personally find this silly, but IDL users
coming to Python may be surprised when the comparison of two Numeric
arrays returns a Boolean-like result.

-- 
Dr. Paul Barrett       Space Telescope Science Institute
Phone: 410-338-4475    ESS/Science Software Group
FAX:   410-338-4767    Baltimore, MD 21218



From nas at arctrix.com  Fri Jan 19 11:43:12 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 19 Jan 2001 02:43:12 -0800
Subject: [Python-Dev] new Makefile.in
In-Reply-To: <20010119111455.C25056@kronos.cnri.reston.va.us>; from akuchlin@mems-exchange.org on Fri, Jan 19, 2001 at 11:14:55AM -0500
References: <20010117235922.A12356@glacier.fnational.com> <200101182356.SAA19616@cj20424-a.reston1.va.home.com> <200101191549.KAA28699@cj20424-a.reston1.va.home.com> <20010119111455.C25056@kronos.cnri.reston.va.us>
Message-ID: <20010119024312.A16179@glacier.fnational.com>

On Fri, Jan 19, 2001 at 11:14:55AM -0500, Andrew Kuchling wrote:
> Which reminds me... there should really be a way to ignore the
> setup.py stuff and use the old method.  How should that be done.  A
> --use-makesetup flag to configure, maybe?

A different target for make would be easy.

  Neil



From fredrik at effbot.org  Fri Jan 19 19:13:15 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 19 Jan 2001 19:13:15 +0100
Subject: [Python-Dev] test_ucn errors ?
References: <20010119172056.K17392@xs4all.nl>  <200101191721.MAA31937@cj20424-a.reston1.va.home.com>
Message-ID: <03a201c08243$7fa62af0$e46940d5@hagrid>

thomas wrote:
> > I'm currently seeing a failure in test_ucn:
> > 
> > test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
> > error: Illegal Unicode character
> > 
> > It looks like one of the unicode literals in test_ucn is invalid, but it's
> > damned hard to pin down which:
> 
> Feels to me like there's a bug in the string literal processing that
> makes *any* string literal containing \N{...} fail during code
> generation.

I took another look at the error message: the only explanation
I can see here is that the lookup succeeds, but the call to ucn-
hash returns a value larger than 0x10ffff.

What is Py_UCS4 set to under gcc?

Confusing /F




From guido at digicool.com  Fri Jan 19 19:11:21 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 13:11:21 -0500
Subject: [Python-Dev] Re: test_support.py
In-Reply-To: Your message of "Fri, 19 Jan 2001 17:34:58 +0100."
             <3A686CB2.C75D184D@lemburg.com> 
References: <E14JUa3-0006xu-00@usw-pr-cvs1.sourceforge.net> <200101191614.LAA28881@cj20424-a.reston1.va.home.com>  
            <3A686CB2.C75D184D@lemburg.com> 
Message-ID: <200101191811.NAA32539@cj20424-a.reston1.va.home.com>

> > Proposal:
> > 
> > (1) change verify() to raise TestFailed;
> > 
> > (2) change test_unicode.py to catch TestFailed instead.
> 
> +1
> 
> Why not simply make TestFailed a subclass of AssertionError ?
> Then we wouldn't have to fear about breaking test code...

No, I'd rather see the two separated.  There can be assert statements
in the modules we're testing, and I'd prefer not to see those caught
by test code that is trying to catch TestFailed.

I'll check this in momentarily.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at effbot.org  Fri Jan 19 19:19:37 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 19 Jan 2001 19:19:37 +0100
Subject: [Python-Dev] test_ucn errors ?
References: <20010119172056.K17392@xs4all.nl>  <200101191721.MAA31937@cj20424-a.reston1.va.home.com>
Message-ID: <03b301c08244$627f22a0$e46940d5@hagrid>

> Feels to me like there's a bug in the string literal processing that
> makes *any* string literal containing \N{...} fail during code
> generation.

umm.  can anyone explain how this can happen:

python ../lib/test/regrtest.py test_ucn
test_ucn
1 test OK.

python ../lib/test/test_ucn.py
UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character Name

how can a test that works under regrtest.py fail when
it's run separately?  what am I missing here?

</F>




From mal at lemburg.com  Fri Jan 19 19:48:53 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 19:48:53 +0100
Subject: [Python-Dev] test_ucn errors ?
References: <20010119172056.K17392@xs4all.nl>  <200101191721.MAA31937@cj20424-a.reston1.va.home.com> <03a201c08243$7fa62af0$e46940d5@hagrid>
Message-ID: <3A688C15.8C9CFF46@lemburg.com>

Fredrik Lundh wrote:
> 
> thomas wrote:
> > > I'm currently seeing a failure in test_ucn:
> > >
> > > test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape decoding
> > > error: Illegal Unicode character
> > >
> > > It looks like one of the unicode literals in test_ucn is invalid, but it's
> > > damned hard to pin down which:
> >
> > Feels to me like there's a bug in the string literal processing that
> > makes *any* string literal containing \N{...} fail during code
> > generation.
> 
> I took another look at the error message: the only explanation
> I can see here is that the lookup succeeds, but the call to ucn-
> hash returns a value larger than 0x10ffff.
> 
> What is Py_UCS4 set to under gcc?

Should be "unsigned int" on all modern Intel platforms.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Fri Jan 19 19:48:45 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 13:48:45 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Your message of "Fri, 19 Jan 2001 12:32:26 EST."
             <14952.30800.112503.123675@nem-srvr.stsci.edu> 
References: <14949.46995.259157.871323@beluga.mojam.com> <200101171609.LAA04102@cj20424-a.reston1.va.home.com> <3A676286.C33823B4@tismer.com> <200101191600.LAA28788@cj20424-a.reston1.va.home.com>  
            <14952.30800.112503.123675@nem-srvr.stsci.edu> 
Message-ID: <200101191848.NAA02765@cj20424-a.reston1.va.home.com>

>  > ... I can see no valid reason why someone would want to overload
>  > a>b different than b<a, ... 
>  > 
> 
> I agree.  But this assumes that the result of A<B and B>A is a
> collection of Booleans.  In the Interactive Data Language (IDL) these
> operators are essentially mapped to ceiling and floor functions which
> are not commutative.  I personally find this silly, but IDL users
> coming to Python may be surprised when the comparison of two Numeric
> arrays returns a Boolean-like result.

This means that Python can't be used to emulate this part of IDL.  I
don't understand how these can be not commutative unless they have a
side effect on the left argument, and that's not possible in Python
anyway.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Fri Jan 19 20:18:04 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 19 Jan 2001 14:18:04 -0500
Subject: [Python-Dev] test_ucn errors ?
Message-ID: <LNBBLJKPBEHFEDALKOLCGELEIJAA.tim.one@home.com>

[/F]
> umm.  can anyone explain how this can happen:
>
> python ../lib/test/regrtest.py test_ucn
> test_ucn
> test OK.
>
> python ../lib/test/test_ucn.py
> UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character
Name
>
> how can a test that works under regrtest.py fail when
> it's run separately?  what am I missing here?

Dunno, but add to the pile of mysteries that you're unique.  Here on
Win98SE:

python ../lib/test/regrtest.py test_ucn
test_ucn
test test_ucn crashed -- exceptions.UnicodeError: Unicode-Escape
      decoding error:
 Invalid Unicode Character Name
1 test failed: test_ucn


python ../lib/test/test_ucn.py
UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character Name


I suggest you reformat your hard drive, and reinstall Windows <wink>.




From mwh21 at cam.ac.uk  Fri Jan 19 20:25:03 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 19 Jan 2001 19:25:03 +0000
Subject: [Python-Dev] test_ucn errors ?
In-Reply-To: "Fredrik Lundh"'s message of "Fri, 19 Jan 2001 19:19:37 +0100"
References: <20010119172056.K17392@xs4all.nl> <200101191721.MAA31937@cj20424-a.reston1.va.home.com> <03b301c08244$627f22a0$e46940d5@hagrid>
Message-ID: <m3n1cn1h0w.fsf@atrus.jesus.cam.ac.uk>

"Fredrik Lundh" <fredrik at effbot.org> writes:

> > Feels to me like there's a bug in the string literal processing that
> > makes *any* string literal containing \N{...} fail during code
> > generation.
> 
> umm.  can anyone explain how this can happen:
> 
> python ../lib/test/regrtest.py test_ucn
> test_ucn
> 1 test OK.

This will run the .pyc if present?
 
> python ../lib/test/test_ucn.py
> UnicodeError: Unicode-Escape decoding error: Invalid Unicode Character Name

This won't?  

Note: no traceback -> (in effect, if not design) compile time error.

> how can a test that works under regrtest.py fail when
> it's run separately?  what am I missing here?

Well, this is just my guess.

Cheers,
M.

-- 
  Well, you pretty much need Microsoft stuff to get misbehaviours bad
  enough to actually tear the time-space continuum.  Luckily for you,
  MS Internet Explorer is available for Solaris.
                              -- Calle Dybedahl, alt.sysadmin.recovery




From skip at mojam.com  Fri Jan 19 20:55:29 2001
From: skip at mojam.com (Skip Montanaro)
Date: Fri, 19 Jan 2001 13:55:29 -0600 (CST)
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <20010119173415.M17295@xs4all.nl>
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us>
	<20010117234925.A17392@xs4all.nl>
	<20010118004400.B17392@xs4all.nl>
	<20010118103036.B21503@kronos.cnri.reston.va.us>
	<20010119000209.F17392@xs4all.nl>
	<200101191626.LAA29165@cj20424-a.reston1.va.home.com>
	<20010119173415.M17295@xs4all.nl>
Message-ID: <14952.39857.83065.24889@beluga.mojam.com>

    Thomas> But it is going to have to change: you now have to include the
    Thomas> build tree as well, and that is quite a bit more difficult to
    Thomas> figure out. I'd suggest a 'make run' that calls python with the
    Thomas> appropriate PYTHONPATH environment variable, but that doesn't
    Thomas> cover test-scripts (which I use a lot myself.)

Doesn't Andrew's new "platform" target in the top-level Makefile do the
right thing?  It *should* generate a platform-specific path to the correct
build subdirectory.

Skip



From MarkH at ActiveState.com  Fri Jan 19 21:11:02 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Fri, 19 Jan 2001 12:11:02 -0800
Subject: [Python-Dev] initializing ob_type (was: CVS: python/dist/src/Modules _cursesmodule.c,2.46,2.47)
In-Reply-To: <010c01c08201$4b0ec050$e46940d5@hagrid>
Message-ID: <LCEPIIGDJPKCOIHOBJEPIEHFCPAA.MarkH@ActiveState.com>

> you can compile the module as C++, but that's also a bit painful...

My understanding is that the C std doesn't guarantee the order of static
object initialization, whereas C++ does provide these semantics.  At least
that is the excuse I found when digging into this some years ago.

Can't-believe-I-mentioned-the-C-standard-while-Tim-is-listening ly,

Mark.




From guido at digicool.com  Fri Jan 19 21:44:53 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 15:44:53 -0500
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. unistr()
In-Reply-To: Your message of "Fri, 19 Jan 2001 10:58:08 +0100."
             <3A680FB0.AED2DB55@lemburg.com> 
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org> <3A66CAC2.74FC894@lemburg.com> <200101190104.UAA27056@cj20424-a.reston1.va.home.com>  
            <3A680FB0.AED2DB55@lemburg.com> 
Message-ID: <200101192044.PAA04154@cj20424-a.reston1.va.home.com>

> If we agree to merge the semantics of the two APIs, then str()
> would have to change too: is this desirable ? (IMHO, yes)

Not clear.  Which is why I'm backing off from my initial support for
merging the two.

I believe unicode() (which is really just an interface to
PyUnicode_FromEncodedObject()) currently already does too much.  In
particular this whole business with calling __str__ on instances seems
to me to be unnecessary.  I think it should *only* bother to look for
something that supports the buffer interface (checking for regular
strings only as a tiny optimization), or existing unicode objects.

> Here's what we could do:
> 
> a) merge the semantics of unistr() into unicode()
> b) apply the same semantics in str()
> c) remove unistr() -- how's that for a short-living builtin ;)
> 
> About the semantics:
> 
> These should be backward compatible to str() in that everything
> that worked before should continue to work after the merge.
> 
> A strawman for processing str() and unicode():
> 
> 1. strings/Unicode is passed back as-is

I hope you mean str() passes 8-bit strings back as-is, unicode()
passes Unicode strings back as-is, right?

> 2. tp_str is tried
> 3. the method __str__ is tried

Shouldn't have to -- instances should define tp_str and all the magic
for calling __str__ should be there.  I don't understand why it's not
done that way, probably just for historical reasons.  I also don't
think __str__ should be tried for non-instance types.

But, more seriously, I believe tp_str or __str__ shouldn't be tried at
all by unicode().

> 4. the PyObject_AsCharBuffer() API is tried (bf_getcharbuffer)
> 5. for str(): Unicode return values are converted to strings using
>               the default encoding
>    for unicode(): Unicode return values are passed back as-is;
>               string return values are decoded according to the
>               encoding parameter
> 6. the return object is type-checked: str() will always return
>    a string object, unicode() always a Unicode object
> 
> Note that passing back Unicode is only allowed in case no encoding
> was given. Otherwise an execption is raised: you can't decode
> Unicode.
> 
> As extension we could add encoding and error parameters to str()
> as well. The result would be either an encoding of Unicode objects
> passed back by tp_str or __str__ or a recoding of string objects
> returned by checks 2, 3 or 4.

Naaaah!

> If we agree to take this approach, then we should remove the
> unistr() Python API before the alpha ships.

Frankly, I believe we need more time to sort this out, and therefore I
propose to remove the unistr() built-in before the release.  Marc,
would you do the honors?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Fri Jan 19 21:55:53 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Fri, 19 Jan 2001 21:55:53 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <14952.39857.83065.24889@beluga.mojam.com>; from skip@mojam.com on Fri, Jan 19, 2001 at 01:55:29PM -0600
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl> <200101191626.LAA29165@cj20424-a.reston1.va.home.com> <20010119173415.M17295@xs4all.nl> <14952.39857.83065.24889@beluga.mojam.com>
Message-ID: <20010119215552.O17295@xs4all.nl>

On Fri, Jan 19, 2001 at 01:55:29PM -0600, Skip Montanaro wrote:
> 
>     Thomas> But it is going to have to change: you now have to include the
>     Thomas> build tree as well, and that is quite a bit more difficult to
>     Thomas> figure out. I'd suggest a 'make run' that calls python with the
>     Thomas> appropriate PYTHONPATH environment variable, but that doesn't
>     Thomas> cover test-scripts (which I use a lot myself.)

> Doesn't Andrew's new "platform" target in the top-level Makefile do the
> right thing?  It *should* generate a platform-specific path to the correct
> build subdirectory.

Yes, it does, that's what I meant with 'make run'. But that isn't quite as
user-friendly as the current method. How would you run a script with the
current python ? 'make SCRIPT=./spamtest.py runscript' ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Fri Jan 19 23:06:03 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 17:06:03 -0500
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: Your message of "Fri, 19 Jan 2001 17:34:15 +0100."
             <20010119173415.M17295@xs4all.nl> 
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl> <200101191626.LAA29165@cj20424-a.reston1.va.home.com>  
            <20010119173415.M17295@xs4all.nl> 
Message-ID: <200101192206.RAA12072@cj20424-a.reston1.va.home.com>

I finally figured the best way to fix sys.path to find shared modules
built by setup.py.  At first I thought I had to add it to getpath.c,
but the problem is that the name is calculated by calling
distutils.util.get_platform(), and that requires a working Python
interpreter, so we'd end up with a chicken-or-egg situation.

So instead I added 5 lines to site.py, which tests for
os.name=='posix', then for sys.path[-1] ending in '/Modules' -- this
tests only succeeds when running from the build directory.  Then it
calls distutils.util.get_platform() and uses the result to calculate
the correct directory name, which is then appended to sys.path.

Yes, this slows down startup (it imports a large portion of the
distutils package), but I don't care -- after all this is mostly for
me so I can play with the interpreter right after I've built it,
right?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Fri Jan 19 22:32:34 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 22:32:34 +0100
Subject: [Python-Dev] Re: Type-converting functions, esp. unicode() vs. 
 unistr()
References: <Pine.LNX.4.10.10101180215590.1568-100000@skuld.kingmanhall.org> <3A66CAC2.74FC894@lemburg.com> <200101190104.UAA27056@cj20424-a.reston1.va.home.com>  
	            <3A680FB0.AED2DB55@lemburg.com> <200101192044.PAA04154@cj20424-a.reston1.va.home.com>
Message-ID: <3A68B272.BBBAECD1@lemburg.com>

Guido van Rossum wrote:
> 
> > If we agree to merge the semantics of the two APIs, then str()
> > would have to change too: is this desirable ? (IMHO, yes)
> 
> Not clear.  Which is why I'm backing off from my initial support for
> merging the two.
> 
> I believe unicode() (which is really just an interface to
> PyUnicode_FromEncodedObject()) currently already does too much.  In
> particular this whole business with calling __str__ on instances seems
> to me to be unnecessary.  I think it should *only* bother to look for
> something that supports the buffer interface (checking for regular
> strings only as a tiny optimization), or existing unicode objects.

Hmm, unicode() should (just like str()) take an object and
convert it to a Unicode string. Since many objects either don't
support the tp_str slot (instances don't for some reason -- just
like they don't tp_call), I had to add some special cases to
make Python instances compatible to Unicode in the same way
str() does.

What I think is really needed is a concept for "stringification"
in Python. We currently have these schemes:

1. tp_str
2. method __str__ (not only of Python instances, but any object)
3. character buffer interface

These three could easily be unified into the tp_str slot:
e.g. tp_str could do the necessary magic to call __str__
or the buffer interface.

Note that the same is true for e.g. tp_call -- the special
cases we have in ceval.c for the different builtin callable
objects would not be necessary if they would implement tp_call.

> > Here's what we could do:
> >
> > a) merge the semantics of unistr() into unicode()
> > b) apply the same semantics in str()
> > c) remove unistr() -- how's that for a short-living builtin ;)
> >
> > About the semantics:
> >
> > These should be backward compatible to str() in that everything
> > that worked before should continue to work after the merge.
> >
> > A strawman for processing str() and unicode():
> >
> > 1. strings/Unicode is passed back as-is
> 
> I hope you mean str() passes 8-bit strings back as-is, unicode()
> passes Unicode strings back as-is, right?

Right.
 
> > 2. tp_str is tried
> > 3. the method __str__ is tried
> 
> Shouldn't have to -- instances should define tp_str and all the magic
> for calling __str__ should be there.  I don't understand why it's not
> done that way, probably just for historical reasons.  I also don't
> think __str__ should be tried for non-instance types.

Ok.
 
> But, more seriously, I believe tp_str or __str__ shouldn't be tried at
> all by unicode().

Hmm, but how would you implement generic conversion to Unicode 
then ? 

We'll need some way for instances (and other types) to
provide a conversion to Unicode. Some time ago we discussed this
issue and came to the conclusion that tp_str should be allowed
to return Unicode data instead of inventing a new tp_unicode
slot for this purpose.

> > 4. the PyObject_AsCharBuffer() API is tried (bf_getcharbuffer)
> > 5. for str(): Unicode return values are converted to strings using
> >               the default encoding
> >    for unicode(): Unicode return values are passed back as-is;
> >               string return values are decoded according to the
> >               encoding parameter
> > 6. the return object is type-checked: str() will always return
> >    a string object, unicode() always a Unicode object
> >
> > Note that passing back Unicode is only allowed in case no encoding
> > was given. Otherwise an execption is raised: you can't decode
> > Unicode.
> >
> > As extension we could add encoding and error parameters to str()
> > as well. The result would be either an encoding of Unicode objects
> > passed back by tp_str or __str__ or a recoding of string objects
> > returned by checks 2, 3 or 4.
> 
> Naaaah!

Would be nice for symmetry and useful in the light of making
Unicode the only string type in Py4k ;-)
 
> > If we agree to take this approach, then we should remove the
> > unistr() Python API before the alpha ships.
> 
> Frankly, I believe we need more time to sort this out, and therefore I
> propose to remove the unistr() built-in before the release.  Marc,
> would you do the honors?

Ok. 

I'll remove the builtin and the docs, but will leave the
PyObject_Unicode() API enabled.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From uche.ogbuji at fourthought.com  Fri Jan 19 22:42:40 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Fri, 19 Jan 2001 14:42:40 -0700
Subject: [Python-Dev] Extension doc bugs
Message-ID: <200101192142.OAA29168@localhost.localdomain>

I'm using the bleeding-edge documentation at 

http://python.sourceforge.net/devel-docs/api/api.html

I know that it's not complete until someone has the time to do so, but I've 
run into a few places where it's completely wrong.

For instance, from the object protocol docs: 

"""
int PyObject_Cmp (PyObject *o1, PyObject *o2, int *result) 
      Compare the values of o1 and o2 using a routine provided by o1, if one   
       exists, otherwise with a routine provided by o2. The result of the
      comparison is returned in result. Returns -1 on failure. This is the     
       equivalent of the Python statement "result = cmp(o1, o2)".
"""

After getting weird behavior implementing this, and then squinting at the 
relevant Python 2.0 code, it appears that in actuality the Cmp function is to 
return the direct comparison results (-1, 0, 1 based on ordering of the 
parameters)  furthermore, there is no such "result" argument.

4Suite has a lot of C extension code developed by squinting at Python sources 
and long gdb sessions and I have a feeling that in many cases we're taking up 
hacks that would get us into trouble across versions, and all that; but the 
"official" interfaces and behaviors are not documented (or only poorly 
documented).  In general, the C API docs are in a rather sorry state and 
though I doubt I could do a great deal about fixing it, I'd be interested in 
discussion of the matter, and perhaps making what contribution I can.

Is the doc-sig the best place for this?  My experience there wouldn't seem to 
encourage this conclusion (most of the discussion is of docstring syntax and 
neat-o automagic document generators).


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From mal at lemburg.com  Fri Jan 19 22:46:24 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 22:46:24 +0100
Subject: [Python-Dev] readline and setup.py
Message-ID: <3A68B5B0.771412F7@lemburg.com>

The new setup.py procedure for Python causes readline not to
be built on my machine. Instead I get a linker error telling
me that termcap is not found.

Looking at my old Setup file, I have this line:

readline readline.c \
	 -I/usr/include/readline -L/usr/lib/termcap \
	 -lreadline -lterm

I guess, setup.py should be modified to include additional
library search paths -- shouldn't hurt on platforms which
don't need them.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Fri Jan 19 22:50:53 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 19 Jan 2001 22:50:53 +0100
Subject: [Python-Dev] _tkinter and setup.py
Message-ID: <3A68B6BD.BAD038D6@lemburg.com>

Why does setup.py stop with an error in case _tkinter cannot
be built (due to an old Tk/Tcl version in my case) ?

I think the policy in setup.py should be to output warnings,
but continue building the rest of the Python modules.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Fri Jan 19 23:38:22 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 17:38:22 -0500
Subject: [Python-Dev] 2.1 alpha 1 release schedule
Message-ID: <200101192238.RAA12413@cj20424-a.reston1.va.home.com>

Practicality beats purity: we're very close to a release, but I've
decided to hold off to give Jeremy a chance to finish the nested
scopes, to give Fred a chance to revise the weak references according
to Martin's wishes, and in general for things to settle.

Most likely we'll be able to release Monday night (Jan 22).

Unfortunately email through python.org seems to be wedged again (I
swear, it seems like it starts getting wedged every afternoon between
3 and 4!) so I don't have a clear view of what the latest checkins
were; but from cvs update it seems that the following things happened
this afternoon:

- Barry fixed a core dump in function attribute assignments

- Marc-Andre withrew unistr(), pending more discussion

- Fredrik fixed the ucnhash problem

- I fixed two path problems in the new build process that only
  occurred when you were building in a subdirectory of the source tree

Good work, crew!  I'm taking the weekend off.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jack at oratrix.nl  Sat Jan 20 00:23:18 2001
From: jack at oratrix.nl (Jack Jansen)
Date: Sat, 20 Jan 2001 00:23:18 +0100
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments 
In-Reply-To: Message by Guido van Rossum <guido@digicool.com> ,
	     Fri, 19 Jan 2001 11:34:45 -0500 , <200101191634.LAA29239@cj20424-a.reston1.va.home.com> 
Message-ID: <20010119232323.70B03116392@oratrix.oratrix.nl>

Recently, Guido van Rossum <guido at digicool.com> said:
> > I get the impression that I'm currently seeing a non-NULL third
> > argument in my (C) methods even though the method is called without
> > keyword arguments.
> 
> > Is this new semantics that I missed the discussion about, or is this a bug?
> 
> [...] 
> Do you really need the NULL?

The places that I know I was counting on the NULL now have "if ( kw && 
PyObject_IsTrue(kw))", so I'll just have to hope there aren't any more 
lingering in there.
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | see http://www.xs4all.nl/~tank/spg-l/sigaction.htm 



From tim.one at home.com  Sat Jan 20 01:04:10 2001
From: tim.one at home.com (Tim Peters)
Date: Fri, 19 Jan 2001 19:04:10 -0500
Subject: [Python-Dev] MS CRT crashing:
In-Reply-To: <200101191556.KAA28761@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMENLIJAA.tim.one@home.com>

[Guido]
> I'd just like to note for the record that this is exactly what I had
> predicted.

I would have hoped you'd be content to let the record speak for itself
<wink>.

> I'd also like to note that I *agree*.

With what?  That the program is undefined by the C std was never in dispute.

> Tim seems to think there's a race condition in the threading code,
> but it's really much simpler than that: the same bug can easily be
> provoked with a single-threaded program: just randomly read and
> write alternatingly.

And this is a point in their favor?!  "It's OK that the MT library corrupts
itself, because even the single-threaded library does"?

> So obviously the people who wrote the threading code aren't interested
> in the bug,

I don't know that it ever got as far as the people who wrote the threading
code, but I sure doubt it:  when the reply starts "Turns out the C standard
explicitly says  ...", it strongly suggests it was written by someone who
didn't already know what the C std says, and went looking for an excuse to
get it off their plate without further effort.  Par for the course, if so.

> because it's not in their code -- and the people who wrote the code
> that doesn't behave well when abused are protected by the C standard...

The behavior of things designated "undefined" and "implementation-defined"
by the std fall under "quality of implementation".  In the real world, the
latter is what vendors compete on; meeting the letter of the std is a bare
minimum for playing the game at all.

The plain fact is that their library is less robust than others in this
case.  I worked on a multithreaded stdio implementation at KSR, and that
sure couldn't corrupt itself.  Looks like no flavor of Linux does either.
It's not *reasonable* for a library to corrupt itself in this case, although
it's certainly reasonable for its behavior to vary from run to run.  There's
nothing in the C std that says a conforming implementation can't *crash* on
the program

void main() {int i = 1;}

either <wink>.

a-std-is-a-floor-on-acceptable-behavior-not-a-ceiling-ly y'rs  - tim




From gstein at lyra.org  Sat Jan 20 02:21:56 2001
From: gstein at lyra.org (Greg Stein)
Date: Fri, 19 Jan 2001 17:21:56 -0800
Subject: [Python-Dev] initializing ob_type
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPIEHFCPAA.MarkH@ActiveState.com>; from MarkH@ActiveState.com on Fri, Jan 19, 2001 at 12:11:02PM -0800
References: <010c01c08201$4b0ec050$e46940d5@hagrid> <LCEPIIGDJPKCOIHOBJEPIEHFCPAA.MarkH@ActiveState.com>
Message-ID: <20010119172156.Y7731@lyra.org>

On Fri, Jan 19, 2001 at 12:11:02PM -0800, Mark Hammond wrote:
> > you can compile the module as C++, but that's also a bit painful...
> 
> My understanding is that the C std doesn't guarantee the order of static
> object initialization, whereas C++ does provide these semantics.  At least
> that is the excuse I found when digging into this some years ago.

True, but when PyWhatever_Type is initialized, &PyType_Type ought to be
ready (even if it isn't initialized). Heck, &PyType_Type points into the
Python core which is *definitely* loaded by that point.

Now, if "initialization" also means "relocation to a specific address" then
I can understand.

Hrm... I've just spent some time with the Windows SDK docs, and I can't find
anything that really discusses the problem and resolution. There certainly
isn't any warning about "don't do this." It all talks about how fixups are
stored with the DLL, how you can optionally use BIND to pre-bind the values,
blah blah blah. But nothing saying "it doesn't work."

It would be interesting to know more about the actual symptoms that appears
when the ob_type init is performed by the structure (rather than at
runtime). What happens? Bad address? NULL value? Failure to resolve and
load? Is PyType_Type not exported correctly or something?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



From guido at digicool.com  Sat Jan 20 03:05:39 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 19 Jan 2001 21:05:39 -0500
Subject: [Python-Dev] How to get setup.py to build expat?
Message-ID: <200101200205.VAA13299@cj20424-a.reston1.va.home.com>

The setup.py script does not build the expat module for me.

I have expat installed in /usr/local, at least I believe so: I have
/usr/local/include/xmlparse.h and /usr/local/lib/libexpat.a -- do I
need more?

How can I get setup.py to spit out what it tries, and why it fails?
setup.py -v build doesn't give any extra output.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From fredrik at effbot.org  Sat Jan 20 03:41:43 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Sat, 20 Jan 2001 03:41:43 +0100
Subject: [Python-Dev] initializing ob_type
References: <010c01c08201$4b0ec050$e46940d5@hagrid> <LCEPIIGDJPKCOIHOBJEPIEHFCPAA.MarkH@ActiveState.com> <20010119172156.Y7731@lyra.org>
Message-ID: <00f001c0828a$bc903900$e46940d5@hagrid>

greg wrote:

> It would be interesting to know more about the actual symptoms that appears
> when the ob_type init is performed by the structure (rather than at runtime).
> What happens?

    http://www.python.org/doc/FAQ.html#3.24
    "3.24. "Initializer not a constant" while building DLL
    on MS-Windows

    "Static type object initializers in extension modules
    may cause compiles to fail with an error message
    like "initializer not a constant"

Cheers /F




From uche.ogbuji at fourthought.com  Sat Jan 20 06:29:23 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Fri, 19 Jan 2001 22:29:23 -0700
Subject: [Python-Dev] Extension doc bugs 
In-Reply-To: Message from uche.ogbuji@fourthought.com 
   of "Fri, 19 Jan 2001 14:42:40 MST." <200101192142.OAA29168@localhost.localdomain> 
Message-ID: <200101200529.WAA30349@localhost.localdomain>

> For instance, from the object protocol docs: 
> 
> """
> int PyObject_Cmp (PyObject *o1, PyObject *o2, int *result) 
>       Compare the values of o1 and o2 using a routine provided by o1, if one   
>        exists, otherwise with a routine provided by o2. The result of the
>       comparison is returned in result. Returns -1 on failure. This is the     
>        equivalent of the Python statement "result = cmp(o1, o2)".
> """
> 
> After getting weird behavior implementing this, and then squinting at the 
> relevant Python 2.0 code, it appears that in actuality the Cmp function is to 
> return the direct comparison results (-1, 0, 1 based on ordering of the 
> parameters)  furthermore, there is no such "result" argument.

Bother.  I didn't squint hard enough.  I mistook the tp_compare slot for the 
PyObject_Cmp equivalent.  I have indeed run into what I'm sure are nits in the 
Python/C API but given that my greatest alarm was false, I'll be more careful 
before bringing up the others.

I'm still curious as to the best forum for this.

-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From tim.one at home.com  Sat Jan 20 06:36:12 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 20 Jan 2001 00:36:12 -0500
Subject: [Python-Dev] Extension doc bugs
In-Reply-To: <200101192142.OAA29168@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPKIJAA.tim.one@home.com>

[uche.ogbuji at fourthought.com]
> ...
> In general, the C API docs are in a rather sorry state and though
> I doubt I could do a great deal about fixing it, I'd be interested in
> discussion of the matter, and perhaps making what contribution I can.
>
> Is the doc-sig the best place for this?

Nope!  Discussing it won't do any good, there or anywhere else.  What it
needs is for people to send better docs to python-docs at python.org or upload
LaTeX patches to SourceForge, and to report doc bugs on SourceForge (which
is where the start of this msg should have gone!).  Most days we just work
on whatever is backed up at SourceForge; if doc bugs don't show up there,
they won't get repaired.

the-docs-are-only-10x-better-than-the-sum-of-the-individual-
    contributions<wink>-ly y'rs  - tim




From tim.one at home.com  Sat Jan 20 07:17:04 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 20 Jan 2001 01:17:04 -0500
Subject: [Python-Dev] RE: [Python-checkins] CVS: python/dist/src/Objects object.c,2.109,2.110
In-Reply-To: <E14JrC9-00056U-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPNIJAA.tim.one@home.com>

[Barry]
> Modified Files:
> 	object.c
> Log Message:
> default_3way_compare(): When comparing the pointers, they must be cast
> to integer types (i.e. Py_uintptr_t, our spelling of C9X's uintptr_t).
> ANSI specifies that pointer compares other than == and != to
> non-related structures are undefined.  This quiets an Insure
> portability warning.

Barry, that comment belongs in the code, not in the checkin msg.  The code
*used* to do this correctly (as you well know, since you & I went thru
considerable pain to fix this the first time).  However, because the
*reason* for the convolution wasn't recorded in the code as a comment,
somebody threw it all away the first time it got reworked.

c-code-isn't-often-self-explanatory-ly y'rs  - tim




From tim.one at home.com  Sat Jan 20 07:30:42 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 20 Jan 2001 01:30:42 -0500
Subject: [Python-Dev] Stupid Python Tricks, Volume 38 Number 1
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPOIJAA.tim.one@home.com>

I had a huge string and wanted to put a double-quote on each end.  The
boring:

    '"' + huge + '"'

does the job, but is inefficent <snort>.  Then this transparent variation
sprang unbidden from my hoary brow:

    huge.join('""')

*That* should put to rest the argument over whether .join() is more properly
a method of the separator or the sequence -- '""'.join(huge) instead would
look plain silly <wink>.

not-entirely-sure-i'm-channeling-on-this-one-ly y'rs  - tim





From tim.one at home.com  Sat Jan 20 10:28:18 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 20 Jan 2001 04:28:18 -0500
Subject: [Python-Dev] Comparison of recursive objects
In-Reply-To: <14952.21218.416551.695660@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEABIKAA.tim.one@home.com>

[Guido's checkin msg]
> ...
> In a discussion with Tim, where we discovered that our intuition
> on when a<=b should be true was failing, we decided to outlaw
> ordering comparisons on recursive objects.  (Once we have fixed our
> intuition and designed a matching algorithm that's practical and
> reasonable to implement, we can allow such orderings again.)

[Jeremy]
> Sounds sensible to me!  I was quite puzzled about what <= should
> return for recursive objects.

That's easy:  x <= y for recursive objects should return true if and only if
x < y or x == y return true <0.9 wink>.

x == y isn't a problem, although Python gives a remarkable answer:
recursive objects in Python are instances of rooted, ordered, directed,
finite, node-labeled graphs, and "x == y" in Python answers whether their
graphs are isomorphic.

Viewed that way (which is the correct way <0.5 wink>), the *natural* meaning
for "x <= y" is "y contains a subgraph isomorphic to x".  And that has
*almost* all the nice properties we like:

    x <= x is true
    (x <= y and y <= z) implies x <= z
    (x <= y and y <= x) if and only if x == y

However,

1. That's much harder to compute.
2. It implies, e.g., [2] <= [1, 2], and that's not what we *want*
   non-recursive sequence comparison to mean.
3. It's a partial ordering:  given arbitrary x and y, it may be that
   neither contains an isomorphic image of the other.
4. We've again given up on avoiding surprises in *simple* comparisons
   among builtin types, like (under current CVS):

>>> 1 < [1] < 0L < 1
1
>>> 1 < 1
0
>>>

   so it's hard to see why we should do any work at all to avoid
   violating "intuition" when comparing recursive objects:  we're
   already scrubbing the face of intuition with steel wool,
   setting it on fire, then putting it out with an axe <wink>.

Now let's look at Guido's example (or one of them, anyway):

>>> a = []
>>> a.append(a)
>>> a.append("x")
>>> b = []
>>> b.append(b)
>>> b.append("y")
>>> a
[[...], 'x']
>>> b
[[...], 'y']
>>>

I think it's a trick of *typography* that caused my first thought to be
"well, clearly, a < b".  That is, the *display* shows me two 2-element
lists, each with the same "blob" as the first element, and where a[1] is
obviously less than b[1].  Since "the blobs" are the same, the second
elements control the outcome.

But those "blobs" aren't really the same:  a[0] is a, and b[0] is b, so
asking whether a < b by looking first at their first elements just leads
back to the original question:  asking whether a[0] < b[0] is again asking
whether a < b, and that makes no progress.  Saying that a is less than b by
fiat is *consistent* with the rules for lexicographic ordering, but so is
insisting that a is greater than b.  There's no basis for picking one over
the other, and so no clear hope of coming up with a generally consistent
scheme.  Well, one clear hope:  if recursive comparison says "not equal", it
could resolve the dilemma by comparing object id instead.  That would be
consistent (I mostly think at the moment ...), but if you run the program
above multiple times it may say a < b on some runs and b < a on others.

WRT "the right way", it should be clear from the attached picture that
neither a nor b contains an isomorphic image of the other, so from that POV
they're not comparable (a != b, but neither a <= b nor b <= a holds).

So this is what Guido made Python do:

>>> a == b  # still cool:  they're not isomorphic and Python knows it
0
>>> a < b
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: can't order recursive values
>>> a <= b
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: can't order recursive values

In light of that, I still find these mildly surprising:

>>> a < a
0
>>> a <= a
1
>>>

I guess some recursive values are more orderable than others <wink -- but
that's true!  the ones Python can prove are equal are indeed "more
orderable">.

>>> import copy
>>> c = copy.deepcopy(a)
>>> c
[[...], 'x']
>>> a == c
1
>>> a <= c
1
>>> a < c
0
>>>

BTW, this kind of construction appears to give equality-testing that's at
best(!) exponential-time in the size of the dicts:

def timeeq(x, y):
    from time import clock
    import sys
    s = clock()
    result = x == y
    f = clock()
    print x, result, round(f-s, 1), "seconds"
    sys.stdout.flush()

d = {}
e = {}
timeeq(d, e)
d[0] = d
e[0] = e
timeeq(d, e)
d[1] = d
e[1] = e
timeeq(d, e)
d[2] = d
e[2] = e
timeeq(d, e)

Output:

{} 1 0.0 seconds
{0: {...}} 1 0.0 seconds
{1: {...}, 0: {...}} 1 6.5 seconds

After more than 15 minutes, the 3-element dict comparison still hasn't
completed (yikes!).

ackerman's-function-eat-your-heart-out-ly y'rs  - tim
-------------- next part --------------
A non-text attachment was scrubbed...
Name: loopy.jpg
Type: image/jpeg
Size: 11363 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20010120/fce25c79/attachment-0001.jpg>

From thomas at xs4all.net  Sat Jan 20 15:30:26 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sat, 20 Jan 2001 15:30:26 +0100
Subject: [Python-Dev] PEP 229 checked in
In-Reply-To: <200101192206.RAA12072@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Fri, Jan 19, 2001 at 05:06:03PM -0500
References: <E14IxsK-0004GJ-00@kronos.cnri.reston.va.us> <20010117234925.A17392@xs4all.nl> <20010118004400.B17392@xs4all.nl> <20010118103036.B21503@kronos.cnri.reston.va.us> <20010119000209.F17392@xs4all.nl> <200101191626.LAA29165@cj20424-a.reston1.va.home.com> <20010119173415.M17295@xs4all.nl> <200101192206.RAA12072@cj20424-a.reston1.va.home.com>
Message-ID: <20010120153026.L17392@xs4all.nl>

On Fri, Jan 19, 2001 at 05:06:03PM -0500, Guido van Rossum wrote:

> So instead I added 5 lines to site.py, which tests for
> os.name=='posix', then for sys.path[-1] ending in '/Modules' -- this
> tests only succeeds when running from the build directory.  Then it
> calls distutils.util.get_platform() and uses the result to calculate
> the correct directory name, which is then appended to sys.path.

> Yes, this slows down startup (it imports a large portion of the
> distutils package), but I don't care -- after all this is mostly for
> me so I can play with the interpreter right after I've built it,
> right?

Right. The only downside (as far as I can tell) is that 'python -S' no
longer works, in the build tree. I don't think that's that big a deal, but
it should be documented somewhere, so we don't end up being boggled by it
once we forget about it :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Sat Jan 20 17:18:39 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 11:18:39 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: Your message of "Fri, 19 Jan 2001 00:45:32 +0100."
             <20010119004532.G17392@xs4all.nl> 
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net>  
            <20010119004532.G17392@xs4all.nl> 
Message-ID: <200101201618.LAA15675@cj20424-a.reston1.va.home.com>

> On Thu, Jan 18, 2001 at 08:46:54AM -0800, Guido van Rossum wrote:
> 
> > filename = '/tmp/delete_me'
> 
> This reminds me: we need a portable way to handle test-files :)

Yeah, I noticed that this test failed on Windows -- fixed now.

The test_support module exports TESTFN; there's also tempfile.mktemp()
which should generate temporary files on all platforms.

Is that enough?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Sat Jan 20 17:36:05 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sat, 20 Jan 2001 17:36:05 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <200101201618.LAA15675@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Sat, Jan 20, 2001 at 11:18:39AM -0500
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net> <20010119004532.G17392@xs4all.nl> <200101201618.LAA15675@cj20424-a.reston1.va.home.com>
Message-ID: <20010120173605.P17295@xs4all.nl>

On Sat, Jan 20, 2001 at 11:18:39AM -0500, Guido van Rossum wrote:
> > On Thu, Jan 18, 2001 at 08:46:54AM -0800, Guido van Rossum wrote:
> > 
> > > filename = '/tmp/delete_me'
> > 
> > This reminds me: we need a portable way to handle test-files :)
> Yeah, I noticed that this test failed on Windows -- fixed now.

> The test_support module exports TESTFN; there's also tempfile.mktemp()
> which should generate temporary files on all platforms.
> Is that enough?

Well, there is one more issue, which we can't fix terribly easy: test_fcntl
tries to flock() the file. flock() doesn't work on all filesystems (like
NFS) :P If we cared a lot, we could try several alternatives (current dir,
/tmp, /var/tmp) in the specific case of flock, but personally I don't want to
bother, and real sysadmins (who should care about the test failure) are more
likely to build Python on a local disk than in their NFS-mounted
homedirectory. At least that's how we do it :-) 

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Sat Jan 20 17:43:49 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 11:43:49 -0500
Subject: [Python-Dev] Stupid Python Tricks, Volume 38 Number 1
In-Reply-To: Your message of "Sat, 20 Jan 2001 01:30:42 EST."
             <LNBBLJKPBEHFEDALKOLCIEPOIJAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPOIJAA.tim.one@home.com> 
Message-ID: <200101201643.LAA16269@cj20424-a.reston1.va.home.com>

> I had a huge string and wanted to put a double-quote on each end.  The
> boring:
> 
>     '"' + huge + '"'
> 
> does the job, but is inefficent <snort>.  Then this transparent variation
> sprang unbidden from my hoary brow:
> 
>     huge.join('""')

Points off for obscurity though!  My favorite for this is:

    '"%s"' % huge

Worth a microbenchmark?

> *That* should put to rest the argument over whether .join() is more properly
> a method of the separator or the sequence -- '""'.join(huge) instead would
> look plain silly <wink>.
> 
> not-entirely-sure-i'm-channeling-on-this-one-ly y'rs  - tim

Give up the channeling for a while -- there's too much interference in
the air from the Microsoft threaded stdio debate still. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Sat Jan 20 17:47:44 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 20 Jan 2001 10:47:44 -0600 (CST)
Subject: [Python-Dev] how to test my __all__ lists?
Message-ID: <14953.49456.654121.987189@beluga.mojam.com>

How do I test the __all__ lists I'm building?  I'm worried about a couple
things:

    1. I may have typos
    2. I may leave something out of a list that should be imported by
       from-module-import-*.

Thoughts?

Skip



From guido at digicool.com  Sat Jan 20 18:00:05 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 12:00:05 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: Your message of "Sat, 20 Jan 2001 17:36:05 +0100."
             <20010120173605.P17295@xs4all.nl> 
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net> <20010119004532.G17392@xs4all.nl> <200101201618.LAA15675@cj20424-a.reston1.va.home.com>  
            <20010120173605.P17295@xs4all.nl> 
Message-ID: <200101201700.MAA16491@cj20424-a.reston1.va.home.com>

> > > > filename = '/tmp/delete_me'
> > > 
> > > This reminds me: we need a portable way to handle test-files :)
> > Yeah, I noticed that this test failed on Windows -- fixed now.
> 
> > The test_support module exports TESTFN; there's also tempfile.mktemp()
> > which should generate temporary files on all platforms.
> > Is that enough?
> 
> Well, there is one more issue, which we can't fix terribly easy: test_fcntl
> tries to flock() the file. flock() doesn't work on all filesystems (like
> NFS) :P If we cared a lot, we could try several alternatives (current dir,
> /tmp, /var/tmp) in the specific case of flock, but personally I don't want to
> bother, and real sysadmins (who should care about the test failure) are more
> likely to build Python on a local disk than in their NFS-mounted
> homedirectory. At least that's how we do it :-) 

These days, I would think that it's a pretty sure bet that the
system's tmp directory is not on NFS.  Then we could just use
tempfile.mktemp() in that module, right?  Or does the /tmp filesystem
on Linux (which AFAIK is a RAM disk implemented in virtual memory so
it uses swap space when it runs out of RAM) not support locking?

I don't particularly care about fixing this -- I haven't seen bug
reports about this.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Sat Jan 20 18:38:38 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 12:38:38 -0500
Subject: [Python-Dev] how to test my __all__ lists?
In-Reply-To: Your message of "Sat, 20 Jan 2001 10:47:44 CST."
             <14953.49456.654121.987189@beluga.mojam.com> 
References: <14953.49456.654121.987189@beluga.mojam.com> 
Message-ID: <200101201738.MAA16636@cj20424-a.reston1.va.home.com>

> How do I test the __all__ lists I'm building?  I'm worried about a couple
> things:
> 
>     1. I may have typos

Do "from M import *" -- this will raise an AttributeError if there's
something in __all__ that's not defined in the module.

>     2. I may leave something out of a list that should be imported by
>        from-module-import-*.

That's what alpha-testing's for.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at netaxs.com  Sat Jan 20 18:49:43 2001
From: esr at netaxs.com (Eric Raymond)
Date: Sat, 20 Jan 2001 12:49:43 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <3A672376.4B951848@lemburg.com>; from M.-A. Lemburg on Thu, Jan 18, 2001 at 06:10:14PM +0100
References: <20010118022321.A9021@thyrsus.com> <3A672376.4B951848@lemburg.com>
Message-ID: <20010120124943.C6073@unix3.netaxs.com>

> A combination of time.time(), process id and counter should
> work in all cases. Make sure you use a lock around the counter,
> though.

Yes, but...this hack has to work in a multithreaded environment,
so process ID isn't good enough.  And I don't want to keep a counter
around if I don't have to.
-- 
	<a href="http://www.tuxedo.org/~esr/home.html">Eric S. Raymond</a>



From guido at digicool.com  Sat Jan 20 19:01:04 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 20 Jan 2001 13:01:04 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: Your message of "Sat, 20 Jan 2001 12:49:43 EST."
             <20010120124943.C6073@unix3.netaxs.com> 
References: <20010118022321.A9021@thyrsus.com> <3A672376.4B951848@lemburg.com>  
            <20010120124943.C6073@unix3.netaxs.com> 
Message-ID: <200101201801.NAA16880@cj20424-a.reston1.va.home.com>

> > A combination of time.time(), process id and counter should
> > work in all cases. Make sure you use a lock around the counter,
> > though.
> 
> Yes, but...this hack has to work in a multithreaded environment,
> so process ID isn't good enough.  And I don't want to keep a counter
> around if I don't have to.

Sorry Eric, this just doesn't make sense.  Keeping a counter around in
your module (protected by a semaphore) is obviously the right
solution.  Why are you fighting it?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at netaxs.com  Sat Jan 20 19:20:26 2001
From: esr at netaxs.com (Eric Raymond)
Date: Sat, 20 Jan 2001 13:20:26 -0500
Subject: [Python-Dev] Weird use of hash() -- will this work?
In-Reply-To: <200101201801.NAA16880@cj20424-a.reston1.va.home.com>; from Guido van Rossum on Sat, Jan 20, 2001 at 01:01:04PM -0500
References: <20010118022321.A9021@thyrsus.com> <3A672376.4B951848@lemburg.com> <20010120124943.C6073@unix3.netaxs.com> <200101201801.NAA16880@cj20424-a.reston1.va.home.com>
Message-ID: <20010120132026.E6073@unix3.netaxs.com>

On Sat, Jan 20, 2001 at 01:01:04PM -0500, Guido van Rossum wrote:
> > Yes, but...this hack has to work in a multithreaded environment,
> > so process ID isn't good enough.  And I don't want to keep a counter
> > around if I don't have to.
> 
> Sorry Eric, this just doesn't make sense.  Keeping a counter around in
> your module (protected by a semaphore) is obviously the right
> solution.  Why are you fighting it?

Actually, I'm not fighting it any more.  I changed my mind a few minutes
after shipping that response.
-- 
	<a href="http://www.tuxedo.org/~esr/home.html">Eric S. Raymond</a>



From thomas at xs4all.net  Sat Jan 20 19:37:10 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sat, 20 Jan 2001 19:37:10 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <200101201700.MAA16491@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Sat, Jan 20, 2001 at 12:00:05PM -0500
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net> <20010119004532.G17392@xs4all.nl> <200101201618.LAA15675@cj20424-a.reston1.va.home.com> <20010120173605.P17295@xs4all.nl> <200101201700.MAA16491@cj20424-a.reston1.va.home.com>
Message-ID: <20010120193710.Q17295@xs4all.nl>

On Sat, Jan 20, 2001 at 12:00:05PM -0500, Guido van Rossum wrote:

> > Well, there is one more issue, which we can't fix terribly easy: test_fcntl
> > tries to flock() the file. flock() doesn't work on all filesystems (like
> > NFS) :P If we cared a lot, we could try several alternatives (current dir,
> > /tmp, /var/tmp) in the specific case of flock, but personally I don't want to
> > bother, and real sysadmins (who should care about the test failure) are
> > more likely to build Python on a local disk than in their NFS-mounted
> > homedirectory. At least that's how we do it :-)

> These days, I would think that it's a pretty sure bet that the
> system's tmp directory is not on NFS.  Then we could just use
> tempfile.mktemp() in that module, right?  Or does the /tmp filesystem
> on Linux (which AFAIK is a RAM disk implemented in virtual memory so
> it uses swap space when it runs out of RAM) not support locking?

Actually, most Linux distributions don't care enough about /tmp to make it a
RAM-based filesystem. At least Debian and RedHat don't :) (There's a good
reason for that: Linux's disk-data cache rocks if you have enough RAM, so
there's no real gain in using a ramdisk) BSDI does (optionally) have such a
/tmp, and probably the other BSD derived systems as well. But that doesn't
mean it doesn't support locking, so that's not a real excuse.

But like I said, I don't care enough to worry about it. I'll look at it
before alpha2.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tim.one at home.com  Sat Jan 20 21:10:51 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 20 Jan 2001 15:10:51 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEABIKAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com>

[Tim]
> ...
> 4. We've again given up on avoiding surprises in *simple* comparisons
>    among builtin types, like (under current CVS):
>
> >>> 1 < [1] < 0L < 1
> 1
> >>> 1 < 1
> 0
> >>>

I really dislike that.  Here's a consequence at a higher level:

N = 5
x = [1 for i in range(N)] + \
    [[1] for i in range(N)] + \
    [0L for i in range(N)]

x.sort()
print x

from random import shuffle
tries = failures = 0
while failures < 5:
    tries += 1
    y = x[:]
    shuffle(y)
    y.sort()
    if x != y:
        print "oops, on try number", tries
        print y
        failures += 1

and here's a typical run (2.1a1):

[1, 1, 1, 1, 1, [1], [1], [1], [1], [1], 0L, 0L, 0L, 0L, 0L]
oops, on try number 3
[0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1]]
oops, on try number 5
[[1], 0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1]]
oops, on try number 6
[0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1]]
oops, on try number 7
[[1], 0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1]]
oops, on try number 8
[0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1], 0L, 0L, 0L, 0L]

I've often used list.sort() on a heterogeneous list simply to bring the
elements of the same type next to each other.  But as "try number 5" shows,
I can no longer rely on even getting all the lists together.  Indeed,
heterogenous list.sort() has become a very bad (biased and slow)
implementation of random.shuffle() <wink>.

Under 2.0, the program never prints "oops", because the only violations of
transitivity in 2.0's ordering of builtin types were bugs in the
implementation (none of which show up in this simple test case); 2.0's
.sort() *always* produces

[0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1]]

The base trick in 2.0 was sound:  when falling back to the "compare by name
of the type" last resort, treat all numeric types as if they had the same
name.

While Python can't enforce that any user-defined __cmp__ is consistent, I
think it should continue to set a good example in the way it implements its
own comparisons.

grumblingly y'rs  - tim




From skip at mojam.com  Sat Jan 20 21:42:27 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 20 Jan 2001 14:42:27 -0600 (CST)
Subject: [Python-Dev] should a module's thread safety be documented?
Message-ID: <14953.63539.629197.232848@beluga.mojam.com>

A bit late for 2.1alpha1, but it just occurred to me that perhaps there
should be an annotation in the documentation that indicates whether or not a
module is thread-safe.  For example, many functions in fileinput rely on a
module global called _state.  It strikes me that this module is not likely
to be thread-safe, yet the documentation doesn't appear to mention this,
certainly not in an obvious fashion.

Anyone for adding \notthreadsafe{} and \threadsafe{} macros to the litany of
LaTex macros in Fred's arsenal?  This would make documenting these
properties both easy and consistent across modules.

Skip




From tim.one at home.com  Sat Jan 20 22:13:41 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 20 Jan 2001 16:13:41 -0500
Subject: [Python-Dev] Stupid Python Tricks, Volume 38 Number 1
In-Reply-To: <200101201643.LAA16269@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEBEIKAA.tim.one@home.com>

[Tim]
>     huge.join('""')

[Guido]
> Points off for obscurity though!

The Subject line was "Stupid Python Tricks" for a reason <wink>.  Those who
don't know the language inside-out should be tickled by figuring out why it
even *works* (hint for the baffled:  you have to view '""' as a sequence
rather than as an atomic string).

> My favorite for this is:
>
>     '"%s"' % huge
>
> Worth a microbenchmark?

Absolutely!  I get:

     obvious  15.574
     obscure   8.165
     sprintf   8.133

after running:

ITERS = 1000
indices = [0] * ITERS

def obvious(huge):
    for i in indices:  '"' + huge + '"'

def obscure(huge):
    for i in indices:  huge.join('""')

def sprintf(huge):
    for i in indices:  '"%s"' % huge

def runtimes(huge):
    from time import clock
    for f in obvious, obscure, sprintf:
        start = clock()
        f(huge)
        finish = clock()
        print "%12s %7.3f" % (f.__name__, finish - start)

runtimes("x" * 1000000)

under current 2.1a1.  Not a dead-quiet machine, but the difference is too
small to care.  Speed up huge.join attr lookup, and it would probably be
faster <wink>.  Hmm:  if I boost ITERS high enough and cut back the size of
huge, "obscure" eventually becomes *slower* than "obvious", and even if the
"huge.join" lookup is floated out of the loop.  I guess that points to the
relative burden of calling a bound method.  So, in real life, the huge.join
approach may well be the slowest!

>> not-entirely-sure-i'm-channeling-on-this-one-ly y'rs  - tim

> Give up the channeling for a while -- there's too much interference in
> the air from the Microsoft threaded stdio debate still. :-)

What debate?  You need two arguably valid points of view for a debate to
even start <wink>.

gloating-in-victory-vicious-in-defeat-but-simply-unbearable-in-
    ambiguity-ly y'rs  - tim




From fdrake at acm.org  Sat Jan 20 22:23:58 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sat, 20 Jan 2001 16:23:58 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <200101201700.MAA16491@cj20424-a.reston1.va.home.com>
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net>
	<20010119004532.G17392@xs4all.nl>
	<200101201618.LAA15675@cj20424-a.reston1.va.home.com>
	<20010120173605.P17295@xs4all.nl>
	<200101201700.MAA16491@cj20424-a.reston1.va.home.com>
Message-ID: <14954.494.223724.705495@cj42289-a.reston1.va.home.com>

Guido van Rossum writes:
 > tempfile.mktemp() in that module, right?  Or does the /tmp filesystem
 > on Linux (which AFAIK is a RAM disk implemented in virtual memory so
 > it uses swap space when it runs out of RAM) not support locking?

  I thought it was Solaris that used available+virtual memory for
/tmp; that was what we ran into at CNRI.  (Which doesn't preclude
Linux from doing the same, I just don't recall that we've encountered
that.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From fdrake at acm.org  Sat Jan 20 23:05:27 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sat, 20 Jan 2001 17:05:27 -0500 (EST)
Subject: [Python-Dev] should a module's thread safety be documented?
In-Reply-To: <14953.63539.629197.232848@beluga.mojam.com>
References: <14953.63539.629197.232848@beluga.mojam.com>
Message-ID: <14954.2983.450755.761653@cj42289-a.reston1.va.home.com>

Skip Montanaro writes:
 > A bit late for 2.1alpha1, but it just occurred to me that perhaps there
 > should be an annotation in the documentation that indicates whether or not a
 > module is thread-safe.  For example, many functions in fileinput rely on a

  If you can create a list of the known thread safe and known thread
unsafe modules, I'll come up with appropriate annotations for the
documentation.

 > Anyone for adding \notthreadsafe{} and \threadsafe{} macros to the litany of
 > LaTex macros in Fred's arsenal?  This would make documenting these
 > properties both easy and consistent across modules.

  Not sure that this is exactly the right approach to the markup; I'll
think about this one.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From skip at mojam.com  Sat Jan 20 23:31:52 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sat, 20 Jan 2001 16:31:52 -0600 (CST)
Subject: [Python-Dev] should a module's thread safety be documented?
In-Reply-To: <14954.2983.450755.761653@cj42289-a.reston1.va.home.com>
References: <14953.63539.629197.232848@beluga.mojam.com>
	<14954.2983.450755.761653@cj42289-a.reston1.va.home.com>
Message-ID: <14954.4568.460875.662560@beluga.mojam.com>

    Fred> If you can create a list of the known thread safe and known thread
    Fred> unsafe modules, I'll come up with appropriate annotations for the
    Fred> documentation.

I think that's going to be a significant undertaking, requiring examination
of a lot of Python and C code.  I'd rather approach it incrementally, which
was why I suggested the LaTeX macros.  As modules are determined to be safe
or unsafe, the appropriate safety macro could just be inserted into the
correct lib*.tex file.  It would (in my mind) expand to a stock bit of text
inserted at a standard place in the file.

Skip



From tim.one at home.com  Sat Jan 20 23:52:09 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 20 Jan 2001 17:52:09 -0500
Subject: [Python-Dev] should a module's thread safety be documented?
In-Reply-To: <14953.63539.629197.232848@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEBNIKAA.tim.one@home.com>

[Skip Montanaro]
> ...
> Anyone for adding \notthreadsafe{} and \threadsafe{} macros to
> the litany of LaTex macros in Fred's arsenal?  This would make
> documenting these properties both easy and consistent across
> modules.

When a module is *not* threadsafe, that's usually considered "a bug" in the
module.  So we should just point out modules that aren't threadsafe by
design.  Alas, that's A Project.




From nas at arctrix.com  Sat Jan 20 16:59:14 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sat, 20 Jan 2001 07:59:14 -0800
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com>; from tim.one@home.com on Sat, Jan 20, 2001 at 03:10:51PM -0500
References: <LNBBLJKPBEHFEDALKOLCAEABIKAA.tim.one@home.com> <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com>
Message-ID: <20010120075914.B18840@glacier.fnational.com>

On Sat, Jan 20, 2001 at 03:10:51PM -0500, Tim Peters wrote:
> While Python can't enforce that any user-defined __cmp__ is consistent, I
> think it should continue to set a good example in the way it implements its
> own comparisons.

I think the 2.0 behavior should be fairly easy to restore.  I'll
leave it up to Guido though since he's "Mr. Comparison" now and I
haven't looked at the code since I checked in the coercion patch.

  Neil



From nas at arctrix.com  Sat Jan 20 17:03:36 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sat, 20 Jan 2001 08:03:36 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test_dumbdbm.py,NONE,1.1
In-Reply-To: <14954.494.223724.705495@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Sat, Jan 20, 2001 at 04:23:58PM -0500
References: <E14JID8-0003nI-00@usw-pr-cvs1.sourceforge.net> <20010119004532.G17392@xs4all.nl> <200101201618.LAA15675@cj20424-a.reston1.va.home.com> <20010120173605.P17295@xs4all.nl> <200101201700.MAA16491@cj20424-a.reston1.va.home.com> <14954.494.223724.705495@cj42289-a.reston1.va.home.com>
Message-ID: <20010120080336.C18840@glacier.fnational.com>

On Sat, Jan 20, 2001 at 04:23:58PM -0500, Fred L. Drake, Jr. wrote:
> 
> Guido van Rossum writes:
>  > tempfile.mktemp() in that module, right?  Or does the /tmp filesystem
>  > on Linux (which AFAIK is a RAM disk implemented in virtual memory so
>  > it uses swap space when it runs out of RAM) not support locking?
> 
>   I thought it was Solaris that used available+virtual memory for
> /tmp; that was what we ran into at CNRI.  (Which doesn't preclude
> Linux from doing the same, I just don't recall that we've encountered
> that.)

I don't know of any Linux system that uses a RAM based /tmp.  The
Linux implemention of ext2 is so fast it doesn't make any sense.
If you have enough memory all the data is stored in the buffer,
page, and inode caches anyhow.


  Neil



From trentm at ActiveState.com  Sun Jan 21 00:35:56 2001
From: trentm at ActiveState.com (Trent Mick)
Date: Sat, 20 Jan 2001 15:35:56 -0800
Subject: [Python-Dev] spurious print and faulty return values: Is this a bug...?
Message-ID: <20010120153556.C18375@ActiveState.com>

... or am I missing something?

With Python 2.0 on Windows 2000, when playing with sys.exit() and sys.argv()
I get some unexpected results.

First here is a simple case that shows what I expect. I run "caller_good.py"
which call "callee_good.py" and prints its return value. "callee_good.py"
returns 42 so "42" is printed:
    ----------------- caller_good.py --------------------
    import os
    retval = os.system("python callee_good.py")
    print "caller: the retval is", retval
    -----------------------------------------------------

    ----------------- callee_good.py --------------------
    import sys
    sys.exit(42)
    -----------------------------------------------------

    D:\trentm\tmp>python caller_good.py
    caller: the retval is 42


Now here is what I didn't expect. I changed "caller_bad.py" to pass, as an
argument, the value that "callee_bad.py" should return.

    ----------------- caller_bad.py ---------------------
    import os
    retval = os.system("python callee_bad.py 42")
    print "caller: the retval is", retval
    -----------------------------------------------------

    ----------------- callee_bad.py ---------------------
    import sys
    firstarg = sys.argv[1]
    print "callee_bad: firstarg is", firstarg
    sys.exit(firstarg)
    -----------------------------------------------------

    D:\trentm\tmp>python caller_bad.py
    callee_bad: firstarg is 42
    42                             # <---- where did *this* print come from?
    caller: the retval is 1        # <---- and this retval is incorrect


Any ideas? I have not tried to track this down yet nor have I tried the
latest Python-CVS state.

Trent

-- 
Trent Mick
TrentM at ActiveState.com



From moshez at zadka.site.co.il  Sun Jan 21 13:37:57 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Sun, 21 Jan 2001 14:37:57 +0200 (IST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,NONE,1.1
In-Reply-To: <E14K45e-00030e-00@usw-pr-cvs1.sourceforge.net>
References: <E14K45e-00030e-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010121123757.D897BA83E@darjeeling.zadka.site.co.il>

Yay! I can change to python-dev manually!
(hear sounds of the timbot's teeth grinding)

On Sat, 20 Jan 2001, Skip Montanaro <montanaro at users.sourceforge.net> wrote:
> def check_all(_modname):
>     exec "import %s" % _modname
>     verify(hasattr(sys.modules[_modname],"__all__"),
>            "%s has no __all__ attribute" % _modname)
>     exec "del %s" % _modname
>     exec "from %s import *" % _modname
>     
>     _keys = locals().keys()
....

Wouldn't it be better to use the

d = {}
exec "foo", d

And verify "d" instead?

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From guido at digicool.com  Sun Jan 21 17:51:45 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sun, 21 Jan 2001 11:51:45 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: Your message of "Sat, 20 Jan 2001 15:10:51 EST."
             <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCCEBDIKAA.tim.one@home.com> 
Message-ID: <200101211651.LAA25346@cj20424-a.reston1.va.home.com>

[Tim, complaining that numerical types are no longer lumped together
in default comparisons:]
> I've often used list.sort() on a heterogeneous list simply to bring the
> elements of the same type next to each other.  But as "try number 5" shows,
> I can no longer rely on even getting all the lists together.  Indeed,
> heterogenous list.sort() has become a very bad (biased and slow)
> implementation of random.shuffle() <wink>.
> 
> Under 2.0, the program never prints "oops", because the only violations of
> transitivity in 2.0's ordering of builtin types were bugs in the
> implementation (none of which show up in this simple test case); 2.0's
> .sort() *always* produces
> 
> [0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1]]
> 
> The base trick in 2.0 was sound:  when falling back to the "compare by name
> of the type" last resort, treat all numeric types as if they had the same
> name.
> 
> While Python can't enforce that any user-defined __cmp__ is consistent, I
> think it should continue to set a good example in the way it implements its
> own comparisons.

I think I can put this behavior back.  (I believe that before I
reorganized the comparison code, it seemed really tricky to do this,
but after refactoring the code, it's quite easy to do.)

My only concern is that under the old schele, two different numeric
extension types that somehow can't be compared will end up being
*equal*.  To fix this, I propose that if the names compare equal, as a
last resort we compare the type pointers -- this should be consistent
too.

Here's a patch that stops your test program from reporting failures:

*** object.c	2001/01/21 16:25:18	2.112
--- object.c	2001/01/21 16:50:16
***************
*** 522,527 ****
--- 522,528 ----
  default_3way_compare(PyObject *v, PyObject *w)
  {
  	int c;
+ 	char *vname, *wname;
  
  	if (v->ob_type == w->ob_type) {
  		/* When comparing these pointers, they must be cast to
***************
*** 550,557 ****
  	}
  
  	/* different type: compare type names */
! 	c = strcmp(v->ob_type->tp_name, w->ob_type->tp_name);
! 	return (c < 0) ? -1 : (c > 0) ? 1 : 0;
  }
  
  #define CHECK_TYPES(o) PyType_HasFeature((o)->ob_type, Py_TPFLAGS_CHECKTYPES)
--- 551,571 ----
  	}
  
  	/* different type: compare type names */
! 	if (v->ob_type->tp_as_number)
! 		vname = "";
! 	else
! 		vname = v->ob_type->tp_name;
! 	if (w->ob_type->tp_as_number)
! 		wname = "";
! 	else
! 		wname = w->ob_type->tp_name;
! 	c = strcmp(vname, wname);
! 	if (c < 0)
! 		return -1;
! 	if (c > 0)
! 		return 1;
! 	/* Same type name, or (more likely) incomparable numeric types */
! 	return (v->ob_type < w->ob_type) ? -1 : 1;
  }
  
  #define CHECK_TYPES(o) PyType_HasFeature((o)->ob_type, Py_TPFLAGS_CHECKTYPES)

Let me know if you agree with this.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Sun Jan 21 18:00:02 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sun, 21 Jan 2001 12:00:02 -0500
Subject: [Python-Dev] should a module's thread safety be documented?
In-Reply-To: Your message of "Sat, 20 Jan 2001 14:42:27 CST."
             <14953.63539.629197.232848@beluga.mojam.com> 
References: <14953.63539.629197.232848@beluga.mojam.com> 
Message-ID: <200101211700.MAA25479@cj20424-a.reston1.va.home.com>

> A bit late for 2.1alpha1, but it just occurred to me that perhaps there
> should be an annotation in the documentation that indicates whether or not a
> module is thread-safe.  For example, many functions in fileinput rely on a
> module global called _state.  It strikes me that this module is not likely
> to be thread-safe, yet the documentation doesn't appear to mention this,
> certainly not in an obvious fashion.
> 
> Anyone for adding \notthreadsafe{} and \threadsafe{} macros to the litany of
> LaTex macros in Fred's arsenal?  This would make documenting these
> properties both easy and consistent across modules.

It's hard to say whether a *whole module* is threadsafe.  E.g. in the
fileinput example, there's the clear implication that if you use this
in multiple threads, you should instantiate your own FileInput
instances, and then you're totally thread-safe.  Clearly the semantics
of the module-global functions are thread-unsafe though.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Sun Jan 21 19:45:07 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 13:45:07 -0500
Subject: [Python-Dev] test_sax failing (Windows)
Message-ID: <LNBBLJKPBEHFEDALKOLCGEDGIKAA.tim.one@home.com>

test test_sax crashed -- 
    exceptions.SystemError: 'finally' pops bad exception

Sometimes it crashes (some flavor of memory fault) instead.

Elsewhere?




From nas at arctrix.com  Sun Jan 21 13:28:35 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 21 Jan 2001 04:28:35 -0800
Subject: [Python-Dev] autoconf --enable vs. --with
Message-ID: <20010121042835.A19774@glacier.fnational.com>

I've been working a bit on the build process lately.  I came
across this in the autoconf documentation:


    If a software package has optional compile-time features, the
    user can give `configure' command line options to specify
    whether to compile them. The options have one of these forms:

        --enable-FEATURE[=ARG]
        --disable-FEATURE

    Some packages require, or can optionally use, other software
    packages which are already installed.  The user can give
    `configure' command line options to specify which such
    external software to use.  The options have one of these
    forms:

        --with-package[=ARG]
        --without-package


Is it worth fixing the Python configure script to comply with
these definitions?  It looks like with-cycle-gc and mybe
with-pydebug would have to be changed.

  Neil

    AC_ARG_ENABLE

    



From tim.one at home.com  Sun Jan 21 20:44:38 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 14:44:38 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101211651.LAA25346@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com>

[Guido, on again lumping numbers together]
> I think I can put this behavior back.  (I believe that before I
> reorganized the comparison code, it seemed really tricky to do this,
> but after refactoring the code, it's quite easy to do.)

I can believe that; and I believe the "bugs" in 2.0 ended up somewhere in or
around the bowels of the xxxHalfBinOp-like routines (which were really
tricky to my eyes -- the interactions among coercions and comparisons were
hard to keep straight).

> My only concern is that under the old schele, two different numeric
> extension types that somehow can't be compared will end up being
> *equal*.  To fix this, I propose that if the names compare equal, as a
> last resort we compare the type pointers -- this should be consistent
> too.

Agreed, and sounds fine!  Save Barry a little work, though:

> ! 	/* Same type name, or (more likely) incomparable numeric types */
> ! 	return (v->ob_type < w->ob_type) ? -1 : 1;

That's non-std C in a way Insure complains about elsewhere; change to

	return ((Py_uintptr_t)v->ob_type <
		  (Py_uintptr_t)w->ob_type) ? -1 : 1;

if-vendors-stuck-to-the-letter-of-the-c-std-python-wouldn't-
     compile-at-all<wink>-ly y'rs  - tim




From trentm at ActiveState.com  Sun Jan 21 21:01:44 2001
From: trentm at ActiveState.com (Trent Mick)
Date: Sun, 21 Jan 2001 12:01:44 -0800
Subject: [Python-Dev] spurious print and faulty return values: Is this a bug...?
In-Reply-To: <20010120153556.C18375@ActiveState.com>; from trentm@ActiveState.com on Sat, Jan 20, 2001 at 03:35:56PM -0800
References: <20010120153556.C18375@ActiveState.com>
Message-ID: <20010121120144.B28643@ActiveState.com>

On Sat, Jan 20, 2001 at 03:35:56PM -0800, Trent Mick wrote:
> 
> ... or am I missing something?

Ignore me. RTFM (sys.exit), Trent.

Sorry,
Trent


-- 
Trent Mick
TrentM at ActiveState.com



From tim.one at home.com  Sun Jan 21 21:13:02 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 15:13:02 -0500
Subject: [Python-Dev] spurious print and faulty return values: Is this a bug...?
In-Reply-To: <20010121120144.B28643@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEDKIKAA.tim.one@home.com>

[Trent, quoting Trent]
>>
>> ... or am I missing something?

[and back to Trent]
> Ignore me. RTFM (sys.exit), Trent.

Nobody wants to ignore *you*, Trent!  If it's not the case that you wanted
to code

sys.exit(int(firstarg))

instead, holler, cuz if that wasn't the problem I'm still baffled.

or-if-it-was-it-caught-you-because-sys.exit's-tricks-aren't-
    really-pythonic-ly y'rs  - tim




From loewis at informatik.hu-berlin.de  Sun Jan 21 22:21:24 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 21 Jan 2001 22:21:24 +0100 (MET)
Subject: [Python-Dev] test_sax failing (Windows)
Message-ID: <200101212121.WAA16327@pandora.informatik.hu-berlin.de>

> Elsewhere?

Not for me, on neither Solaris nor Linux. What expat version?

Regards,
Martin



From loewis at informatik.hu-berlin.de  Sun Jan 21 22:22:44 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 21 Jan 2001 22:22:44 +0100 (MET)
Subject: [Python-Dev] autoconf --enable vs. --with
Message-ID: <200101212122.WAA16371@pandora.informatik.hu-berlin.de>

> It looks like with-cycle-gc and mybe with-pydebug would have to be
> changed.

I'm in favour of changing it.

Regards,
Martin



From loewis at informatik.hu-berlin.de  Sun Jan 21 22:34:08 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 21 Jan 2001 22:34:08 +0100 (MET)
Subject: [Python-Dev] test___all__ fails with no bsddb
Message-ID: <200101212134.WAA16446@pandora.informatik.hu-berlin.de>

On my Solaris 2.6 installation, with no bsddb module, I get

test test___all__ failed -- dbhash has no __all__ attribute

This is caused by anydbm importing dbhash first. After that fails,
dbhash is still in sys.modules, and the next import of dbhash silently
loads an incomplete module.

Regards,
Martin



From tim.one at home.com  Sun Jan 21 22:38:11 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 16:38:11 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <200101212121.WAA16327@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEDPIKAA.tim.one@home.com>

[Martin von Loewis]
> Not for me, on neither Solaris nor Linux. What expat version?

Tell me how to answer the question, and I'll be happy to (I have no idea
what any of this stuff is or does).

My pyexpat.c (well, my *everything*) is current CVS, pyexpat.c in particular
is revision 2.33.

xmltok.dll and xmlparse.dll were obtained from

    ftp://ftp.jclark.com/pub/xml/expat.zip

for the 2.0 release.

Is any of that relevant?

The tests passed in the wee hours (EST; UTC -0500) this morning.  They began
failing after I updated around 1pm EST today.




From thomas at xs4all.net  Sun Jan 21 22:54:05 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sun, 21 Jan 2001 22:54:05 +0100
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 21, 2001 at 02:44:38PM -0500
References: <200101211651.LAA25346@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com>
Message-ID: <20010121225405.M17392@xs4all.nl>

On Sun, Jan 21, 2001 at 02:44:38PM -0500, Tim Peters wrote:

> > ! 	/* Same type name, or (more likely) incomparable numeric types */
> > ! 	return (v->ob_type < w->ob_type) ? -1 : 1;

> That's non-std C in a way Insure complains about elsewhere; change to

> 	return ((Py_uintptr_t)v->ob_type <
> 		  (Py_uintptr_t)w->ob_type) ? -1 : 1;

Why is comparing v->ob_type with w->ob_type illegal ? They're both pointers
to the same type, aren't they ?

> if-vendors-stuck-to-the-letter-of-the-c-std-python-wouldn't-
>      compile-at-all<wink>-ly y'rs  - tim

That's easy to check, gcc has these nice (and from a users point of view,
fairly useless) options: '-ansi', '-pedantic' and '-pedantic-errors'.
'-ansi' disables some GCC-specific features, -pedantic turns gcc into a
whiney pedantic I'm sure you'd get along with just fine <wink>, and
-pedantic-errors turns those whines into errors.

Doing a quick check I see one error I added myself (but haven't commited) in
the continue-inside-try patch (a trailing comma in an enumerator
definition), and one error in configure (it mis-detects the arguments to
setpgrp() in strict-ANSI mode, for some reason.) I don't see any errors in
the core Python. I see an error in the nis module (missing function
prototype, and broken system-include file) and a *lot* of errors in
linuxaudiodev, but nothing else in the set of modules I can compile. Not
bad!

Note that this was tested in a current tree. I couldn't find either Guido's
'broken' code or your proposed 'good' code, so I don't know if you checked
in a fix yet. If you didn't, don't bother, it's not broken :-)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From loewis at informatik.hu-berlin.de  Sun Jan 21 23:00:47 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Sun, 21 Jan 2001 23:00:47 +0100 (MET)
Subject: [Python-Dev] Re: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEDPIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEDPIKAA.tim.one@home.com>
Message-ID: <200101212200.XAA16672@pandora.informatik.hu-berlin.de>

> [Martin von Loewis]
> > Not for me, on neither Solaris nor Linux. What expat version?
> 
> Tell me how to answer the question, and I'll be happy to (I have no idea
> what any of this stuff is or does).
>
> My pyexpat.c (well, my *everything*) is current CVS, pyexpat.c in
> particular is revision 2.33.

That's good; mine too.

> xmltok.dll and xmlparse.dll were obtained from
> 
>     ftp://ftp.jclark.com/pub/xml/expat.zip
> 
> for the 2.0 release.
> 
> Is any of that relevant?

That gives some clue, yes. Unfortunately, that URL itself is a symlink
that was expat1_1.zip (157936 bytes) at some point, and now is
expat1_2.zip (153591 bytes). The files themselves are not
self-identifying, it's hard to tell once unzipped...

Anyway, I was using 1.1 in my own tests, and 1.2 in PyXML - either
works for me. I never tested 1.95.x (which is also not available from
jclark.com).

> The tests passed in the wee hours (EST; UTC -0500) this morning.
> They began failing after I updated around 1pm EST today.

I just merged pyexpat changes from PyXML into Python 2 so that could
be the cause. However, this very code has been used for some time by
PyXML users, why it crashes for you is a mystery to me.

Any chance of producing a C backtrace?

Regards,
Martin



From tim.one at home.com  Sun Jan 21 23:09:30 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 17:09:30 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEDPIKAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEDPIKAA.tim.one@home.com>

FYI, under the debug-build Python, running test_sax.py under the debugger
dies like so:

Passed test_attrs_empty
Passed test_attrs_wattr
Passed test_escape_all
Passed test_escape_basic
Passed test_escape_extra
Passed test_expat_attrs_empty
Passed test_expat_attrs_wattr
Passed test_expat_dtdhandler
Passed test_expat_entityresolver
Passed test_expat_file
Traceback (most recent call last):
  File "../lib/test/test_sax.py", line 603, in ?
    confirm(value(), name)
  File "../lib/test/test_sax.py", line 435, in test_expat_incomplete
    parser.parse(StringIO("<foo>"))
  File "c:\code\python\dist\src\lib\xml\sax\expatreader.py", line 42, in
parse
    xmlreader.IncrementalParser.parse(self, source)
  File "c:\code\python\dist\src\lib\xml\sax\xmlreader.py", line 122, in
parse
    self.close()
  File "c:\code\python\dist\src\lib\xml\sax\expatreader.py", line 91, in
close
    self.feed("", isFinal = 1)
  File "c:\code\python\dist\src\lib\xml\sax\expatreader.py", line 82, in
feed
    except expat.error:
SystemError: 'finally' pops bad exception

Running it from a command line instead produces the same output up to but
not including the traceback, and Python crashes with a memory fault then.
Attaching to the process with a debugger at that point shows it trying to do
_Py_Dealloc on an op whose op->op_type member is NULL.  Here's the call
stack at that point:

_Py_Dealloc(_object * 0x007af100) line 1304 + 6 bytes
insertdict(dictobject * 0x007637ec, _object * 0x007a8270,
           long -1601350627, _object * 0x1e1eff18 __Py_NoneStruct)
           line 364 + 48 bytes
PyDict_SetItem(_object * 0x007637ec, _object * 0x007a8270,
          _object * 0x1e1eff18 __Py_NoneStruct) line 498 + 21 bytes
PyDict_SetItemString(_object * 0x007637ec, char * 0x1e1d84fc,
          _object * 0x1e1eff18 __Py_NoneStruct) line 1272 + 17 bytes
PySys_SetObject(char * 0x1e1d84fc, _object * 0x1e1eff18 __Py_NoneStruct)
          line 67 + 17 bytes
reset_exc_info(_ts * 0x00760630) line 2207 + 17 bytes
eval_code2(PyCodeObject * 0x00993df0, _object * 0x0098794c,
          _object * 0x00000000, _object * * 0x007a9d28, int 2,
          _object * * 0x007a9d30, int 1, _object * * 0x009a0b60,
          int 1) line 2125 + 9 bytes
fast_function(_object * 0x009a4f6c, _object * * * 0x0063f5a0, int 4,
          int 2, int 1) line 2817 + 61 bytes
eval_code2(PyCodeObject * 0x00993910, _object * 0x0098794c,
          _object * 0x00000000, _object * * 0x007a05e8, int 1,
          _object * * 0x007a05ec, int 0, _object * * 0x00000000,
         int 0) line 1860 + 37 bytes
fast_function(_object * 0x009a549c, _object * * * 0x0063f738, int 1,
         int 1, int 0) line 2817 + 61 bytes
eval_code2(PyCodeObject * 0x007b35e0, _object * 0x0098110c,
          _object * 0x00000000, _object * * 0x009beb10, int 2,
          _object * * 0x00000000, int 0, _object * * 0x00000000,
          int 0) line 1860 + 37 bytes
call_eval_code2(_object * 0x0098a97c, _object * 0x009beafc,
         _object * 0x00000000) line 2765 + 57 bytes
call_object(_object * 0x0098a97c, _object * 0x009beafc,
         _object * 0x00000000) line 2594 + 17 bytes
call_method(_object * 0x0098a97c, _object * 0x009beafc,
         _object * 0x00000000) line 2717 + 17 bytes
call_object(_object * 0x007e125c, _object * 0x009beafc,
         _object * 0x00000000) line 2592 + 17 bytes
do_call(_object * 0x007e125c, _object * * * 0x0063f96c, int 2,
        int 0) line 2915 + 17 bytes
eval_code2(PyCodeObject * 0x00991560, _object * 0x0098794c,
        _object * 0x00000000, _object * * 0x009bce98, int 2,
        _object * * 0x009bcea0, int 0, _object * * 0x00000000,
        int 0) line 1863 + 30 bytes
fast_function(_object * 0x009a7dfc, _object * * * 0x0063fb04, int 2,
        int 2, int 0) line 2817 + 61 bytes
eval_code2(PyCodeObject * 0x009f7e00, _object * 0x0076f14c,
       _object * 0x00000000, _object * * 0x00775904, int 0,
       _object * * 0x00775904, int 0, _object * * 0x00000000,
       int 0) line 1860 + 37 bytes
fast_function(_object * 0x009bc8ac, _object * * * 0x0063fc9c, int 0,
       int 0, int 0) line 2817 + 61 bytes
eval_code2(PyCodeObject * 0x009f86d0, _object * 0x0076f14c,
      _object * 0x0076f14c, _object * * 0x00000000, int 0,
      _object * * 0x00000000, int 0, _object * * 0x00000000,
      int 0) line 1860 + 37 bytes
PyEval_EvalCode(PyCodeObject * 0x009f86d0, _object * 0x0076f14c,
      _object * 0x0076f14c) line 338 + 29 bytes
run_node(_node * 0x007aa740, char * 0x00760dd9, _object * 0x0076f14c,
     _object * 0x0076f14c) line 919 + 17 bytes
run_err_node(_node * 0x007aa740, char * 0x00760dd9, _object * 0x0076f14c,
     _object * 0x0076f14c) line 907 + 21 bytes
PyRun_FileEx(_iobuf * 0x10261888, char * 0x00760dd9, int 257,
     _object * 0x0076f14c, _object * 0x0076f14c, int 1) line 899 + 21 bytes
PyRun_SimpleFileEx(_iobuf * 0x10261888, char * 0x00760dd9, int 1)
      line 612 + 30 bytes
PyRun_AnyFileEx(_iobuf * 0x10261888, char * 0x00760dd9, int 1)
      line 466 + 17 bytes
Py_Main(int 2, char * * 0x00760da0) line 295 + 44 bytes
main(int 2, char * * 0x00760da0) line 10 + 13 bytes

insertdict is doing

    Py_DECREF(old_value);

reset_exc_info is doing

    PySys_SetObject("exc_type", frame->f_exc_type);

Bet that's as helpful to you as it was to me <wink>.




From thomas at xs4all.net  Sun Jan 21 23:13:02 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sun, 21 Jan 2001 23:13:02 +0100
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <20010121225405.M17392@xs4all.nl>; from thomas@xs4all.net on Sun, Jan 21, 2001 at 10:54:05PM +0100
References: <200101211651.LAA25346@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com> <20010121225405.M17392@xs4all.nl>
Message-ID: <20010121231302.N17392@xs4all.nl>

On Sun, Jan 21, 2001 at 10:54:05PM +0100, Thomas Wouters wrote:
> I see an error in the nis module (missing function prototype, and broken
> system-include file) and a *lot* of errors in linuxaudiodev

The errors in linuxaudiodev are only errors because for some reason, in
-ansi -pedantic-errors mode, gcc doesn't define the 'linux' symbol. IMHO,
not worth fixing. The nismodule is 'broken' because of this:

static
nismaplist *
nis_maplist (void)
{
        nisresp_maplist *list;
        char *dom;
        CLIENT *cl, *clnt_create();

clnt_create() should be declared by the system include files. Anyone have
objections to me moving it to pyport.h, inside the '#if 0' ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tim.one at home.com  Sun Jan 21 23:28:45 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 17:28:45 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <20010121225405.M17392@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com>

[Thomas Wouters]
> Why is comparing v->ob_type with w->ob_type illegal ? They're
> both pointers to the same type, aren't they ?

Non-equality comparison of pointers is defined if and only if the pointers
are both addresses in the same contiguous structure (think struct or array);
an exception is made for a pointer "one beyond the end" of an array, i.e. if

    sometype a[N];

then &a[0] < &a[N] == 1 is guaranteed despite that &a[N] is outside the
bounds of a; but &a[0] < &a[N+1] is undefined (which *means* undefined!
e.g., it's OK if they compare equal, or if the comparison causes a hardware
fault, or ...).

> That's easy to check, gcc has these nice (and from a users point of view,
> fairly useless) options: '-ansi', '-pedantic' and '-pedantic-errors'.
> '-ansi' disables some GCC-specific features, -pedantic turns gcc into a
> whiney pedantic I'm sure you'd get along with just fine <wink>, and
> -pedantic-errors turns those whines into errors.

Your faith in gcc is as charming as it is naive <wink>:  the most
interesting cases of undefined behavior can't be checked no-way, no-how at
compile-time.  That's why Barry keeps talking employers into dumping
thousands of dollars into a single Insure++ license.  Insure++ actually tags
every pointer at runtime with its source, and gripes if non-equality
comparisons are done on a pair not derived from the same array or malloc
etc.  Since Python type objects are individually allocated (not taken from a
preallocated contiguous vector), Insure++ should complain about that
compare.

> ...
> Note that this was tested in a current tree. I couldn't find
> either Guido's 'broken' code or your proposed 'good' code, so I
> don't know if you checked in a fix yet. If you didn't, don't bother,
> it's not broken :-)

Guido hasn't checked it in yet, but gcc isn't smart enough to detect *this*
breakage anyway.





From fredrik at effbot.org  Mon Jan 22 00:02:10 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Mon, 22 Jan 2001 00:02:10 +0100
Subject: [Python-Dev] more unicode database changes
Message-ID: <030501c083fe$2fe7dbf0$e46940d5@hagrid>

Just checked in another unicode database patch, which
saves another ~60k.  On my Windows box, the Unicode
tables are now about 200k (down from 600k in 2.0).

After this change, Modules/unicodedatabase.[ch] are no
longer used.

Since I'm on a Windows box with MSVC 5.0, I don't really
want to try removing them from the official build files. In-
stead, I've checked in empty versions of the files.

Can anyone help me get rid of all references to them from
the build files (and CVS)?

</F>

PS. btw, if my changes broke the build somewhere, let me
know asap!




From tim.one at home.com  Mon Jan 22 00:07:14 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 18:07:14 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <200101212200.XAA16672@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEEDIKAA.tim.one@home.com>

[Martin, on ftp://ftp.jclark.com/pub/xml/expat.zip]
> ...
> That gives some clue, yes. Unfortunately, that URL itself is a symlink
> that was expat1_1.zip (157936 bytes) at some point,

That's the one I've been using.

> and now is expat1_2.zip (153591 bytes).

I'm assuming you're recommending that one!  Based on that assumption, I've
downloaded a new one and will put that in the 2.1a1 Windows release.  Scream
if that's not what you want.

> ...
> Anyway, I was using 1.1 in my own tests, and 1.2 in PyXML - either
> works for me. I never tested 1.95.x (which is also not available from
> jclark.com).

If you do and love it, let me know where to get it and I'll ship that
instead.

>> The tests passed in the wee hours (EST; UTC -0500) this morning.
>> They began failing after I updated around 1pm EST today.

> I just merged pyexpat changes from PyXML into Python 2 so that could
> be the cause. However, this very code has been used for some time by
> PyXML users, why it crashes for you is a mystery to me.

Perhaps gc, perhaps uninitialized vars, ..., hard to say.  Unfortunately,
it's not unusual for flawed code to display different behavior across
platforms; or, from the long-term QA perspective, it's *great* that flawed
code doesn't always appear to work on all platforms <wink>.

> Any chance of producing a C backtrace?

Sent that before; doesn't look like much help; we're seeing a NULL type
pointer, but at that stage there's no telling when or where or why it
*became* NULL.

I'm going to rebuild the world from scratch, and use the new DLLs.  You
should assume that didn't help unless I say otherwise within 15 minutes.




From tim.one at home.com  Mon Jan 22 00:09:51 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 18:09:51 -0500
Subject: [Python-Dev] more unicode database changes
In-Reply-To: <030501c083fe$2fe7dbf0$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEEEIKAA.tim.one@home.com>

[/F]
> Just checked in another unicode database patch, which
> saves another ~60k.  On my Windows box, the Unicode
> tables are now about 200k (down from 600k in 2.0).

Yay!  I take it CNRI wasn't paying you by the byte <wink>.

> After this change, Modules/unicodedatabase.[ch] are no
> longer used.
>
> Since I'm on a Windows box with MSVC 5.0, I don't really
> want to try removing them from the official build files. In-
> stead, I've checked in empty versions of the files.

That's fine.

> Can anyone help me get rid of all references to them from
> the build files (and CVS)?
>
> </F>
>
> PS. btw, if my changes broke the build somewhere, let me
> know asap!

I'll take care of the MS project files -- and I was just about to rebuild
the world from scratch anyway.




From tim.one at home.com  Mon Jan 22 00:20:03 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 18:20:03 -0500
Subject: [Python-Dev] more unicode database changes
In-Reply-To: <030501c083fe$2fe7dbf0$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEEFIKAA.tim.one@home.com>

> After this change, Modules/unicodedatabase.[ch] are no
> longer used.

Not so:  unicodedata.c still #includes unicodedatabase.h.




From tim.one at home.com  Mon Jan 22 00:53:13 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 18:53:13 -0500
Subject: [Python-Dev] more unicode database changes
In-Reply-To: <030501c083fe$2fe7dbf0$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEEGIKAA.tim.one@home.com>

[/F]
> ...
> PS. btw, if my changes broke the build somewhere, let me
> know asap!

The Windows build is fine now and changes checked-in.  You can remove

    Modules/unicodedatabase.[ch]

from the project without hurting it (although I imagine the Unixish builds
still need to learn about this!).




From tim.one at home.com  Mon Jan 22 01:12:21 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 19:12:21 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <200101212200.XAA16672@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEEHIKAA.tim.one@home.com>

More FYI:  With the new expat1_2.zip (153591 bytes) DLLs, all tests pass on
Windows except for test_sax.  No change in symptoms.  The failure modes for
test_sax depend on all of:

+ Whether run in release or debug builds.

+ Whether text_sax.py is run directly or via regrtest.py.

+ Whether I delete all .pyc/.pyo files first, or use precomplied ones.

+ In debug builds, whether the test is started from within the
  debugger, or I start it via cmdline and attach to the process after
  it crashes (with a memory fault).

Here's a new failure mode:

test test_sax crashed -- XMLParserType: no element found: line 1, column 5

So this smells to high heaven of either a nasty gc problem or referencing
uninitialized memory.  Symptoms don't change if I stick

    import gc
    gc.disable()

at the start of test_sax.py.

Barry, can you try running test_sax under Insure?  I've got little chance of
making enough time tonight to figure this out the hard way ...




From nas at arctrix.com  Sun Jan 21 18:28:52 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Sun, 21 Jan 2001 09:28:52 -0800
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCIEEHIKAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 21, 2001 at 07:12:21PM -0500
References: <200101212200.XAA16672@pandora.informatik.hu-berlin.de> <LNBBLJKPBEHFEDALKOLCIEEHIKAA.tim.one@home.com>
Message-ID: <20010121092852.A24605@glacier.fnational.com>

On Sun, Jan 21, 2001 at 07:12:21PM -0500, Tim Peters wrote:
> So this smells to high heaven of either a nasty gc problem or referencing
> uninitialized memory.  Symptoms don't change if I stick
> 
>     import gc
>     gc.disable()
> 
> at the start of test_sax.py.

Can you try it with WITH_CYCLE_GC undefined?

  Neil



From greg at cosc.canterbury.ac.nz  Mon Jan 22 01:25:08 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 13:25:08 +1300 (NZDT)
Subject: [Python-Dev] a>b == b<a dangerous?
In-Reply-To: <200101191600.LAA28788@cj20424-a.reston1.va.home.com>
Message-ID: <200101220025.NAA01809@s454.cosc.canterbury.ac.nz>

Suppose I have a class which checks whether it knows
how to do a comparison, and if not, wants to pass it
on to the other operand in case it knows:

  class Foo:

    def __lt__(self, other):
      if I_know_about(other):
        # do the comparison
      else:
        return other.__gt__(self)

If the other operand has a __gt__ method which is
doing similar tricks, infinite recursion could result.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From greg at cosc.canterbury.ac.nz  Mon Jan 22 01:36:51 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 13:36:51 +1300 (NZDT)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <200101191848.NAA02765@cj20424-a.reston1.va.home.com>
Message-ID: <200101220036.NAA01813@s454.cosc.canterbury.ac.nz>

Guido:

> I don't understand how these can be not commutative unless they have a
> side effect on the left argument

I think he meant "not reflective". If a<b == floor(a,b) and a>b ==
ceil(a,b), then clearly a<b != b>a.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From mwh21 at cam.ac.uk  Mon Jan 22 01:48:16 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 22 Jan 2001 00:48:16 +0000
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: Greg Ewing's message of "Mon, 22 Jan 2001 13:36:51 +1300 (NZDT)"
References: <200101220036.NAA01813@s454.cosc.canterbury.ac.nz>
Message-ID: <m31ytw1kfj.fsf@atrus.jesus.cam.ac.uk>

Greg Ewing <greg at cosc.canterbury.ac.nz> writes:

> Guido:
> 
> > I don't understand how these can be not commutative unless they have a
> > side effect on the left argument
> 
> I think he meant "not reflective". If a<b == floor(a,b) and a>b ==
> ceil(a,b), then clearly a<b != b>a.

What's floor of two arguments?  In common lisp, (floor a b) is the
largest integer n such that (<= n (/ a b)), in Python it's a type
error...  if you meant min(a,b), then I then think the programmer who
thinks "min(a,b)" is spelt "a<b" has problems we can't be expected to
deal with (if min has a symbol it's /\, but never mind that).

More generally, people who define their comparison operators in
non-intuitive ways shouldn't really expect intuitive behaviour.  I
thought Guido threatened to document this fact in large letters
somewhere...

Cheers,
M.

-- 
  Premature optimization is the root of all evil in programming.  
                                                       -- C.A.R. Hoare




From greg at cosc.canterbury.ac.nz  Mon Jan 22 01:52:25 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 13:52:25 +1300 (NZDT)
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com>
Message-ID: <200101220052.NAA01817@s454.cosc.canterbury.ac.nz>

> Non-equality comparison of pointers is defined if and only if the pointers
> are both addresses in the same contiguous structure

I'm not sure that the proposed alternative (casting both
pointers to ints and comparing the ints) is any better.
Does the C std define the result of doing that to two
unrelated pointers?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From tim.one at home.com  Mon Jan 22 01:56:16 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 19:56:16 -0500
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <20010121092852.A24605@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEEJIKAA.tim.one@home.com>

[Neil Schemenauer]
> Can you try it with WITH_CYCLE_GC undefined?

Good idea -- for someone with an infinite amount of free time <wink>.

But being a good sport, I did as you asked with giddy cheer.  Alas, it
didn't help (all the same bizarre context-dependent test_sax failure modes).
I'm sure I disabled WITH_CYCLE_GC correctly, because "import gc" now fails
with ImportError in both release and debug builds.

BTW, a refcount-too-low problem is another good candidate.




From greg at cosc.canterbury.ac.nz  Mon Jan 22 02:00:46 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 14:00:46 +1300 (NZDT)
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <m31ytw1kfj.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <200101220100.OAA01820@s454.cosc.canterbury.ac.nz>

Michael Hudson <mwh21 at cam.ac.uk>:

> if you meant min(a,b),

Yes, sorry, that's what I meant. Or at least that's what
I thought the original poster meant - if he didn't, then
I'm confused, too!

Anyway, I agree that it's a silly thing to want to make
a>b mean, and I'm not all that disappointed that it won't
be possible.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From tim.one at home.com  Mon Jan 22 02:11:52 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 20:11:52 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101220052.NAA01817@s454.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEEKIKAA.tim.one@home.com>

[Greg Ewing]
> I'm not sure that the proposed alternative (casting both
> pointers to ints and comparing the ints) is any better.
> Does the C std define the result of doing that to two
> unrelated pointers?

C99 guarantees that, if the type exists, casting a pointer to type uintptr_t
won't blow up, and also guarantees that comparisons between (at least) ints
of the same type won't blow up.  Beyond that, we don't care what it returns.
Mostly we're trying to eliminate warnings Barry has to wade thru from
Insure++ -- same reason we have a "no compiler warnings!" build policy.
Doing the cast is obviously "better" when viewed through Barry's 4AM eyes.

You can find out *why* C has this rule (which was in C89, not new in C99) by
reading the C FAQ.




From tim.one at home.com  Mon Jan 22 02:23:27 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 20:23:27 -0500
Subject: [Python-Dev] Rich comparison confusion
In-Reply-To: <m31ytw1kfj.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEELIKAA.tim.one@home.com>

[Michael Hudson]
> ...
> if you meant min(a,b), then I then think the programmer who
> thinks "min(a,b)" is spelt "a<b" has problems we can't be expected to
> deal with (if min has a symbol it's /\, but never mind that).

Curiously, in the Icon language, if a is less than b then

   a < b

returns b while

   b > a

returns a.

In this way they get the same effect as Python's chained comparisons

   a < b < c < d

via purely binary operators (if a is *not* less than b, a < b in Icon
"fails", which is a silent event that causes the expression's context to
backtrack -- but we won't go into that here <wink>).

Anyway, that accounts for this curious Icon idiom:

   a <:= b

which is short for

   a := a < b

and binds a to max(a, b) (if a is smaller, a < b returns b and the
assignment proceeds; but if a is not smaller, a < b fails and that
propagates into its context, which here has no other possibilities to
backtrack into, so the stmt just ends leaving a alone).

"<"-and-">"-are-just-bags-of-pixels-ly y'rs  - tim




From uche.ogbuji at fourthought.com  Mon Jan 22 02:24:46 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Sun, 21 Jan 2001 18:24:46 -0700
Subject: [Python-Dev] should a module's thread safety be documented? 
In-Reply-To: Message from Guido van Rossum <guido@digicool.com> 
   of "Sun, 21 Jan 2001 12:00:02 EST." <200101211700.MAA25479@cj20424-a.reston1.va.home.com> 
Message-ID: <200101220124.SAA08868@localhost.localdomain>

> > A bit late for 2.1alpha1, but it just occurred to me that perhaps there
> > should be an annotation in the documentation that indicates whether or not a
> > module is thread-safe.  For example, many functions in fileinput rely on a
> > module global called _state.  It strikes me that this module is not likely
> > to be thread-safe, yet the documentation doesn't appear to mention this,
> > certainly not in an obvious fashion.
> > 
> > Anyone for adding \notthreadsafe{} and \threadsafe{} macros to the litany of
> > LaTex macros in Fred's arsenal?  This would make documenting these
> > properties both easy and consistent across modules.
> 
> It's hard to say whether a *whole module* is threadsafe.  E.g. in the
> fileinput example, there's the clear implication that if you use this
> in multiple threads, you should instantiate your own FileInput
> instances, and then you're totally thread-safe.  Clearly the semantics
> of the module-global functions are thread-unsafe though.

Perhaps what is needed rather is a prose annotation for thread-safety issues.

My TeX is rusty, but in Docbook, with the use of role attributes, one could 
have, taking your FileInput example

<sect1 role="thread-safety"><para>
  The module-global functions are not safe, but if you instantiate your own 
FileInput instances, they will be totally thread-safe.
</para></sect>

That way the MT issues could be styled differently on rendering, gathered into 
separate documentation, stripped by those who don't care, etc.  I imagine this 
is also possible in TeX.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From tim.one at home.com  Mon Jan 22 02:32:30 2001
From: tim.one at home.com (Tim Peters)
Date: Sun, 21 Jan 2001 20:32:30 -0500
Subject: [Python-Dev] a>b == b<a dangerous?
In-Reply-To: <200101220025.NAA01809@s454.cosc.canterbury.ac.nz>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEEMIKAA.tim.one@home.com>

[Greg Ewing]
> Suppose I have a class which checks whether it knows
> how to do a comparison, and if not, wants to pass it
> on to the other operand in case it knows:
>
>   class Foo:
>
>     def __lt__(self, other):
>       if I_know_about(other):
>         # do the comparison
>       else:
>         return other.__gt__(self)
>
> If the other operand has a __gt__ method which is
> doing similar tricks, infinite recursion could result.

Does this have something to do with comparisons?  That is, wouldn't the same
be true if you coded two methods named "spam" and "eggs" in this way?

whatever = 0

class Foo:
    def spam(self, other):
       if whatever:
           return 1
       else:
           return other.eggs(self)

class Bar:
    def eggs(self, other):
       if whatever:
           return 1
       else:
           return other.spam(self)

Foo().spam(Bar())  # RuntimeError: Maximum recursion depth exceeded

It that's all there is to it, you got what you asked for.




From greg at cosc.canterbury.ac.nz  Mon Jan 22 04:31:41 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Mon, 22 Jan 2001 16:31:41 +1300 (NZDT)
Subject: [Python-Dev] a>b == b<a dangerous?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEEMIKAA.tim.one@home.com>
Message-ID: <200101220331.QAA01833@s454.cosc.canterbury.ac.nz>

Tim Peters <tim.one at home.com>:

> Does this have something to do with comparisons?  That is, wouldn't the same
> be true if you coded two methods named "spam" and "eggs" in this
> way?

Yes, but Guido hasn't decreed that a.spam(b) and b.eggs(a) are
to have a reflective relationship with each other.

But don't worry - I've belatedly realised that the correct way
to do what I was talking about is to return NotImplemented and
let the interpreter take care of calling the reflected method.
So I withdraw my objection.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From tim.one at home.com  Mon Jan 22 08:54:32 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 02:54:32 -0500
Subject: [Python-Dev] Worse news
Message-ID: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>

I still don't have a clue about test_sax, but have stumbled into more
failure modes.  Most of them seem related to the SystemError ("'finally'
pops bad exception").  Around that part of ceval.c, sometimes the v popped
off the stack has a NULL type pointer, other times it's a pointer to a
damaged PyTuple_Type (for example, with a tp_dealloc field of 0x61, which
leads to an illegal instruction exception).

The MS debug heap routines fill all newly malloc'ed memory with 0Xcd ("clean
landfill"), fill free'ed memory with 0Xdd ("dead landfill"), and *pad*
malloc'ed memory with some number of 0xfd bytes on both sides ("no-man's
land").  The clean landfill and no-man's land patterns are showing up more
often they should "by chance", and especially in high-order bytes.  Just
more evidence of the obvious:  something is really screwed up <wink>.

I cannot get the subtest that test_sax is calling (test_expat_incomplete) to
fail in isolation.

Next headache:  If I delete all .pyc files from Lib/ and Lib/test/, and then
run:

python ../lib/test/regrtest.py -x test_sax

by hand, all the 98 tests that *should* run on Windows (excluding, of
course, test_sax, which is no longer tried) pass.  If I immediately run them
again (without deleting .pyc) by hand:

python ../lib/test/regrtest.py -x test_sax

then they again pass.  However, if I do

rt -x test_sax

which does exactly the steps (delete .pyc, run regrest excluding test_sax,
run regrtest again) via the little MS batch file rt.bat, then on the second
time thru regrtest, and 5 times out of 5, it died in test_extcall with an
"illegal operation", while executing

		if (TYPE(c) == DOUBLESTAR) {

near the end of symtable_params in compile.c.  This is an optimized build,
and the debugger has no idea what's in c at this point; to judge from the
offending machine instruction and register contents, though, c is a bad
pointer.

Have not been able to get test_extcall to fail in isolation.

Have also been unable to get test_extcall to fail in the debug build.


So there's evidence of Deep Rot beyond test_sax, but test_sax remains the
only test that fails every time and under both build types.

Running regrtest with -r (randomize test order) is also "interesting":
first time I tried that, test_cpickle failed (truncated output) as well as
test_sax.

I doubt anyone has run the tests more often than me over the last week, so
I'm not surprised I'm seeing the most problems.  However, since *nobody* is
seeing anything on Linux, I'd at least like to get *someone* else to run the
tests on Windows.  While I'm not having any unusual problems with my box,
it's certainly possible that I've got a corrupted file or a flaky memory
chip etc, or that MSVC is generating bad code for some recent change
(although that's unlikely since the debug build generates *really*
straightforward code).

Deleting my entire PCbuild subtree and refetching it from CVS didn't make
any difference.




From esr at thyrsus.com  Mon Jan 22 09:01:27 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 03:01:27 -0500
Subject: [Python-Dev] autoconf --enable vs. --with
In-Reply-To: <200101212122.WAA16371@pandora.informatik.hu-berlin.de>; from loewis@informatik.hu-berlin.de on Sun, Jan 21, 2001 at 10:22:44PM +0100
References: <200101212122.WAA16371@pandora.informatik.hu-berlin.de>
Message-ID: <20010122030127.C20804@thyrsus.com>

Martin von Loewis <loewis at informatik.hu-berlin.de>:
> > It looks like with-cycle-gc and mybe with-pydebug would have to be
> > changed.
> 
> I'm in favour of changing it.

Likewise.  Let's be good neighbors.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Where rights secured by the Constitution are involved, there can be no
rule making or legislation which would abrogate them.
        -- Miranda vs. Arizona, 384 US 436 p. 491



From loewis at informatik.hu-berlin.de  Mon Jan 22 09:26:15 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 22 Jan 2001 09:26:15 +0100 (MET)
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEDPIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCOEDPIKAA.tim.one@home.com>
Message-ID: <200101220826.JAA20819@pandora.informatik.hu-berlin.de>

> Running it from a command line instead produces the same output up to but
> not including the traceback, and Python crashes with a memory fault then.
> Attaching to the process with a debugger at that point shows it trying to do
> _Py_Dealloc on an op whose op->op_type member is NULL.
[...]
> Bet that's as helpful to you as it was to me <wink>.

Well, it was atleast motivating enough to try it out on my Whistler
installation. Purify would probably find this rather quickly; the code
writes into the 257th element of a 256-elements array. I've committed
a fix.

Depending on the exact organization of globals, this could have easily
gone unnoticed. MSVC packs variables more than gcc does, so the write
would overwrite one byte in ErrorObject, which would then not point to
a PyObject anymore.

Thanks for your patience,
Martin



From tim.one at home.com  Mon Jan 22 10:18:04 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 04:18:04 -0500
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing (Windows))
In-Reply-To: <200101220826.JAA20819@pandora.informatik.hu-berlin.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com>

[Martin]
> Well, it was atleast motivating enough to try it out on my Whistler
> installation. Purify would probably find this rather quickly; the code
> writes into the 257th element of a 256-elements array.

Ah!  You shouldn't do that <wink>.

> I've committed a fix.

But you should do that.  Thank you!

Here's where I am now:

=========================================================================
All test_sax failures have gone away (yay!).
=========================================================================
Running

    rt -x test_sax

on Windows still blows up in test_extcall on the 2nd pass.  It does not blow
up:

    using the debug build; or
    if test_sax is *not* excluded; or
    in the 1st pass; or
    when running text_extcall in isolation; or
    if the steps rt performs are done by hand
=========================================================================
Running

    rt -r

on Windows still sees test_cpickle fail in the first pass (with truncated
output), but succeed in the second pass.  First-pass failure is always like
so (modulo line breaks I'm inserting by hand):

test test_cpickle failed -- Tail of expected stdout unseen:
'dumps()\012
loads()\012
ok\012
loads() DATA\012
ok\012
dumps() binary\012
loads() binary\012
ok\012
loads() BINDATA\012
ok\012
dumps() RECURSIVE\012
ok\012'

I've also seen it fail at least once when doing the same thing by hand:

    del ..\lib\*.pyc
    del ..\lib\test\*.pyc
    python ../lib/test/regrtest.py -r

else-i-would-have-asked-martin-to-look-for-a digit-to-change-in-
    command.com<wink>-ly y'rs  - tim




From mal at lemburg.com  Mon Jan 22 11:19:18 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 11:19:18 +0100
Subject: [Python-Dev] more unicode database changes
References: <030501c083fe$2fe7dbf0$e46940d5@hagrid>
Message-ID: <3A6C0926.D0A004E4@lemburg.com>

Fredrik Lundh wrote:
> 
> Just checked in another unicode database patch, which
> saves another ~60k.  On my Windows box, the Unicode
> tables are now about 200k (down from 600k in 2.0).

Great work, Fredrik :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 22 11:42:52 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 11:42:52 +0100
Subject: [Python-Dev] readline and setup.py
References: <3A68B5B0.771412F7@lemburg.com>
Message-ID: <3A6C0EAC.7D322174@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> The new setup.py procedure for Python causes readline not to
> be built on my machine. Instead I get a linker error telling
> me that termcap is not found.
> 
> Looking at my old Setup file, I have this line:
> 
> readline readline.c \
>          -I/usr/include/readline -L/usr/lib/termcap \
>          -lreadline -lterm
> 
> I guess, setup.py should be modified to include additional
> library search paths -- shouldn't hurt on platforms which
> don't need them.

Here's a patch which works for me:

projects/Python> diff CVS-Python/setup.py Dev-Python/
--- CVS-Python/setup.py Mon Jan 22 11:36:56 2001
+++ Dev-Python/setup.py Mon Jan 22 11:40:15 2001
@@ -216,10 +216,11 @@ class PyBuildExt(build_ext):
             exts.append( Extension('rgbimg', ['rgbimgmodule.c']) )
 
         # readline
         if (self.compiler.find_library_file(lib_dirs, 'readline')):
             exts.append( Extension('readline', ['readline.c'],
+                                   library_dirs=['/usr/lib/termcap'],
                                    libraries=['readline', 'termcap']) )
 
         # The crypt module is now disabled by default because it breaks builds
         # on many systems (where -lcrypt is needed), e.g. Linux (I believe).
 


-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 22 11:52:17 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 11:52:17 +0100
Subject: [Python-Dev] _tkinter and setup.py
References: <3A68B6BD.BAD038D6@lemburg.com>
Message-ID: <3A6C10E1.EF890356@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> Why does setup.py stop with an error in case _tkinter cannot
> be built (due to an old Tk/Tcl version in my case) ?
> 
> I think the policy in setup.py should be to output warnings,
> but continue building the rest of the Python modules.

I haven't heard anything from the powers to be... what should the
policy be for auto-detected and -configured modules ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Mon Jan 22 13:37:04 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 13:37:04 +0100
Subject: [Python-Dev] _tkinter and setup.py
In-Reply-To: <3A6C10E1.EF890356@lemburg.com>; from mal@lemburg.com on Mon, Jan 22, 2001 at 11:52:17AM +0100
References: <3A68B6BD.BAD038D6@lemburg.com> <3A6C10E1.EF890356@lemburg.com>
Message-ID: <20010122133704.O17392@xs4all.nl>

On Mon, Jan 22, 2001 at 11:52:17AM +0100, M.-A. Lemburg wrote:
> "M.-A. Lemburg" wrote:

> > I think the policy in setup.py should be to output warnings,
> > but continue building the rest of the Python modules.

> I haven't heard anything from the powers to be... what should the
> policy be for auto-detected and -configured modules ?

I think Andrew is still working on a way to disable modules from the command
line somehow. (I think moving setup.py to setup.py.in, and using autoconf
--options would be easiest on both developer and user, but that's just me.)
I also think everyone agrees with you that a module that can't be build
shouldn't stop the entire process in the final release (and possibly the
betas) but that it's definately a good way to debug setup.py in the alphas.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tismer at tismer.com  Mon Jan 22 14:13:46 2001
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 22 Jan 2001 14:13:46 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com>
Message-ID: <3A6C320A.37CBB4E5@tismer.com>

Maybe I can help.

Tim Peters wrote:
...
> Here's where I am now:
> 
> =========================================================================
> All test_sax failures have gone away (yay!).
> =========================================================================
> Running
> 
>     rt -x test_sax
> 
> on Windows still blows up in test_extcall on the 2nd pass.  It does not blow
> up:
> 
>     using the debug build; or
>     if test_sax is *not* excluded; or
>     in the 1st pass; or
>     when running text_extcall in isolation; or
>     if the steps rt performs are done by hand
...

I got problems with XML as well. I'm not using SAX, but plain
expat for speed. The following error happens after parsing
thousands of small XML files:

from_my_log_window="""
\\bned-s1\tismer\pxml\sdf\mdl\DisplayRGB\1
\\bned-s1\tismer\pxml\sdf\mdl\DisplayVideo\1
Traceback (innermost last):
  File "<interactive input>", line 1, in ?
  File "D:\crml_doc\pxml\clean.py", line 151, in getall
    getall(here, res)
  File "D:\crml_doc\pxml\clean.py", line 151, in getall
    getall(here, res)
  File "D:\crml_doc\pxml\clean.py", line 151, in getall
    getall(here, res)
  File "D:\crml_doc\pxml\clean.py", line 149, in getall
    res.append(p.parse())
  File "D:\crml_doc\pxml\clean.py", line 81, in parse
    self.parsers[0].Parse(self.txt1, 1)
  File "D:\crml_doc\pxml\clean.py", line 53, in endElementMaster
    if self.txt2: self.parsers[1].Parse(self.txt2, 1)
  File "D:\crml_doc\pxml\clean.py", line 46, in startElementOther
    if name <> "MASTER":
UnicodeError: UTF-8 decoding error: invalid data
"""

The good news: The error is reproducible, happens the same under
PythonWin and DOS Python, and I can reduce it to a single XML file.
That indicates to me that I am near the reason of the bug,
not at late, indirect effects.
It also *might* be related to Unicode.

I will now try to create a minimized script and XML data that
produces the above again.

back in an hour - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From thomas at xs4all.net  Mon Jan 22 14:52:44 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 14:52:44 +0100
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com>; from tim.one@home.com on Sun, Jan 21, 2001 at 05:28:45PM -0500
References: <20010121225405.M17392@xs4all.nl> <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com>
Message-ID: <20010122145244.Y17295@xs4all.nl>

On Sun, Jan 21, 2001 at 05:28:45PM -0500, Tim Peters wrote:
> [Thomas Wouters]
> > Why is comparing v->ob_type with w->ob_type illegal ? They're
> > both pointers to the same type, aren't they ?

> Non-equality comparison of pointers is defined if and only if the pointers
> are both addresses in the same contiguous structure (think struct or array);
> an exception is made for a pointer "one beyond the end" of an array, i.e. if

>     sometype a[N];

> then &a[0] < &a[N] == 1 is guaranteed despite that &a[N] is outside the
> bounds of a; but &a[0] < &a[N+1] is undefined (which *means* undefined!
> e.g., it's OK if they compare equal, or if the comparison causes a hardware
> fault, or ...).

Ok, I guess I stand corrected. I was confused by the name of Py_uintptr_t: I
thought it was a pointer-to-int, not an int large enough to hold a pointer.
I'm also positively appalled by the fact the standard refuses to define sane
behaviour for out-of-bounds access on an array, but attaches some weird
significance to what pointers are pointing *to*, when comparing the values
of those pointers, regardless of what type of object they are stored in. But
I guess I don't have to whine about that to you, Tim :-)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tismer at tismer.com  Mon Jan 22 15:03:25 2001
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 22 Jan 2001 15:03:25 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com> <3A6C320A.37CBB4E5@tismer.com>
Message-ID: <3A6C3DAD.522CE623@tismer.com>


Christian Tismer wrote:
> 
> Maybe I can help.

...

...
> I will now try to create a minimized script and XML data that
> produces the above again.
> 
> back in an hour - chris

Here we go.
The following session produces the mentioned UTF8 error:

>>> txt = "<master desc='blah\325weird' />"
>>> def startelt(name, dic):
... 	print name, dic
... 	
>>> p=expat.ParserCreate()
>>> p.StartElementHandler = startelt
>>> p.Parse(txt)
Traceback (innermost last):
  File "<interactive input>", line 1, in ?
UnicodeError: UTF-8 decoding error: invalid data

Behavior depends of the ASCII code.

From jeremy at alum.mit.edu  Mon Jan 22 15:19:34 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 09:19:34 -0500 (EST)
Subject: [Python-Dev] Worse news
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>
Message-ID: <14956.16758.68050.257212@localhost.localdomain>

Tim,

Funny (strange or haha?) that test_extcall is failing since the two
pieces of code I've modified most recently are compile.c and the
section of ceval.c that handles extended call syntax.  I just got
through my mail this morning and I'll see what I can reproduce on
Linux.

As for the test_sax failure, is any of the Python code being executed
conditional on platform?  The compiler may be generating bad bytecode
for a code path that is only executed on Windows.

Jeremy




From mal at lemburg.com  Mon Jan 22 15:27:38 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 15:27:38 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com> <3A6C320A.37CBB4E5@tismer.com> <3A6C3DAD.522CE623@tismer.com>
Message-ID: <3A6C4359.BCB06252@lemburg.com>

Christian Tismer wrote:
> 
> Christian Tismer wrote:
> >
> > Maybe I can help.
> 
> ...
> 
> ...
> > I will now try to create a minimized script and XML data that
> > produces the above again.
> >
> > back in an hour - chris
> 
> Here we go.
> The following session produces the mentioned UTF8 error:
> 
> >>> txt = "<master desc='blah\325weird' />"
> >>> def startelt(name, dic):
> ...     print name, dic
> ...
> >>> p=expat.ParserCreate()
> >>> p.StartElementHandler = startelt
> >>> p.Parse(txt)
> Traceback (innermost last):
>   File "<interactive input>", line 1, in ?
> UnicodeError: UTF-8 decoding error: invalid data
> 
> Behavior depends of the ASCII code.
> >From code 128 (0200) to 191 (0277) the parser gives an
> not well-formed exception, as it should be.
> 
> The codes from 192 to 236, 238-243 produce
> "UTF-8 decoding error: invalid data",
> the rest gives "not well-formed".
> 
> I would like to know if this happens with your (Tim) modified
> version as well. I'm using plain vanilla BeOpen Python 2.0 .

This has nothing to do with Python. UTF-8 marks the codes 
from 128-191 as illegal prefix. See Object/unicodeobject.c:

static 
char utf8_code_length[256] = {
    /* Map UTF-8 encoded prefix byte to sequence length.  zero means
       illegal prefix.  see RFC 2279 for details */
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
    2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
    2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
    3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
    4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 0, 0
};

Perhaps the parser should catch the UnicodeError and
instead return a not-wellformed exception ?!
 
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 22 15:38:14 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 15:38:14 +0100
Subject: [Python-Dev] _tkinter and setup.py
References: <3A68B6BD.BAD038D6@lemburg.com> <3A6C10E1.EF890356@lemburg.com> <20010122133704.O17392@xs4all.nl>
Message-ID: <3A6C45D5.9A6FA25C@lemburg.com>

Thomas Wouters wrote:
> 
> On Mon, Jan 22, 2001 at 11:52:17AM +0100, M.-A. Lemburg wrote:
> > "M.-A. Lemburg" wrote:
> 
> > > I think the policy in setup.py should be to output warnings,
> > > but continue building the rest of the Python modules.
> 
> > I haven't heard anything from the powers to be... what should the
> > policy be for auto-detected and -configured modules ?
> 
> I think Andrew is still working on a way to disable modules from the command
> line somehow. (I think moving setup.py to setup.py.in, and using autoconf
> --options would be easiest on both developer and user, but that's just me.)

This is fairly simple to do: distutils allows great flexibility
when it comes to adding user options, e.g. we could have

python setup.py --enable-tkinter --disable-readline

or more generic

python setup.py --enable-package tkinter --disable-package readline

The options could then be edited in setup.cfg.

> I also think everyone agrees with you that a module that can't be build
> shouldn't stop the entire process in the final release (and possibly the
> betas) but that it's definately a good way to debug setup.py in the alphas.

True... but currently the only way to get Python to compile is
to hand-edit setup.py and this is not easy for people with no 
prior distutils experience.

BTW, in my case, setup.py did find the TK-libs for 8.0, but for
a beta version -- as a result, _tkinter.c's version #error line 
triggered and the build failed.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Mon Jan 22 15:38:30 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 09:38:30 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,NONE,1.1
In-Reply-To: Your message of "Sun, 21 Jan 2001 14:37:57 +0200."
             <20010121123757.D897BA83E@darjeeling.zadka.site.co.il> 
References: <E14K45e-00030e-00@usw-pr-cvs1.sourceforge.net>  
            <20010121123757.D897BA83E@darjeeling.zadka.site.co.il> 
Message-ID: <200101221438.JAA29303@cj20424-a.reston1.va.home.com>

> Wouldn't it be better to use the
> 
> d = {}
> exec "foo", d

Surely you meant

    exec "foo" in d

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Mon Jan 22 15:43:42 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 15:43:42 +0100
Subject: [Python-Dev] _tkinter and setup.py
In-Reply-To: <3A6C45D5.9A6FA25C@lemburg.com>; from mal@lemburg.com on Mon, Jan 22, 2001 at 03:38:14PM +0100
References: <3A68B6BD.BAD038D6@lemburg.com> <3A6C10E1.EF890356@lemburg.com> <20010122133704.O17392@xs4all.nl> <3A6C45D5.9A6FA25C@lemburg.com>
Message-ID: <20010122154342.B17295@xs4all.nl>

On Mon, Jan 22, 2001 at 03:38:14PM +0100, M.-A. Lemburg wrote:

> > I think Andrew is still working on a way to disable modules from the command
> > line somehow. (I think moving setup.py to setup.py.in, and using autoconf
> > --options would be easiest on both developer and user, but that's just me.)

> This is fairly simple to do: distutils allows great flexibility
> when it comes to adding user options, e.g. we could have
> 
> python setup.py --enable-tkinter --disable-readline
> 
> or more generic
> 
> python setup.py --enable-package tkinter --disable-package readline
> 
> The options could then be edited in setup.cfg.

Note that the 'user' only has 'configure' and 'make' to run, so optimally,
the options would have to be given to one of those (preferably to
'configure', to keep it similar to 90% of the packages out there.)

> but currently the only way to get Python to compile is
> to hand-edit setup.py and this is not easy for people with no 
> prior distutils experience.

You only have to edit the 'disabled_module_list' variable... not too hard
even if you don't have distutils experience (though you do need some python
experience.) I don't think its wrong to expect people who compile alpha
versions to have at least that much knowledge (though it should be noted in
the README somewhere.)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From loewis at informatik.hu-berlin.de  Mon Jan 22 15:46:39 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Mon, 22 Jan 2001 15:46:39 +0100 (MET)
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
In-Reply-To: <3A6C4359.BCB06252@lemburg.com> (mal@lemburg.com)
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com> <3A6C320A.37CBB4E5@tismer.com> <3A6C3DAD.522CE623@tismer.com> <3A6C4359.BCB06252@lemburg.com>
Message-ID: <200101221446.PAA05164@pandora.informatik.hu-berlin.de>

> This has nothing to do with Python. UTF-8 marks the codes 
> from 128-191 as illegal prefix. 
[...]
> Perhaps the parser should catch the UnicodeError and
> instead return a not-wellformed exception ?!

Right on both accounts. If no encoding is specified, and if the
document appears not to be UTF-16 in any endianness, an XML processor
shall assume it is UTF-8. As Marc-Andre explains, your document is not
proper UTF-8, hence the error.

The confusing thing is that expat itself does not care about it not
being UTF-8; that is only detected when the callback is invoked in
pyexpat, and therefore conversion to a Unicode object is attempted.

The right solution probably would be to change expat so that it
determines correctness of the encoding for each string it gets as part
of the wellformedness analysis, and produces illformedness exceptions
when an encoding error occurs. Patches are welcome, although they
probable should go to sourceforge.net/projects/expat.

Regards,
Martin



From jack at oratrix.nl  Mon Jan 22 15:57:33 2001
From: jack at oratrix.nl (Jack Jansen)
Date: Mon, 22 Jan 2001 15:57:33 +0100
Subject: [Python-Dev] test_sax and site-python
Message-ID: <20010122145733.85E51373C95@snelboot.oratrix.nl>

I'm not sure whether this is really a bug, but I had the problem that there 
was something wrong with the xml package I had installed into my 
Lib/site-python, and this caused test_sax to complain.

If the test stuff is expected to test only the core functionality maybe 
sys.path should be edited so that it only contains directories that are part 
of the core distribution?
--
Jack Jansen             | ++++ stop the execution of Mumia Abu-Jamal ++++
Jack.Jansen at oratrix.com | ++++ if you agree copy these lines to your sig ++++
www.oratrix.nl/~jack    | ++++ see http://www.xs4all.nl/~tank/ ++++





From tismer at tismer.com  Mon Jan 22 16:05:24 2001
From: tismer at tismer.com (Christian Tismer)
Date: Mon, 22 Jan 2001 16:05:24 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
References: <LNBBLJKPBEHFEDALKOLCIEFFIKAA.tim.one@home.com> <3A6C320A.37CBB4E5@tismer.com> <3A6C3DAD.522CE623@tismer.com> <3A6C4359.BCB06252@lemburg.com>
Message-ID: <3A6C4C34.4D1252C9@tismer.com>


"M.-A. Lemburg" wrote:
...
> > The codes from 192 to 236, 238-243 produce
> > "UTF-8 decoding error: invalid data",
> > the rest gives "not well-formed".
> >
> > I would like to know if this happens with your (Tim) modified
> > version as well. I'm using plain vanilla BeOpen Python 2.0 .
> 
> This has nothing to do with Python. UTF-8 marks the codes
> from 128-191 as illegal prefix. See Object/unicodeobject.c:
...

Schade.

> Perhaps the parser should catch the UnicodeError and
> instead return a not-wellformed exception ?!

I belive it would be better.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From guido at digicool.com  Mon Jan 22 16:06:06 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 10:06:06 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include pyport.h,2.24,2.25
In-Reply-To: Your message of "Sun, 21 Jan 2001 15:34:14 PST."
             <E14KTzy-0002Xt-00@usw-pr-cvs1.sourceforge.net> 
References: <E14KTzy-0002Xt-00@usw-pr-cvs1.sourceforge.net> 
Message-ID: <200101221506.KAA29773@cj20424-a.reston1.va.home.com>

> Move declaration of 'clnt_create()' NIS function to pyport.h, as it's
> supposed to be declared in system include files (with a proper prototype.)
> Should be moved to a platform-specific block if anyone finds out which
> broken platforms need it :-)

[The following is inside #if 0]
> + /* From Modules/nismodule.c */
> + CLIENT *clnt_create();
> + 

Thomas, I'm not sure if this particular declaration belongs in
pyport.h, even inside #if 0.

CLIENT is declared in a NIS-specific header file that's not included by
pyport.h, but which *is* included by nismodule.c.

I think you did the right thing to nismodule.c; the pyport.h patch is
redundant in my eyes.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Mon Jan 22 16:12:49 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 16:12:49 +0100
Subject: [Python-Dev] _tkinter and setup.py
References: <3A68B6BD.BAD038D6@lemburg.com> <3A6C10E1.EF890356@lemburg.com> <20010122133704.O17392@xs4all.nl> <3A6C45D5.9A6FA25C@lemburg.com> <20010122154342.B17295@xs4all.nl>
Message-ID: <3A6C4DF1.F71AA631@lemburg.com>

Thomas Wouters wrote:
> 
> On Mon, Jan 22, 2001 at 03:38:14PM +0100, M.-A. Lemburg wrote:
> 
> > > I think Andrew is still working on a way to disable modules from the command
> > > line somehow. (I think moving setup.py to setup.py.in, and using autoconf
> > > --options would be easiest on both developer and user, but that's just me.)
> 
> > This is fairly simple to do: distutils allows great flexibility
> > when it comes to adding user options, e.g. we could have
> >
> > python setup.py --enable-tkinter --disable-readline
> >
> > or more generic
> >
> > python setup.py --enable-package tkinter --disable-package readline
> >
> > The options could then be edited in setup.cfg.
> 
> Note that the 'user' only has 'configure' and 'make' to run, so optimally,
> the options would have to be given to one of those (preferably to
> 'configure', to keep it similar to 90% of the packages out there.)

Hmm, but then you'll have to hack autoconf again... (even if only
to pass the options to setup.py somehow, e.g. via your proposed
setup.cfg.in trick).
 
> > but currently the only way to get Python to compile is
> > to hand-edit setup.py and this is not easy for people with no
> > prior distutils experience.
> 
> You only have to edit the 'disabled_module_list' variable... not too hard
> even if you don't have distutils experience (though you do need some python
> experience.) I don't think its wrong to expect people who compile alpha
> versions to have at least that much knowledge (though it should be noted in
> the README somewhere.)

Oops, you're right; must have overlooked that one in setup.py.


-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From thomas at xs4all.net  Mon Jan 22 16:14:02 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 16:14:02 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include pyport.h,2.24,2.25
In-Reply-To: <200101221506.KAA29773@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 10:06:06AM -0500
References: <E14KTzy-0002Xt-00@usw-pr-cvs1.sourceforge.net> <200101221506.KAA29773@cj20424-a.reston1.va.home.com>
Message-ID: <20010122161402.D17295@xs4all.nl>

On Mon, Jan 22, 2001 at 10:06:06AM -0500, Guido van Rossum wrote:
> > Move declaration of 'clnt_create()' NIS function to pyport.h, as it's
> > supposed to be declared in system include files (with a proper prototype.)
> > Should be moved to a platform-specific block if anyone finds out which
> > broken platforms need it :-)
> 
> [The following is inside #if 0]
> > + /* From Modules/nismodule.c */
> > + CLIENT *clnt_create();
> > + 
> 
> Thomas, I'm not sure if this particular declaration belongs in
> pyport.h, even inside #if 0.
> 
> CLIENT is declared in a NIS-specific header file that's not included by
> pyport.h, but which *is* included by nismodule.c.
> 
> I think you did the right thing to nismodule.c; the pyport.h patch is
> redundant in my eyes.

The same goes for most prototypes inside that '#if 0'. I see it more as an
easy list to see what prototypes were removed than as proper examples of the
prototype. You're right about CLIENT being defined in system-specific
include files, I just wasn't worried about it because it was inside an '#if 0'
that will never be turned into an '#if 1'. If a specific platform needs that
prototype, we'll figure out how to arrange the prototype then :)

But if you want me to remove it, that's fine.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Mon Jan 22 16:22:29 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 10:22:29 -0500
Subject: [Python-Dev] autoconf --enable vs. --with
In-Reply-To: Your message of "Mon, 22 Jan 2001 03:01:27 EST."
             <20010122030127.C20804@thyrsus.com> 
References: <200101212122.WAA16371@pandora.informatik.hu-berlin.de>  
            <20010122030127.C20804@thyrsus.com> 
Message-ID: <200101221522.KAA30287@cj20424-a.reston1.va.home.com>

> I've been working a bit on the build process lately.  I came
> across this in the autoconf documentation:
> 
> 
>     If a software package has optional compile-time features, the
>     user can give `configure' command line options to specify
>     whether to compile them. The options have one of these forms:
> 
>         --enable-FEATURE[=ARG]
>         --disable-FEATURE
> 
>     Some packages require, or can optionally use, other software
>     packages which are already installed.  The user can give
>     `configure' command line options to specify which such
>     external software to use.  The options have one of these
>     forms:
> 
>         --with-package[=ARG]
>         --without-package
> 
> 
> Is it worth fixing the Python configure script to comply with
> these definitions?  It looks like with-cycle-gc and mybe
> with-pydebug would have to be changed.

OK, but please add explicit checks for the old --with[out]-cycle-gc
and --with[out]-pydebug flags that cause errors (not just warnings)
when these forms are used.  It's bad enough that configure doesn't
flag typos in such options as errors; if we change the option names,
we really owe users who were using the old forms a clear error.

(Is this stupid autoconf behavior changable?  Does it also apply to
enable/disable?)

--Guido van Rossum (home page: http://www.python.org/~guido/)




From fdrake at acm.org  Mon Jan 22 16:19:49 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 22 Jan 2001 10:19:49 -0500 (EST)
Subject: [Python-Dev] RE: test_sax failing (Windows)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEEDIKAA.tim.one@home.com>
References: <200101212200.XAA16672@pandora.informatik.hu-berlin.de>
	<LNBBLJKPBEHFEDALKOLCKEEDIKAA.tim.one@home.com>
Message-ID: <14956.20373.104748.573294@cj42289-a.reston1.va.home.com>

[Martin, on ftp://ftp.jclark.com/pub/xml/expat.zip]
 > Anyway, I was using 1.1 in my own tests, and 1.2 in PyXML - either
 > works for me. I never tested 1.95.x (which is also not available from
 > jclark.com).

Tim Peters writes:
 > If you do and love it, let me know where to get it and I'll ship that
 > instead.

  I'll recommend not updating to 1.95.1; let's awit at least until
1.95.2 is out.  These are really just pre-2.0 releases to shake things
out.  I have been using the current Expat CVS lightly, but need to do
more testing before I can be confident in it and our bindings (not
yet checked in anywhere; should be in PyXML soon).


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From jeremy at alum.mit.edu  Mon Jan 22 16:44:41 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 10:44:41 -0500 (EST)
Subject: [Python-Dev] Worse news
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAEFDIKAA.tim.one@home.com>
Message-ID: <14956.21865.943601.735426@localhost.localdomain>

On Linux, I am also seeing test_cpickle failures.  I have not been
able to reproduce failures in test_extcall or test_sax.

I ran 'regrtest.py -r -x test_thread test_unicodedata test_signal
test_select test_poll' 10 times and test_cpickle failed five times.
(I did the peculiar run because exclyding those five tests shaves two
minutes off the running time of the test suite.)

No more time to look into this...

Jeremy



From jeremy at alum.mit.edu  Mon Jan 22 16:26:27 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 10:26:27 -0500 (EST)
Subject: [Python-Dev] getcode() function in pyexpat.c
Message-ID: <14956.20771.447958.389724@localhost.localdomain>

The pyexpat module uses functions named getcode() and
call_with_frame() for handlers of some sort.  I can make this much out
from the code, but the rest is a bit of a mystery.  I was trying to
read this code because of the errors Tim is seeing with test_sax on
Windows.  A few comments to explain this highly stylized and
macro-laden code would be appreciated.

The module appears to be creating empty code objects and calling
them.  I say they appear to be empty, because when they are created 
they don't appear to have anything initialized except name, filename,
and firstlineno.

    getcode(EndNamespaceDecl, 419)
    <code at 0x81b73c0
        co_name = 'EndNamespaceDecl'
        co_filename = 'pyexpat.c'
        co_firstlineno = 419
        co_argcount = 0
        co_nlocals = 0
        co_stacksize = 0
        co_flags = 0
        co_consts = ()
        co_names = ()
        co_varnames = ()
        co_freevars = ()
        co_cellvars = ()
        co_code = ''
    >

(The freevars and cellvars entries are part of the support for nested
scopes.  They can be safely ignored for the moment.) 

I simply don't understand what's going on -- and I'm deeply suspicious
that it is the source of whatever problems Tim is seeing with
test_sax.

Jeremy



From thomas at xs4all.net  Mon Jan 22 16:55:35 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 16:55:35 +0100
Subject: [Python-Dev] 'make distclean' broken.
Message-ID: <20010122165535.P17392@xs4all.nl>

'make distclean' seems broken, at least on non-GNU make's:

[snip]
clobbering subdirectory Modules
rm -f *.o python core *~ [@,#]* *.old *.orig *.rej
rm -f add2lib hassignal
rm -f *.a tags TAGS config.c Makefile.pre
rm -f *.so *.sl so_locations
make -f ./Makefile.in  SUBDIRS="Include Lib Misc Demo" clobber
"./Makefile.in", line 134: Need an operator
make: fatal errors encountered -- cannot continue
*** Error code 1 (ignored)
rm -f config.status config.log config.cache config.h Makefile
rm -f buildno platform
rm -f Modules/Makefile
[snip]

(This is using FreeBSD's 'make'.)

Looking at line 134, I'm not sure why it works with GNU make other than that
it avoids complaining about syntax errors it doesn't run into (which could
be both bad and good :) or that it avoids complaining about obvious GNU
autoconf tricks. But I don't know enough about make to say for sure, nor to
fix the above problem.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Mon Jan 22 16:55:42 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 10:55:42 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: Your message of "Sun, 21 Jan 2001 17:28:45 EST."
             <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCGEEBIKAA.tim.one@home.com> 
Message-ID: <200101221555.KAA30935@cj20424-a.reston1.va.home.com>

> Your faith in gcc is as charming as it is naive <wink>:  the most
> interesting cases of undefined behavior can't be checked no-way, no-how at
> compile-time.  That's why Barry keeps talking employers into dumping
> thousands of dollars into a single Insure++ license.  Insure++ actually tags
> every pointer at runtime with its source, and gripes if non-equality
> comparisons are done on a pair not derived from the same array or malloc
> etc.  Since Python type objects are individually allocated (not taken from a
> preallocated contiguous vector), Insure++ should complain about that
> compare.

IMHO, *this* *particular* gripe of Insure++ is just a pain in the
butt, and I wish there was a way to turn it off in Insure++ without
having to fix the code.

IMHO, this was included in the standard to allow segmented-memory
implementations of C.  Think certain DOS or Windows 3.1 memory models
where a pointer is a segment plus an offset.  This is not current
practice even on Palmpilots!

The standard may say that such comparisons are undefined, but I don't
care about this particular undefinedness, and I'm annoyed by the
required patches.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Mon Jan 22 17:02:15 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 11:02:15 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: Your message of "Sun, 21 Jan 2001 14:44:38 EST."
             <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com> 
Message-ID: <200101221602.LAA31103@cj20424-a.reston1.va.home.com>

> > My only concern is that under the old schele, two different numeric
> > extension types that somehow can't be compared will end up being
> > *equal*.  To fix this, I propose that if the names compare equal, as a
> > last resort we compare the type pointers -- this should be consistent
> > too.
> 
> Agreed, and sounds fine!

Checked in now.

While fixing the test_b1 code again, which depends on this behavior, I
thought of a refinement: it wouldn't be hard to make None compare
smaller than *anything* (including numbers).

Is this worth it?

diff -c -r2.113 object.c
*** object.c	2001/01/22 15:59:32	2.113
--- object.c	2001/01/22 16:03:38
***************
*** 550,555 ****
--- 550,561 ----
  		PyErr_Clear();
  	}
  
+ 	/* None is smaller than anything */
+ 	if (v == Py_None)
+ 		return -1;
+ 	if (w == Py_None)
+ 		return 1;
+ 
  	/* different type: compare type names */
  	if (v->ob_type->tp_as_number)
  		vname = "";


--Guido van Rossum (home page: http://www.python.org/~guido/)



From mwh21 at cam.ac.uk  Mon Jan 22 17:12:47 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: Mon, 22 Jan 2001 16:12:47 +0000 (GMT)
Subject: [Python-Dev] Worse news
In-Reply-To: <14956.21865.943601.735426@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.10101221609430.24819-100000@localhost.localdomain>

On Mon, 22 Jan 2001, Jeremy Hylton wrote:

> On Linux, I am also seeing test_cpickle failures.  I have not been
> able to reproduce failures in test_extcall or test_sax.

Hmm - my machine's done 28 exemplary "make clean; make test" runs this
morning.  I last updated yesterday afternoon my time (~1700 GMT).

Of course, I don't build pyexpat...

> No more time to look into this...

Don't you just love memory corruption bugs?

Cheers,
M.




From akuchlin at mems-exchange.org  Mon Jan 22 17:28:59 2001
From: akuchlin at mems-exchange.org (Andrew Kuchling)
Date: Mon, 22 Jan 2001 11:28:59 -0500
Subject: [Python-Dev] Python 2.1 article
Message-ID: <E14Kjpz-0000cu-00@ute.cnri.reston.va.us>

I've put together an almost-complete first draft of a "What's New in
2.1" article.  The only missing piece is a section on the Nested
Scopes PEP, which obviously has to wait for the changes to get checked
in.  http://www.amk.ca/python/2.1/ ; as usual, nitpicking comments are
welcomed.

--amk




From nas at arctrix.com  Mon Jan 22 11:00:43 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 02:00:43 -0800
Subject: [Python-Dev] Worse news
In-Reply-To: <Pine.LNX.4.10.10101221609430.24819-100000@localhost.localdomain>; from mwh21@cam.ac.uk on Mon, Jan 22, 2001 at 04:12:47PM +0000
References: <14956.21865.943601.735426@localhost.localdomain> <Pine.LNX.4.10.10101221609430.24819-100000@localhost.localdomain>
Message-ID: <20010122020043.A25687@glacier.fnational.com>

On Mon, Jan 22, 2001 at 04:12:47PM +0000, Michael Hudson wrote:
> Don't you just love memory corruption bugs?

Great fun.

I've played around with efence and debauch on the weekend.  I
even when as far as merging an updated fmalloc from the XFree
source tree into debauch and writing a reporting script in
Python.

I probably would have caught the pyexpat overrun if I would have
used efence with EF_ALIGNMENT=0 and complied with -fpack-struct.
I'll have to try it tonight.  Maybe something else will turn up.

  Neil



From guido at digicool.com  Mon Jan 22 18:12:29 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 12:12:29 -0500
Subject: [Python-Dev] 'make distclean' broken.
In-Reply-To: Your message of "Mon, 22 Jan 2001 16:55:35 +0100."
             <20010122165535.P17392@xs4all.nl> 
References: <20010122165535.P17392@xs4all.nl> 
Message-ID: <200101221712.MAA00694@cj20424-a.reston1.va.home.com>

> 'make distclean' seems broken, at least on non-GNU make's:
> 
> [snip]
> clobbering subdirectory Modules
> rm -f *.o python core *~ [@,#]* *.old *.orig *.rej
> rm -f add2lib hassignal
> rm -f *.a tags TAGS config.c Makefile.pre
> rm -f *.so *.sl so_locations
> make -f ./Makefile.in  SUBDIRS="Include Lib Misc Demo" clobber
> "./Makefile.in", line 134: Need an operator
> make: fatal errors encountered -- cannot continue
> *** Error code 1 (ignored)
> rm -f config.status config.log config.cache config.h Makefile
> rm -f buildno platform
> rm -f Modules/Makefile
> [snip]
> 
> (This is using FreeBSD's 'make'.)
> 
> Looking at line 134, I'm not sure why it works with GNU make other than that
> it avoids complaining about syntax errors it doesn't run into (which could
> be both bad and good :) or that it avoids complaining about obvious GNU
> autoconf tricks. But I don't know enough about make to say for sure, nor to
> fix the above problem.

There's one line in Makefile.in that trips over Make (mine also
complains about it):

    @SET_DLLLIBRARY@

Looking at the code in configure.in that generates this macro:

    AC_SUBST(SET_DLLLIBRARY)
    LDLIBRARY=''
    SET_DLLLIBRARY=''
       .
       . (and later)
       .
    cygwin*)
	  LDLIBRARY='libpython$(VERSION).dll.a'
	  SET_DLLLIBRARY='DLLLIBRARY=	$(basename $(LDLIBRARY))'
	  ;;

I don't see why we couldn't change this so that Makefile.in just
contains

    DLLLIBRARY=		@DLLLIBRARY@

and then configure.in could be changed to

    AC_SUBST(DLLLIBRARY)
    LDLIBRARY=''
    DLLLIBRARY=''
       .
       . (and later)
       .
    cygwin*)
	  LDLIBRARY='libpython$(VERSION).dll.a'
	  DLLLIBRARY='DLLLIBRARY=	$(basename $(LDLIBRARY))'
	  ;;

Or am I missing something?

Does this fix the problem?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Mon Jan 22 18:21:09 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 12:21:09 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101221602.LAA31103@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 11:02:15AM -0500
References: <LNBBLJKPBEHFEDALKOLCAEDJIKAA.tim.one@home.com> <200101221602.LAA31103@cj20424-a.reston1.va.home.com>
Message-ID: <20010122122109.A14952@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> While fixing the test_b1 code again, which depends on this behavior, I
> thought of a refinement: it wouldn't be hard to make None compare
> smaller than *anything* (including numbers).
> 
> Is this worth it?

I think so, if only for the sake of well-definedness.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"They that can give up essential liberty to obtain a little temporary 
safety deserve neither liberty nor safety."
	-- Benjamin Franklin, Historical Review of Pennsylvania, 1759.



From thomas at xs4all.net  Mon Jan 22 18:25:30 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 18:25:30 +0100
Subject: [Python-Dev] 'make distclean' broken.
In-Reply-To: <200101221712.MAA00694@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 12:12:29PM -0500
References: <20010122165535.P17392@xs4all.nl> <200101221712.MAA00694@cj20424-a.reston1.va.home.com>
Message-ID: <20010122182530.E17295@xs4all.nl>

On Mon, Jan 22, 2001 at 12:12:29PM -0500, Guido van Rossum wrote:

> and then configure.in could be changed to

>     AC_SUBST(DLLLIBRARY)
>     LDLIBRARY=''
>     DLLLIBRARY=''
>        .
>        . (and later)
>        .
>     cygwin*)
> 	  LDLIBRARY='libpython$(VERSION).dll.a'
> 	  DLLLIBRARY='DLLLIBRARY=	$(basename $(LDLIBRARY))'
> 	  ;;

You mean 
 	  DLLLIBRARY='$(basename $(LDLIBRARY))'

But yes, that fixes it.

> Or am I missing something?

Well, on *that* I'm not sure, that's why I asked :P If things in the Python
source boggle me, they are always there for a good reason. Well, maybe just
'almost always', but practically always :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From nas at arctrix.com  Mon Jan 22 11:39:59 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 02:39:59 -0800
Subject: [Python-Dev] 'make distclean' broken.
In-Reply-To: <200101221712.MAA00694@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 12:12:29PM -0500
References: <20010122165535.P17392@xs4all.nl> <200101221712.MAA00694@cj20424-a.reston1.va.home.com>
Message-ID: <20010122023959.A25798@glacier.fnational.com>

[Guido on change SET_DLLLIBRARY]
> Or am I missing something?

I don't think so.  My new Makefile uses "FOO = @FOO@" everywhere.
SET_CXX is the same way in the current Makefile.

  Neil



From esr at thyrsus.com  Mon Jan 22 18:41:59 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 12:41:59 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
Message-ID: <20010122124159.A14999@thyrsus.com>

\section{\module{set} ---
         Basic set algebra for Python}

\declaremodule{standard}{set}
\modulesynopsis{Basic set algebra operations on sequences.}
\moduleauthor{Eric S. Raymond}{esr at thyrsus.com}
\sectionauthor{Eric S. Raymond}{esr at thyrsus.com}

The \module{set} module defines functions for treating lists and other
sequences as mathematical sets, and defines a set class that uses
these operations natively and overloads Python's standard operator set.

The \module{set} functions work on any sequence type and return lists.
The set methods can take a set or any sequence type as an argument.
Set or sequence elements may be of any type and may be mutable.
Comparisons and membership tests of elements against sequence objects
are done using \keyword{in}, and so can be customized by supplying a 
suitable \method{__getattr__} method for the sequence type.

The running time of these functions is O(n**2) in the worst case
unless otherwise noted.  For cases that can be short-circuited by 
cardinality comparisons, this has been done.

\begin{funcdesc}{setify}{list1}
Returns a list of the argument sequence's elements with duplicates removed.
\end{funcdesc}

\begin{funcdesc}{union}{list1, list2}
Set union.  All elements of both sets or sequences are returned.
\end{funcdesc}

\begin{funcdesc}{intersection}{list1, list2}
Set intersection.  All elements common to both sets or sequences are returned.
\end{funcdesc}

\begin{funcdesc}{difference}{list1, list2}
Set difference.  All elements of the first set or sequence not present
in the second are returned.
\end{funcdesc}

\begin{funcdesc}{symmetric_difference}{list1, list2}
Set symmetric difference.  All elements present in one sequence or the other
but not in both are returned.
\end{funcdesc}

\begin{funcdesc}{cartesian}{list1, list2}
Returns a list of tuples consisting of all possible pairs of elements
from the first and second sequences or sets.
\end{funcdesc}

\begin{funcdesc}{equality}{list1, list2}
Set comparison.  Return 1 if the two sets or sequences contain exactly
the same elements, 0 or otherwise.
\end{funcdesc}

\begin{funcdesc}{subset}{list1, list2}
Set subset test.  Return 1 if all elements of the fiorst set or
sequence are members of the second, 0 otherwise.
\end{funcdesc}

\begin{funcdesc}{proper_subset}{list1, list2}
Set subset test, excluding equality.  Return 1 if the arguments fail a
set equality test, and all elements of the fiorst set or sequence are
members of the second, 0 otherwise.
\end{funcdesc}

\begin{funcdesc}{powerset}{list1}
Return the set of all subsets of the argument set or
sequence. Warning: this produces huge results from small arguments and
is O(2**n) in both running time and space requirements; you can
readily run yourself out of memory using it.
\end{funcdesc}

\subsection{set Objects \label{set-objects}}

A \class{set} instance uses the \module{set} module functions to
implement set semantics on the list it contains, and to support 
a full set of Python list methods and operaors.  Thus, the set
methods can take a set or any sequence type as an argument.  

A set object contains a single data member:

\begin{memberdesc}{elements}
List containing the elements of the set.  
\end{memberdesc}

Set objects can be treated as mutable sequences; they support the
special methods 
\method{__len__}, 
\method{__getattr__},
\method{__setattr__}, 
and \method{__delattr__}.  
Through
\method{__getattr__}, they support the memebership test via
\keyword{in}. All the standard mutable-sequence methods
\method{list}, 
\method{append}, 
\method{extend}, 
\method{count}, 
\method{index}, 
\method{insert} (the index argument is ignored), 
\method{pop}, 
\method{remove}, 
\method{reverse}, 
and \method{sort}
are also supported.  After method calls that add elements
(\method{setattr},
\method{append}, \method{extend}, \method{insert}), the
elements of the data member are re-setified, so it is not possible to
introduce duplicates.

Calling \function{repr()} on a set returns the result of calling
\function{repr} on its element list.  Calling \function{str()} returns
a representation resembling mathematical notation for the set; an
open set bracket, followed by a comma-separated list of \function{str()}
representations of the elements, followed by a close set brackets.

Set objects support the following Python operators:

\begin {tableiii}{l|l|l}{code}{Operator}{Function}{Description}
\lineiii{|,+}{union}{Union}
\lineiii{&}{intersection}{Intersection}
\lineiii{-}{difference}{Difference}
\lineiii{^}{symmetric_difference}{Symmetric differe}
\lineiii{*}{cartesian}{Cartesian product}
\lineiii{==}{equality}{Equality test}
\lineiii{!=,<>}{}{Inequality test}
\lineiii{<}{proper_subset}{Proper-subset test}
\lineiii{<=}{subset}{Subset test}
\lineiii{>}{}{Proper superset test}
\lineiii{>=}{}{Superset test}
\end {tableiii}

-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Government is actually the worst failure of civilized man. There has
never been a really good one, and even those that are most tolerable
are arbitrary, cruel, grasping and unintelligent.
	-- H. L. Mencken 



From esr at snark.thyrsus.com  Mon Jan 22 19:28:57 2001
From: esr at snark.thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 13:28:57 -0500
Subject: [Python-Dev] I still can't build HTML in a current CVS tree.
Message-ID: <200101221828.f0MISvH15121@snark.thyrsus.com>

Fred, I still can't build HTML documentation in a current CVS tree -- same
complaint about lib/modindex.html being absent.  Can we get this fixed
before 2.1 ships?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

...Virtually never are murderers the ordinary, law-abiding people
against whom gun bans are aimed.  Almost without exception, murderers
are extreme aberrants with lifelong histories of crime, substance
abuse, psychopathology, mental retardation and/or irrational violence
against those around them, as well as other hazardous behavior, e.g.,
automobile and gun accidents."
        -- Don B. Kates, writing on statistical patterns in gun crime



From fredrik at effbot.org  Mon Jan 22 19:33:56 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Mon, 22 Jan 2001 19:33:56 +0100
Subject: [Python-Dev] Python 2.1 article
References: <E14Kjpz-0000cu-00@ute.cnri.reston.va.us>
Message-ID: <059b01c084a1$e431e490$e46940d5@hagrid>

> I've put together an almost-complete first draft of a "What's New in
> 2.1" article.  The only missing piece is a section on the Nested
> Scopes PEP, which obviously has to wait for the changes to get checked
> in.

what's the current 2.1a1 eta?  (pep 226 still
says last friday)

today?  wednesday?  this week?  this month?

Curious /F




From mal at lemburg.com  Mon Jan 22 19:33:24 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 22 Jan 2001 19:33:24 +0100
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
References: <20010122124159.A14999@thyrsus.com>
Message-ID: <3A6C7CF4.F10AA77B@lemburg.com>

[LaTeX file]

Eric, we are all hackers, but plain LaTeX is not really the right
format for a posting to a mailing list... at least not if
you really expect feedback ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From martin at mira.cs.tu-berlin.de  Mon Jan 22 19:36:16 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Mon, 22 Jan 2001 19:36:16 +0100
Subject: [Python-Dev] getcode() function in pyexpat.c
Message-ID: <200101221836.f0MIaGL00923@mira.informatik.hu-berlin.de>

> A few comments to explain this highly stylized and macro-laden code
> would be appreciated.

I probably can't do that before 2.1a1, but I promise to suggest
something right afterwards.

In general, the macro magic is designed to make the many expat
callbacks available to Python. RC_HANDLER (for return code) is the
most general template; VOID_HANDLER and INT_HANDLER are common
specializations. In the core of RC_HANDLER, there a tuple is built and
a Python function is called.

The code used to do PyEval_CallObject right inside the macro; the
call_with_frame feature is new compared to 2.0. It solves the specific
problem of incomprehensible tracebacks.

In a typical SAX application, the user code calls
expatreader.ExpatParser.parse, which in turn calls 

            self._parser.Parse(data, isFinal)

Now, in 2.0, a common problem was a traceback

            self._parser.Parse(data, isFinal)
TypeError: not enough arguments; expected 4, got 2

Everybody assumes a problem in the call to Parse; the real problem is
in the call to the callback inside RC_HANDLER, which tried to call a
user's function with two arguments that expected four.

2.1 would improve this slightly on its own, writing

            self._parser.Parse(data, isFinal)
TypeError: characters() takes exactly 4 arguments (2 given)

With that code, you get

  File "/usr/local/lib/python2.1/xml/sax/expatreader.py", line 81, in feed
    self._parser.Parse(data, isFinal)
  File "pyexpat.c", line 379, in CharacterData
TypeError: characters() takes exactly 4 arguments (2 given)

So that tells you that it is the CharacterData handler that invokes
characters(). You are right that the frame object is not used
otherwise; it is just there to make a nice traceback.

> I simply don't understand what's going on -- and I'm deeply
> suspicious that it is the source of whatever problems Tim is seeing
> with test_sax.

I thought so, too, at first; it turned out that the problem was
elsewhere.

Regards,
Martin



From guido at digicool.com  Mon Jan 22 20:04:02 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 14:04:02 -0500
Subject: [Python-Dev] Python 2.1 article
In-Reply-To: Your message of "Mon, 22 Jan 2001 19:33:56 +0100."
             <059b01c084a1$e431e490$e46940d5@hagrid> 
References: <E14Kjpz-0000cu-00@ute.cnri.reston.va.us>  
            <059b01c084a1$e431e490$e46940d5@hagrid> 
Message-ID: <200101221904.OAA01170@cj20424-a.reston1.va.home.com>

> what's the current 2.1a1 eta?  (pep 226 still
> says last friday)

You missed my email that I sent out Friday.  Tentatively it's going
out tonight.  No point in updating the PEP each time there's slippage.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Mon Jan 22 20:10:54 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 14:10:54 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Mon, 22 Jan 2001 12:41:59 EST."
             <20010122124159.A14999@thyrsus.com> 
References: <20010122124159.A14999@thyrsus.com> 
Message-ID: <200101221910.OAA01218@cj20424-a.reston1.va.home.com>

Eric,

There's already a PEP on a set object type, and everybody and their
aunt has already implemented a set datatype.

If *your* set module is ready for prime time, why not publish it in
the Vaults of Parnassus?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Mon Jan 22 20:29:18 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 14:29:18 -0500 (EST)
Subject: [Python-Dev] Re: getcode() function in pyexpat.c
In-Reply-To: <200101221836.f0MIaGL00923@mira.informatik.hu-berlin.de>
References: <200101221836.f0MIaGL00923@mira.informatik.hu-berlin.de>
Message-ID: <14956.35342.724657.865367@localhost.localdomain>

>>>>> "MvL" == Martin v Loewis <martin at mira.cs.tu-berlin.de> writes:

  >> I simply don't understand what's going on -- and I'm deeply
  >> suspicious that it is the source of whatever problems Tim is
  >> seeing with test_sax.

  MvL> I thought so, too, at first; it turned out that the problem was
  MvL> elsewhere.

What was the cause of that problem?  I didn't see any mail after Tim's
middle-of-the-night message "Worse news."

Jeremy




From tim.one at home.com  Mon Jan 22 21:01:59 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 15:01:59 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101221602.LAA31103@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEGIIKAA.tim.one@home.com>

[Guido]
> ...
> While fixing the test_b1 code again, which depends on this behavior, I
> thought of a refinement: it wouldn't be hard to make None compare
> smaller than *anything* (including numbers).
>
> Is this worth it?

First, an attempt to see what Python did in this morning's CVS turned up an
internal error for Jeremy:

>>> [None < x for x in (1, 1L, 1j, 1.0, [1], {}, (1,))]
name: None, in ?, file '<stdin>', line 1
locals: {'[1]': 0, 'x': 1}
globals: {}
Fatal Python error: compiler did not label name as local or global

abnormal program termination

A simpler way to provoke that:

>>> [None < 2 for x in "x"]
name: None, in ?, file '<stdin>', line 1
locals: {'[1]': 0, 'x': 1}
globals: {}
Fatal Python error: compiler did not label name as local or global


Anyway, I think forcing None to be "the smallest" is cute!  Inexpensive to
do, and while I don't see a compelling *use* for it, I bet it would be least
surprising to newbies.  +1.




From fdrake at acm.org  Mon Jan 22 21:08:54 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Mon, 22 Jan 2001 15:08:54 -0500 (EST)
Subject: [Python-Dev] Re: I still can't build HTML in a current CVS tree.
In-Reply-To: <200101221828.f0MISvH15121@snark.thyrsus.com>
References: <200101221828.f0MISvH15121@snark.thyrsus.com>
Message-ID: <14956.37718.968912.189834@cj42289-a.reston1.va.home.com>

Eric S. Raymond writes:
 > Fred, I still can't build HTML documentation in a current CVS tree -- same
 > complaint about lib/modindex.html being absent.  Can we get this fixed
 > before 2.1 ships?

  I'm guessing I've lost a previous email on the topic, or it's buried
in my inbox.  If this is still a problem after today's checkins, could
you please file a bug report and assign it to me?
  Thanks!


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tim.one at home.com  Mon Jan 22 21:26:15 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 15:26:15 -0500
Subject: More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)
In-Reply-To: <200101221555.KAA30935@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEGJIKAA.tim.one@home.com>

[Guido]
> IMHO, *this* *particular* gripe of Insure++ is just a pain in the
> butt, and I wish there was a way to turn it off in Insure++ without
> having to fix the code.

Maybe there is.  Barry?

> IMHO, this was included in the standard to allow segmented-memory
> implementations of C.  Think certain DOS or Windows 3.1 memory models
> where a pointer is a segment plus an offset.  This is not current
> practice even on Palmpilots!

I could ask Tom MacDonald (former X3J11 chair), but don't want to bother
him.  The way these things usually turn out:  the committee debated it 100
times over 10 years, but some committee member steadfastly claimed it was
important.  Since ANSI/ISO committees work via consensus, one implacable
objector is enough.

WRT pointers, I know that while the C committee did worry about segmented
architectures a lot in the past, tagged architectures gave them much
thornier problems (the HW tags each "word" with some manner of metadata
(such as a busy/free or empty/full bit, or read+write permission bits, or a
data type identifier, or a "capability" tag tying into a HW-enforced
security architecture, ...), and checks those on each access, and some of
the metadata can propagate into a pointer, and the HW can raise faults on
pointer comparisons if the metadata doesn't match).  While such machines
aren't in common use, the US Govt does all sorts of things they don't talk
about -- if it's not IBM's representative protecting a 40-year old
architecture, it's someone emphatically not from the NSA <wink> protecting
something they're not at liberty to discuss.  Of course Python wants to run
there too, even if we never hear about it ...

> The standard may say that such comparisons are undefined, but I don't
> care about this particular undefinedness, and I'm annoyed by the
> required patches.

Ya, and I'm annoyed that MS stdio corrupts itself -- but they're just
clinging to the letter of the std too, and I've learned to live with it
gracefully <wink>.

pointer-ordering-comparisons-should-be-very-rare-anyway-ly y'rs  - tim




From tim.one at home.com  Mon Jan 22 21:55:30 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 15:55:30 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: <Pine.LNX.4.10.10101221609430.24819-100000@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEGNIKAA.tim.one@home.com>

[Michael Hudson]
> Hmm - my machine's done 28 exemplary "make clean; make test" runs this
> morning.  I last updated yesterday afternoon my time (~1700 GMT).

So does mine now.  The remaining failures require *unusual* ways of running
the test suite (with -r to get test_cpickle to fail, confirmed now by Jeremy
under Linux; and in an extremely specialized and seemingly Windows-specific
way to get test_extcall to blow up w/ a bad pointer).




From tim.one at home.com  Mon Jan 22 22:07:27 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 16:07:27 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: <14956.16758.68050.257212@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>

[Jeremy Hylton]
> Funny (strange or haha?) that test_extcall is failing since the two
> pieces of code I've modified most recently are compile.c and the
> section of ceval.c that handles extended call syntax.

Ya, I knew that, but I avoided wagging a Finger of Shame in your direction
because coincidence isn't proof <wink>.

> ...
> As for the test_sax failure,

There is no test_sax failure anywhere anymore that I know of (Martin found a
dead-wrong array decl in contributed pyexpat.c code and repaired it).

And I believe my "rt -x test_sax" failure in test_extcall almost certainly
has nothing to do with test_sax -- far more likely the connection to
test_sax is an accident, and that if I spend umpteen hours trying other
things at random I'll provoke the same memory accident leading to a bad
pointer via excluding some other test.  I just picked test_sax because that
*was* broken and I wanted to get thru the rest of the tests.

BTW, delighted(?) to hear that test_cpickle fails for you too!  I'm sure
test_extcall is going to blow up for other people eventually too -- but it
is sooooo hard to provoke even for me.  I've dropped the effort pending news
from someone running Insure++ or efence or whatever.




From guido at digicool.com  Mon Jan 22 22:18:26 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 16:18:26 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: Your message of "Mon, 22 Jan 2001 16:07:27 EST."
             <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com> 
Message-ID: <200101222118.QAA28305@cj20424-a.reston1.va.home.com>

[Tim]
> So does mine now.  The remaining failures require *unusual* ways of running
> the test suite (with -r to get test_cpickle to fail, confirmed now by Jeremy
> under Linux;
[and later]
> BTW, delighted(?) to hear that test_cpickle fails for you too!

This (test_cpickle) is a red herring -- it's a shallow failure in the
test suite.  test_cpickle imports test_pickle, but test_pickle first
outputs the test output from testing pickle -- unless test_pickle has
been run before!  This succeeds:

  ./python Lib/test/regrtest.py test_cpickle test_pickle

and this fails:

  ./python Lib/test/regrtest.py test_pickle test_cpickle

Use regrtest.py -v to fidn out why. :-)

I'm not sure how to restucture this, but it's not of the same quality
as test_extcall or test_sax failing.  Neither of those has failed for
me on Linux during hours of testing.  However on Windows I get an
occasional appfail dialog box when using rt.bat.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas at arctrix.com  Mon Jan 22 15:44:00 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 06:44:00 -0800
Subject: [Python-Dev] Worse news
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 22, 2001 at 04:07:27PM -0500
References: <14956.16758.68050.257212@localhost.localdomain> <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>
Message-ID: <20010122064400.A26543@glacier.fnational.com>

On Mon, Jan 22, 2001 at 04:07:27PM -0500, Tim Peters wrote:
> I've dropped the effort pending news from someone running
> Insure++ or efence or whatever.

efence to the rescue!  I compiled with -fstruct-pack and used
EF_ALIGNMENT=0 and now I can trigger a core dump by running
test_extcall.  More news comming...

  Neil



From tim.one at home.com  Mon Jan 22 22:41:08 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 16:41:08 -0500
Subject: [Python-Dev] test_sax and site-python
In-Reply-To: <20010122145733.85E51373C95@snelboot.oratrix.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHHIKAA.tim.one@home.com>

[Jack Jansen]
> I'm not sure whether this is really a bug, but I had the problem
> that there  was something wrong with the xml package I had
> installed into my Lib/site-python, and this caused test_sax to
> complain.
>
> If the test stuff is expected to test only the core functionality
> maybe sys.path should be edited so that it only contains directories
> that are part of the core distribution?

AFAIK, xml *is* considered part of the core now, and has been since 2.0 was
released.  The wisdom of that decision is debatable with hindsight, but
AFAICT xml is in the same boat as, say, zlib now:  not builtin, and requires
3rd-party code to work, but part of the core all the same.  The Windows
installer comes w/ the necessary xml (and zlib) pieces, and I suppose the
Mac Python package also should.




From nas at arctrix.com  Mon Jan 22 16:00:57 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 07:00:57 -0800
Subject: [Python-Dev] Worse news
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 22, 2001 at 04:07:27PM -0500
References: <14956.16758.68050.257212@localhost.localdomain> <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>
Message-ID: <20010122070057.A26575@glacier.fnational.com>

Perhaps this will help somone track down the bug:

[running test_extcall...]
unbound method method() must be called with instance as first argument
unbound method method() must be called with instance as first argument

Program received signal SIGSEGV, Segmentation fault.
symtable_params (st=0x429bafd0, n=0x42a3ffd7) at Python/compile.c:4330
4330                    if (TYPE(c) == DOUBLESTAR) {
(gdb) l
4325                            symtable_add_def(st, STR(CHILD(n, i)), 
4326                                             DEF_PARAM | DEF_STAR);
4327                            i += 2;
4328                            c = CHILD(n, i);
4329                    }
4330                    if (TYPE(c) == DOUBLESTAR) {
4331                            i++;
4332                            symtable_add_def(st, STR(CHILD(n, i)), 
4333                                             DEF_PARAM | DEF_DOUBLESTAR);
4334                    }
(gdb) p c
$3 = (node *) 0x42a43fff
(gdb) p *c
$4 = {n_type = 0, n_str = 0x0, n_lineno = 0, n_nchildren = 0, n_child = 0x0}
(gdb) p n
$5 = (node *) 0x42a3ffd7
(gdb) p *n
$6 = {n_type = 261, n_str = 0x0, n_lineno = 1, n_nchildren = 2, 
  n_child = 0x42a43fc3}
(gdb) bt 10
#0  symtable_params (st=0x429bafd0, n=0x42a3ffd7) at Python/compile.c:4330
#1  0x8060126 in symtable_funcdef (st=0x429bafd0, n=0x42a23feb)
    at Python/compile.c:4245
#2  0x805fd29 in symtable_node (st=0x429bafd0, n=0x429b0fc3)
    at Python/compile.c:4128
#3  0x80600da in symtable_node (st=0x429bafd0, n=0x4290cfeb)
    at Python/compile.c:4232
#4  0x805f443 in symtable_build (c=0xbffff5c8, n=0x4290cfeb)
    at Python/compile.c:3816
#5  0x805f130 in jcompile (n=0x4290cfeb, filename=0x80a040f "<string>", 
    base=0x0) at Python/compile.c:3720
#6  0x805f0c2 in PyNode_Compile (n=0x4290cfeb, filename=0x80a040f "<string>")
    at Python/compile.c:3699
#7  0x8069adf in run_node (n=0x4290cfeb, filename=0x80a040f "<string>", 
    globals=0x40644fe0, locals=0x40644fe0) at Python/pythonrun.c:915
#8  0x8069ac0 in run_err_node (n=0x4290cfeb, filename=0x80a040f "<string>", 
    globals=0x40644fe0, locals=0x40644fe0) at Python/pythonrun.c:907
#9  0x8069a30 in PyRun_String (
    str=0x429f9fd1 "def zv(*v): print \"ok zv\", a, b, d, e, v, k", start=257, 
    globals=0x40644fe0, locals=0x40644fe0) at Python/pythonrun.c:881
(More stack frames follow...)




From thomas at xs4all.net  Mon Jan 22 23:13:29 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 23:13:29 +0100
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122070057.A26575@glacier.fnational.com>; from nas@arctrix.com on Mon, Jan 22, 2001 at 07:00:57AM -0800
References: <14956.16758.68050.257212@localhost.localdomain> <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com> <20010122070057.A26575@glacier.fnational.com>
Message-ID: <20010122231329.A27785@xs4all.nl>

On Mon, Jan 22, 2001 at 07:00:57AM -0800, Neil Schemenauer wrote:
> Perhaps this will help somone track down the bug:

> [running test_extcall...]
> unbound method method() must be called with instance as first argument
> unbound method method() must be called with instance as first argument
> 
> Program received signal SIGSEGV, Segmentation fault.
> symtable_params (st=0x429bafd0, n=0x42a3ffd7) at Python/compile.c:4330
> 4330                    if (TYPE(c) == DOUBLESTAR) {
> (gdb) l
> 4325                            symtable_add_def(st, STR(CHILD(n, i)), 
> 4326                                             DEF_PARAM | DEF_STAR);
> 4327                            i += 2;
> 4328                            c = CHILD(n, i);
> 4329                    }
> 4330                    if (TYPE(c) == DOUBLESTAR) {
> 4331                            i++;
> 4332                            symtable_add_def(st, STR(CHILD(n, i)), 
> 4333                                             DEF_PARAM | DEF_DOUBLESTAR);
> 4334                    }

> (gdb) p c
> $3 = (node *) 0x42a43fff
> (gdb) p *c
> $4 = {n_type = 0, n_str = 0x0, n_lineno = 0, n_nchildren = 0, n_child = 0x0}
> (gdb) p n
> $5 = (node *) 0x42a3ffd7
> (gdb) p *n
> $6 = {n_type = 261, n_str = 0x0, n_lineno = 1, n_nchildren = 2, 
>   n_child = 0x42a43fc3}

n_child is 0x42a43fc3. That's n_child[0]. 0x42a43fff is the child being
handled now. That would be n_child[3] (0x42a43fff - 0x42a3ffd7 == 60, a
struct node is 20 bytes.) But n_children is 2, so it's an off-by-two error
somewhere -- and look, there's a "i += 2' right above it ! It *looks* like
this code will blow up whenever you use '*eggs' without '**spam' in a
funtion definition. That's a fairly wild guess, but it's worth a try. Try
this patch:

Index: Python/compile.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Python/compile.c,v
retrieving revision 2.148
diff -c -c -r2.148 compile.c
*** Python/compile.c    2001/01/22 04:35:57     2.148
--- Python/compile.c    2001/01/22 22:12:31
***************
*** 4324,4329 ****
--- 4324,4331 ----
                        i++;
                        symtable_add_def(st, STR(CHILD(n, i)), 
                                         DEF_PARAM | DEF_STAR);
+                       if (NCH(n) <= i+2)
+                               return;
                        i += 2;
                        c = CHILD(n, i);
                }


-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From esr at thyrsus.com  Mon Jan 22 21:13:09 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 22 Jan 2001 15:13:09 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101221910.OAA01218@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 02:10:54PM -0500
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com>
Message-ID: <20010122151309.C15236@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> There's already a PEP on a set object type, and everybody and their
> aunt has already implemented a set datatype.

I've just read the PEP.  Greg's proposal has a couple of problems.
The biggest one is that the interface design isn't very Pythonic --
it's formally adequate, but doesn't exploit the extent to which sets
naturally have common semantics with existing Python sequence types.
This is bad; it means that a lot of code that could otherwise ignore
the difference between lists and sets would have to be specialized 
one way or the other for no good reason.

The only other set module I can find in the Vaults or anywhere else is
kjBuckets (which I knew about before).  Looks like a good design, but
complicated -- and requires installation of an extension.

> If *your* set module is ready for prime time, why not publish it in
> the Vaults of Parnassus?

I suppose that's what I'll do if you don't bless it for the standard
library.  But here are the reasons I suggest you should do so:

1. It supports a set of operations that are both often useful and
fiddly to get right, thus enhancing the "batteries are included"
effect.  (I used its ancestor for representing seen-message numbers in
a specialized mailreader, for example.)

2. It's simple for application programmers to use.  No extension module
to integrate.

3. It's unsurprising.  My set objects behave almost exactly like other
mutable sequences, with all the same built-in methods working, except for 
the fact that you can't introduce duplicates with the mutators.

4. It's already completely documented in a form suitable for the library.

5. It's simple enough not to cause you maintainance hassles down the
road, and even if it did the maintainer is unlikely to disappear :-).
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The United States is in no way founded upon the Christian religion
	-- George Washington & John Adams, in a diplomatic message to Malta.



From guido at digicool.com  Mon Jan 22 23:29:26 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 17:29:26 -0500
Subject: [Python-Dev] test_sax and site-python
In-Reply-To: Your message of "Mon, 22 Jan 2001 16:41:08 EST."
             <LNBBLJKPBEHFEDALKOLCGEHHIKAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCGEHHIKAA.tim.one@home.com> 
Message-ID: <200101222229.RAA28667@cj20424-a.reston1.va.home.com>

> [Jack Jansen]
> > I'm not sure whether this is really a bug, but I had the problem
> > that there  was something wrong with the xml package I had
> > installed into my Lib/site-python, and this caused test_sax to
> > complain.
> >
> > If the test stuff is expected to test only the core functionality
> > maybe sys.path should be edited so that it only contains directories
> > that are part of the core distribution?
> 
[Tim]
> AFAIK, xml *is* considered part of the core now, and has been since 2.0 was
> released.  The wisdom of that decision is debatable with hindsight, but
> AFAICT xml is in the same boat as, say, zlib now:  not builtin, and requires
> 3rd-party code to work, but part of the core all the same.  The Windows
> installer comes w/ the necessary xml (and zlib) pieces, and I suppose the
> Mac Python package also should.

Yes, but Jack was talking about a non-std xml package in
site-python...  I agree that this shouldn't be picked up.  But is it
worth taking draconian measures to avoid this?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Mon Jan 22 23:35:08 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 22 Jan 2001 17:35:08 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: <200101222118.QAA28305@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEHLIKAA.tim.one@home.com>

[Guido]
> This (test_cpickle) is a red herring -- it's a shallow failure in the
> test suite.

Fixed now -- thanks!

Please note that Neil got text_extcall to fail in exactly the same place
(see his recent Python-Dev) mail.  That's the only remaining failure I know
of.

> ...
> However on Windows I get an occasional appfail dialog box when
> using rt.bat.

I don't believe I've ever seen one of those ("appfail" rings no bells), and
rt has never acted strangely for me.   Your DOS-box properties may be
screwed up:  use Start -> Find -> Files or Folders ...; set "Look in" to C:;
enter *.pif in the "Named:" box; click Find.  You'll probably get a dozen
hits.  One of them will correspond to the method you use to open a DOS box
(which I don't know).  Right-click on that one and select Properties.  On
the Memory tab of the dialog that pops up, the four dropdown lists should
have "Auto" selected.  "Uses HMA" should be checked.  Hmm ... looks like
"Protected" *should* be checked but mine isn't ... oh, this goes on and on.
I don't even know which version of Windows you're using here!  How about I
look at it next time I'm at your house ...




From greg at cosc.canterbury.ac.nz  Mon Jan 22 23:50:07 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 23 Jan 2001 11:50:07 +1300 (NZDT)
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122231329.A27785@xs4all.nl>
Message-ID: <200101222250.LAA01929@s454.cosc.canterbury.ac.nz>

> 4330                    if (TYPE(c) == DOUBLESTAR) {
> 4325                            symtable_add_def(st, STR(CHILD(n, i)), 
> 4326                                             DEF_PARAM | DEF_STAR);

Shouldn't line 4330 say if (TYPE(c) == STAR) ?

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From thomas at xs4all.net  Mon Jan 22 23:56:02 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 22 Jan 2001 23:56:02 +0100
Subject: [Python-Dev] Worse news
In-Reply-To: <200101222250.LAA01929@s454.cosc.canterbury.ac.nz>; from greg@cosc.canterbury.ac.nz on Tue, Jan 23, 2001 at 11:50:07AM +1300
References: <20010122231329.A27785@xs4all.nl> <200101222250.LAA01929@s454.cosc.canterbury.ac.nz>
Message-ID: <20010122235602.B27785@xs4all.nl>

On Tue, Jan 23, 2001 at 11:50:07AM +1300, Greg Ewing wrote:
> > 4330                    if (TYPE(c) == DOUBLESTAR) {
> > 4325                            symtable_add_def(st, STR(CHILD(n, i)), 
> > 4326                                             DEF_PARAM | DEF_STAR);

> Shouldn't line 4330 say if (TYPE(c) == STAR) ?

No, that's line 4323. You can't have doublestar without having star, and
star should precede doublestar. (Grammar should enforce that.) 

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From paulp at ActiveState.com  Tue Jan 23 00:02:07 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Mon, 22 Jan 2001 15:02:07 -0800
Subject: [Python-Dev] pydoc - put it in the core
References: <14945.59192.400783.403810@beluga.mojam.com> <200101142055.PAA13041@cj20424-a.reston1.va.home.com>
Message-ID: <3A6CBBEF.4732BFF2@ActiveState.com>

Guido van Rossum wrote:
> 
> ....
>
> Yes, wow!
> 
> ....

I apologize but I'm not clear on my responsibilities here, if any. I
wrote a PEP for online help. I submitted a partial implementation. Ping
wrote a full implementation that basically supercedes mine. There are
various ideas for improving it, but I think that we agree that the core
is solid. Several people have said that it should be moved into the core
library. Nobody has said that it shouldn't. Whose move is it? What's
next?

 Paul Prescod



From fredrik at effbot.org  Tue Jan 23 00:08:40 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 23 Jan 2001 00:08:40 +0100
Subject: [Python-Dev] test___all__ fails if bsddb not available
Message-ID: <079a01c084c8$43023e40$e46940d5@hagrid>

test___all__
test test___all__ failed -- dbhash has no __all__ attribute

maybe this test shouldn't depend on optional modules?

</F>




From nas at arctrix.com  Mon Jan 22 17:24:34 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 22 Jan 2001 08:24:34 -0800
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122231329.A27785@xs4all.nl>; from thomas@xs4all.net on Mon, Jan 22, 2001 at 11:13:29PM +0100
References: <14956.16758.68050.257212@localhost.localdomain> <LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com> <20010122070057.A26575@glacier.fnational.com> <20010122231329.A27785@xs4all.nl>
Message-ID: <20010122082433.B26765@glacier.fnational.com>

On Mon, Jan 22, 2001 at 11:13:29PM +0100, Thomas Wouters wrote:
> That's a fairly wild guess, but it's worth a try. Try this
> patch:
[...]

Works for me.

  Neil



From greg at cosc.canterbury.ac.nz  Tue Jan 23 00:21:14 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 23 Jan 2001 12:21:14 +1300 (NZDT)
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122235602.B27785@xs4all.nl>
Message-ID: <200101222321.MAA01957@s454.cosc.canterbury.ac.nz>

Thomas Wouters <thomas at xs4all.net>:

> You can't have doublestar without having star

What?!? You could in 1.5.2. Has that changed?

Anyway, it just looked a bit odd that it seemed to be testing
for DOUBLESTAR and then adding a DEF_STAR thing to the symtab.
But I guess I should shut up until I've seen all of the code.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From thomas at xs4all.net  Tue Jan 23 00:26:02 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 23 Jan 2001 00:26:02 +0100
Subject: [Python-Dev] Worse news
In-Reply-To: <200101222321.MAA01957@s454.cosc.canterbury.ac.nz>; from greg@cosc.canterbury.ac.nz on Tue, Jan 23, 2001 at 12:21:14PM +1300
References: <20010122235602.B27785@xs4all.nl> <200101222321.MAA01957@s454.cosc.canterbury.ac.nz>
Message-ID: <20010123002602.C27785@xs4all.nl>

On Tue, Jan 23, 2001 at 12:21:14PM +1300, Greg Ewing wrote:
> Thomas Wouters <thomas at xs4all.net>:

> > You can't have doublestar without having star

> What?!? You could in 1.5.2. Has that changed?

Sorry, my bad, I'm wrong. (I just tested this.) I could swear it was that
way, but it's 0:25 right now, after a night with about 2 hours decent sleep,
so ignore my delusions :)

> Anyway, it just looked a bit odd that it seemed to be testing
> for DOUBLESTAR and then adding a DEF_STAR thing to the symtab.
> But I guess I should shut up until I've seen all of the code.

No, it's not doing that. It's adding the symbol name to the symtab, with
DEF_DOUBLESTAR as one of its flags. Not sure what the flag does, but I could
guess. (But see the above mentioned delusions as to why I'm not doing that
out loud anymore :-) The 'if' in front of it adds the symbol to the symtab
with DEF_STAR as a flag, in the case of 'STAR' (rather than DOUBLESTAR).
Really. go check :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Tue Jan 23 00:31:03 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 23 Jan 2001 00:31:03 +0100
Subject: [Python-Dev] Worse news
In-Reply-To: <20010123002602.C27785@xs4all.nl>; from thomas@xs4all.net on Tue, Jan 23, 2001 at 12:26:02AM +0100
References: <20010122235602.B27785@xs4all.nl> <200101222321.MAA01957@s454.cosc.canterbury.ac.nz> <20010123002602.C27785@xs4all.nl>
Message-ID: <20010123003103.D27785@xs4all.nl>

On Tue, Jan 23, 2001 at 12:26:02AM +0100, Thomas Wouters wrote:
> On Tue, Jan 23, 2001 at 12:21:14PM +1300, Greg Ewing wrote:
> > Thomas Wouters <thomas at xs4all.net>:
> 
> > > You can't have doublestar without having star
> 
> > What?!? You could in 1.5.2. Has that changed?

> Sorry, my bad, I'm wrong. (I just tested this.) I could swear it was that
> way, but it's 0:25 right now, after a night with about 2 hours decent sleep,
> so ignore my delusions :)

Ah, yeah, what I meant to *think* was: you can't have *spam *after* **eggs:

>>> def foo(x, **kwarg, *arg)
  File "<stdin>", line 1
    def foo(x, **kwarg, *arg)
                      ^
SyntaxError: invalid syntax

So the logic of the latter part of the function seems okay (after the little
patch I posted before.) Jeremy should give his expert opinion before it goes
in, though, since it's his code :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Tue Jan 23 00:36:17 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 18:36:17 -0500
Subject: [Python-Dev] test___all__ fails if bsddb not available
In-Reply-To: Your message of "Tue, 23 Jan 2001 00:08:40 +0100."
             <079a01c084c8$43023e40$e46940d5@hagrid> 
References: <079a01c084c8$43023e40$e46940d5@hagrid> 
Message-ID: <200101222336.SAA30480@cj20424-a.reston1.va.home.com>

> test test___all__ failed -- dbhash has no __all__ attribute
> 
> maybe this test shouldn't depend on optional modules?

Fixed -- I just skip dbhash if bsddb can't be imported.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Tue Jan 23 01:38:28 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 19:38:28 -0500 (EST)
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122231329.A27785@xs4all.nl>
References: <14956.16758.68050.257212@localhost.localdomain>
	<LNBBLJKPBEHFEDALKOLCAEHFIKAA.tim.one@home.com>
	<20010122070057.A26575@glacier.fnational.com>
	<20010122231329.A27785@xs4all.nl>
Message-ID: <14956.53892.651549.493268@localhost.localdomain>

Thomas,

Your patch has the right diagnosis, although I would write it a tad
differently.  NCH(n) <= i + 2 should be NCH(n) < i + 2, because
CHILD(n, NCH(i)) is not valid.

I'll check it in.

Jeremy



From jeremy at alum.mit.edu  Tue Jan 23 02:23:56 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 20:23:56 -0500 (EST)
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments 
In-Reply-To: <20010119232323.70B03116392@oratrix.oratrix.nl>
References: <guido@digicool.com>
	<200101191634.LAA29239@cj20424-a.reston1.va.home.com>
	<20010119232323.70B03116392@oratrix.oratrix.nl>
Message-ID: <14956.56620.706531.647341@localhost.localdomain>

>>>>> "JJ" == Jack Jansen <jack at oratrix.nl> writes:

  JJ> Recently, Guido van Rossum <guido at digicool.com> said:
  >> > I get the impression that I'm currently seeing a non-NULL third
  >> > argument in my (C) methods even though the method is called
  >> > without keyword arguments.
  >>
  >> > Is this new semantics that I missed the discussion about, or is
  >> > this a bug?
  >>
  >> [...]  Do you really need the NULL?

  JJ> The places that I know I was counting on the NULL now have "if (
  JJ> kw && PyObject_IsTrue(kw))", so I'll just have to hope there
  JJ> aren't any more lingering in there.

Guido,

Does your query ("Do you really need the NULL?") mean that you don't
care whether the argument is NULL or an empty dictionary?  I could
change the code to do either for 2.1a2, if you have a preference.

Jeremy



From guido at digicool.com  Tue Jan 23 02:33:20 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 20:33:20 -0500
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: Your message of "Mon, 22 Jan 2001 20:23:56 EST."
             <14956.56620.706531.647341@localhost.localdomain> 
References: <guido@digicool.com> <200101191634.LAA29239@cj20424-a.reston1.va.home.com> <20010119232323.70B03116392@oratrix.oratrix.nl>  
            <14956.56620.706531.647341@localhost.localdomain> 
Message-ID: <200101230133.UAA04378@cj20424-a.reston1.va.home.com>

> Guido,
> 
> Does your query ("Do you really need the NULL?") mean that you don't
> care whether the argument is NULL or an empty dictionary?  I could
> change the code to do either for 2.1a2, if you have a preference.
> 
> Jeremy

Robust code IMO should treat NULL and {} the same.  But since
traditionally we passed NULL, it's better to pass NULL rather than {}.
I believe that's the status quo now, right?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Tue Jan 23 02:54:53 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 22 Jan 2001 20:54:53 -0500 (EST)
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: <200101230133.UAA04378@cj20424-a.reston1.va.home.com>
References: <guido@digicool.com>
	<200101191634.LAA29239@cj20424-a.reston1.va.home.com>
	<20010119232323.70B03116392@oratrix.oratrix.nl>
	<14956.56620.706531.647341@localhost.localdomain>
	<200101230133.UAA04378@cj20424-a.reston1.va.home.com>
Message-ID: <14956.58477.874472.190937@localhost.localdomain>

>>>>> "GvR" == Guido van Rossum <guido at digicool.com> writes:

  [Jeremy wrote:]
  >> Does your query ("Do you really need the NULL?") mean that you
  >> don't care whether the argument is NULL or an empty dictionary?
  >> I could change the code to do either for 2.1a2, if you have a
  >> preference.

  GvR> Robust code IMO should treat NULL and {} the same.  But since
  GvR> traditionally we passed NULL, it's better to pass NULL rather
  GvR> than {}.  I believe that's the status quo now, right?

The current status in CVS is to pass {}, because there appeared to be
some case where a PyCFunction was not expecting NULL.  I assumed,
without checking, that {} was required and change the implementation
to always pass a dictionary to METH_KEYWORDS functions.  I could
change it back to NULL and see if I can reproduce the error I was
seeing.

Jeremy



From guido at digicool.com  Tue Jan 23 03:01:12 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 21:01:12 -0500
Subject: [Python-Dev] Keyword arg dictionary without keyword arguments
In-Reply-To: Your message of "Mon, 22 Jan 2001 20:54:53 EST."
             <14956.58477.874472.190937@localhost.localdomain> 
References: <guido@digicool.com> <200101191634.LAA29239@cj20424-a.reston1.va.home.com> <20010119232323.70B03116392@oratrix.oratrix.nl> <14956.56620.706531.647341@localhost.localdomain> <200101230133.UAA04378@cj20424-a.reston1.va.home.com>  
            <14956.58477.874472.190937@localhost.localdomain> 
Message-ID: <200101230201.VAA15993@cj20424-a.reston1.va.home.com>

>   [Jeremy wrote:]
>   >> Does your query ("Do you really need the NULL?") mean that you
>   >> don't care whether the argument is NULL or an empty dictionary?
>   >> I could change the code to do either for 2.1a2, if you have a
>   >> preference.
> 
>   GvR> Robust code IMO should treat NULL and {} the same.  But since
>   GvR> traditionally we passed NULL, it's better to pass NULL rather
>   GvR> than {}.  I believe that's the status quo now, right?
> 
> The current status in CVS is to pass {}, because there appeared to be
> some case where a PyCFunction was not expecting NULL.  I assumed,
> without checking, that {} was required and change the implementation
> to always pass a dictionary to METH_KEYWORDS functions.  I could
> change it back to NULL and see if I can reproduce the error I was
> seeing.

Yes, that's a good idea.  I hope that the {} in alpha 1 won't make
folks think that they will never see a NULL in the future and code
accordingly...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan 23 03:15:11 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 21:15:11 -0500
Subject: [Python-Dev] 2.1a1 release tonight -- but no nested scopes or weak refs
Message-ID: <200101230215.VAA16577@cj20424-a.reston1.va.home.com>

We've decided to release 2.1a1 without further ado, but without two
big hopeful patches: Jeremy's nested scopes aren't finished and will
take considerably more time, and Fred's weak references need more
review (I haven't had the time to look at the code).  Rather than wait
longer, I've decided to try and release 2.1a1 tonight -- there's
nothing I'm waiting for now before I can cut a tarball.  There will be
an alpha2 release around February 1.

Please don't make any check-ins until I announce the 2.1a1 release
here.  (PythonLabs: please mail or phone me if you need to check in a
last-minute thing -- I'm tagging the tree now.)

More news as it happens,

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Tue Jan 23 03:36:24 2001
From: skip at mojam.com (Skip Montanaro)
Date: Mon, 22 Jan 2001 20:36:24 -0600 (CST)
Subject: [Python-Dev] test_grammar failing
Message-ID: <14956.60968.363878.643640@beluga.mojam.com>

At the end of this:

    make distclean ; ./configure ; make OPT='-g -pipe' ; make test

I get this:

    rm -f ./Lib/test/*.py[co]
    PYTHONPATH=./build/lib.`cat platform` ./python -tt ./Lib/test/regrtest.py -l
    test_grammar
    name: None, in test_in_func, file './Lib/test/test_grammar.py', line 617
    locals: {'x': 2, '[1]': 1, 'l': 0}
    globals: {}
    Fatal Python error: compiler did not label name as local or global
    make: *** [test] Aborted
    PYTHONPATH=./build/lib.`cat platform` ./python -tt ./Lib/test/regrtest.py -l
    test_grammar
    name: None, in test_in_func, file './Lib/test/test_grammar.py', line 617
    locals: {'x': 2, '[1]': 1, 'l': 0}
    globals: {}
    Fatal Python error: compiler did not label name as local or global
    make: *** [test] Aborted

Any ideas?  I notice that Jeremy checked in some changes to test_grammar.py
this evening.

Skip



From gvwilson at nevex.com  Tue Jan 23 03:47:33 2001
From: gvwilson at nevex.com (Greg Wilson)
Date: Mon, 22 Jan 2001 21:47:33 -0500 (EST)
Subject: [Python-Dev] re: I think my set module is ready for prime time
Message-ID: <Pine.LNX.4.10.10101222146150.20319-100000@akbar.nevex.com>

> > Guido van Rossum:
> > There's already a PEP on a set object type, and everybody and their
> > aunt has already implemented a set datatype.

> Eric Raymond:
> Greg's proposal has a couple of problems.
> The biggest one is that the interface design isn't very Pythonic --
> ...doesn't exploit the extent to which sets
> naturally have common semantics with existing Python sequence types.
> This is bad; it means that a lot of code that could otherwise ignore
> the difference between lists and sets would have to be specialized 
> one way or the other for no good reason.

I agree with Eric's point; I put the interface design on hold while I
went off to try to find an efficient implementation capable of
handling mutable values (i.e. one that would allow things like sets of
sets).  I'm still looking :-(, but would appreciate comments from this
list on Eric's interface.

Thanks,
Greg




From guido at digicool.com  Tue Jan 23 04:02:50 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 22:02:50 -0500
Subject: [Python-Dev] test_grammar failing
In-Reply-To: Your message of "Mon, 22 Jan 2001 20:36:24 CST."
             <14956.60968.363878.643640@beluga.mojam.com> 
References: <14956.60968.363878.643640@beluga.mojam.com> 
Message-ID: <200101230302.WAA27104@cj20424-a.reston1.va.home.com>

> At the end of this:
> 
>     make distclean ; ./configure ; make OPT='-g -pipe' ; make test
> 
> I get this:
> 
>     rm -f ./Lib/test/*.py[co]
>     PYTHONPATH=./build/lib.`cat platform` ./python -tt ./Lib/test/regrtest.py -l
>     test_grammar
>     name: None, in test_in_func, file './Lib/test/test_grammar.py', line 617
>     locals: {'x': 2, '[1]': 1, 'l': 0}
>     globals: {}
>     Fatal Python error: compiler did not label name as local or global
>     make: *** [test] Aborted
>     PYTHONPATH=./build/lib.`cat platform` ./python -tt ./Lib/test/regrtest.py -l
>     test_grammar
>     name: None, in test_in_func, file './Lib/test/test_grammar.py', line 617
>     locals: {'x': 2, '[1]': 1, 'l': 0}
>     globals: {}
>     Fatal Python error: compiler did not label name as local or global
>     make: *** [test] Aborted
> 
> Any ideas?  I notice that Jeremy checked in some changes to test_grammar.py
> this evening.

Try another cvs update and rebuild.  The test that Jeremy checked in
is supposed to catch a bug in the compiler code that he checked in.
The latest compile.c is 103277 bytes long (in Unix).

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan 23 04:33:02 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 22 Jan 2001 22:33:02 -0500
Subject: [Python-Dev] Python 2.1 alpha 1 released!
Message-ID: <200101230333.WAA28376@cj20424-a.reston1.va.home.com>

Thanks to the PythonLabs developers and the many hard-working
volunteers, I'm proud to release Python 2.1a1 -- the first alpha
release of Python version 2.1.

The release mechanics are different than for previous releases: we're
only releasing through SourceForge for now.  The official source
tarball is already available from the download page:

  http://sourceforge.net/project/showfiles.php?group_id=5470

Additional files will be released soon: a Windows installer,
Linux RPMs, and documentation.

Please give it a good try!  The only way Python 2.1 can become a
rock-solid product is if people test the alpha releases.  Especially
if you are using Python for demanding applications or on extreme
platforms we are interested in hearing your feedback.  Are you
embedding Python or using threads?  Please test your application using
Python 2.1a1!  Please submit all bug reports through SourceForge:

  http://sourceforge.net/bugs/?group_id=5470

Here's the NEWS file:

What's New in Python 2.1 alpha 1?
=================================

Core language, builtins, and interpreter

- There is a new Unicode companion to the PyObject_Str() API
  called PyObject_Unicode(). It behaves in the same way as the
  former, but assures that the returned value is an Unicode object
  (applying the usual coercion if necessary).

- The comparison operators support "rich comparison overloading" (PEP
  207).  C extension types can provide a rich comparison function in
  the new tp_richcompare slot in the type object.  The cmp() function
  and the C function PyObject_Compare() first try the new rich
  comparison operators before trying the old 3-way comparison.  There
  is also a new C API PyObject_RichCompare() (which also falls back on
  the old 3-way comparison, but does not constrain the outcome of the
  rich comparison to a Boolean result).

  The rich comparison function takes two objects (at least one of
  which is guaranteed to have the type that provided the function) and
  an integer indicating the opcode, which can be Py_LT, Py_LE, Py_EQ,
  Py_NE, Py_GT, Py_GE (for <, <=, ==, !=, >, >=), and returns a Python
  object, which may be NotImplemented (in which case the tp_compare
  slot function is used as a fallback, if defined).

  Classes can overload individual comparison operators by defining one
  or more of the methods__lt__, __le__, __eq__, __ne__, __gt__,
  __ge__.  There are no explicit "reflected argument" versions of
  these; instead, __lt__ and __gt__ are each other's reflection,
  likewise for__le__ and __ge__; __eq__ and __ne__ are their own
  reflection (similar at the C level).  No other implications are
  made; in particular, Python does not assume that == is the Boolean
  inverse of !=, or that < is the Boolean inverse of >=.  This makes
  it possible to define types with partial orderings.

  Classes or types that want to implement (in)equality tests but not
  the ordering operators (i.e. unordered types) should implement ==
  and !=, and raise an error for the ordering operators.

  It is possible to define types whose rich comparison results are not
  Boolean; e.g. a matrix type might want to return a matrix of bits
  for A < B, giving elementwise comparisons.  Such types should ensure
  that any interpretation of their value in a Boolean context raises
  an exception, e.g. by defining __nonzero__ (or the tp_nonzero slot
  at the C level) to always raise an exception.

- Complex numbers use rich comparisons to define == and != but raise
  an exception for <, <=, > and >=.  Unfortunately, this also means
  that cmp() of two complex numbers raises an exception when the two
  numbers differ.  Since it is not mathematically meaningful to compare
  complex numbers except for equality, I hope that this doesn't break
  too much code.

- Functions and methods now support getting and setting arbitrarily
  named attributes (PEP 232).  Functions have a new __dict__
  (a.k.a. func_dict) which hold the function attributes.  Methods get
  and set attributes on their underlying im_func.  It is a TypeError
  to set an attribute on a bound method.

- The xrange() object implementation has been improved so that
  xrange(sys.maxint) can be used on 64-bit platforms.  There's still a
  limitation that in this case len(xrange(sys.maxint)) can't be
  calculated, but the common idiom "for i in xrange(sys.maxint)" will
  work fine as long as the index i doesn't actually reach 2**31.
  (Python uses regular ints for sequence and string indices; fixing
  that is much more work.)

- Two changes to from...import:

  1) "from M import X" now works even if M is not a real module; it's
     basically a getattr() operation with AttributeError exceptions
     changed into ImportError.

  2) "from M import *" now looks for M.__all__ to decide which names to
     import; if M.__all__ doesn't exist, it uses M.__dict__.keys() but
     filters out names starting with '_' as before.  Whether or not
     __all__ exists, there's no restriction on the type of M.

- File objects have a new method, xreadlines().  This is the fastest
  way to iterate over all lines in a file:

  for line in file.xreadlines():
      ...do something to line...

  See the xreadlines module (mentioned below) for how to do this for
  other file-like objects.

- Even if you don't use file.xreadlines(), you may expect a speedup on
  line-by-line input.  The file.readline() method has been optimized
  quite a bit in platform-specific ways:  on systems (like Linux) that
  support flockfile(), getc_unlocked(), and funlockfile(), those are
  used by default.  On systems (like Windows) without getc_unlocked(),
  a complicated (but still thread-safe) method using fgets() is used by
  default.

  You can force use of the fgets() method by #define'ing 
  USE_FGETS_IN_GETLINE at build time (it may be faster than 
  getc_unlocked()).

  You can force fgets() not to be used by #define'ing 
  DONT_USE_FGETS_IN_GETLINE (this is the first thing to try if std test 
  test_bufio.py fails -- and let us know if it does!).

- In addition, the fileinput module, while still slower than the other
  methods on most platforms, has been sped up too, by using
  file.readlines(sizehint).

- Support for run-time warnings has been added, including a new
  command line option (-W) to specify the disposition of warnings.
  See the description of the warnings module below.

- Extensive changes have been made to the coercion code.  This mostly
  affects extension modules (which can now implement mixed-type
  numerical operators without having to use coercion), but
  occasionally, in boundary cases the coercion semantics have changed
  subtly.  Since this was a terrible gray area of the language, this
  is considered an improvement.  Also note that __rcmp__ is no longer
  supported -- instead of calling __rcmp__, __cmp__ is called with
  reflected arguments.

- In connection with the coercion changes, a new built-in singleton
  object, NotImplemented is defined.  This can be returned for
  operations that wish to indicate they are not implemented for a
  particular combination of arguments.  From C, this is
  Py_NotImplemented.

- The interpreter accepts now bytecode files on the command line even
  if they do not have a .pyc or .pyo extension. On Linux, after executing

  echo ':pyc:M::\x87\xc6\x0d\x0a::/usr/local/bin/python:' > /proc/sys/fs/binfmt_misc/register

  any byte code file can be used as an executable (i.e. as an argument
  to execve(2)).

- %[xXo] formats of negative Python longs now produce a sign
  character.  In 1.6 and earlier, they never produced a sign,
  and raised an error if the value of the long was too large
  to fit in a Python int.  In 2.0, they produced a sign if and
  only if too large to fit in an int.  This was inconsistent
  across platforms (because the size of an int varies across
  platforms), and inconsistent with hex() and oct().  Example:

  >>> "%x" % -0x42L
  '-42'      # in 2.1
  'ffffffbe' # in 2.0 and before, on 32-bit machines
  >>> hex(-0x42L)
  '-0x42L'   # in all versions of Python

  The behavior of %d formats for negative Python longs remains
  the same as in 2.0 (although in 1.6 and before, they raised
  an error if the long didn't fit in a Python int).

  %u formats don't make sense for Python longs, but are allowed
  and treated the same as %d in 2.1.  In 2.0, a negative long
  formatted via %u produced a sign if and only if too large to
  fit in an int.  In 1.6 and earlier, a negative long formatted
  via %u raised an error if it was too big to fit in an int.

- Dictionary objects have an odd new method, popitem().  This removes
  an arbitrary item from the dictionary and returns it (in the form of
  a (key, value) pair).  This can be useful for algorithms that use a
  dictionary as a bag of "to do" items and repeatedly need to pick one
  item.  Such algorithms normally end up running in quadratic time;
  using popitem() they can usually be made to run in linear time.

Standard library

- In the time module, the time argument to the functions strftime,
  localtime, gmtime, asctime and ctime is now optional, defaulting to
  the current time (in the local timezone).

- The ftplib module now defaults to passive mode, which is deemed a
  more useful default given that clients are often inside firewalls
  these days.  Note that this could break if ftplib is used to connect
  to a *server* that is inside a firewall, from outside; this is
  expected to be a very rare situation.  To fix that, you can call
  ftp.set_pasv(0).

- The module site now treats .pth files not only for path configuration,
  but also supports extensions to the initialization code: Lines starting
  with import are executed.

- There's a new module, warnings, which implements a mechanism for
  issuing and filtering warnings.  There are some new built-in
  exceptions that serve as warning categories, and a new command line
  option, -W, to control warnings (e.g. -Wi ignores all warnings, -We
  turns warnings into errors).  warnings.warn(message[, category])
  issues a warning message; this can also be called from C as
  PyErr_Warn(category, message).

- A new module xreadlines was added.  This exports a single factory
  function, xreadlines().  The intention is that this code is the
  absolutely fastest way to iterate over all lines in an open
  file(-like) object:

  import xreadlines
  for line in xreadlines.xreadlines(file):
      ...do something to line...

  This is equivalent to the previous the speed record holder using
  file.readlines(sizehint).  Note that if file is a real file object
  (as opposed to a file-like object), this is equivalent:

  for line in file.xreadlines():
      ...do something to line...

- The bisect module has new functions bisect_left, insort_left,
  bisect_right and insort_right.  The old names bisect and insort
  are now aliases for bisect_right and insort_right.  XXX_right
  and XXX_left methods differ in what happens when the new element
  compares equal to one or more elements already in the list:  the
  XXX_left methods insert to the left, the XXX_right methods to the
  right.  Code that doesn't care where equal elements end up should
  continue to use the old, short names ("bisect" and "insort").

- The new curses.panel module wraps the panel library that forms part
  of SYSV curses and ncurses.  Contributed by Thomas Gellekum.

- The SocketServer module now sets the allow_reuse_address flag by
  default in the TCPServer class.

- A new function, sys._getframe(), returns the stack frame pointer of
  the caller.  This is intended only as a building block for
  higher-level mechanisms such as string interpolation.

Build issues

- For Unix (and Unix-compatible) builds, configuration and building of
  extension modules is now greatly automated.  Rather than having to
  edit the Modules/Setup file to indicate which modules should be
  built and where their include files and libraries are, a
  distutils-based setup.py script now takes care of building most
  extension modules.  All extension modules built this way are built
  as shared libraries.  Only a few modules that must be linked
  statically are still listed in the Setup file; you won't need to
  edit their configuration.

- Python should now build out of the box on Cygwin.  If it doesn't,
  mail to Jason Tishler (jlt63 at users.sourceforge.net).

- Python now always uses its own (renamed) implementation of getopt()
  -- there's too much variation among C library getopt()
  implementations.

- C++ compilers are better supported; the CXX macro is always set to a
  C++ compiler if one is found.

Windows changes

- select module:  By default under Windows, a select() call
  can specify no more than 64 sockets.  Python now boosts
  this Microsoft default to 512.  If you need even more than
  that, see the MS docs (you'll need to #define FD_SETSIZE
  and recompile Python from source).

- Support for Windows 3.1, DOS and OS/2 is gone.  The Lib/dos-8x3
  subdirectory is no more!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ping at lfw.org  Tue Jan 23 05:11:09 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 22 Jan 2001 20:11:09 -0800 (PST)
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: <3A6CBBEF.4732BFF2@ActiveState.com>
Message-ID: <Pine.LNX.4.10.10101221953190.1568-100000@skuld.kingmanhall.org>

Guido van Rossum wrote:
> Yes, wow!

Paul Prescod wrote:
> I apologize but I'm not clear on my responsibilities here, if any. I
> wrote a PEP for online help. I submitted a partial implementation.

Hi, guys.  Sorry i haven't been sending updates on what i'm doing.
Here's the current picture as i see it.

> Ping wrote a full implementation that basically supercedes mine.

My implementation is "full" in that it deploys and seems to work on
arbitrary modules as it stands, but it doesn't really supercede Paul's
because it leaves out the big piece of Paul's work that did conversion
from packaged HTML docs to plain text.

It also has the deficiency that it imports modules live; for untrusted
modules, this is a security risk.  I know Paul has been working on
stuff to compile a module into a kind of skeleton object that has all
the same name bindings but no live contents, and if that works reliably,
we should definitely try plugging that in.

> There are various ideas for improving it, but I think that we agree
> that the core is solid.

Yes.  I believe that as it stands, pydoc is useful enough to be a net
positive addition to the core.  inspect.py alone has been stable and
alpha-ready for some time, i believe.

Here is a summary of its status and work that remains.  pydoc has:

    inspecting live objects
    generating text docs from live objects
    generating HTML docs from live objects
    serving HTML docs from a little web server
    showing docs from the command line
    showing docs from within the interactive interpreter
    apropos-style module listing

It's missing the following, and Paul had stuff for this:

    inspecting unsafe modules
    generating text docs from packaged HTML (e.g. language reference)

It also needs these:

    generating docs from a file given on the command line (easy)
    more Windows and Mac testing and decisions
    various small bugfixes

This past week i've been messing around with Windows and Mac stuff,
trying to see whether it's possible to reliably spawn a webserver
and launch a web browser at the same time (this would seem to be a
good default action to do on GUI platforms).

In trying to do the latter i've found the webbrowser module pretty
unreliable, by the way.  For example, it relies on a constant delay
of 4 seconds to launch a new browser that can't be expected on all
platforms, and fails to launch Netscape 3 because it supplies an
illegal command-line option.  When i've found good cross-platform
ways to make this work i'll suggest some patches.

I've so far considered this project blocked only on cross-platform
testing -- do you agree?  While i know that inspecting unsafe modules
and processing packaged HTML are important features, i don't consider
them essential.


-- ?!ng




From ping at lfw.org  Tue Jan 23 05:14:50 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 22 Jan 2001 20:14:50 -0800 (PST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <Pine.LNX.4.10.10101221953190.1568-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101222011470.1568-100000@skuld.kingmanhall.org>

On Mon, 22 Jan 2001, Ka-Ping Yee wrote:
> In trying to do the latter i've found the webbrowser module pretty
> unreliable, by the way.  For example, it relies on a constant delay
> of 4 seconds to launch a new browser that can't be expected on all
> platforms, and fails to launch Netscape 3 because it supplies an
> illegal command-line option.  When i've found good cross-platform
> ways to make this work i'll suggest some patches.

Oh, and i forgot to mention... i was pretty disappointed that:

    setenv BROWSER my_browser_program
    python -c 'import webbrowser; webbrowser.open("http://python.org/")'

doesn't execute "my_browser_program http://python.org/" as i would
have hoped.  Even for a known browser type:

    setenv BROWSER lynx
    python -c 'import webbrowser; webbrowser.open("http://python.org/")'

does not work as expected, either.  (Red Hat Linux here.)


-- ?!ng




From ping at lfw.org  Tue Jan 23 05:22:56 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 22 Jan 2001 20:22:56 -0800 (PST)
Subject: [Python-Dev] Is X a (sequence|mapping)?
Message-ID: <Pine.LNX.4.10.10101222016150.1568-100000@skuld.kingmanhall.org>

We can implement abstract interfaces (sequence, mapping, number) in
Python with the appropriate __special__ methods, but i don't see an
easy way to test if something supports one of these abstract interfaces
in Python.

At the moment, to see if something is a sequence i believe i have to
say something like

    try:
        x[0]
    except:
        # not a sequence
    else:
        # okay, it's a sequence

or

    if hasattr(x, '__getitem__') or type(x) in [type(()), type([])]:
        ...

Is there, or should there be, a better way to do this?



-- ?!ng




From greg at cosc.canterbury.ac.nz  Tue Jan 23 05:46:26 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Tue, 23 Jan 2001 17:46:26 +1300 (NZDT)
Subject: [Python-Dev] re: I think my set module is ready for prime time
In-Reply-To: <Pine.LNX.4.10.10101222146150.20319-100000@akbar.nevex.com>
Message-ID: <200101230446.RAA01992@s454.cosc.canterbury.ac.nz>

Greg Wilson <gvwilson at nevex.com>:

> an efficient implementation capable of
> handling mutable values (i.e. one that would allow things like sets of
> sets)

I suspect that such a thing is impossible. To avoid a
linear search you have to take advantage of some kind
of hashing or ordering, which you can't do if your
objects can change their values out from under you.

Also, there's nothing to stop someone from mutating
two previously unequal elements so that they're equal.
Then you have a "set" with two identical elements,
which isn't a set any more, it's just a collection.

So, I submit that the very concept of a set only
makes sense for immutable values.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From tim.one at home.com  Tue Jan 23 06:03:18 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 23 Jan 2001 00:03:18 -0500
Subject: [Python-Dev] Is X a (sequence|mapping)?
In-Reply-To: <Pine.LNX.4.10.10101222016150.1568-100000@skuld.kingmanhall.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEJBIKAA.tim.one@home.com>

[?!ng]
> ...
> At the moment, to see if something is a sequence i believe i have to
> say something like
>
>     try:
>         x[0]
>     except:
>         # not a sequence
>     else:
>         # okay, it's a sequence
>
> or
>
>     if hasattr(x, '__getitem__') or type(x) in [type(()), type([])]:
>         ...
>
> Is there, or should there be, a better way to do this?

Dunno.  What's a sequence?  If you want to know whether x[0] will blow up,
trying x[0] is the most obvious way.  BTW, I expect trying x[:0] is a better
idea:  doesn't succeed for dicts, and doesn't blow up for an irrelevant
reason if x is an empty sequence.  BTW2, your second method suggests an
uncomfortable truth:  many contexts that want "a sequence" don't want
strings to pass the test, despite that strings are as much sequences as
lists in Python, no matter how "a sequence" is defined.

afraid that-what-you-want-to-do-with-it-is-more-important-than-what-
    python-calls-it-ly y'rs  - tim




From ping at lfw.org  Tue Jan 23 06:27:30 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Mon, 22 Jan 2001 21:27:30 -0800 (PST)
Subject: [Python-Dev] I think my set module is ready for prime time;
 comments?
In-Reply-To: <20010122124159.A14999@thyrsus.com>
Message-ID: <Pine.LNX.4.10.10101222125340.1568-100000@skuld.kingmanhall.org>

On Mon, 22 Jan 2001, Eric S. Raymond wrote:
> \section{\module{set} ---
>          Basic set algebra for Python}

I'd like to look at the module.  Did you actually show us the code
for this, or am i a blind doofus?

(Please, no answers to the unasked question of whether i am a doofus.)


-- ?!ng




From tim.one at home.com  Tue Jan 23 07:05:26 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 23 Jan 2001 01:05:26 -0500
Subject: [Python-Dev] Worse news
In-Reply-To: <20010122064400.A26543@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEJEIKAA.tim.one@home.com>

In finding and repairing the test_extcall bug, Neil and Thomas have once
again contributed beyond the call of duty.  Thank you!  It took some doing
to convince Guido to release his Dutch Death Grip on the PythonLabs coffers,
but in the end he was overcome by the moral necessity of rewarding you
sterling fellows for your golden deeds:  you're both entitled to free(*)--
yes, FREE(*)! --copies of all Python 2.1 alpha, *and* beta, releases(*)!

you-wouldn't-believe-how-much-he-charges-us-ly y'rs  - tim


(*) Does not apply to Jython releases.  All applicable taxes are the
responsibility of the recipient.  No warranty is expressed or implied.  This
offer has not been reviewed or approved by CWI, CNRI, BeOpen.com, or Digital
Creations 2.  Export restrictions may apply.  By acceptance of this offer,
recipient grants perpetual license to use their name, image and likeness in
Python promotional materials without compensation.  Packaging, handling,
shipping and insurance costs to be borne by recipient, but in no case to
exceed 1 (one) US$/byte.  This offer may be withdrawn at any time, including
but not limited to retroactively, at the sole discretion of Guido van
Rossum, or such of his heirs and successors as he may designate from time to
time.




From martin at mira.cs.tu-berlin.de  Tue Jan 23 09:14:32 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Tue, 23 Jan 2001 09:14:32 +0100
Subject: [Python-Dev] Is X a (sequence|mapping)?
Message-ID: <200101230814.f0N8EWQ00849@mira.informatik.hu-berlin.de>

> i don't see an easy way to test if something supports one of these
> abstract interfaces in Python.

Why do you want to test for that? If you have an algorithm that only
operates on integer-indexed things, what can you do if the test fails?

So it is always better to just use the object in the algorithm, and
let it break with an exception if somebody passes a bad object.

Regards,
Martin



From mal at lemburg.com  Tue Jan 23 10:08:24 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 10:08:24 +0100
Subject: [Python-Dev] webbrowser.py
References: <Pine.LNX.4.10.10101222011470.1568-100000@skuld.kingmanhall.org>
Message-ID: <3A6D4A08.B3806984@lemburg.com>

Ka-Ping Yee wrote:
> 
> On Mon, 22 Jan 2001, Ka-Ping Yee wrote:
> > In trying to do the latter i've found the webbrowser module pretty
> > unreliable, by the way.  For example, it relies on a constant delay
> > of 4 seconds to launch a new browser that can't be expected on all
> > platforms, and fails to launch Netscape 3 because it supplies an
> > illegal command-line option.  When i've found good cross-platform
> > ways to make this work i'll suggest some patches.
> 
> Oh, and i forgot to mention... i was pretty disappointed that:
> 
>     setenv BROWSER my_browser_program
>     python -c 'import webbrowser; webbrowser.open("http://python.org/")'
> 
> doesn't execute "my_browser_program http://python.org/" as i would
> have hoped.  Even for a known browser type:
> 
>     setenv BROWSER lynx
>     python -c 'import webbrowser; webbrowser.open("http://python.org/")'
> 
> does not work as expected, either.  (Red Hat Linux here.)

Hmm, lynx should work (the module has explicit support for it)
and yes, I agree, webbrowser should trust BROWSER and use a
generic calling mechanism (program <url>) for opening the
URL.

Too late for 2.1a1, but maybe for a2 ?!

BTW, I think that the second line here is causing the problem:

class CommandLineBrowser:
    _browsers = [] # <- this overrides the global of the same name
    if os.environ.get("DISPLAY"):
        _browsers.extend([
            ("netscape", "netscape %s >/dev/null &"),
            ("mosaic", "mosaic %s >/dev/null &"),
            ])
    _browsers.extend([
        ("lynx", "lynx %s"),
        ("w3m", "w3m %s"),
        ])


-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Tue Jan 23 10:15:11 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 10:15:11 +0100
Subject: [Python-Dev] Is X a (sequence|mapping)?
References: <200101230814.f0N8EWQ00849@mira.informatik.hu-berlin.de>
Message-ID: <3A6D4B9F.38B17046@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > i don't see an easy way to test if something supports one of these
> > abstract interfaces in Python.
> 
> Why do you want to test for that? If you have an algorithm that only
> operates on integer-indexed things, what can you do if the test fails?
> 
> So it is always better to just use the object in the algorithm, and
> let it break with an exception if somebody passes a bad object.

Right. 

Polymorphic code will usually get you more out of an 
algorithm, than type-safe or interface-safe code.

BTW, there are Python interfaces to PySequence_Check() and
PyMapping_Check() burried in the builtin operator module in case
you really do care ;) ...

	operator.isSequenceType()
	operator.isMappingType()
	+ some other C style _Check() APIs

These only look at the type slots though, so Python instances
will appear to support everything but when used fail with
an exception if they don't provide the proper __xxx__ hooks.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Tue Jan 23 10:17:30 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 04:17:30 -0500
Subject: [Python-Dev] webbrowser.py
Message-ID: <20010123041730.A25165@thyrsus.com>

Ping's complaints are justified -- I've been looking at and testing
webbrowser.py and it's a mess.  Among other things:

1. The BROWSER variable is not interpreted properly.

2. The code is stupid about loading platform support it doesn't need.

3. It's not possible to specify lynx as a browser under Unix, because the
   computation of available browsers is split in two and partly done inside
   the CommandLineBrowser class.

3. The module code is excessively hard to read, obscuring these bugs.

Our mistake was hurriedly merging the launcher code from IDLE with the
browser-finder hack I wrote (the guts of CommandLineBrowser).  The resulting
code is a bad, overcomplicated architecture with a nasty seam in it.

As co-designer/implementor I should have caught this sooner, but I was
in a hurry to get a CML2 prototype out the door and didn't test
anything but the case I needed.  My apologies to all.

I'm rewriting to fix these problems now.  Documented semantics of entry
points will be preserved.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The politician attempts to remedy the evil by increasing the very thing
that caused the evil in the first place: legal plunder.
	-- Frederick Bastiat



From mal at lemburg.com  Tue Jan 23 11:26:16 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 11:26:16 +0100
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com>
Message-ID: <3A6D5C48.A076DA0@lemburg.com>

"Eric S. Raymond" wrote:
> 
> Guido van Rossum <guido at digicool.com>:
> > There's already a PEP on a set object type, and everybody and their
> > aunt has already implemented a set datatype.
> 
> I've just read the PEP.  Greg's proposal has a couple of problems.
> The biggest one is that the interface design isn't very Pythonic --
> it's formally adequate, but doesn't exploit the extent to which sets
> naturally have common semantics with existing Python sequence types.
> This is bad; it means that a lot of code that could otherwise ignore
> the difference between lists and sets would have to be specialized
> one way or the other for no good reason.
> 
> The only other set module I can find in the Vaults or anywhere else is
> kjBuckets (which I knew about before).  Looks like a good design, but
> complicated -- and requires installation of an extension.

There's also a kjSet.py available at Aaron's site:

	http://www.chordate.com/kwParsing/index.html

which is a pure Python version of the C extenion's kjSet type.
 
> > If *your* set module is ready for prime time, why not publish it in
> > the Vaults of Parnassus?
> 
> I suppose that's what I'll do if you don't bless it for the standard
> library.  But here are the reasons I suggest you should do so:
> 
> 1. It supports a set of operations that are both often useful and
> fiddly to get right, thus enhancing the "batteries are included"
> effect.  (I used its ancestor for representing seen-message numbers in
> a specialized mailreader, for example.)
> 
> 2. It's simple for application programmers to use.  No extension module
> to integrate.
> 
> 3. It's unsurprising.  My set objects behave almost exactly like other
> mutable sequences, with all the same built-in methods working, except for
> the fact that you can't introduce duplicates with the mutators.
> 
> 4. It's already completely documented in a form suitable for the library.
> 
> 5. It's simple enough not to cause you maintainance hassles down the
> road, and even if it did the maintainer is unlikely to disappear :-).

All very well, but are sets really that essential to every
day Python programming ? If we include sets then we ought to
also include graphs, tries, btrees and all those other goodies
we have in computer science. All of these types are available
out there, but I believe the audience who really cares for these
types is also capable of downloading the extensions and installing
them.

It would be nice if all of these extension could go into a SUMO
edition of Python though... together with your set module.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Tue Jan 23 12:08:06 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 06:08:06 -0500
Subject: [Python-Dev] What does "batteries are included" mean?
In-Reply-To: <3A6D5C48.A076DA0@lemburg.com>; from mal@lemburg.com on Tue, Jan 23, 2001 at 11:26:16AM +0100
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com>
Message-ID: <20010123060806.A25436@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> All very well, but are sets really that essential to every
> day Python programming ? If we include sets then we ought to
> also include graphs, tries, btrees and all those other goodies
> we have in computer science.

I use sets a lot.  And there was enough demand to generate a PEP.

But the wider question here is how seriously we take "batteries are
included" as a design principle.  Does a facility have to be useful
*every day* to be worth being in the standard library?  And if so,
what are things like the POP3 and IMAP libraries (or, for that matter,
my own shlex and netrc modules) doing there?

I don't think so.  I think there are at least four different
possible reasons for something to be in the standard library:

1. It's useful every day.

2. It's useful less frequently than every day, but is a stable
cross-platform implementation of a wheel that would otherwise have to
be reinvented frequently.  That is, you can solve it *once* and have a
zero-maintainance increment to the power of the language.

3. It's a technique that's not often used, and not necessarily stable 
in the face of platform variations, but nothing else will do
when you need it and it's notably difficult to get right.  (popen2 and
BaseHTTPServer would be good examples of this.)

4. It's a developer checklist feature that improves Python's competitive
position against Perl, Tcl, and other contenders for the same ecological
niche.

IMO a lightweight set facility, like POP3 and IMAP, qualifies under 2 and 4
even if not under 1 and 3.  

This question keeps coming up in different guises.  I'm often the one to
raise it, because I favor an aggressive interpretation of "batteries
are included" that would pull in a lot of stuff.  Yes, this makes more
work for us -- but I think it's work we should be doing.  

While minimalism is an excellent design heuristic for the core language,
I think it's a bad one for the libraries.  Python is a high-level language
and programmers using it both expect and deserve high-level libraries --
yes, including graphs/tries/btrees and all that computer science stuff.

Just as much to the point, Python competing against languages like
Perl that frequently get design wins against it because of the
richness of the environment *they* are willing to carry around.

Guido and Tim and others are more conservative than I, which would be
OK -- but it seems to me that the conservatives do not have consistent
or well-thought-out criteria for what to include, which is *not* OK.
We need to solve this problem.

Some time back I initiated a library guidelines PEP, then dropped it
due to press of overwork.  But the general question is going to keep
coming up and we ought to have policy guidelines that potential 
library developers can understand.  

Should I pick this up again?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

I do not find in orthodox Christianity one redeeming feature.
	-- Thomas Jefferson



From mal at lemburg.com  Tue Jan 23 12:50:39 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 12:50:39 +0100
Subject: [Python-Dev] What does "batteries are included" mean?
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com>
Message-ID: <3A6D700F.7A9E2509@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal at lemburg.com>:
> > All very well, but are sets really that essential to every
> > day Python programming ? If we include sets then we ought to
> > also include graphs, tries, btrees and all those other goodies
> > we have in computer science.
> 
> I use sets a lot.  And there was enough demand to generate a PEP.

Sure, but sets are fairly easy to implement using Python dictionaries
-- at least at the level normally needed by Python programs. Sets, queues
and graphs are examples of data types which can have many
different faces; it is hard to design APIs for these which meet 
everyones needs.
 
> But the wider question here is how seriously we take "batteries are
> included" as a design principle.  Does a facility have to be useful
> *every day* to be worth being in the standard library?  And if so,
> what are things like the POP3 and IMAP libraries (or, for that matter,
> my own shlex and netrc modules) doing there?

You can argue the same way for all kinds of extensions and
packages you find in the Vaults. That's why there's demand for
a different packaging of Python and this is what Moshe's
PEP 206 addresses:

	http://python.sourceforge.net/peps/pep-0206.html

> I don't think so. I think there are at least four different
> possible reasons for something to be in the standard library:
> 
> 1. It's useful every day.
> 
> 2. It's useful less frequently than every day, but is a stable
> cross-platform implementation of a wheel that would otherwise have to
> be reinvented frequently.  That is, you can solve it *once* and have a
> zero-maintainance increment to the power of the language.
> 
> 3. It's a technique that's not often used, and not necessarily stable
> in the face of platform variations, but nothing else will do
> when you need it and it's notably difficult to get right.  (popen2 and
> BaseHTTPServer would be good examples of this.)
> 
> 4. It's a developer checklist feature that improves Python's competitive
> position against Perl, Tcl, and other contenders for the same ecological
> niche.
> 
> IMO a lightweight set facility, like POP3 and IMAP, qualifies under 2 and 4
> even if not under 1 and 3.
> 
> This question keeps coming up in different guises.  I'm often the one to
> raise it, because I favor an aggressive interpretation of "batteries
> are included" that would pull in a lot of stuff.  Yes, this makes more
> work for us -- but I think it's work we should be doing.
> 
> While minimalism is an excellent design heuristic for the core language,
> I think it's a bad one for the libraries.  Python is a high-level language
> and programmers using it both expect and deserve high-level libraries --
> yes, including graphs/tries/btrees and all that computer science stuff.
> 
> Just as much to the point, Python competing against languages like
> Perl that frequently get design wins against it because of the
> richness of the environment *they* are willing to carry around.
> 
> Guido and Tim and others are more conservative than I, which would be
> OK -- but it seems to me that the conservatives do not have consistent
> or well-thought-out criteria for what to include, which is *not* OK.
> We need to solve this problem.
> 
> Some time back I initiated a library guidelines PEP, then dropped it
> due to press of overwork.  But the general question is going to keep
> coming up and we ought to have policy guidelines that potential
> library developers can understand.
> 
> Should I pick this up again?

Hmm, we already have the PEP 206 which focusses on the topic.
Perhaps you could work with Moshe to sort out the "which
batteries do we need" sub-topic ?!

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Tue Jan 23 13:20:46 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 07:20:46 -0500
Subject: [Python-Dev] What does "batteries are included" mean?
In-Reply-To: <3A6D700F.7A9E2509@lemburg.com>; from mal@lemburg.com on Tue, Jan 23, 2001 at 12:50:39PM +0100
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com> <3A6D700F.7A9E2509@lemburg.com>
Message-ID: <20010123072046.A25593@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> > But the wider question here is how seriously we take "batteries are
> > included" as a design principle.  Does a facility have to be useful
> > *every day* to be worth being in the standard library?  And if so,
> > what are things like the POP3 and IMAP libraries (or, for that matter,
> > my own shlex and netrc modules) doing there?
> 
> You can argue the same way for all kinds of extensions and
> packages you find in the Vaults. That's why there's demand for
> a different packaging of Python and this is what Moshe's
> PEP 206 addresses:
> 
> 	http://python.sourceforge.net/peps/pep-0206.html

Muttering "PEP 206" evades the fundamental problem rather than solving it.

Not that I'm saying Moshe hasn't made a valiant effort, within the political
constraint that the BDFL and others seem unwilling to confront the deeper 
issue.  But PEP 206 is not enough.  Here is why:

1. If the "Sumo" packaging ever happens, the vanilla non-Sumo version that
Guido issues will quickly become of mostly theoretical interest -- because
Red Hat and everybody else will move to Sumo instantly, figuring they have
nothing to lose by including more features.

2. If by some change I'm wrong about 1, the outcome will be worse;
we'll in effect have fragmented the language, because there won't be
consistency in what library stuff is available between Sumo and
non-Sumo builds on the same platform.

3. There are documentation issues as well.  It's already a blot on
Python that the standard documentation set doesn't cover Tkinter.  In
the Sumo distribution, the gap between what's installed and what's
documented is likely to widen further.  Developers will see this as
pointlessly irritating -- and they'll be right.

The stock distribution should *be* the Sumo distribution.  If we're really
so terrified of the extra maintainence load, then the right fix is to
mark some modules and documentation as "externally maintained" with 
prominent pointers back to the responsible people.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The day will come when the mystical generation of Jesus by the Supreme
Being as his father, in the womb of a virgin, will be classed with the
fable of the generation of Minerva in the brain of Jupiter.
	-- Thomas Jefferson, 1823



From mal at lemburg.com  Tue Jan 23 13:48:09 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 13:48:09 +0100
Subject: [Python-Dev] What does "batteries are included" mean?
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com> <3A6D700F.7A9E2509@lemburg.com> <20010123072046.A25593@thyrsus.com>
Message-ID: <3A6D7D89.A6BE1B74@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal at lemburg.com>:
> > > But the wider question here is how seriously we take "batteries are
> > > included" as a design principle.  Does a facility have to be useful
> > > *every day* to be worth being in the standard library?  And if so,
> > > what are things like the POP3 and IMAP libraries (or, for that matter,
> > > my own shlex and netrc modules) doing there?
> >
> > You can argue the same way for all kinds of extensions and
> > packages you find in the Vaults. That's why there's demand for
> > a different packaging of Python and this is what Moshe's
> > PEP 206 addresses:
> >
> >       http://python.sourceforge.net/peps/pep-0206.html
> 
> Muttering "PEP 206" evades the fundamental problem rather than solving it.
> 
> Not that I'm saying Moshe hasn't made a valiant effort, within the political
> constraint that the BDFL and others seem unwilling to confront the deeper
> issue.  But PEP 206 is not enough.  Here is why:
> 
> 1. If the "Sumo" packaging ever happens, the vanilla non-Sumo version that
> Guido issues will quickly become of mostly theoretical interest -- because
> Red Hat and everybody else will move to Sumo instantly, figuring they have
> nothing to lose by including more features.
> 
> 2. If by some change I'm wrong about 1, the outcome will be worse;
> we'll in effect have fragmented the language, because there won't be
> consistency in what library stuff is available between Sumo and
> non-Sumo builds on the same platform.
> 
> 3. There are documentation issues as well.  It's already a blot on
> Python that the standard documentation set doesn't cover Tkinter.  In
> the Sumo distribution, the gap between what's installed and what's
> documented is likely to widen further.  Developers will see this as
> pointlessly irritating -- and they'll be right.
> 
> The stock distribution should *be* the Sumo distribution.  If we're really
> so terrified of the extra maintainence load, then the right fix is to
> mark some modules and documentation as "externally maintained" with
> prominent pointers back to the responsible people.

That's your POV, others think different and since this is not
a democracy, the Sumo distribution is a feasable way of satisfying
both needs.

There are a few other issues to consider as well:

* licensing is a problem (and this is also mentioned in the PEP 206)
  since some of the nicer additions are GPLed and thus not
  in the spirit of Python's closed-source friendliness which
  has provided it with a large user base in the commercial field

* packages authors are not all the same and some may not want
  to split their distribution due to the integration of their
  package in a Sumo-distribution

* the packages mentioned in PEP 206 are very complex and usually
  largish; maintaining them will cause much more effort compared
  to the standard lib modules and extensions

* the build process varies widely between packages; even though
  we have distutils, some of the packages extend it to fit
  their specific needs (which is OK, but causes extra efforts
  in getting the build process combined)

I'm not objecting to the Sumo-distribution project; to the 
contrary -- I tried a similar project a few years ago:
the Python PowerTools distribution which you can download
from:

	http://www.lemburg.com/python/PowerTools-0.2.zip

The project died quickly though, as I wasn't able to keep
up with the maintenance effort.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From akuchlin at cnri.reston.va.us  Tue Jan 23 14:40:06 2001
From: akuchlin at cnri.reston.va.us (Andrew Kuchling)
Date: Tue, 23 Jan 2001 08:40:06 -0500
Subject: [Python-Dev] What does "batteries are included" mean?
In-Reply-To: <3A6D7D89.A6BE1B74@lemburg.com>; from mal@lemburg.com on Tue, Jan 23, 2001 at 01:48:09PM +0100
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com> <3A6D700F.7A9E2509@lemburg.com> <20010123072046.A25593@thyrsus.com> <3A6D7D89.A6BE1B74@lemburg.com>
Message-ID: <20010123084006.A23485@newcnri.cnri.reston.va.us>

On Tue, Jan 23, 2001 at 01:48:09PM +0100, M.-A. Lemburg wrote:
>There are a few other issues to consider as well:
>   <good list deleted>

To add a few:

* The larger the amount of code in the distribution, the more effort it is
  maintain it all.

* Minor fixes aren't available until the next Python release.  For example,
  to drag out the XML code again: there have been two PyXML releases since
  Python 2.0 fixing various bugs, but someone who sticks to installing just 
  Python will not be able to get at those bugfixes until April (when 2.1
  is supposed to get finalized). 

If there were a core Python distribution and a sumo distribution, and the
sumo distribution was the one that most people downloaded and used, that
would be perfectly OK.  Practically no one assembles their own Linux
distribution, and that's not considered a problem.  To some degree, if
you're using a well-packaged Linux distribution such as Debian, you also
have Python distribution mechanism with intermodule dependencies; we just
have to reinvent the wheel for people on other platforms.

>The project died quickly though, as I wasn't able to keep
>up with the maintenance effort.

Interesting.  Did you get much feedback indicating that people used it much?
Perhaps when you were doing that effort the Python community was composed
more of self-reliant early adopter types; there are probably more newbies
around now.

--amk



From mal at lemburg.com  Tue Jan 23 15:05:13 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Tue, 23 Jan 2001 15:05:13 +0100
Subject: [Python-Dev] What does "batteries are included" mean?
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <3A6D5C48.A076DA0@lemburg.com> <20010123060806.A25436@thyrsus.com> <3A6D700F.7A9E2509@lemburg.com> <20010123072046.A25593@thyrsus.com> <3A6D7D89.A6BE1B74@lemburg.com> <20010123084006.A23485@newcnri.cnri.reston.va.us>
Message-ID: <3A6D8F99.53A0F411@lemburg.com>

Andrew Kuchling wrote:
> 
> On Tue, Jan 23, 2001 at 01:48:09PM +0100, M.-A. Lemburg wrote:
> >There are a few other issues to consider as well:
> >   <good list deleted>
> 
> To add a few:
> 
> * The larger the amount of code in the distribution, the more effort it is
>   maintain it all.
> 
> * Minor fixes aren't available until the next Python release.  For example,
>   to drag out the XML code again: there have been two PyXML releases since
>   Python 2.0 fixing various bugs, but someone who sticks to installing just
>   Python will not be able to get at those bugfixes until April (when 2.1
>   is supposed to get finalized).
> 
> If there were a core Python distribution and a sumo distribution, and the
> sumo distribution was the one that most people downloaded and used, that
> would be perfectly OK.  Practically no one assembles their own Linux
> distribution, and that's not considered a problem.  To some degree, if
> you're using a well-packaged Linux distribution such as Debian, you also
> have Python distribution mechanism with intermodule dependencies; we just
> have to reinvent the wheel for people on other platforms.
> 
> >The project died quickly though, as I wasn't able to keep
> >up with the maintenance effort.
> 
> Interesting.  Did you get much feedback indicating that people used it much?

Not much -- the interested parties were mostly Python experts (the
lib started out as a project called expert-lib).

> Perhaps when you were doing that effort the Python community was composed
> more of self-reliant early adopter types; there are probably more newbies
> around now.

True. The included packages are dated 1997-1998 -- at that time
Starship was just starting to get off the ground (this are moving
at a much faster pace now).

The PowerTools package still uses the Makefile.pre.in mechanism
(with much success though) as distutils wasn't even considered
at the time. Perhaps Moshe could pick this up to have a head
start for Sumo-Python ?!

Some of the included packages are not available elsewhere, AFAIK,
so it may well be worthwhile having a look (e.g. the LGPLed trie and
btree implementations donated by John W. M. Stevens).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Tue Jan 23 15:06:47 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 09:06:47 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: Your message of "Tue, 23 Jan 2001 04:17:30 EST."
             <20010123041730.A25165@thyrsus.com> 
References: <20010123041730.A25165@thyrsus.com> 
Message-ID: <200101231406.JAA04765@cj20424-a.reston1.va.home.com>

> Ping's complaints are justified -- I've been looking at and testing
> webbrowser.py and it's a mess.  Among other things:
> 
> 1. The BROWSER variable is not interpreted properly.
> 
> 2. The code is stupid about loading platform support it doesn't need.
> 
> 3. It's not possible to specify lynx as a browser under Unix, because the
>    computation of available browsers is split in two and partly done inside
>    the CommandLineBrowser class.
> 
> 3. The module code is excessively hard to read, obscuring these bugs.
> 
> Our mistake was hurriedly merging the launcher code from IDLE with the
> browser-finder hack I wrote (the guts of CommandLineBrowser).  The resulting
> code is a bad, overcomplicated architecture with a nasty seam in it.
> 
> As co-designer/implementor I should have caught this sooner, but I was
> in a hurry to get a CML2 prototype out the door and didn't test
> anything but the case I needed.  My apologies to all.
> 
> I'm rewriting to fix these problems now.  Documented semantics of entry
> points will be preserved.

Excellent, Eric!  That's the spirit.

Can you point me to docs explaining the meaning of the BROWSER
environment variable?  I've never heard of it...  The last new
environment variables I learned were PAGER and EDITOR, probably 15
years ago when 4.1BSD was released... :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Tue Jan 23 15:22:26 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 09:22:26 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101231406.JAA04765@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 09:06:47AM -0500
References: <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com>
Message-ID: <20010123092226.A25968@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> Can you point me to docs explaining the meaning of the BROWSER
> environment variable?  I've never heard of it...  The last new
> environment variables I learned were PAGER and EDITOR, probably 15
> years ago when 4.1BSD was released... :-)

You've never heard of BROWSER because I invented it and have not
widely popularized it yet :-).  Ping knew about it either because he
read the module code and saw that it was supposed to work, or because
he remembered the design discussion when webbrowser.py was first
implemented.

I've had conversations with some key Perl and Tcl people (Larry Wall,
Tom Christiansen, Clif Flynt) about the BROWSER convention, and they
agree it's a good idea.  I'll probably hack support for it into Perl's
browser launcher next.

It's documented in the version of libwebbrowser.tex now in the CVS tree.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Power concedes nothing without a demand. It never did, and it never will.
Find out just what people will submit to, and you have found out the exact
amount of injustice and wrong which will be imposed upon them; and these will
continue until they are resisted with either words or blows, or with both.
The limits of tyrants are prescribed by the endurance of those whom they
oppress.
	-- Frederick Douglass, August 4, 1857



From nas at arctrix.com  Tue Jan 23 09:30:56 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 23 Jan 2001 00:30:56 -0800
Subject: [Python-Dev] Does autoconfig detect INSTALL incorrectly?
Message-ID: <20010123003056.A28309@glacier.fnational.com>

Why is the configure.in file set to always use "install-sh"?
There is a comment that says:

    # Install just never works :-(

I don't think that statement is accurate.  /usr/bin/install works
quite well on my machine.  The only commments I can find in the
changelog are:

    revision 1.16
    date: 1995/01/20 14:12:16;  author: guido;  state: Exp;  lines: +27 -2
    add INSTALL_PROGRAM and INSTALL_DATA; check for getopt

and:

    revision 1.5
    date: 1994/08/19 15:33:51;  author: guido;  state: Exp;  lines: +14 -6
    Simplify value of INSTALL (always 'cp').

Is there any reason why the autoconf macro AC_PROG_INSTALL is not used?  The
documentation seems to indicate that is does what we want.

 Neil



From guido at digicool.com  Tue Jan 23 16:31:39 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 10:31:39 -0500
Subject: [Python-Dev] Is X a (sequence|mapping)?
In-Reply-To: Your message of "Tue, 23 Jan 2001 10:15:11 +0100."
             <3A6D4B9F.38B17046@lemburg.com> 
References: <200101230814.f0N8EWQ00849@mira.informatik.hu-berlin.de>  
            <3A6D4B9F.38B17046@lemburg.com> 
Message-ID: <200101231531.KAA05122@cj20424-a.reston1.va.home.com>

> Polymorphic code will usually get you more out of an 
> algorithm, than type-safe or interface-safe code.

Right.

But there are times when people want to write methods that take
e.g. either a sequence or a mapping, and need to distinguish between
the two.  That's not easy in Python!  Java and C++ support it very
well though, and thus we'll always keep seeing this kind of
complaint.  Not sure what to do, except to recommend "find out which
methods you expect in one case but not in the other (e.g. keys()) and
do a hasattr() test for that."

> BTW, there are Python interfaces to PySequence_Check() and
> PyMapping_Check() burried in the builtin operator module in case
> you really do care ;) ...
> 
> 	operator.isSequenceType()
> 	operator.isMappingType()
> 	+ some other C style _Check() APIs
> 
> These only look at the type slots though, so Python instances
> will appear to support everything but when used fail with
> an exception if they don't provide the proper __xxx__ hooks.

Yes, these should probably be deprecated.  I certainly have never used
them!  (The operator module doesn't seem to get much use in
general...  Was it a bad idea?)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan 23 16:49:23 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 10:49:23 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Mon, 22 Jan 2001 15:13:09 EST."
             <20010122151309.C15236@thyrsus.com> 
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com>  
            <20010122151309.C15236@thyrsus.com> 
Message-ID: <200101231549.KAA05172@cj20424-a.reston1.va.home.com>

> I've just read the PEP.  Greg's proposal has a couple of problems.
> The biggest one is that the interface design isn't very Pythonic --
> it's formally adequate, but doesn't exploit the extent to which sets
> naturally have common semantics with existing Python sequence types.
> This is bad; it means that a lot of code that could otherwise ignore
> the difference between lists and sets would have to be specialized 
> one way or the other for no good reason.

Actually, I thought that Greg's proposal has some charm: it seems to
be using a natural extension of the existing dictionary syntax, where
a set is a dictionary without the values.  I haven't thought about
this deeply enough, but I see a lot of potential here.

I understand that you have probably given this more thought than I
have recently, so I'd like to see your more detailed analysis of what
you do and don't like about Greg's proposal!

> The only other set module I can find in the Vaults or anywhere else is
> kjBuckets (which I knew about before).  Looks like a good design, but
> complicated -- and requires installation of an extension.
> 
> > If *your* set module is ready for prime time, why not publish it in
> > the Vaults of Parnassus?
> 
> I suppose that's what I'll do if you don't bless it for the standard
> library.  But here are the reasons I suggest you should do so:
> 
> 1. It supports a set of operations that are both often useful and
> fiddly to get right, thus enhancing the "batteries are included"
> effect.  (I used its ancestor for representing seen-message numbers in
> a specialized mailreader, for example.)

I haven't read your docs yet (and no time because Digital Creations is
requiring my attention all of today), but I expect that designing a
universal set type, one that is good enough to be used in all sorts of
applications, is very difficult.  

> 2. It's simple for application programmers to use.  No extension module
> to integrate.

This is a silly argument for wanting something to be added to the
core.  If it's part of the core, the need for an extension is
immaterial because that extension will always be available.  So
I conclude that your module is set up perfectly for a popular module
in the Vaults. :-)

> 3. It's unsurprising.  My set objects behave almost exactly like other
> mutable sequences, with all the same built-in methods working, except for 
> the fact that you can't introduce duplicates with the mutators.

Ah, so you see a set as an extension of a sequence.  That may be the
big rift between your version and Greg's PEP: are sets more like
sequences or more like dictionaries?

> 4. It's already completely documented in a form suitable for the library.

Much appreciated.

> 5. It's simple enough not to cause you maintainance hassles down the
> road, and even if it did the maintainer is unlikely to disappear :-).

I'll be the judge of that, and since you prefer not to show your
source code (why is that?), I can't tell yet.

[...time flows...]

Having just skimmed your docs, I'm disappointed that you choose lists
as your fundamental representation type -- this makes it slow to test
for membership and hence makes intersection and union slow.  I suppose
that you have evidence from using this that those operations aren't
used much, or not for large sets?  This is one of the problems with
coming up with a set type for the core: it has to work for (nearly)
everybody.  It's no big deal if the Vaults contain three or more set
modules -- perfect even, people can choose the best one for their
purpose.  But in the core, there's only room for one set type or
module.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Tue Jan 23 17:30:50 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 11:30:50 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101231549.KAA05172@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 10:49:23AM -0500
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <200101231549.KAA05172@cj20424-a.reston1.va.home.com>
Message-ID: <20010123113050.A26162@thyrsus.com>

Guido van Rossum <guido at digicool.com>: 
> I understand that you have probably given this more thought than I
> have recently, so I'd like to see your more detailed analysis of what
> you do and don't like about Greg's proposal!

I've already covered my big objection, the fact that it doesn't
support the degree of polymorphic crossover one might expect with
sequence types (and Greg has agreed that I have a point there).
Another problem is the lack of support for mutable elements (and yes,
I'm quite aware of the problems with this.)

One thing I do like is the proposal for an actual set input syntax.  Of course
this would require that the set type become one of the builtins, with 
compiler support.

> I haven't read your docs yet (and no time because Digital Creations is
> requiring my attention all of today), but I expect that designing a
> universal set type, one that is good enough to be used in all sorts of
> applications, is very difficult.  

For "difficult" read "can't be done".  This is one of those cases where
no matter what implementation you choose, some of the operations you want
to be cheap will be worst-case quadratic.  Life is like that.  So I chose
a dead-simple representation and accepted quadratic times for 
union/intersection.

> > 2. It's simple for application programmers to use.  No extension module
> > to integrate.
> 
> This is a silly argument for wanting something to be added to the
> core.  If it's part of the core, the need for an extension is
> immaterial because that extension will always be available.  So
> I conclude that your module is set up perfectly for a popular module
> in the Vaults. :-)

Reasonable point.
 
> > 3. It's unsurprising.  My set objects behave almost exactly like other
> > mutable sequences, with all the same built-in methods working, except for 
> > the fact that you can't introduce duplicates with the mutators.
> 
> Ah, so you see a set as an extension of a sequence.  That may be the
> big rift between your version and Greg's PEP: are sets more like
> sequences or more like dictionaries?

Indeed it is.  

> > 5. It's simple enough not to cause you maintainance hassles down the
> > road, and even if it did the maintainer is unlikely to disappear :-).
> 
> I'll be the judge of that, and since you prefer not to show your
> source code (why is that?), I can't tell yet.

No nefarious concealment going on here here :-), I've sent versions of
the code to Greg and Ping already.  I'll shoot you a copy too.
 
> Having just skimmed your docs, I'm disappointed that you choose lists
> as your fundamental representation type -- this makes it slow to test
> for membership and hence makes intersection and union slow.

Not quite.  Membership test is still linear-time; so is adding and deleting
elements.  It's true that union and intersection are quadratic, but see below.

>                                                      I suppose
> that you have evidence from using this that those operations aren't
> used much, or not for large sets?

Exactly!  In my experience the usage pattern of a class like this runs
heavily to small sets (usually < 64 elements); membership tests
dominate usage, with addition and deletion of elements running second
and the "classical" boolean operations like union and intersection
being uncommon.

What you get by going with a dictionary representation is that
membership test becomes close to constant-time, while insertion and
deletion become sometimes cheap and sometimes quite expensive
(depending of course on whether you have to allocate a new 
hash bucket).  Given the usage pattern I described, the overall
difference in performance is marginal.

>                              This is one of the problems with
> coming up with a set type for the core: it has to work for (nearly)
> everybody.

As I pointed out above (and someone else on the list had made the same point
earlier), "works for everbody" isn't really possible here.  So my solution
does the next best thing -- pick a choice of tradeoffs that isn't obviously
worse than the alternatives and keeps things bog-simple.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Alcohol still kills more people every year than all `illegal' drugs put
together, and Prohibition only made it worse.  Oppose the War On Some Drugs!
-------------- next part --------------
"""
A set-algebra module for Python.

The functions work on any sequence type and return lists.
The set methods can take a set or any sequence type as an argument.
They are insensitive to the types of the elements.

Lists are used rather than dictionaries so the elements can be mutable.

"""
# Design and implementation by ESR, January 2001.

def setify(list1):		# Used by set constructor
    "Remove duplicates in sequence."
    res = []
    for i in range(len(list1)):
	duplicate = 0
        for j in range(i):
	    if list1[i] == list1[j]:
		duplicate = 1
		break
	if not duplicate:
	    res.append(list1[i])
    return res

def union(list1, list2):		# Used for |
    "Compute set intersection of sequences."
    res = list1[:]
    for x in list2:
	if not x in list1:
	    res.append(x)
    return res

def intersection(list1, list2):		# Used for &
    "Compute set intersection of sequences."
    res = []
    for x in list1:
	if x in list2:
	    res.append(x)
    return res

def difference(list1, list2):		# Used for -
    "Compute set difference of sequences."
    res = []
    for x in list1:
	if not x in list2:
	    res.append(x)
    return res

def symmetric_difference(list1, list2):	# Used for ^
    "Compute set symmetric-difference of sequences."
    res = []
    for x in list1:
	if not x in list2:
	    res.append(x)
    for x in list2:
	if not x in list1:
	    res.append(x)
    return res

def cartesian(list1, list2):		# Used for *
    "Cartesian product of sequences considered as sets."
    res = []
    for x in list1:
	for y in list2:
	    res.append((x,y))
    return res

def equality(list1, list2):
    "Test sequences considered as sets for equality."
    if len(list1) != len(list2):
        return 0
    for x in list1:
        if not x in list2:
            return 0
    for x in list2:
        if not x in list1:
            return 0
    return 1

def proper_subset(list1, list2):
    "Return 1 if first argument is a proper subset of second, 0 otherwise."
    if not len(list1) < len(list2):
        return 0
    for x in list1:
        if not x in list2:
            return 0
    return 1

def subset(list1, list2):
    "Return 1 if first argument is a subset of second, 0 otherwise."
    if not len(list1) <= len(list2):
        return 0
    for x in list1:
        if not x in list2:
            return 0
    return 1

def powerset(base):
    "Compute the set of all subsets of a set."
    powerset = []
    for n in xrange(2 ** len(base)):
	subset = []
	for e in xrange(len(base)):
	     if n & 2 ** e:
		subset.append(base[e])
	powerset.append(subset)
    return powerset

class set:
    "Lists with set-theoretic operations."

    def __init__(self, value):
        self.elements = setify(value)

    def __len__(self):
	return len(self.elements)

    def __getitem__(self, ind):
	return self.elements[ind]

    def __setitem__(self, ind, val):
        if val not in self.elements:
            self.elements[ind] = val

    def __delitem__(self, ind):
	del self.elements[ind]

    def list(self):
        return self.elements

    def append(self, new):
        if new not in self.elements:
            self.elements.append(new)

    def extend(self, new):
	self.elements.extend(new)
        self.elements = setify(self.elements)

    def count(self, x):
	self.elements.count(x)

    def index(self, x):
	self.elements.index(x)

    def insert(self, i, x):
        if x not in self.elements:
            self.elements.index(i, x)

    def pop(self, i=None):
	self.elements.pop(i)

    def remove(self, x):
	self.elements.remove(x)

    def reverse(self):
	self.elements.reverse()

    def sort(self, cmp=None):
	self.elements.sort(cmp)

    def __or__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(union(self.elements, other))

    __add__ = __or__

    def __and__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(intersection(self.elements, other))

    def __sub__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(difference(self.elements, other))

    def __xor__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(symmetric_difference(self.elements, other))

    def __mul__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return set(cartesian(self.elements, other))

    def __eq__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return self.elements == other

    def __ne__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return self.elements != other

    def __lt__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return proper_subset(self.elements, other)

    def __le__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return subset(self.elements, other)

    def __gt__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return proper_subset(other, self.elements)

    def __ge__(self, other):
	if type(other) == type(self):
	    other = other.elements
        return subset(other, self.elements)

    def __str__(self):
        res = "{"
        for x in self.elements:
            res = res + str(x) + ", "
        res = res[0:-2] + "}"
        return res

    def __repr__(self):
        return repr(self.elements)

if __name__ == '__main__':
    a = set([1, 2, 3, 4])
    b = set([1, 4])
    c = set([5, 6])
    d = [1, 1, 2, 1]
    print `d`, "setifies to", set(d)
    print `a`, "|", `b`, "is", `a | b`
    print `a`, "^", `b`, "is", `a ^ b`
    print `a`, "&", `b`, "is", `a & b`
    print `b`, "*", `c`, "is", `b * c`
    print `a`, '<', `b`, "is", `a < b`
    print `a`, '>', `b`, "is", `a > b`
    print `b`, '<', `c`, "is", `b < c`
    print `b`, '>', `c`, "is", `b > c`
    print "Power set of", `c`, "is", powerset(c)

# end

From sdm7g at virginia.edu  Tue Jan 23 18:12:22 2001
From: sdm7g at virginia.edu (Steven D. Majewski)
Date: Tue, 23 Jan 2001 12:12:22 -0500 (EST)
Subject: [Python-Dev] libraries=['m'] in config.py [Re: Python 2.1 alpha 1 released!]
In-Reply-To: <200101230333.WAA28376@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.NXT.4.21.0101231204010.227-100000@localhost.virginia.edu>


Is there a simple way (other than editing config.py) to remove the
effect of all of the "libraries=['m']" options from config.py ? 

This breaks the MacOSX build as there's no libm -- that functionality
is build into the System.framework .

Shouldn't these type of flags be acquired from configure or the
make environment somehow ? 

-- Steve Majewski 


( BTW: OSX build also needs a "-traditional-cpp" flag to get thru 
  compiling classobject.c without error. ) 







From uche.ogbuji at fourthought.com  Tue Jan 23 18:28:18 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Tue, 23 Jan 2001 10:28:18 -0700
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
In-Reply-To: Message from Martin von Loewis <loewis@informatik.hu-berlin.de> 
   of "Mon, 22 Jan 2001 15:46:39 +0100." <200101221446.PAA05164@pandora.informatik.hu-berlin.de> 
Message-ID: <200101231728.KAA03408@localhost.localdomain>

> > This has nothing to do with Python. UTF-8 marks the codes 
> > from 128-191 as illegal prefix. 
> [...]
> > Perhaps the parser should catch the UnicodeError and
> > instead return a not-wellformed exception ?!
> 
> Right on both accounts. If no encoding is specified, and if the
> document appears not to be UTF-16 in any endianness, an XML processor
> shall assume it is UTF-8. As Marc-Andre explains, your document is not
> proper UTF-8, hence the error.
> 
> The confusing thing is that expat itself does not care about it not
> being UTF-8; that is only detected when the callback is invoked in
> pyexpat, and therefore conversion to a Unicode object is attempted.

Pyexpat violates the XML spec here.  XML parsers are not allowed to "recover" 
from well-formedness errors.  And I would classify blithley reporting the 
character data as "recovery".

However, I'm amazed that this wouldn't have come up before, considering the 
pedigree of expat.

I'll poke around, and raise a bug on the expat site if need be.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From tismer at tismer.com  Tue Jan 23 18:35:08 2001
From: tismer at tismer.com (Christian Tismer)
Date: Tue, 23 Jan 2001 18:35:08 +0100
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
References: <200101231728.KAA03408@localhost.localdomain>
Message-ID: <3A6DC0CC.C4FF83DF@tismer.com>


uche.ogbuji at fourthought.com wrote:
> 
> > > This has nothing to do with Python. UTF-8 marks the codes
> > > from 128-191 as illegal prefix.
> > [...]
> > > Perhaps the parser should catch the UnicodeError and
> > > instead return a not-wellformed exception ?!
> >
> > Right on both accounts. If no encoding is specified, and if the
> > document appears not to be UTF-16 in any endianness, an XML processor
> > shall assume it is UTF-8. As Marc-Andre explains, your document is not
> > proper UTF-8, hence the error.
> >
> > The confusing thing is that expat itself does not care about it not
> > being UTF-8; that is only detected when the callback is invoked in
> > pyexpat, and therefore conversion to a Unicode object is attempted.
> 
> Pyexpat violates the XML spec here.  XML parsers are not allowed to "recover"
> from well-formedness errors.  And I would classify blithley reporting the
> character data as "recovery".
> 
> However, I'm amazed that this wouldn't have come up before, considering the
> pedigree of expat.

Well, I had to write a preprocessor which turns some "xml-like"
but not well-formed stuff into something useable. This was a
bulk of 100 MB of data, partially hand-written, partially
machine-generated, but not really well-formed. Some
special characters appeared very late in the data set, raising
an error in Python 2.0, but not in 1.5.2, so I perceived
it as an error in the parser first, not the data. :-)

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From uche.ogbuji at fourthought.com  Tue Jan 23 18:55:12 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Tue, 23 Jan 2001 10:55:12 -0700
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
In-Reply-To: Message from Christian Tismer <tismer@tismer.com> 
   of "Mon, 22 Jan 2001 16:05:24 +0100." <3A6C4C34.4D1252C9@tismer.com> 
Message-ID: <200101231755.KAA03471@localhost.localdomain>

> "M.-A. Lemburg" wrote:
> ...
> > > The codes from 192 to 236, 238-243 produce
> > > "UTF-8 decoding error: invalid data",
> > > the rest gives "not well-formed".
> > >
> > > I would like to know if this happens with your (Tim) modified
> > > version as well. I'm using plain vanilla BeOpen Python 2.0 .
> > 
> > This has nothing to do with Python. UTF-8 marks the codes
> > from 128-191 as illegal prefix. See Object/unicodeobject.c:
> ...
> 
> Schade.
> 
> > Perhaps the parser should catch the UnicodeError and
> > instead return a not-wellformed exception ?!
> 
> I belive it would be better.

Yes, and given there is not much time before thr 2.1 release, doing so is an 
acceptable stop-gap.  However, I think the real fix has to lie in expat.

I just had a *very* quick and dirty perusal of expat 1.2 and 1.95.1, and not 
only do the UTF-8 validity checks (at the top of xmltok.c) seem wrong, but it 
doesn't look as if they're ever invoked.

I'll try to some time to look into this more closely, or perhaps someone will 
straighten me out if I'm on the wrong trail.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From fredrik at effbot.org  Tue Jan 23 19:03:42 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 23 Jan 2001 19:03:42 +0100
Subject: [Python-Dev] getting rid of ucnhash
Message-ID: <013901c08566$d2a8f360$e46940d5@hagrid>

It's probably just me, but the names of the two unicode
modules tend to irritate me:

> ls u*.pyd
ucnhash.pyd      unicodedata.pyd

(the former contains names, the latter data)

I've been meaning to rename the former, but I just realized
that it might be better to get rid of it completely, and move
its functionality into the unicodedata module.

The result is a single 200k unicodedata module, which con-
tains the name database as well as two new functions:

    name(character [, default]) => map unicode
    character to name.  if the name doesn't exist,
    return the default object, or raise ValueError.

    lookup(name) => unicode character
    (or raise KeyError if it doesn't exist)

Should I check it in now, change the names/semantics and check
it in, or post it to sourceforge?

Cheers /F





From uche.ogbuji at fourthought.com  Tue Jan 23 19:00:19 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Tue, 23 Jan 2001 11:00:19 -0700
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
In-Reply-To: Message from "Eric S. Raymond" <esr@thyrsus.com> 
   of "Mon, 22 Jan 2001 12:41:59 EST." <20010122124159.A14999@thyrsus.com> 
Message-ID: <200101231800.LAA03515@localhost.localdomain>

> \section{\module{set} ---
>          Basic set algebra for Python}

Looks good.  Are you making this available for download?  I could put this to 
experimental use right away (experimental since, IIRC, you are using the new 
rich comparisons).


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From uche.ogbuji at fourthought.com  Tue Jan 23 19:16:27 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Tue, 23 Jan 2001 11:16:27 -0700
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
In-Reply-To: Message from "Eric S. Raymond" <esr@thyrsus.com> 
   of "Mon, 22 Jan 2001 15:13:09 EST." <20010122151309.C15236@thyrsus.com> 
Message-ID: <200101231816.LAA03551@localhost.localdomain>

> Guido van Rossum <guido at digicool.com>:
> > There's already a PEP on a set object type, and everybody and their
> > aunt has already implemented a set datatype.

Tim mentioned that he had one, and he also claimed that every other dodder had 
a set class, but the only one listed in the vaults is kjBuckets, which I'm not 
sure is maintained any more.  (Is Aaron Watters hereabouts?)

> I've just read the PEP.  Greg's proposal has a couple of problems.
> The biggest one is that the interface design isn't very Pythonic --
> it's formally adequate, but doesn't exploit the extent to which sets
> naturally have common semantics with existing Python sequence types.
> This is bad; it means that a lot of code that could otherwise ignore
> the difference between lists and sets would have to be specialized 
> one way or the other for no good reason.

IMO, Eric's Set interface is close to perfect.

PEP 218 is interesting, but I'm not sure it's worth slogging through the 
inevitable uproar over an entirely new syntactic construct (the "{}" notation) 
before getting something as useful as a set class into the standard library.


> > If *your* set module is ready for prime time, why not publish it in
> > the Vaults of Parnassus?
> 
> I suppose that's what I'll do if you don't bless it for the standard
> library.  But here are the reasons I suggest you should do so:

For what it's worth, I'm +1 on adding this to the standard library.  I've seen 
so many set hacks with dictionaries (memory ouch) and list hacks (speed ouch) 
in Python code out there, that I'm convinced it would meet much more common 
usage than, say zlib, xdr, or even expat.

On this hacker list everyone's aunt might whip up set extensions on boring 
weekends, but I doubt this describes the overall Python populace.


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From uche.ogbuji at fourthought.com  Tue Jan 23 19:29:36 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Tue, 23 Jan 2001 11:29:36 -0700
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
In-Reply-To: Message from "M.-A. Lemburg" <mal@lemburg.com> 
   of "Tue, 23 Jan 2001 11:26:16 +0100." <3A6D5C48.A076DA0@lemburg.com> 
Message-ID: <200101231829.LAA03575@localhost.localdomain>

> All very well, but are sets really that essential to every
> day Python programming ?

Not everyday, but as I said, the standard library has zlib, expat, tkinter, 
colorsys, and a whole lot of other stuff that is undoubtedly less useful than 
a set class.

> If we include sets then we ought to
> also include graphs, tries, btrees

I see all of these as far less commonly useful than sets (at least in 
situations where implementations using existing data structures won't suffice).

I run into needs for sets all the time.  I don't have as much trouble with 
your other examples, though I've always considered tries as a possible 
performance boost in XPath.  Oddly enough another data structure I often wish 
I had is a splay tree, and I hope to wrap my old C++ splay tree implementation 
for Python one of these days.

> and all those other goodies
> we have in computer science. All of these types are available
> out there, but I believe the audience who really cares for these
> types is also capable of downloading the extensions and installing
> them.
> 
> It would be nice if all of these extension could go into a SUMO
> edition of Python though... together with your set module.

Considering "batteries included", it's worth considering these very important 
"batteries".


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From skip at mojam.com  Tue Jan 23 19:35:04 2001
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 23 Jan 2001 12:35:04 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,1.3,1.4
In-Reply-To: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net>
References: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <14957.52952.48739.53360@beluga.mojam.com>

    Guido> - Use "exec ... in dict" to avoid having to walk on eggshells;
    Guido>   locals no don't have to start with underscore.

Thanks.  I have just been incredibly short on time lately.

    Guido> - Only test dbhash if bsddb can be imported.  (Wonder if there
    Guido>   are more like this?)

Alpha testing should pick those up, yes? ;-)

    Guido> ! try:
    Guido> !     import bsddb
    Guido> ! except ImportError:
    Guido> !     if verbose:
    Guido> !         print "can't import bsddb, so skipping dbhash"
    Guido> ! else:
    Guido> !     check_all("dbhash")

Instead of having to know that dbhash includes bsddb, shouldn't dbhash be
the module that's imported here?

Skip



From uche.ogbuji at fourthought.com  Tue Jan 23 19:36:59 2001
From: uche.ogbuji at fourthought.com (uche.ogbuji at fourthought.com)
Date: Tue, 23 Jan 2001 11:36:59 -0700
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
In-Reply-To: Message from "Eric S. Raymond" <esr@thyrsus.com> 
   of "Tue, 23 Jan 2001 11:30:50 EST." <20010123113050.A26162@thyrsus.com> 
Message-ID: <200101231836.LAA03655@localhost.localdomain>

> """
> A set-algebra module for Python.
> 
> The functions work on any sequence type and return lists.
> The set methods can take a set or any sequence type as an argument.
> They are insensitive to the types of the elements.
> 
> Lists are used rather than dictionaries so the elements can be mutable.
> 
> """

Hmm.  I was hoping this was actually a C extension for the performance boost, 
esp. given the number of __foo__ methods in the set class.

Implementation in Python makes my interest in adding it to the standard lib 
more tepid (not to cast the least bit of aspersion on your work).


-- 
Uche Ogbuji                               Principal Consultant
uche.ogbuji at fourthought.com               +1 303 583 9900 x 101
Fourthought, Inc.                         http://Fourthought.com 
4735 East Walnut St, Ste. C, Boulder, CO 80301-2537, USA
Software-engineering, knowledge-management, XML, CORBA, Linux, Python





From skip at mojam.com  Tue Jan 23 19:37:44 2001
From: skip at mojam.com (Skip Montanaro)
Date: Tue, 23 Jan 2001 12:37:44 -0600 (CST)
Subject: [Python-Dev] pydoc - put it in the core
In-Reply-To: <3A6CBBEF.4732BFF2@ActiveState.com>
References: <14945.59192.400783.403810@beluga.mojam.com>
	<200101142055.PAA13041@cj20424-a.reston1.va.home.com>
	<3A6CBBEF.4732BFF2@ActiveState.com>
Message-ID: <14957.53112.119272.797494@beluga.mojam.com>

    Paul> I apologize but I'm not clear on my responsibilities here, if
    Paul> any. I wrote a PEP for online help. I submitted a partial
    Paul> implementation. 

Perhaps I am the one who should apologize.  I started the thread.  I tried
Ping's code and was simply amazed at how useful it was.  I didn't bother
checking the list of PEPs to see if it overlapped with something there, and
I suspect any discussion of this stuff has taken place in the doc sig, where
I don't hang out.

Skip



From esr at thyrsus.com  Tue Jan 23 19:39:04 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 13:39:04 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101231816.LAA03551@localhost.localdomain>; from uche.ogbuji@fourthought.com on Tue, Jan 23, 2001 at 11:16:27AM -0700
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain>
Message-ID: <20010123133904.B26487@thyrsus.com>

uche.ogbuji at fourthought.com <uche.ogbuji at fourthought.com>:
> I've seen so many set hacks with dictionaries (memory ouch) and list
> hacks (speed ouch) in Python code out there, that I'm convinced it
> would meet much more common usage than, say zlib, xdr, or even
> expat.

Uche brings up a point I meant to make in my reply to Guido.  The dict-
vs.-list choice in set representation is indeed a choice between 
memory ouch and speed ouch.  

I believe most uses of sets are small sets.  That reduces the speed ouch
of using a list representation and increases the proportional memory
ouch of a dictionary implementation.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Question with boldness even the existence of a God; because, if there
be one, he must more approve the homage of reason, than that of
blindfolded fear.... Do not be frightened from this inquiry from any
fear of its consequences. If it ends in the belief that there is no
God, you will find incitements to virtue in the comfort and
pleasantness you feel in its exercise...
	-- Thomas Jefferson, in a 1787 letter to his nephew



From jeremy at alum.mit.edu  Tue Jan 23 19:41:23 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue, 23 Jan 2001 13:41:23 -0500 (EST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123113050.A26162@thyrsus.com>
References: <20010122124159.A14999@thyrsus.com>
	<200101221910.OAA01218@cj20424-a.reston1.va.home.com>
	<20010122151309.C15236@thyrsus.com>
	<200101231549.KAA05172@cj20424-a.reston1.va.home.com>
	<20010123113050.A26162@thyrsus.com>
Message-ID: <14957.53331.342827.462297@localhost.localdomain>

>>>>> "ESR" == Eric S Raymond <esr at thyrsus.com> writes:

  ESR> Guido van Rossum <guido at digicool.com>:
  >> Having just skimmed your docs, I'm disappointed that you choose
  >> lists as your fundamental representation type -- this makes it
  >> slow to test for membership and hence makes intersection and
  >> union slow.

  ESR> Not quite.  Membership test is still linear-time; so is adding
  ESR> and deleting elements.  It's true that union and intersection
  ESR> are quadratic, but see below.

  >> I suppose that you have evidence from using this that those
  >> operations aren't used much, or not for large sets?

  ESR> Exactly!  In my experience the usage pattern of a class like
  ESR> this runs heavily to small sets (usually < 64 elements);
  ESR> membership tests dominate usage, with addition and deletion of
  ESR> elements running second and the "classical" boolean operations
  ESR> like union and intersection being uncommon.

I use a Set type in the compiler package (Tools/compiler/compiler) to
collect the names for a code block.  I implemented a trivial Set type
using a dictionary, because it supported the operations I was most
interested in: addition, membership tests, intersection, and get
elements as sequence (in arbitrary order).  Those are the only
operations the compiler uses.

I think I use sets for this purpose frequently, although I can't think
of any other good examples at the moment.  I usually just use a
dictionary explicitly.  In the compiler, I chose an explicit Set class
with unique method names (add, has_elt, elements) to make it obvious
for readers that I was using a set.

  ESR> What you get by going with a dictionary representation is that
  ESR> membership test becomes close to constant-time, while insertion
  ESR> and deletion become sometimes cheap and sometimes quite
  ESR> expensive (depending of course on whether you have to allocate
  ESR> a new hash bucket).  Given the usage pattern I described, the
  ESR> overall difference in performance is marginal.

The cost of insertion would presumably be dominated by the frequency
of dictionary resizes.  I don't know how often they occur, but I
assume the dictionary type is designed to accommodate efficient
insert.

I did a quick and dirty performance comparison of dictionary-based and
list-based sets.  (I'll include the code below.)  It uses sample data
collected from running the compiler; so it is measuring actual usage.

The tests showed that dictionary-based sets were always faster.  For
small tests (3 operations), the difference was about 10 percent.  For
larger tests (88 operations), the difference ranged from 180 to almost
700 percent.

  >> This is one of the problems with coming up with a set type for
  >> the core: it has to work for (nearly) everybody.

  ESR> As I pointed out above (and someone else on the list had made
  ESR> the same point earlier), "works for everbody" isn't really
  ESR> possible here.  So my solution does the next best thing -- pick
  ESR> a choice of tradeoffs that isn't obviously worse than the
  ESR> alternatives and keeps things bog-simple.

For my applications, the dictionary-based approach is faster and
offers a natural interface.  If a set implementation were included in
the standard library, I would like to see either (1) the
implementation that favors my needs <wink> or (2) multiple
implementations tuned for different uses.  I think it would be just as
easy to make set implementations available separately, though.

Jeremy

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: sets.tar
URL: <http://mail.python.org/pipermail/python-dev/attachments/20010123/99167672/attachment-0001.txt>

From loewis at informatik.hu-berlin.de  Tue Jan 23 19:51:37 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Tue, 23 Jan 2001 19:51:37 +0100 (MET)
Subject: Partial victory (was RE: [Python-Dev] RE: test_sax failing 
 (Windows))
In-Reply-To: <200101231755.KAA03471@localhost.localdomain>
	(uche.ogbuji@fourthought.com)
References: <200101231755.KAA03471@localhost.localdomain>
Message-ID: <200101231851.TAA19488@pandora.informatik.hu-berlin.de>

> I'll try to some time to look into this more closely, or perhaps
> someone will straighten me out if I'm on the wrong trail.

Spending only a little time myself, either, I'd agree with your
conclusions.

Regards,
Martin



From esr at thyrsus.com  Tue Jan 23 19:55:30 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 13:55:30 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <14957.53331.342827.462297@localhost.localdomain>; from jeremy@alum.mit.edu on Tue, Jan 23, 2001 at 01:41:23PM -0500
References: <20010122124159.A14999@thyrsus.com> <200101221910.OAA01218@cj20424-a.reston1.va.home.com> <20010122151309.C15236@thyrsus.com> <200101231549.KAA05172@cj20424-a.reston1.va.home.com> <20010123113050.A26162@thyrsus.com> <14957.53331.342827.462297@localhost.localdomain>
Message-ID: <20010123135530.A26565@thyrsus.com>

Jeremy Hylton <jeremy at alum.mit.edu>:
Content-Description: message body text
> The tests showed that dictionary-based sets were always faster.  For
> small tests (3 operations), the difference was about 10 percent.  For
> larger tests (88 operations), the difference ranged from 180 to almost
> 700 percent.

Not surprising.  88 elements is getting pretty large.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Hoplophobia (n.): The irrational fear of weapons, correctly described by 
Freud as "a sign of emotional and sexual immaturity".  Hoplophobia, like
homophobia, is a displacement symptom; hoplophobes fear their own
"forbidden" feelings and urges to commit violence.  This would be
harmless, except that they project these feelings onto others.  The
sequelae of this neurosis include irrational and dangerous behaviors
such as passing "gun-control" laws and trashing the Constitution.



From petrilli at amber.org  Tue Jan 23 20:06:05 2001
From: petrilli at amber.org (Christopher Petrilli)
Date: Tue, 23 Jan 2001 14:06:05 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123133904.B26487@thyrsus.com>; from esr@thyrsus.com on Tue, Jan 23, 2001 at 01:39:04PM -0500
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com>
Message-ID: <20010123140604.E18796@trump.amber.org>

Eric S. Raymond [esr at thyrsus.com] wrote:
> I believe most uses of sets are small sets.  That reduces the speed ouch
> of using a list representation and increases the proportional memory
> ouch of a dictionary implementation.

The problem is that there are a lot of uses for large sets, especially 
when you begin to introduce intersections and unions.  If an
implementation is only useful for a few dozen (or a hundered) items in 
the set, that eliminates a lot of places where the real use of set
types is useful---optimizing large scale manipulations.

Zope for example, manipulates sets with 10,000 items in it on a
regular basis when doing text index manipulation.  The data structures 
are heavily optimized for this kind of behaviour, without a major
sacrifice in space.  I think Jim perhaps can talk to this. 

Unfortunately, for me, a Python implementation of Sets is only
interesting academicaly.  Any time I've needed to work with them at a
large scale, I've needed them *much* faster than Python could achieve
without a C extension.

Perhaps the difference is in problem domain.  In the "scripting"
problem domain, I would agree that Setswould rarely reach large sizes, 
and so a algorithm which performed in quadratic time might be fine,
because the actual resultant time is small.  However, in more
full-blown applications, this would be counter productive, and the
user would be forced implement their own (or use Aaron's excellent
kjBuckets).

Just my opinion, of course.
Chris
-- 
| Christopher Petrilli
| petrilli at amber.org



From ping at lfw.org  Tue Jan 23 20:27:38 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Tue, 23 Jan 2001 11:27:38 -0800 (PST)
Subject: [Python-Dev] Sets: elt in dict, lst.include
In-Reply-To: <14957.53331.342827.462297@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org>

On Tue, 23 Jan 2001, Jeremy Hylton wrote:
> For my applications, the dictionary-based approach is faster and
> offers a natural interface.

The only change that needs to be made to support sets of immutable
elements is to provide "in" on dictionaries.  The rest is then all
quite natural:

    dict[key] = 1
    if key in dict: ...
    for key in dict: ...

(Then we can also get rid of the ugly has_key method.)

For those that need mutable set elements badly enough to sacrifice
a little speed, we can add two methods to lists:

    lst.include(elt)   # same as - if elt not in lst: lst.append(elt)
    lst.exclude(elt)   # same as - while elt in lst: lst.remove(elt)

(These are generally useful methods to have anyway.)


This proposal has the following advantages:

    1. You still get to choose which implementation best suits your needs.

    2. No new types are introduced; lists and dicts are well understood.

    3. Both features are extremely simple to understand and explain.

    4. Both features are useful in their own right, and could stand as
       independent proposals to improve lists and dicts respectively.
       (For instance, i spotted about 10 places in the std library where
       the 'include' method could be used, and i know i would use it
       myself -- certainly more often than pop or reverse!)

    5. In all cases this is faster than a new Python class.  (For instance,
       Jeremy's implementation even contained a commented-out optimization
       that stored self.elts.has_key as self.has_elt to speed things up a
       bit.  Using straight dicts would see this optimization and raise it
       one, with no effort at all.)

    6. Either feature can be independently approved or rejected without
       affecting the other.


-- ?!ng




From loewis at informatik.hu-berlin.de  Tue Jan 23 20:33:00 2001
From: loewis at informatik.hu-berlin.de (Martin von Loewis)
Date: Tue, 23 Jan 2001 20:33:00 +0100 (MET)
Subject: [Python-Dev] getting rid of ucnhash
Message-ID: <200101231933.UAA02223@pandora.informatik.hu-berlin.de>

> Should I check it in now, change the names/semantics and check it
> in, or post it to sourceforge?

Is that two or three options? If three, what change in semantics did
you propose?

Anyway, I feel it could go in right now; the only breakage would be to
applications that use ucnhash.ucnhashAPI, right?

Regards,
Martin



From fredrik at effbot.org  Tue Jan 23 20:49:09 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 23 Jan 2001 20:49:09 +0100
Subject: [Python-Dev] Re:  getting rid of ucnhash
References: <200101231933.UAA02223@pandora.informatik.hu-berlin.de>
Message-ID: <01e801c08575$8f71c680$e46940d5@hagrid>

martin wrote:

> > Should I check it in now, change the names/semantics and check it
> > in, or post it to sourceforge?
> 
> Is that two or three options?

three, I think.

> If three, what change in semantics did you propose?

none -- but maybe someone else has a better name for "lookup"?

(the "name" function behaves like the existing property methods
in 2.0's unicodedata)

> Anyway, I feel it could go in right now; the only breakage would be to
> applications that use ucnhash.ucnhashAPI, right?

yup -- and those applications are already broken, since the CObject
was renamed in 2.1a1.

(well, any code using 2.1a1's new ucnhash.getcode/getname functions
will of course also break.  but I think we can live with that ;-)

Cheers /F




From ping at lfw.org  Tue Jan 23 20:43:50 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Tue, 23 Jan 2001 11:43:50 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101231135570.1568-100000@skuld.kingmanhall.org>

Christopher Petrilli wrote:
> The problem is that there are a lot of uses for large sets, especially 
> when you begin to introduce intersections and unions.
[...]
> Unfortunately, for me, a Python implementation of Sets is only
> interesting academicaly.  Any time I've needed to work with them at a
> large scale, I've needed them *much* faster than Python could achieve
> without a C extension.

On Tue, 23 Jan 2001, Ka-Ping Yee wrote:
> This proposal has the following advantages:
[six nice things about 'in dict' and 'lst.include']

I forgot to mention an important seventh advantage:

    7. The list and dictionary data structures are implemented
       in the C core, so we leave open the possibility of a
       wizard going and optimizing the snot out of them later.

Just as there's e.g. a boundary on recursion levels before Python
invokes the cycle detection algorithm during comparison, if we
decide we need more speed for big sets, Python could notice when
a list or dictionary gets very big and invoke more powerful
optimizations.  We don't have to do this now, but the important
thing is that we will always have the option to make Christopher's
dream come true.  (A wizard can do this once, and every Python
script on the planet benefits.)

In general i support Python deciding on the Right Thing to do
under the hood, performance-wise, so that the programmer doesn't
have to think too hard about what data structure to choose.


-- ?!ng




From nas at arctrix.com  Tue Jan 23 14:08:07 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 23 Jan 2001 05:08:07 -0800
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123140604.E18796@trump.amber.org>; from petrilli@amber.org on Tue, Jan 23, 2001 at 02:06:05PM -0500
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com> <20010123140604.E18796@trump.amber.org>
Message-ID: <20010123050807.A29115@glacier.fnational.com>

On Tue, Jan 23, 2001 at 02:06:05PM -0500, Christopher Petrilli wrote:
> Unfortunately, for me, a Python implementation of Sets is only
> interesting academicaly.  Any time I've needed to work with them at a
> large scale, I've needed them *much* faster than Python could achieve
> without a C extension.

I think this argues that if sets are added to the core they
should be implemented as an extension type with the speed of
dictionaries and the memory usage of lists.  Basicly, we would
use the implementation of PyDict but drop the values.

  Neil



From jeremy at alum.mit.edu  Tue Jan 23 20:48:18 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue, 23 Jan 2001 14:48:18 -0500 (EST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <14957.53331.342827.462297@localhost.localdomain>
References: <20010122124159.A14999@thyrsus.com>
	<200101221910.OAA01218@cj20424-a.reston1.va.home.com>
	<20010122151309.C15236@thyrsus.com>
	<200101231549.KAA05172@cj20424-a.reston1.va.home.com>
	<20010123113050.A26162@thyrsus.com>
	<14957.53331.342827.462297@localhost.localdomain>
Message-ID: <14957.57346.248852.656387@localhost.localdomain>

Sorry about the garbled attachment on the previous message; I think I
got the content-type wrong.  Here's a second try.

Jeremy

-------------- next part --------------
A non-text attachment was scrubbed...
Name: sets.tar
Type: application/octet-stream
Size: 20480 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20010123/9dda92b6/attachment-0001.obj>

From petrilli at amber.org  Tue Jan 23 21:06:16 2001
From: petrilli at amber.org (Christopher Petrilli)
Date: Tue, 23 Jan 2001 15:06:16 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123050807.A29115@glacier.fnational.com>; from nas@arctrix.com on Tue, Jan 23, 2001 at 05:08:07AM -0800
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com> <20010123140604.E18796@trump.amber.org> <20010123050807.A29115@glacier.fnational.com>
Message-ID: <20010123150616.F18796@trump.amber.org>

Neil Schemenauer [nas at arctrix.com] wrote:
> On Tue, Jan 23, 2001 at 02:06:05PM -0500, Christopher Petrilli wrote:
> > Unfortunately, for me, a Python implementation of Sets is only
> > interesting academicaly.  Any time I've needed to work with them at a
> > large scale, I've needed them *much* faster than Python could achieve
> > without a C extension.
> 
> I think this argues that if sets are added to the core they
> should be implemented as an extension type with the speed of
> dictionaries and the memory usage of lists.  Basicly, we would
> use the implementation of PyDict but drop the values.

This is effectively the implementation that Zope has for Sets.  In
addition we have "buckets" that have scores on them (which are
implemented as a modified BTree).  

Unfortunately Jim Fulton (who wrote all the code for that level) is in 
a meeting, but I hope he'll comment on the implementation that was
chosen for our software.

Chris
-- 
| Christopher Petrilli
| petrilli at amber.org



From jeremy at alum.mit.edu  Tue Jan 23 20:56:05 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue, 23 Jan 2001 14:56:05 -0500 (EST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123135530.A26565@thyrsus.com>
References: <20010122124159.A14999@thyrsus.com>
	<200101221910.OAA01218@cj20424-a.reston1.va.home.com>
	<20010122151309.C15236@thyrsus.com>
	<200101231549.KAA05172@cj20424-a.reston1.va.home.com>
	<20010123113050.A26162@thyrsus.com>
	<14957.53331.342827.462297@localhost.localdomain>
	<20010123135530.A26565@thyrsus.com>
Message-ID: <14957.57813.23072.723418@localhost.localdomain>

>>>>> "ESR" == Eric S Raymond <esr at thyrsus.com> writes:

  ESR> Jeremy Hylton <jeremy at alum.mit.edu>: Content-Description:
  ESR> message body text
  >> The tests showed that dictionary-based sets were always faster.
  >> For small tests (3 operations), the difference was about 10
  >> percent.  For larger tests (88 operations), the difference ranged
  >> from 180 to almost 700 percent.

  ESR> Not surprising.  88 elements is getting pretty large.

Large for what?  I've got directories with that many files and modules
with the many names defined at the top-level :-).  I'm just reporting
the range of set sizes I've encountered for a real application.  In
general, I expect a few hundred elements should be handled without
trouble by most Python containers.

Jeremy



From gvwilson at nevex.com  Tue Jan 23 21:26:22 2001
From: gvwilson at nevex.com (Greg Wilson)
Date: Tue, 23 Jan 2001 15:26:22 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123200601.87817EF68@mail.python.org>
Message-ID: <001101c0857a$c0dce420$770a0a0a@nevex.com>

Greg Wilson:
Meta-question: do people want to continue to discuss sets on the
general python-dev list, or take it out-of-line (e.g. to an egroups
list)?  I'm finding all of the discussion very useful, but I realize
that many readers might prefer to concentrate on the 2.1 release...

> Jeremy Hylton <jeremy at alum.mit.edu>:
> > The tests showed that dictionary-based sets were always faster.
> > small tests (3 operations), the difference was about 10 percent.
> > larger tests (88 operations), the difference ranged from 
> > 180 to almost 700 percent.

> Eric Raymond <esr at thyrsus.com>:
> Not surprising.  88 elements is getting pretty large.

Greg Wilson:
Really?  I was testing my implementation with sets of email addresses
grep'd out of old mail folders --- typical sizes were several thousand
elements.

> From: Christopher Petrilli <petrilli at amber.org>
> Unfortunately, for me, a Python implementation of Sets is only
> interesting academicaly.  Any time I've needed to work with them at a
> large scale, I've needed them *much* faster than Python could achieve
> without a C extension.

Greg Wilson:
I had been expecting to implement this in C, not in pure Python, for
performance.

> From: Christopher Petrilli <petrilli at amber.org>
> In the "scripting" problem domain, I would agree that Sets would
> rarely reach large sizes,
> and so a algorithm which performed in quadratic time might be fine,

Greg Wilson:
I strongly disagree (see the email address example above --- it was
the first thing that occurred to me to try).  I am still hoping to
find a sub-quadratic (preferably sub-linear) implementation.  I can
do it in C++ with observer/observable (contained items notify containers
of changes in value, sets store all equivalent items in the same bucket),
but that doesn't really help...

> From: Ka-Ping Yee <ping at lfw.org>
> The only change that needs to be made to support sets of immutable
> elements is to provide "in" on dictionaries...

and:

> From: Neil Schemenauer <nas at arctrix.com>
> ...if sets are added to the core...we would
> use the implementation of PyDict but drop the values.

Unfortunately, if values are required to be immutable, then sets of
sets aren't possible... :-(

Thanks, everyone,
Greg




From esr at thyrsus.com  Tue Jan 23 21:38:39 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Tue, 23 Jan 2001 15:38:39 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org>; from ping@lfw.org on Tue, Jan 23, 2001 at 11:27:38AM -0800
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org>
Message-ID: <20010123153839.B26676@thyrsus.com>

Ka-Ping Yee <ping at lfw.org>:
> The only change that needs to be made to support sets of immutable
> elements is to provide "in" on dictionaries.  The rest is then all
> quite natural:
> 
>     dict[key] = 1
>     if key in dict: ...
>     for key in dict: ...

Independently of implementation issues about sets, I think this is a
damn fine idea. +1.

> (Then we can also get rid of the ugly has_key method.)
> 
> For those that need mutable set elements badly enough to sacrifice
> a little speed, we can add two methods to lists:
> 
>     lst.include(elt)   # same as - if elt not in lst: lst.append(elt)
>     lst.exclude(elt)   # same as - while elt in lst: lst.remove(elt)

+1 on the concept, -0 on the names.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

[The disarming of citizens] has a double effect, it palsies the hand
and brutalizes the mind: a habitual disuse of physical forces totally
destroys the moral [force]; and men lose at once the power of
protecting themselves, and of discerning the cause of their
oppression.
        -- Joel Barlow, "Advice to the Privileged Orders", 1792-93



From tim.one at home.com  Tue Jan 23 23:02:41 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 23 Jan 2001 17:02:41 -0500
Subject: [Python-Dev] Is X a (sequence|mapping)?
In-Reply-To: <200101231531.KAA05122@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEELFIKAA.tim.one@home.com>

>> 	operator.isMappingType()
>> 	+ some other C style _Check() APIs

[Guido]
> Yes, these should probably be deprecated.  I certainly have never
> used them!  (The operator module doesn't seem to get much use in
> general...

It's used heavily by test_operator.py <wink>.  Outside of that, it's used
maybe three times in the std distribution, nowhere essential; the

    return map(operator.__div__, rgbtuple, _maxtuple)

in Pynche's ColorDB.py is typical.  2.0's

    return [x / 256. for x in rgbtuple]

does the same thing more clearly (_maxtuple is a module constant).

It appeals to functional-language fans and extreme micro-optimizers, so they
don't have to type "lambda" in the simplest cases.  At least
operator.truth(x) is *clearer* than "not not x".

> Was it a bad idea?)

Mixed, but I'd say more bad than good overall.




From thomas at xs4all.net  Wed Jan 24 00:38:14 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 24 Jan 2001 00:38:14 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010123153839.B26676@thyrsus.com>; from esr@thyrsus.com on Tue, Jan 23, 2001 at 03:38:39PM -0500
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com>
Message-ID: <20010124003814.F27785@xs4all.nl>

On Tue, Jan 23, 2001 at 03:38:39PM -0500, Eric S. Raymond wrote:

> > The only change that needs to be made to support sets of immutable
> > elements is to provide "in" on dictionaries.  The rest is then all
> > quite natural:

> >     dict[key] = 1
> >     if key in dict: ...
> >     for key in dict: ...

> Independently of implementation issues about sets, I think this is a
> damn fine idea. +1.

It's come up before. The problem with it is that it's not quite obvious
whether it is 'if key in dict' or 'if value in dict'. Sure, from the above
example it's obvious what you *expect*, but I suspect that 'for x in dict'
will result in a 40/60 split in expectations, and like American voters, the
20% middle section will change their vote each recount :-)

Now, if only there was a terribly obvious way to spell it... so that it's
immediately obvious which of the two you wanted.... something like, oh, I
donno, this, maybe:

  if key in dict.keys: ...
  if value in dict.values: ...

Ponder-ponder--Guido-should-use-the-time-machine-for-this-one!-ly y'rs,
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From fredrik at effbot.org  Wed Jan 24 01:13:20 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Wed, 24 Jan 2001 01:13:20 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl>
Message-ID: <02f401c0859a$765d07c0$e46940d5@hagrid>

> It's come up before. The problem with it is that it's not quite obvious
> whether it is 'if key in dict' or 'if value in dict'.

you forgot "if (key, value) in dict"

on the other hand, it's not quite obvious that "list.sort"
doesn't return the sorted list, "print >>None" prints to
standard output, "except KeyError, ValueError" doesn't
catch a ValueError exception, etc, etc, etc.

(nor that it's "has_key" and "hasattr", and not "has_key"
and "has_attr" or "haskey" and "hasattr" ;-)

let's just say that "in" is the same thing as "has_key",
and be done with it.

Cheers /F




From tim.one at home.com  Wed Jan 24 02:51:22 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 23 Jan 2001 20:51:22 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123140604.E18796@trump.amber.org>
Message-ID: <LNBBLJKPBEHFEDALKOLCMELKIKAA.tim.one@home.com>

[Christopher Petrilli]
> ....
> Unfortunately, for me, a Python implementation of Sets is only
> interesting academicaly.  Any time I've needed to work with them at a
> large scale, I've needed them *much* faster than Python could achieve
> without a C extension.

How do you know that?  I've used large sets in Python happily without
resorting to C or kjbuckets (which is really aiming at fast operations on
*graphs*, in which area it has no equal).

Everyone (except Eric <wink>) uses dicts to implement sets in Python, and
"most" set operations can work at full C speed then; e.g., assuming both
sets have N elements:

    membership testing
        O(1) -- it's just dict.has_key()
    element insertion
        O(1) -- dict[element] = 1
    element removal
        O(1) -- del dict[element]
    union
        O(N), but at full C speed -- dict1.update(dict2)
    intersection
        O(N), but at Python speed (the only 2.1 dog in the bunch!)
    choose some element and remove it
        took O(N) time and additional space in 2.0, but
        is O(1) in both since dict.pop() was introduced
    iteration
        O(N), with O(N) additional space using dict.keys(),
        or O(1) additional space using dict.pop() repeatedly

What are you going to do in C that's faster than using a Python dict for
this purpose?  Most key set operations are straightforward Python dict
1-liners then, and Python dicts are very fast.  kjbuckets sets were slower
last time I timed them (several years ago, but Python dicts have gotten
faster since then while kjbuckets has been stagnant).

There's a long tradition in the Lisp world of using unordered lists to
represent sets (when the only tool you have is a hammer ... <0.5 wink>), but
it's been easy to do much better than that in Python almost since the start.
Even in the Python list world, enormous improvements for large sets can be
gotten by maintaining lists in sorted order (then most O(N) operations drop
to O(log2(N)), and O(N**2) to O(N)).  Curiously, though, in 2.1 we can still
use a dict-set for complex numbers, but no longer a sorted-list-set!
Requiring a total ordering can get in the way more than requiring
hashability (and vice versa -- that's a tough one).

measurement-is-the-measure-of-all-measurable-things-ly y'rs  - tim




From greg at cosc.canterbury.ac.nz  Wed Jan 24 03:45:01 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 24 Jan 2001 15:45:01 +1300 (NZDT)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010124003814.F27785@xs4all.nl>
Message-ID: <200101240245.PAA02098@s454.cosc.canterbury.ac.nz>

Thomas Wouters <thomas at xs4all.net>:

> Now, if only there was a terribly obvious way to spell it... so that it's
> immediately obvious which of the two you wanted...

Well, in the case of

  for key in d:

or

  for value in d:

it's immediately obvious to a *human* reader what is meant,
so all we need to do is make the compiler a bit smarter. This
can easily be done by the use of a small table, containing
the equivalents of the words 'key' and 'value' in all known
natural languages, against which the target variable name is
matched using some suitable fuzzy matching algorithm.
Soundex could be used for this, if we can decide on which
version to use...

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From guido at digicool.com  Wed Jan 24 03:46:37 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 21:46:37 -0500
Subject: [Python-Dev] getting rid of ucnhash
In-Reply-To: Your message of "Tue, 23 Jan 2001 19:03:42 +0100."
             <013901c08566$d2a8f360$e46940d5@hagrid> 
References: <013901c08566$d2a8f360$e46940d5@hagrid> 
Message-ID: <200101240246.VAA06336@cj20424-a.reston1.va.home.com>

> It's probably just me, but the names of the two unicode
> modules tend to irritate me:
> 
> > ls u*.pyd
> ucnhash.pyd      unicodedata.pyd
> 
> (the former contains names, the latter data)
> 
> I've been meaning to rename the former, but I just realized
> that it might be better to get rid of it completely, and move
> its functionality into the unicodedata module.
> 
> The result is a single 200k unicodedata module, which con-
> tains the name database as well as two new functions:
> 
>     name(character [, default]) => map unicode
>     character to name.  if the name doesn't exist,
>     return the default object, or raise ValueError.
> 
>     lookup(name) => unicode character
>     (or raise KeyError if it doesn't exist)
> 
> Should I check it in now, change the names/semantics and check
> it in, or post it to sourceforge?

To me, both of these are irrelevant details of the Unicode
implementation. :-)   IOW, feel free to check it in.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From greg at cosc.canterbury.ac.nz  Wed Jan 24 03:49:21 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 24 Jan 2001 15:49:21 +1300 (NZDT)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMELKIKAA.tim.one@home.com>
Message-ID: <200101240249.PAA02101@s454.cosc.canterbury.ac.nz>

Tim Peters <tim.one at home.com>:

> Requiring a total ordering can get in the way more than requiring
> hashability

Often it's useful to have *some* total ordering, and
you don't really care what it is as long as its consistent.

Maybe all types should be required to support cmp(x,y) even 
if doing x < y via the rich comparison route raises a
NotOrderable exception.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From greg at cosc.canterbury.ac.nz  Wed Jan 24 03:52:43 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 24 Jan 2001 15:52:43 +1300 (NZDT)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123050807.A29115@glacier.fnational.com>
Message-ID: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>

Neil Schemenauer <nas at arctrix.com>:

> Basicly, we would
> use the implementation of PyDict but drop the values.

This could be incorporated into PyDict. Instead of storing keys and
values in the same array, keep them in separate arrays and only
allocate the values array the first time someone stores a value other
than 1.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From guido at digicool.com  Wed Jan 24 03:58:59 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 21:58:59 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Wed, 24 Jan 2001 01:13:20 +0100."
             <02f401c0859a$765d07c0$e46940d5@hagrid> 
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl>  
            <02f401c0859a$765d07c0$e46940d5@hagrid> 
Message-ID: <200101240258.VAA06479@cj20424-a.reston1.va.home.com>

> let's just say that "in" is the same thing as "has_key",
> and be done with it.

You know, I've long resisted this, but I agree now -- this is the
right thing.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Wed Jan 24 04:11:30 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 22:11:30 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,1.3,1.4
In-Reply-To: Your message of "Tue, 23 Jan 2001 12:35:04 CST."
             <14957.52952.48739.53360@beluga.mojam.com> 
References: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net>  
            <14957.52952.48739.53360@beluga.mojam.com> 
Message-ID: <200101240311.WAA06582@cj20424-a.reston1.va.home.com>

>     Guido> - Use "exec ... in dict" to avoid having to walk on eggshells;
>     Guido>   locals no don't have to start with underscore.
> 
> Thanks.  I have just been incredibly short on time lately.

You're welcome.

>     Guido> - Only test dbhash if bsddb can be imported.  (Wonder if there
>     Guido>   are more like this?)
> 
> Alpha testing should pick those up, yes? ;-)

Yes. :-)

>     Guido> ! try:
>     Guido> !     import bsddb
>     Guido> ! except ImportError:
>     Guido> !     if verbose:
>     Guido> !         print "can't import bsddb, so skipping dbhash"
>     Guido> ! else:
>     Guido> !     check_all("dbhash")
> 
> Instead of having to know that dbhash includes bsddb, shouldn't dbhash be
> the module that's imported here?

I think I saw a complaint about this that specifically said that when
dbhash is imported when bsddb can't be imported, an incomplete dbhash
is left behind in sys.modules, and then a second import of dbhash will
succeed -- but of course it will define no objects.  Since dbhash may
be imported elsewhere, testing for bsddb is safer.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Wed Jan 24 04:22:14 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 22:22:14 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.114,2.115
In-Reply-To: Your message of "Tue, 23 Jan 2001 08:24:38 PST."
             <E14L6FK-0001ZY-00@usw-pr-cvs1.sourceforge.net> 
References: <E14L6FK-0001ZY-00@usw-pr-cvs1.sourceforge.net> 
Message-ID: <200101240322.WAA06671@cj20424-a.reston1.va.home.com>

> A few miscellaneous helpers.
> 
> PyObject_Dump(): New function that is useful when debugging Python's C
> runtime.  In something like gdb it can be a pain to get some useful
> information out of PyObject*'s.  This function prints the str() of the
> object to stderr, along with the object's refcount and hex address.
> 
> PyGC_Dump(): Similar to PyObject_Dump() but knows how to cast from the
> garbage collector prefix back to the PyObject* structure.
> 
> [See Misc/gdbinit for some useful gdb hooks]
> 
> none_dealloc(): Rather than SEGV if we accidentally decref None out of
> existance, we assign None's and NotImplemented's destructor slot to
> this function, which just calls abort().

Barry, since these are only gdb helpers, would it perhaps be better if
their names started with "_Py" to indicate that they aren't part of
the regular API?  They violate an important rule: you shouldn't write
to stderr directly, but always to sys.stderr.  (There's a helper
routines to write to stderr: PySys_WriteStderr().)  I understand that
for the gdb helper it's important to use the real stderr, and I don't
object to having these functions present at all times (they're so
small), but I do think that we should make it clear (by a _Py name,
and also by a comment) that they should not be called!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From ping at lfw.org  Wed Jan 24 04:29:24 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Tue, 23 Jan 2001 19:29:24 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010124003814.F27785@xs4all.nl>
Message-ID: <Pine.LNX.4.10.10101231921030.1568-100000@skuld.kingmanhall.org>

I wrote:
> The only change that needs to be made to support sets of immutable
> elements is to provide "in" on dictionaries.

Thomas Wouters wrote:
> It's come up before. The problem with it is that it's not quite obvious
> whether it is 'if key in dict' or 'if value in dict'.

Yes, and i've seen this objection before, and i think it's silly.

> Sure, from the above
> example it's obvious what you *expect*, but I suspect that 'for x in dict'
> will result in a 40/60 split in expectations,

No way... it's at least 90/10.

How often do you write 'dict.has_key(x)'?          (std lib says: 206)
How often do you write 'for x in dict.keys()'?     (std lib says: 49)

How often do you write 'x in dict.values()'?       (std lib says: 0)
How often do you write 'for x in dict.values()'?   (std lib says: 3)

I rest my case.


-- ?!ng




From barry at digicool.com  Wed Jan 24 04:44:31 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 23 Jan 2001 22:44:31 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Objects object.c,2.114,2.115
References: <E14L6FK-0001ZY-00@usw-pr-cvs1.sourceforge.net>
	<200101240322.WAA06671@cj20424-a.reston1.va.home.com>
Message-ID: <14958.20383.795064.832967@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido at digicool.com> writes:

    GvR> Barry, since these are only gdb helpers, would it perhaps be
    GvR> better if their names started with "_Py" to indicate that
    GvR> they aren't part of the regular API?  They violate an
    GvR> important rule: you shouldn't write to stderr directly, but
    GvR> always to sys.stderr.  (There's a helper routines to write to
    GvR> stderr: PySys_WriteStderr().)  I understand that for the gdb
    GvR> helper it's important to use the real stderr, and I don't
    GvR> object to having these functions present at all times
    GvR> (they're so small), but I do think that we should make it
    GvR> clear (by a _Py name, and also by a comment) that they should
    GvR> not be called!

I thought about it, couldn't decide and figured I'd check it in
anyway, knowing that you'd let me know.  See how wise I was?  :)

I will rename them as _Py* and fix the gdbinit file accordingly.  One
note: these functions /ought/ to be useful for dbx or any other
command line debugger.  I just haven't used anything but gdb for
years.  If anybody's got a dbxinit equivalent I could add that to Misc
too.

nothing-an-adjacent-office-wouldn't-have-solved-much-more-quick-ly y'rs,
-Barry



From guido at digicool.com  Wed Jan 24 04:46:47 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 22:46:47 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: Your message of "Tue, 23 Jan 2001 09:22:26 EST."
             <20010123092226.A25968@thyrsus.com> 
References: <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com>  
            <20010123092226.A25968@thyrsus.com> 
Message-ID: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>

> Guido van Rossum <guido at digicool.com>:
> > Can you point me to docs explaining the meaning of the BROWSER
> > environment variable?  I've never heard of it...  The last new
> > environment variables I learned were PAGER and EDITOR, probably 15
> > years ago when 4.1BSD was released... :-)

ESR replies:
> You've never heard of BROWSER because I invented it and have not
> widely popularized it yet :-).  Ping knew about it either because he
> read the module code and saw that it was supposed to work, or because
> he remembered the design discussion when webbrowser.py was first
> implemented.
> 
> I've had conversations with some key Perl and Tcl people (Larry Wall,
> Tom Christiansen, Clif Flynt) about the BROWSER convention, and they
> agree it's a good idea.  I'll probably hack support for it into Perl's
> browser launcher next.
> 
> It's documented in the version of libwebbrowser.tex now in the CVS
> tree.

Grumble.  That wasn't the kind of answer I expected.  I don't like it
if Python is used as a wedge to get a particular thing introduced to
the rest of the world, no matter how useful it may seem at the time.
If something is already a popular convention, I'll happily adopt it,
but I'm not comfortable being put in front of somebody else's cart.
There just are too many carts that would like to be pulled by a horse
as strong as Python, and I don't want to take sides if I can avoid it.
BROWSER seems unlikely to take the world by storm and I don't feel I
need to be involved in the effort to get it accepted.

(And yes, I know there are enough cases where I *did* take sides.
There were some cases where I *do* want to take a side, and there were
some mistakes -- which is one of the reasons why I'm shy about taking
sides now.)

Anyway, shouldn't you also talk to the developers of packages like KDE
and Gnome?  Surely their users would like to be able to configure the
default webbrowser.  Talking just to the scripting language people
seems like you're thinking too small.  There must be lots of C apps
with the desire to invoke a browser.  Also Emacs, which has an
extensive list of browser-url-* functions (you might even learn a few
tricks from it about how to invoke various external browsers) but
AFAIK no default browser selection.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Wed Jan 24 04:54:25 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 22:54:25 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Wed, 24 Jan 2001 15:52:43 +1300."
             <200101240252.PAA02105@s454.cosc.canterbury.ac.nz> 
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz> 
Message-ID: <200101240354.WAA06903@cj20424-a.reston1.va.home.com>

> Neil Schemenauer <nas at arctrix.com>:
> 
> > Basicly, we would
> > use the implementation of PyDict but drop the values.
> 
> This could be incorporated into PyDict. Instead of storing keys and
> values in the same array, keep them in separate arrays and only
> allocate the values array the first time someone stores a value other
> than 1.

Not a bad idea!  (But shouldn't the default value be something else,
like none?)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Wed Jan 24 05:20:56 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 23 Jan 2001 23:20:56 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Wed, 24 Jan 2001 00:38:14 +0100."
             <20010124003814.F27785@xs4all.nl> 
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com>  
            <20010124003814.F27785@xs4all.nl> 
Message-ID: <200101240420.XAA07153@cj20424-a.reston1.va.home.com>

> > >     dict[key] = 1
> > >     if key in dict: ...
> > >     for key in dict: ...
> 
> > Independently of implementation issues about sets, I think this is a
> > damn fine idea. +1.
> 
> It's come up before. The problem with it is that it's not quite obvious
> whether it is 'if key in dict' or 'if value in dict'. Sure, from the above
> example it's obvious what you *expect*, but I suspect that 'for x in dict'
> will result in a 40/60 split in expectations, and like American voters, the
> 20% middle section will change their vote each recount :-)
> 
> Now, if only there was a terribly obvious way to spell it... so that it's
> immediately obvious which of the two you wanted.... something like, oh, I
> donno, this, maybe:
> 
>   if key in dict.keys: ...
>   if value in dict.values: ...
> 
> Ponder-ponder--Guido-should-use-the-time-machine-for-this-one!-ly y'rs,

No chance of a time-machine escape, but I *can* say that I agree that
Ping's proposal makes a lot of sense.  This is a reversal of my
previous opinion on this matter.  (Take note -- those don't happen
very often! :-)

First to submit a working patch gets a free copy of 2.1a2 and
subsequent releases,

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Wed Jan 24 05:50:49 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 23 Jan 2001 23:50:49 -0500
Subject: [Python-Dev] getting rid of ucnhash
In-Reply-To: <013901c08566$d2a8f360$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEMCIKAA.tim.one@home.com>

[/F]
> It's probably just me, but the names of the two unicode
> modules tend to irritate me:

I don't care much about the names, but having two Unicode subprojects in the
MS build seems overkill <wink>.

> ls u*.pyd
> ucnhash.pyd      unicodedata.pyd
>
> (the former contains names, the latter data)

Maybe that's the reason:  the names don't get loaded at all unless you *use*
one of the name APIs?  Hard to say whether that's worth the bother; now that
everything has been nicely compressed, it's sure not as compelling as it may
have been earlier.

> I've been meaning to rename the former, but I just realized
> that it might be better to get rid of it completely, and move
> its functionality into the unicodedata module.
>
> The result is a single 200k unicodedata module, which con-
> tains the name database as well as two new functions:
>
>     name(character [, default]) => map unicode
>     character to name.  if the name doesn't exist,
>     return the default object, or raise ValueError.
>
>     lookup(name) => unicode character
>     (or raise KeyError if it doesn't exist)
>
> Should I check it in now, change the names/semantics and check
> it in, or post it to sourceforge?

I have no opinion on what's best:  you're working with it, you're the best
judge of that.  I only vote for checking in whatever you decide sooner
rather than later; I'll fiddle the MS project files and readmes accordingly
ASAP after that.




From moshez at zadka.site.co.il  Wed Jan 24 15:07:08 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Wed, 24 Jan 2001 16:07:08 +0200 (IST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>
Message-ID: <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il>

On Wed, 24 Jan 2001, Greg Ewing <greg at cosc.canterbury.ac.nz> wrote:

> This could be incorporated into PyDict. Instead of storing keys and
> values in the same array, keep them in separate arrays and only
> allocate the values array the first time someone stores a value other
> than 1.

Cool idea, but even cooler (would catch more idioms, that is) is
"the first time someone stores something not 'is'  something in the
dict, allocate the values array". This would catch small numbers,
None and identifier-looking strings, for the measly cost of one
pointer/dict object.

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From moshez at zadka.site.co.il  Wed Jan 24 15:15:39 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Wed, 24 Jan 2001 16:15:39 +0200 (IST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>, <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com>  
            <20010123092226.A25968@thyrsus.com>
Message-ID: <20010124141539.76C3FA83E@darjeeling.zadka.site.co.il>

On Tue, 23 Jan 2001 22:46:47 -0500, Guido van Rossum <guido at digicool.com> wrote:

[ESR]
> You've never heard of BROWSER because I invented it and have not
> widely popularized it yet :-).

[Guido v. Rossum]
> Grumble.  That wasn't the kind of answer I expected.  I don't like it
> if Python is used as a wedge to get a particular thing introduced to
> the rest of the world, no matter how useful it may seem at the time.

Guido, I think you're being over-dramatic. BROWSER is right in the
tradition of PAGER and EDITOR, and a lot of other programs need it.
I know Eric uses RH and mutt, so probably RH's urlview program (which
mutt uses to jump to URLs) uses BROWSER. I was just about to submit
a bug report to Debian that their urlview doesn't respect it.

And if you really don't want to be a horse in front of a cart...

> Anyway, shouldn't you also talk to the developers of packages like KDE
> and Gnome?  Surely their users would like to be able to configure the
> default webbrowser.

Yes -- via GNOME/KDE specific mechanisms. I have 0 experience with KDE,
but I'm guessing the GNOME guys would do it via the GNOME "registry".
KDE probably has something similar. I'm sure you wouldn't want Python
to depend on GNOME, though it would be nice to make the browser-choosing
part pluggable so when "import gnome" is done, it automatically tries
to choose the user's browser.

On UNIX (as opposed to GNOME/KDE, which are pretty much operating systems
themselves), these things are done via environment variable. And $BROWSER
doesn't seem like that much of an innovation.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From skip at mojam.com  Wed Jan 24 07:28:21 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 24 Jan 2001 00:28:21 -0600 (CST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,1.3,1.4
In-Reply-To: <200101240311.WAA06582@cj20424-a.reston1.va.home.com>
References: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net>
	<14957.52952.48739.53360@beluga.mojam.com>
	<200101240311.WAA06582@cj20424-a.reston1.va.home.com>
Message-ID: <14958.30213.325584.373062@beluga.mojam.com>

    Guido> I think I saw a complaint about this that specifically said that
    Guido> when dbhash is imported when bsddb can't be imported, an
    Guido> incomplete dbhash is left behind in sys.modules, and then a
    Guido> second import of dbhash will succeed -- but of course it will
    Guido> define no objects.

So it does:

    % ./python
    Python 2.1a1 (#2, Jan 23 2001, 23:30:41) 
    [GCC 2.95.3 19991030 (prerelease)] on linux2
    Type "copyright", "credits" or "license" for more information.
    >>> import dbhash
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "/home/beluga/skip/src/python/dist/src/Lib/dbhash.py", line 3, in ?
	import bsddb
    ImportError: No module named bsddb
    >>> import dbhash
    >>>

Can that be construed as a bug?  If import fails, shouldn't the stub module
that was inserted in sys.modules be removed?

Skip



From skip at mojam.com  Wed Jan 24 07:31:08 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 24 Jan 2001 00:31:08 -0600 (CST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>
References: <20010123041730.A25165@thyrsus.com>
	<200101231406.JAA04765@cj20424-a.reston1.va.home.com>
	<20010123092226.A25968@thyrsus.com>
	<200101240346.WAA06790@cj20424-a.reston1.va.home.com>
Message-ID: <14958.30380.851599.764535@beluga.mojam.com>

    Guido> BROWSER seems unlikely to take the world by storm and I don't
    Guido> feel I need to be involved in the effort to get it accepted.

Editors and web browsers are classes of tools which (one would hope) will
always come in several varieties.  Users have to have some way to specify
what to launch.  BROWSER seems analogous to the EDITOR environment variable
which is commonly used in Unix environments for just that purpose.

Skip



From thomas at xs4all.net  Wed Jan 24 08:03:09 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 24 Jan 2001 08:03:09 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101240420.XAA07153@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 11:20:56PM -0500
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl> <200101240420.XAA07153@cj20424-a.reston1.va.home.com>
Message-ID: <20010124080308.G27785@xs4all.nl>

On Tue, Jan 23, 2001 at 11:20:56PM -0500, Guido van Rossum wrote:

> First to submit a working patch gets a free copy of 2.1a2 and
> subsequent releases,

Patch submitted. It only implements 'if key in dict', not 'for key in dict'.
The latter is kind of hard until we have a separate iteration protocol.
(PEP, anyone ?) Once we have it, we could consider 'for key, value in dict',
which is now easily explained with 'dict.popitem()'.

Does this mean I get a legally sound and thus empty legal statement with
every Python release for the rest of your, its or my life, Guido, or will
you just make me 'Free Python Release Receiver For Life' ? :-)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From pf at artcom-gmbh.de  Wed Jan 24 08:31:30 2001
From: pf at artcom-gmbh.de (Peter Funk)
Date: Wed, 24 Jan 2001 08:31:30 +0100 (MET)
Subject: OT: contribution rewards (was Re: [Python-Dev] Re: Sets: elt in dict, lst.include)
In-Reply-To: <200101240420.XAA07153@cj20424-a.reston1.va.home.com> from Guido van Rossum at "Jan 23, 2001 11:20:56 pm"
Message-ID: <m14LKOw-000CxUC@artcom0.artcom-gmbh.de>

Hi,

Guido van Rossum:
[...]
> Ping's proposal makes a lot of sense.  This is a reversal of my
> previous opinion on this matter.  (Take note -- those don't happen
> very often! :-)

It gives a warm und fuzzy feeling to see that happen sometimes at all. ;-)

> First to submit a working patch gets a free copy of 2.1a2 and
> subsequent releases,

This repeated offer of free copies of Python becomes increasingly
boring.  For quite a while I myself have not contributed anything useful 
and I am nevertheless hoarding free copies of Python here. ;-)

What about offering another immaterial reward to potential contributors
instead?  What about "fame points"?  Anybody contributing something
useful to Python receives a certain number of "fame points":  These
fame points will be added and placed in front of the name of
the contributor into the ACKS file and the file will be sorted
accordingly turning the ACKS file effectively into some kind of
"Python contribution high score" ...   ;-)

Just kidding, Peter




From tim.one at home.com  Wed Jan 24 09:08:50 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 03:08:50 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123050807.A29115@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEMPIKAA.tim.one@home.com>

[Neil Schemenauer]
> I think this argues that if sets are added to the core they
> should be implemented as an extension type with the speed of
> dictionaries and the memory usage of lists.  Basicly, we would
> use the implementation of PyDict but drop the values.

They'll be slower than dicts and take more memory than lists then.  WRT
memory, dicts cache the hash code with each entry for speed (so double the
memory of a list even without the value field), and are never more than 2/3
full anyway.  The dict implementation also gets low-level speed benefits out
of using both the key and value fields to characterize the nature of a slot
(the key field is NULL iff the slot is virgin; the value field is NULL iff
the slot is available (virgin or dummy)).

Dummy slots can be avoided (and so also the need for runtime code to
distinguish them from active slots) by using a hash table of pointers to
linked lists-- or flex vectors, or linked lists of small vectors --instead,
and in most ways that leads to much simpler code (no more fiddling with
dummies, no more probe-sequence hassles, no more boosting the size before
the table is full).  But without fine control over the internals of malloc,
that takes even more memory in the end.

Interesting twist:  "a dict" *is* "a set", but a set of (key, value) pairs
further constrained so that no two elements have the same key.  So any set
implementation can be used as-is to implement a dict as a set of 2-tuples,
customizing the hash and "is equal" functions to look at just the tuples'
first elements.  The was the view taken by SETL in 1969, although their
"map" (dict) type was eventually optimized to get away from actually
constructing 2-tuples.  Indeed, SETL eventually grew an elaborate optional
type declaration sublanguage, allowing the user to influence many details of
its many internal set-storage schemes; e.g., from pg 399 of "Programming
With Sets:  An Introduction to SETL":

    For example, we can declare [I'm putting their keywords in UPPERCASE
    for, umm, clarity]

        successors: LOCAL MMAP(ELMT b) REMOTE SET(ELMT b);

    This declaration specifies that for each x in b the image set
    successors{x} is stored in the element block of x, and that this
    image set is always to be represented as a bit vector.  Similarly,
    the declaration

        successors: LOCAL MMAP(ELMT b) SPARSE SET(ELMT b);

    specifies that for each x in b the image set successors{x} is to
    be stored as a hash table containing pointers to elements of b.
    Note that the attribute LOCAL cannot be used for image sets of
    multivalued maps,  This follows from the remarks in section 10.4.3
    on the awkwardness of making local objects into subparts of
    composite objects.

Clear?  Snort.  Here are some citations lifted from the web for their
experience in trying to make these kinds of decisions by magic:

@article{dewar:79,
title="Programming by Refinement, as Exemplified by the {SETL}
Representation Sublanguage",
author="Robert B. K. Dewar and Arthur Grand and Ssu-Cheng Liu and
Jacob T. Schwartz and Edmond Schonberg",
journal=toplas,
year=1979,
month=jul,
volume=1,
number=1,
pages="27--49"
}

@article{schonberg:81,
title="An Automatic Technique for Selection of Data Structures in
{SETL} Programs",
author="Edmond Schonberg and Jacob T. Schwartz and Micha Sharir",
journal=toplas,
year=1981,
month=apr,
volume=3,
number=2,
pages="126--143"
}

@article{freudenberger:83,
title="Experience with the {SETL} Optimizer",
author="Stefan M. Freudenberger and Jacob T. Schwartz and Micha Sharir",
pages="26--45",
journal=toplas,
year=1983,
month=jan,
volume=5,
number=1
}

If someone wanted to take sets seriously today, a better approach would be
to define a minimal "set interface" ("abstract base class" in C++ terms),
then supply multiple implementations of that interface, letting the user
choose directly which implementation strategy they want for each of their
sets.  And people are doing just that in the C++ and Java worlds; e.g.,

http://developer.java.sun.com/developer/onlineTraining/
    collections/Collection.html#SetInterface

Curiously, the newer Java Collections Framework (covering multiple
implementations of list, set, and dict interfaces) gave up on thread-safety
by default, because it cost too much at runtime.  Just another thing to
argue about <wink>.

we're-not-exactly-pioneers-here-ly y'rs  - tim




From fredrik at effbot.org  Wed Jan 24 09:29:30 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Wed, 24 Jan 2001 09:29:30 +0100
Subject: [Python-Dev] getting rid of ucnhash
References: <013901c08566$d2a8f360$e46940d5@hagrid>  <200101240246.VAA06336@cj20424-a.reston1.va.home.com>
Message-ID: <019801c085df$c7ee0540$e46940d5@hagrid>

guido wrote:
> > It's probably just me, but the names of the two unicode
> > modules tend to irritate me:
> > 
> > > ls u*.pyd
> > ucnhash.pyd      unicodedata.pyd
> 
> To me, both of these are irrelevant details of the Unicode
> implementation. :-)   IOW, feel free to check it in.

Done.

Note that Include/ucnhash.h is still there; it declares the
"ucnhash_CAPI" structure used to access names from the
unicodeobject module.

(and all name-related tests are still kept in test_ucn)

I'll leave it to Tim to update the MSVC build files.

Cheers /F




From tim.one at home.com  Wed Jan 24 09:28:34 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 03:28:34 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>

[Guido]
> Can you point me to docs explaining the meaning of the BROWSER
> environment variable?  I've never heard of it...  The last new
> environment variables I learned were PAGER and EDITOR, probably 15
> years ago when 4.1BSD was released... :-)

I gotta say, politics aside, BROWSER is a screamingly natural answer to the
question "what comes next in this sequence?":

    PAGER, EDITOR, ...

Dear Lord, even *I* use a browser almost every week <wink>.

explicit-is-better-than-implicit-ly y'rs  - tim




From esr at thyrsus.com  Wed Jan 24 10:02:59 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:02:59 -0500
Subject: OT: contribution rewards (was Re: [Python-Dev] Re: Sets: elt in dict, lst.include)
In-Reply-To: <m14LKOw-000CxUC@artcom0.artcom-gmbh.de>; from pf@artcom-gmbh.de on Wed, Jan 24, 2001 at 08:31:30AM +0100
References: <200101240420.XAA07153@cj20424-a.reston1.va.home.com> <m14LKOw-000CxUC@artcom0.artcom-gmbh.de>
Message-ID: <20010124040259.A28086@thyrsus.com>

Peter Funk <pf at artcom-gmbh.de>:
> What about offering another immaterial reward to potential contributors
> instead?  What about "fame points"?  Anybody contributing something
> useful to Python receives a certain number of "fame points":  These
> fame points will be added and placed in front of the name of
> the contributor into the ACKS file and the file will be sorted
> accordingly turning the ACKS file effectively into some kind of
> "Python contribution high score" ...   ;-)
> 
> Just kidding, Peter

You may be joking, but as an observer of how gift cultures work I say this
isn't a bad idea.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"One of the ordinary modes, by which tyrants accomplish their purposes
without resistance, is, by disarming the people, and making it an
offense to keep arms."
        -- Constitutional scholar and Supreme Court Justice Joseph Story, 1840



From esr at thyrsus.com  Wed Jan 24 10:09:18 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:09:18 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101240258.VAA06479@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 09:58:59PM -0500
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl> <02f401c0859a$765d07c0$e46940d5@hagrid> <200101240258.VAA06479@cj20424-a.reston1.va.home.com>
Message-ID: <20010124040918.B28086@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> > let's just say that "in" is the same thing as "has_key",
> > and be done with it.
> 
> You know, I've long resisted this, but I agree now -- this is the
> right thing.

I think we've just justified the time and energy that went into this 
discussion.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

What is a magician but a practicing theorist?
	-- Obi-Wan Kenobi, 'Return of the Jedi'



From esr at thyrsus.com  Wed Jan 24 10:14:27 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:14:27 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 23, 2001 at 10:46:47PM -0500
References: <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com> <20010123092226.A25968@thyrsus.com> <200101240346.WAA06790@cj20424-a.reston1.va.home.com>
Message-ID: <20010124041427.D28086@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> Grumble.  That wasn't the kind of answer I expected.  I don't like it
> if Python is used as a wedge to get a particular thing introduced to
> the rest of the world, no matter how useful it may seem at the time.

Oh, stop!  I'm not using Python as an argument for other people to adopt
the BROWSER convention.  The idea sells itself quite nicely by analogy to
EDITOR and PAGER the second people hear it.

> Anyway, shouldn't you also talk to the developers of packages like KDE
> and Gnome?  Surely their users would like to be able to configure the
> default webbrowser.  Talking just to the scripting language people
> seems like you're thinking too small.  There must be lots of C apps
> with the desire to invoke a browser.  Also Emacs, which has an
> extensive list of browser-url-* functions (you might even learn a few
> tricks from it about how to invoke various external browsers) but
> AFAIK no default browser selection.

All on my TO-DO list.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

It is proper to take alarm at the first experiment on our
liberties. We hold this prudent jealousy to be the first duty of
citizens and one of the noblest characteristics of the late
Revolution. The freemen of America did not wait till usurped power had
strengthened itself by exercise and entangled the question in
precedents. They saw all the consequences in the principle, and they
avoided the consequences by denying the principle. We revere this
lesson too much ... to forget it
	-- James Madison.



From esr at thyrsus.com  Wed Jan 24 10:16:12 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:16:12 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>; from tim.one@home.com on Wed, Jan 24, 2001 at 03:28:34AM -0500
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>
Message-ID: <20010124041612.E28086@thyrsus.com>

Tim Peters <tim.one at home.com>:
> I gotta say, politics aside, BROWSER is a screamingly natural answer to the
> question "what comes next in this sequence?":
> 
>     PAGER, EDITOR, ...

That's exactly what I thought when I was struck by the obvious.  Everybody
I spread this meme to seems to agree.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Government is actually the worst failure of civilized man. There has
never been a really good one, and even those that are most tolerable
are arbitrary, cruel, grasping and unintelligent.
	-- H. L. Mencken 



From esr at thyrsus.com  Wed Jan 24 10:21:56 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 04:21:56 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124141539.76C3FA83E@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Wed, Jan 24, 2001 at 04:15:39PM +0200
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com>, <20010123041730.A25165@thyrsus.com> <200101231406.JAA04765@cj20424-a.reston1.va.home.com> <20010123092226.A25968@thyrsus.com> <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <20010124141539.76C3FA83E@darjeeling.zadka.site.co.il>
Message-ID: <20010124042156.F28086@thyrsus.com>

Moshe Zadka <moshez at zadka.site.co.il>:
> I know Eric uses RH and mutt, so probably RH's urlview program (which
> mutt uses to jump to URLs) uses BROWSER. I was just about to submit
> a bug report to Debian that their urlview doesn't respect it.

Oh, *do* that!  Note: BROWSER may consist of a colon-separated series
of parts, browser commands to be tried in order (this is useful so you
can put an X browser first, then a console browser, and have the right
thing happen).  If a part contains %s, the URL is substituted there;
otherwise, the URL is concatenated to the command after a space.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Gun Control: The theory that a woman found dead in an alley, raped and
strangled with her panty hose, is somehow morally superior to a
woman explaining to police how her attacker got that fatal bullet wound.
	-- L. Neil Smith



From tim.one at home.com  Wed Jan 24 10:24:26 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 04:24:26 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101240354.WAA06903@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCAENIIKAA.tim.one@home.com>

[Greg Ewing]
> This could be incorporated into PyDict. Instead of storing keys and
> values in the same array, keep them in separate arrays and only
> allocate the values array the first time someone stores a value other
> than 1.

[Guido]
> Not a bad idea!

In theory, but if Vladimir were here he'd bust a gut over the possibly bad
cache effects on "real dicts" (by keeping everything together, simply
accessing the cached hash code brings both the key and value pointers into
L1 cache too).  We would need to quantify the effect of breaking that
connection.

> (But shouldn't the default value be something else,
> like none?)

Bleech.  I hate the idiom of using a false value to mean "present".

    d = {}
    for x in seq:
        d[x] = 1

runs faster too (None needs a LOAD_GLOBAL now).




From tim.one at home.com  Wed Jan 24 11:01:36 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 05:01:36 -0500
Subject: [Python-Dev] test___all__ failing; Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>

> python  ../lib/test/regrtest.py test___all__

test___all__
test test___all__ crashed -- exceptions.AttributeError:
     'locale' module has no attribute 'LC_MESSAGES'

And indeed it does not:

> python
Python 2.1a1 (#9, Jan 24 2001, 04:40:55) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import locale
>>> dir(locale)
['CHAR_MAX', 'Error', 'LC_ALL', 'LC_COLLATE', 'LC_CTYPE',
 'LC_MONETARY', 'LC_NUMERIC', 'LC_TIME', '__all__', '__builtins__',
 '__doc__', '__file__', '__name__', '_build_localename', '_group',
 '_parse_localename', '_print_locale', '_setlocale', '_test', 'atof',
 'atoi', 'encoding_alias', 'format', 'getdefaultlocale', 'getlocale',
 'locale_alias', 'localeconv', 'normalize', 'resetlocale', 'setlocale',
 'str', 'strcoll', 'string', 'strxfrm', 'sys', 'windows_locale']
>>>

Nor is LC_MESSAGES std C (the other LC_XXX guys are).

I pin the blame on

    from _locale import *

in locale.py -- who knows what that's supposed to export?  Certainly not
Skip <wink>.




From tim.one at home.com  Wed Jan 24 11:17:47 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 05:17:47 -0500
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>

Nevermind; checked in a hack to stop the error on Windows.




From mal at lemburg.com  Wed Jan 24 14:00:28 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 24 Jan 2001 14:00:28 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <14957.53331.342827.462297@localhost.localdomain> <Pine.LNX.4.10.10101231112460.1568-100000@skuld.kingmanhall.org> <20010123153839.B26676@thyrsus.com> <20010124003814.F27785@xs4all.nl> <02f401c0859a$765d07c0$e46940d5@hagrid>
Message-ID: <3A6ED1EC.237B5B1D@lemburg.com>

Fredrik Lundh wrote:
> 
> > It's come up before. The problem with it is that it's not quite obvious
> > whether it is 'if key in dict' or 'if value in dict'.
> 
> you forgot "if (key, value) in dict"
> 
> on the other hand, it's not quite obvious that "list.sort"
> doesn't return the sorted list, "print >>None" prints to
> standard output, "except KeyError, ValueError" doesn't
> catch a ValueError exception, etc, etc, etc.
> 
> (nor that it's "has_key" and "hasattr", and not "has_key"
> and "has_attr" or "haskey" and "hasattr" ;-)
> 
> let's just say that "in" is the same thing as "has_key",
> and be done with it.

+1 all the way :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Wed Jan 24 15:01:33 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 24 Jan 2001 15:01:33 +0100
Subject: [Python-Dev] Interfaces (Is X a (sequence|mapping)?)
References: <200101230814.f0N8EWQ00849@mira.informatik.hu-berlin.de>  
	            <3A6D4B9F.38B17046@lemburg.com> <200101231531.KAA05122@cj20424-a.reston1.va.home.com>
Message-ID: <3A6EE03D.4D5DFD17@lemburg.com>

Guido van Rossum wrote:
> 
> > Polymorphic code will usually get you more out of an
> > algorithm, than type-safe or interface-safe code.
> 
> Right.
> 
> But there are times when people want to write methods that take
> e.g. either a sequence or a mapping, and need to distinguish between
> the two.  That's not easy in Python!  Java and C++ support it very
> well though, and thus we'll always keep seeing this kind of
> complaint.  Not sure what to do, except to recommend "find out which
> methods you expect in one case but not in the other (e.g. keys()) and
> do a hasattr() test for that."

Perhaps we should provide simple means for testing a set of
available methods and slots ?!

E.g. hasinterface(obj, ('keys', 'items', '__len__'))

Objects could provide an __interface__ special attribute for this
purpose (since not all slots can be auto-detected and -verified
without side-effects).

> > BTW, there are Python interfaces to PySequence_Check() and
> > PyMapping_Check() burried in the builtin operator module in case
> > you really do care ;) ...
> >
> >       operator.isSequenceType()
> >       operator.isMappingType()
> >       + some other C style _Check() APIs
> >
> > These only look at the type slots though, so Python instances
> > will appear to support everything but when used fail with
> > an exception if they don't provide the proper __xxx__ hooks.
> 
> Yes, these should probably be deprecated.  I certainly have never used
> them!  (The operator module doesn't seem to get much use in
> general...  Was it a bad idea?)

Some of these are nice to have and provide some good performance
boost (e.g. the numeric slot access APIs). The type slot checking 
APIs are not too useful though.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From jim at digicool.com  Wed Jan 24 10:05:44 2001
From: jim at digicool.com (Jim Fulton)
Date: Wed, 24 Jan 2001 04:05:44 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com> <20010123140604.E18796@trump.amber.org> <20010123050807.A29115@glacier.fnational.com> <20010123150616.F18796@trump.amber.org>
Message-ID: <3A6E9AE8.6C2D3CF0@digicool.com>

Christopher Petrilli wrote:
> 
> Neil Schemenauer [nas at arctrix.com] wrote:
> > On Tue, Jan 23, 2001 at 02:06:05PM -0500, Christopher Petrilli wrote:
> > > Unfortunately, for me, a Python implementation of Sets is only
> > > interesting academicaly.  Any time I've needed to work with them at a
> > > large scale, I've needed them *much* faster than Python could achieve
> > > without a C extension.
> >
> > I think this argues that if sets are added to the core they
> > should be implemented as an extension type with the speed of
> > dictionaries and the memory usage of lists.  Basicly, we would
> > use the implementation of PyDict but drop the values.
> 
> This is effectively the implementation that Zope has for Sets. 

Except we use sorted collections with binary search for sets.

I think that a simple hash-based set would make alot of sense.

> In
> addition we have "buckets" that have scores on them (which are
> implemented as a modified BTree).
> 
> Unfortunately Jim Fulton (who wrote all the code for that level) is in
> a meeting, but I hope he'll comment on the implementation that was
> chosen for our software.

We have a number of special needs:

  - Scalability is critical. We make some special opimizations, 
    like sets of integers and mapping objects with integer keys
    and values. In these cases, data are stored using C int arrays, 
    allowing very efficient data storage and manipulation, especially
    when using integer keys.

  - We need to spread data over multiple database records. Our data
    structures may be hundreds of megabytes in size. We have ZODB-aware
    structures that use multiple independently stored database objects.

  - Range searches are very common, and under some circomstances, 
    sorted collections and BTrees can have very little overhead
    compared to dictionaries. For this reason, out mapping objects
    and sets have been based on BTrees and sorted collections.

Unfortunately, our current BTree implementation has a flaw that
causes excessive number of objects to be updated when items are 
added and removed. (Each BTree internal node keeps track of the number
of objects contained in it.)  Also, out current sets are limited
to integers and cannot be spread over multiple database records.

We are completing a new BTree implementation that overcomes these 
limitations.  IN this implementation, we will provide sets as
value-less BTrees.

Jim

--
Jim Fulton           mailto:jim at digicool.com   Python Powered!        
Technical Director   (888) 344-4332            http://www.python.org  
Digital Creations    http://www.digicool.com   http://www.zope.org



From gvwilson at nevex.com  Wed Jan 24 15:10:41 2001
From: gvwilson at nevex.com (Greg Wilson)
Date: Wed, 24 Jan 2001 09:10:41 -0500
Subject: [Python-Dev] re: sets
In-Reply-To: <20010124032401.EB329F199@mail.python.org>
Message-ID: <000301c0860f$6fa29010$770a0a0a@nevex.com>

1. I did a poll overnight by email of 22 friends and colleagues,
none of whom are regular Python users (yet).  My question was,

   "Would you expect the interface of a set class to be like
    the interface of a vector or list, or like the interface
    of a map or hash?"

15 people have replied; all 15 have said, "map or hash".
Several respondents are Perl hackers, so I'm sure the answer
is influenced by previous exposure to the set-as-valueless-hash
idiom.  Still, I think 15-0 is a pretty convincing score...

Four, unprompted, said that they thought the STL's hierarchy of
containers was as good as it gets, and that other languages
should mirror it.  (One of those added that this makes teaching
much simpler --- students can transfer instincts from one language
to another.)

2. Is there enough interest in sets for a BOF at IPC9?  Please
reply to me point-to-point if you're interested; I'll summarize
and post the result.  I volunteer to bring the donuts...

> > Ka-Ping Yee:
> > The only change that needs to be made to support sets of immutable
> > elements is to provide "in" on dictionaries.  The rest is then all
> > quite natural:
> >     dict[key] = 1
> >     if key in dict: ...
> >     for key in dict: ...

> > various:
> > [but what about 'value in dict' or '(key, value) in dict'?]

> Fredrik Lundh:
> let's just say that "in" is the same thing as "has_key",
> and be done with it.

> Guido van Rossum:
> You know, I've long resisted this, but I agree now -- this is the
> right thing.

Greg Wilson:
Woo hoo!  Now, on a related note, what is the status of the 'indices()'
proposal, as in:

    for i in indices(someList):

instead of:

    for i in range(len(someList)):

Would 'indices(dict)' be the same as 'dict.keys()', to allow
uniform iteration?  Or would it be more economical to introduce
a 'keys()' method on lists and tuples, so that:

    for i in collection.keys():

would work on dicts, lists, and tuples?  I know that 'keys()'
is the wrong name for lists and tuples, but dicts are already
using it, and it's completely unambiguous...

Thanks,
Greg



From mal at lemburg.com  Wed Jan 24 15:46:10 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 24 Jan 2001 15:46:10 +0100
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
References: <esr@thyrsus.com> <200101231816.LAA03551@localhost.localdomain> <20010123133904.B26487@thyrsus.com> <20010123140604.E18796@trump.amber.org> <20010123050807.A29115@glacier.fnational.com> <20010123150616.F18796@trump.amber.org> <3A6E9AE8.6C2D3CF0@digicool.com>
Message-ID: <3A6EEAB2.5E6A4E83@lemburg.com>

Jim Fulton wrote:
> 
> Christopher Petrilli wrote:
> >
> > Neil Schemenauer [nas at arctrix.com] wrote:
> > > On Tue, Jan 23, 2001 at 02:06:05PM -0500, Christopher Petrilli wrote:
> > > > Unfortunately, for me, a Python implementation of Sets is only
> > > > interesting academicaly.  Any time I've needed to work with them at a
> > > > large scale, I've needed them *much* faster than Python could achieve
> > > > without a C extension.
> > >
> > > I think this argues that if sets are added to the core they
> > > should be implemented as an extension type with the speed of
> > > dictionaries and the memory usage of lists.  Basicly, we would
> > > use the implementation of PyDict but drop the values.
> >
> > This is effectively the implementation that Zope has for Sets.
> 
> Except we use sorted collections with binary search for sets.
> 
> I think that a simple hash-based set would make alot of sense.
> 
> > In
> > addition we have "buckets" that have scores on them (which are
> > implemented as a modified BTree).
> >
> > Unfortunately Jim Fulton (who wrote all the code for that level) is in
> > a meeting, but I hope he'll comment on the implementation that was
> > chosen for our software.
> 
> We have a number of special needs:
> 
>   - Scalability is critical. We make some special opimizations,
>     like sets of integers and mapping objects with integer keys
>     and values. In these cases, data are stored using C int arrays,
>     allowing very efficient data storage and manipulation, especially
>     when using integer keys.
> 
>   - We need to spread data over multiple database records. Our data
>     structures may be hundreds of megabytes in size. We have ZODB-aware
>     structures that use multiple independently stored database objects.
> 
>   - Range searches are very common, and under some circomstances,
>     sorted collections and BTrees can have very little overhead
>     compared to dictionaries. For this reason, out mapping objects
>     and sets have been based on BTrees and sorted collections.
> 
> Unfortunately, our current BTree implementation has a flaw that
> causes excessive number of objects to be updated when items are
> added and removed. (Each BTree internal node keeps track of the number
> of objects contained in it.)  Also, out current sets are limited
> to integers and cannot be spread over multiple database records.
> 
> We are completing a new BTree implementation that overcomes these
> limitations.  IN this implementation, we will provide sets as
> value-less BTrees.

You may want to check out a soon to be released new mx
package: mxBeeBase. This is an on-disk b+tree implementation
which supports data files up to 2GB on 32-bit platforms.

Here's a preview:

	http://www.lemburg.com/python/mxBeeBase.html

(The links on that page are not functional.)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From skip at mojam.com  Wed Jan 24 15:42:23 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 24 Jan 2001 08:42:23 -0600 (CST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
Message-ID: <14958.59855.4855.52638@beluga.mojam.com>

    Tim> Nor is LC_MESSAGES std C (the other LC_XXX guys are).

    Tim> I pin the blame on

    Tim>     from _locale import *

    Tim> in locale.py -- who knows what that's supposed to export?
    Tim> Certainly not Skip <wink>.

Was that a roundabout way of complimenting me for having found a bug? ;-)

Skip






From skip at mojam.com  Wed Jan 24 15:50:02 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 24 Jan 2001 08:50:02 -0600 (CST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
	<LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>
Message-ID: <14958.60314.482226.825611@beluga.mojam.com>

    Tim> Nevermind; checked in a hack to stop the error on Windows.

Probably should file a bug report (if you haven't already) so the root
problem isn't forgotten because the hack obscures it.  I see this code in
localemodule.c:

    #ifdef LC_MESSAGES
	x = PyInt_FromLong(LC_MESSAGES);
	PyDict_SetItemString(d, "LC_MESSAGES", x);
	Py_XDECREF(x);
    #endif /* LC_MESSAGES */

Martin, looks like this module is your baby.  Care to hazard a guess about
whether LC_MESSAGES should always or never be there?

Skip




From fredrik at effbot.org  Wed Jan 24 16:11:33 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Wed, 24 Jan 2001 16:11:33 +0100
Subject: [Python-Dev] test___all__ failing; Windows
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com><LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com> <14958.60314.482226.825611@beluga.mojam.com>
Message-ID: <04de01c08617$f56216f0$e46940d5@hagrid>

Skip wrote:

> Probably should file a bug report (if you haven't already) so the root
> problem isn't forgotten because the hack obscures it.  I see this code in
> localemodule.c:
> 
>     #ifdef LC_MESSAGES
> x = PyInt_FromLong(LC_MESSAGES);
> PyDict_SetItemString(d, "LC_MESSAGES", x);
> Py_XDECREF(x);
>     #endif /* LC_MESSAGES */
> 
> Martin, looks like this module is your baby.  Care to hazard a guess about
> whether LC_MESSAGES should always or never be there?

I think the correct answer is "sometimes":

    ANSI C mandates LC_ALL, LC_COLLATE, LC_CTYPE,
    LC_MONETARY, LC_NUMERIC, and LC_TIME

    Unix mandates LC_ALL, LC_COLLATE,LC_CTYPE,
    LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and
    LC_TIME

in other words, if it's supported, it should be exposed by
the Python bindings.

Cheers /F




From tismer at tismer.com  Wed Jan 24 15:40:04 2001
From: tismer at tismer.com (Christian Tismer)
Date: Wed, 24 Jan 2001 16:40:04 +0200
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>
Message-ID: <3A6EE944.C8CC6EF7@tismer.com>


Greg Ewing wrote:
> 
> Neil Schemenauer <nas at arctrix.com>:
> 
> > Basicly, we would
> > use the implementation of PyDict but drop the values.
> 
> This could be incorporated into PyDict. Instead of storing keys and
> values in the same array, keep them in separate arrays and only
> allocate the values array the first time someone stores a value other
> than 1.

Very good idea. It fits also in my view of how dicts should be
implemented: Keep keys and values apart, since this information
has different access patterns.
I think (or at least hope) that dictionaries become faster,
when hashes, keys and values are in seperate areas, giving more
cache hits. Not sure if hashes and keys should be apart, but
sure for values.

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From guido at digicool.com  Wed Jan 24 16:37:03 2001
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 24 Jan 2001 10:37:03 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Lib/test test___all__.py,1.3,1.4
In-Reply-To: Your message of "Wed, 24 Jan 2001 00:28:21 CST."
             <14958.30213.325584.373062@beluga.mojam.com> 
References: <E14KqWI-0007rN-00@usw-pr-cvs1.sourceforge.net> <14957.52952.48739.53360@beluga.mojam.com> <200101240311.WAA06582@cj20424-a.reston1.va.home.com>  
            <14958.30213.325584.373062@beluga.mojam.com> 
Message-ID: <200101241537.KAA27039@cj20424-a.reston1.va.home.com>

>     Guido> I think I saw a complaint about this that specifically said that
>     Guido> when dbhash is imported when bsddb can't be imported, an
>     Guido> incomplete dbhash is left behind in sys.modules, and then a
>     Guido> second import of dbhash will succeed -- but of course it will
>     Guido> define no objects.
> 
> So it does:
> 
>     % ./python
>     Python 2.1a1 (#2, Jan 23 2001, 23:30:41) 
>     [GCC 2.95.3 19991030 (prerelease)] on linux2
>     Type "copyright", "credits" or "license" for more information.
>     >>> import dbhash
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>       File "/home/beluga/skip/src/python/dist/src/Lib/dbhash.py", line 3, in ?
> 	import bsddb
>     ImportError: No module named bsddb
>     >>> import dbhash
>     >>>
> 
> Can that be construed as a bug?  If import fails, shouldn't the stub module
> that was inserted in sys.modules be removed?

Yep, but not a very important bug -- typically this isn't caught.
Feel free to check in a change; I think you should be able to insert
something like

    import sys
    try:
	import bsddb
    except ImportError:
	del sys.modules[__name__]
	raise

into dbhash.

If this works for you in testing, forget the patch manager, just check
it in.  (I'm too busy to do much myself, the company needs me. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From pf at artcom-gmbh.de  Wed Jan 24 16:32:55 2001
From: pf at artcom-gmbh.de (Peter Funk)
Date: Wed, 24 Jan 2001 16:32:55 +0100 (MET)
Subject: LC_MESSAGES (was Re: [Python-Dev] test___all__ failing; Windows)
In-Reply-To: <14958.60314.482226.825611@beluga.mojam.com> from Skip Montanaro at "Jan 24, 2001  8:50: 2 am"
Message-ID: <m14LRup-000CxUC@artcom0.artcom-gmbh.de>

Hi,

Skip Montanaro:
> 
>     Tim> Nevermind; checked in a hack to stop the error on Windows.
> 
> Probably should file a bug report (if you haven't already) so the root
> problem isn't forgotten because the hack obscures it.  I see this code in
> localemodule.c:
> 
>     #ifdef LC_MESSAGES
> 	x = PyInt_FromLong(LC_MESSAGES);
> 	PyDict_SetItemString(d, "LC_MESSAGES", x);
> 	Py_XDECREF(x);
>     #endif /* LC_MESSAGES */
> 
> Martin, looks like this module is your baby.  Care to hazard a guess about
> whether LC_MESSAGES should always or never be there?

AFAI found out, LC_MESSAGES was added to the POSIX "standard" in Posix.2.
Non-posix2 compatible systems probably miss the proper functionality 
behind 'setlocale()'.  So the best solution would be to add a clever
emulation/approximation of this feature, if the underlying platform
(here windows) doesn't provide it.   This would require to wrap 
'setlocale()'.  But I'm not sure how to emulate for example
'setlocale(LC_MESSAGES, 'DE_de') on a Windows box.  May be it is
impossible to achieve.  

What I would love to see is that the typical query
'setlocale(LC_MESSAGES)' would return 'DE_de' on a Box running for example
the german version of Windows or MacOS.  This would eliminate the need for
ugly language selection menus on these platforms in a portable fashion.

Regards, Peter




From guido at digicool.com  Wed Jan 24 16:41:07 2001
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 24 Jan 2001 10:41:07 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Wed, 24 Jan 2001 16:07:08 +0200."
             <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il> 
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>  
            <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il> 
Message-ID: <200101241541.KAA27082@cj20424-a.reston1.va.home.com>

> > This could be incorporated into PyDict. Instead of storing keys and
> > values in the same array, keep them in separate arrays and only
> > allocate the values array the first time someone stores a value other
> > than 1.
> 
> Cool idea, but even cooler (would catch more idioms, that is) is
> "the first time someone stores something not 'is'  something in the
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> dict, allocate the values array". This would catch small numbers,
> None and identifier-looking strings, for the measly cost of one
> pointer/dict object.

Sorry, but I don't understand what you mean by the ^^^ marked phrase.
Can you please elaborate?

Regarding storing one for "present", that's all well and fine, but it
suggests to me that storing a false value could mean "not present".
Do we really want that?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From moshez at zadka.site.co.il  Thu Jan 25 01:50:13 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Thu, 25 Jan 2001 02:50:13 +0200 (IST)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101241541.KAA27082@cj20424-a.reston1.va.home.com>
References: <200101241541.KAA27082@cj20424-a.reston1.va.home.com>, <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>  
            <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il>
Message-ID: <20010125005013.58C12A840@darjeeling.zadka.site.co.il>

On Wed, 24 Jan 2001 10:41:07 -0500, Guido van Rossum <guido at digicool.com> wrote:

> > Cool idea, but even cooler (would catch more idioms, that is) is
> > "the first time someone stores something not 'is'  something in the
>                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> > dict, allocate the values array". This would catch small numbers,
> > None and identifier-looking strings, for the measly cost of one
> > pointer/dict object.
> 
> Sorry, but I don't understand what you mean by the ^^^ marked phrase.
> Can you please elaborate?

I should really stop writing incomprehensible bits like that. Heck,
I can't even understand it on second reading.

I meant that the dictionary would keep a slot for "the one and only
value". First time someone puts a value in the dict, it puts it
in the "one and only value" slot, and doesn't initalize the value
array. The second time someone puts a value, it checks for pointer
equality with that "one and only value". If it is the same, it
it still doesn't initalize the value array. The only time when
the dictionary initalizes the value array is when two pointer-different
values are put in.

This would let me code

a[key] = None

For my sets (but consistent in the same set!)

a[key] = 1

When the timbot codes (again, consistent in the same set)

and

a[key] = 'present'

If you're really weird.

(identifier-like strings get interned)

That's not *semantics*, that's *optimization* for a commonly
used (I think) idiom with dictionaries -- you can't predict
the value, but it will probably remain the same.

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From skip at mojam.com  Wed Jan 24 17:44:17 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 24 Jan 2001 10:44:17 -0600 (CST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <04de01c08617$f56216f0$e46940d5@hagrid>
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
	<LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>
	<14958.60314.482226.825611@beluga.mojam.com>
	<04de01c08617$f56216f0$e46940d5@hagrid>
Message-ID: <14959.1633.163407.779930@beluga.mojam.com>

    Fredrik> I think the correct answer is "sometimes":

    Fredrik>     ANSI C mandates LC_ALL, LC_COLLATE, LC_CTYPE,
    Fredrik>     LC_MONETARY, LC_NUMERIC, and LC_TIME

    Fredrik>     Unix mandates LC_ALL, LC_COLLATE,LC_CTYPE,
    Fredrik>     LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and
    Fredrik>     LC_TIME

    Fredrik> in other words, if it's supported, it should be exposed by
    Fredrik> the Python bindings.

Then this suggests that either Tim's hack is the correct fix (leave it out
because we can't rely on it always being there) or I should add it to
__all__ at the bottom of the file if and only if it's present in the
module's namespace.

Skip






From moshez at zadka.site.co.il  Thu Jan 25 01:57:22 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Thu, 25 Jan 2001 02:57:22 +0200 (IST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <04de01c08617$f56216f0$e46940d5@hagrid>
References: <04de01c08617$f56216f0$e46940d5@hagrid>, <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com><LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com> <14958.60314.482226.825611@beluga.mojam.com>
Message-ID: <20010125005722.D2229A840@darjeeling.zadka.site.co.il>

On Wed, 24 Jan 2001 16:11:33 +0100, "Fredrik Lundh" <fredrik at effbot.org> wrote:

> I think the correct answer is "sometimes":
> 
>     ANSI C mandates LC_ALL, LC_COLLATE, LC_CTYPE,
>     LC_MONETARY, LC_NUMERIC, and LC_TIME
> 
>     Unix mandates LC_ALL, LC_COLLATE,LC_CTYPE,
>     LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and
>     LC_TIME
> 
> in other words, if it's supported, it should be exposed by
> the Python bindings.

In that case, the __all__ attribute in the module has to be calculated
dynamically. Say, adding code like

try:
    LC_MESSAGES
except NameError:
    pass
else:
    __all__.append('LC_MESSAGES')

Ditto for anything else.

Should I check in a patch?
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From trentm at ActiveState.com  Wed Jan 24 17:49:17 2001
From: trentm at ActiveState.com (Trent Mick)
Date: Wed, 24 Jan 2001 08:49:17 -0800
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>; from tim.one@home.com on Wed, Jan 24, 2001 at 03:28:34AM -0500
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>
Message-ID: <20010124084917.C29977@ActiveState.com>

How will the expected adherence of apps to BROWSER jive with the current (and
poorly understood by me) Windows convention of specifying the "default"
browser somewhere in the registry? 

Trent


-- 
Trent Mick
TrentM at ActiveState.com



From skip at mojam.com  Wed Jan 24 17:49:23 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 24 Jan 2001 10:49:23 -0600 (CST)
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <20010125005722.D2229A840@darjeeling.zadka.site.co.il>
References: <04de01c08617$f56216f0$e46940d5@hagrid>
	<LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com>
	<LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com>
	<14958.60314.482226.825611@beluga.mojam.com>
	<20010125005722.D2229A840@darjeeling.zadka.site.co.il>
Message-ID: <14959.1939.398029.896891@beluga.mojam.com>

    Moshe> In that case, the __all__ attribute in the module has to be
    Moshe> calculated dynamically. Say, adding code like

No need.  I've already got this exact change in my local copy and I'll be
adding a few more __all__ lists later today.

Skip



From paulp at ActiveState.com  Wed Jan 24 17:56:26 2001
From: paulp at ActiveState.com (Paul Prescod)
Date: Wed, 24 Jan 2001 08:56:26 -0800
Subject: [Python-Dev] I think my set module is ready for prime time; 
 comments?
References: <200101240252.PAA02105@s454.cosc.canterbury.ac.nz>  
	            <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il> <200101241541.KAA27082@cj20424-a.reston1.va.home.com>
Message-ID: <3A6F093A.A311C71E@ActiveState.com>

Guido van Rossum wrote:
> 
>...
> 
> > Cool idea, but even cooler (would catch more idioms, that is) is
> > "the first time someone stores something not 'is'  something in the
>
> Sorry, but I don't understand what you mean by the ^^^ marked phrase.
> Can you please elaborate?

I wasn't clear about that either. The idea is:

def add(new_value):
    if not values_array:
        if self.magic_value is NULL:
            self.magic_value = new_value
        elif new_value is not self.magic_value:
            self.values_array=[self.magic_value, new_value, ... ]
        else:
            # new_value is self.magic_value: do nothing

I am neutral on this proposal myself. I think that even if we optimize
any code where you pass the same thing over and over again, we should
document a convention for consistency. So I'm not sure there is much
advantage.

 Paul Prescod



From esr at thyrsus.com  Wed Jan 24 17:53:31 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 11:53:31 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124084917.C29977@ActiveState.com>; from trentm@ActiveState.com on Wed, Jan 24, 2001 at 08:49:17AM -0800
References: <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com> <20010124084917.C29977@ActiveState.com>
Message-ID: <20010124115331.A15059@thyrsus.com>

Trent Mick <trentm at ActiveState.com>:
> How will the expected adherence of apps to BROWSER jive with the current (and
> poorly understood by me) Windows convention of specifying the "default"
> browser somewhere in the registry? 

BROWSER overrides the registry setting.  Which is OK; under Windows, only
wizards are going to muck with it.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Ideology, politics and journalism, which luxuriate in failure, are
impotent in the face of hope and joy.
	-- P. J. O'Rourke



From guido at digicool.com  Wed Jan 24 17:59:00 2001
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 24 Jan 2001 11:59:00 -0500
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: Your message of "Wed, 24 Jan 2001 10:44:17 CST."
             <14959.1633.163407.779930@beluga.mojam.com> 
References: <LNBBLJKPBEHFEDALKOLCAENJIKAA.tim.one@home.com> <LNBBLJKPBEHFEDALKOLCEENLIKAA.tim.one@home.com> <14958.60314.482226.825611@beluga.mojam.com> <04de01c08617$f56216f0$e46940d5@hagrid>  
            <14959.1633.163407.779930@beluga.mojam.com> 
Message-ID: <200101241659.LAA27650@cj20424-a.reston1.va.home.com>

>     Fredrik> I think the correct answer is "sometimes":
> 
>     Fredrik>     ANSI C mandates LC_ALL, LC_COLLATE, LC_CTYPE,
>     Fredrik>     LC_MONETARY, LC_NUMERIC, and LC_TIME
> 
>     Fredrik>     Unix mandates LC_ALL, LC_COLLATE,LC_CTYPE,
>     Fredrik>     LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and
>     Fredrik>     LC_TIME
> 
>     Fredrik> in other words, if it's supported, it should be exposed by
>     Fredrik> the Python bindings.
> 
> Then this suggests that either Tim's hack is the correct fix (leave it out
> because we can't rely on it always being there) or I should add it to
> __all__ at the bottom of the file if and only if it's present in the
> module's namespace.

The latter.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From moshez at zadka.site.co.il  Thu Jan 25 18:05:44 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Thu, 25 Jan 2001 19:05:44 +0200 (IST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124084917.C29977@ActiveState.com>
References: <20010124084917.C29977@ActiveState.com>, <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>
Message-ID: <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>

On Wed, 24 Jan 2001 08:49:17 -0800, Trent Mick <trentm at ActiveState.com> wrote:
 
> How will the expected adherence of apps to BROWSER jive with the current (and
> poorly understood by me) Windows convention of specifying the "default"
> browser somewhere in the registry? 

The "webbrowser" module should prefer to take the setting from the
registry on windows.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From guido at digicool.com  Wed Jan 24 18:17:09 2001
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 24 Jan 2001 12:17:09 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: Your message of "Thu, 25 Jan 2001 02:50:13 +0200."
             <20010125005013.58C12A840@darjeeling.zadka.site.co.il> 
References: <200101241541.KAA27082@cj20424-a.reston1.va.home.com>, <200101240252.PAA02105@s454.cosc.canterbury.ac.nz> <20010124140708.2B6A2A83E@darjeeling.zadka.site.co.il>  
            <20010125005013.58C12A840@darjeeling.zadka.site.co.il> 
Message-ID: <200101241717.MAA27852@cj20424-a.reston1.va.home.com>

> I meant that the dictionary would keep a slot for "the one and only
> value". First time someone puts a value in the dict, it puts it
> in the "one and only value" slot, and doesn't initalize the value
> array. The second time someone puts a value, it checks for pointer
> equality with that "one and only value". If it is the same, it
> it still doesn't initalize the value array. The only time when
> the dictionary initalizes the value array is when two pointer-different
> values are put in.
> 
> This would let me code
> 
> a[key] = None
> 
> For my sets (but consistent in the same set!)
> 
> a[key] = 1
> 
> When the timbot codes (again, consistent in the same set)
> 
> and
> 
> a[key] = 'present'
> 
> If you're really weird.
> 
> (identifier-like strings get interned)
> 
> That's not *semantics*, that's *optimization* for a commonly
> used (I think) idiom with dictionaries -- you can't predict
> the value, but it will probably remain the same.

This I like!

But note that a dict currently uses 12 bytes per slot in the hash
table (on a 32-bit platform: long me_hash; PyObject *me_key,
*me_value).  The hash table's fill factor is typically between 50 and
67%.

I think removing the hashes would slow down lookups too much, so
optimizing identical values out would only save 6-8 bytes per existing
key on average.  Not clear if it's worth enough.  I think I have to
agree with Tim's expectation that two (or three) separate parallel
arrays will reduce the cache locality and thus slow things down.  Once
you start probing, you jump through the hashtable at large random
strides, causing bad cache performance (for largeish hash tables); but
since often enough the first slot tried is right, you have the hash,
key and value right next together, typically on the same cache line.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Wed Jan 24 18:31:55 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 12:31:55 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Thu, Jan 25, 2001 at 07:05:44PM +0200
References: <20010124084917.C29977@ActiveState.com>, <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com> <20010124084917.C29977@ActiveState.com> <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>
Message-ID: <20010124123155.A15203@thyrsus.com>

Moshe Zadka <moshez at zadka.site.co.il>:
> > How will the expected adherence of apps to BROWSER jive with the
> > current (and poorly understood by me) Windows convention of
> > specifying the "default" browser somewhere in the registry?
> 
> The "webbrowser" module should prefer to take the setting from the
> registry on windows.

Um, that's not the way it works right now. The windows-default browser choice 
launches the registered default browser, but BROWSER may have something else
in its search list first.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The real point of audits is to instill fear, not to extract revenue;
the IRS aims at winning through intimidation and (thereby) getting
maximum voluntary compliance
	-- Paul Strassel, former IRS Headquarters Agent Wall St. Journal 1980



From esr at thyrsus.com  Wed Jan 24 18:52:11 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 12:52:11 -0500
Subject: [Python-Dev] BROWSER status
Message-ID: <20010124125211.A15276@thyrsus.com>

I spent the morning writing and testing patches to make urlview and GNU Emacs
BROWSER-aware, and have sent them off to the relevant maintainers.  I've
also sent a patch to Andries Brouwer for the environ(5) man page.  

Those of you interested in my latest bit of social engineering can
take a look at

	http://www.tuxedo.org/~esr/BROWSER/

A bow in Guido's direction -- if he hadn't been grouchy about this I
probably wouldn't have gotten to shipping those patches for a while.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

A right is not what someone gives you; it's what no one can take from you. 
	-- Ramsey Clark



From thomas at xs4all.net  Wed Jan 24 19:33:27 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 24 Jan 2001 19:33:27 +0100
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>; from moshez@zadka.site.co.il on Thu, Jan 25, 2001 at 07:05:44PM +0200
References: <20010124084917.C29977@ActiveState.com>, <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com> <20010124084917.C29977@ActiveState.com> <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>
Message-ID: <20010124193326.B962@xs4all.nl>

On Thu, Jan 25, 2001 at 07:05:44PM +0200, Moshe Zadka wrote:
> On Wed, 24 Jan 2001 08:49:17 -0800, Trent Mick <trentm at ActiveState.com> wrote:
>  
> > How will the expected adherence of apps to BROWSER jive with the current (and
> > poorly understood by me) Windows convention of specifying the "default"
> > browser somewhere in the registry? 

> The "webbrowser" module should prefer to take the setting from the
> registry on windows.

Why ? That's a lot harder to change, and not settable per
'shell'/'thread'/'process'.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tim.one at home.com  Wed Jan 24 20:54:47 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 14:54:47 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124115331.A15059@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEPDIKAA.tim.one@home.com>

Guys, while I like BROWSER, don't think it has anything to do with Windows!
Windows is not Unix; doesn't have PAGER or EDITOR either; and, in general,
use of envars is an abomination under Windows.  The old webbrowser.py uses
the Windows-specific os.startfile(url) because that's the *right* way to do
it on Windows, wizard or not.  And you would have to be a Windows wizard to
succeed in launching a browser under Windows in any other way anyway.  You
may as well try to sell the notion that, on Unix, Python should maintain a
dict mapping file extensions to the user's preferred ways of opening such
files <0.9 wink>.




From tim.one at home.com  Wed Jan 24 20:56:32 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 14:56:32 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124193326.B962@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPEIKAA.tim.one@home.com>

>> The "webbrowser" module should prefer to take the setting from the
>> registry on windows.

> Why ? That's a lot harder to change, and not settable per
> 'shell'/'thread'/'process'.

A Windows user has a legitimate expectation that *every* time an .html file
is opened, it will come up in their browser of choice.  That choice is made
via the registry, and this is how *all* apps work under Windows.  Ditto for
.htm files (and that may be a different browser than is used for .html
files, but again the user has set up their registry to do what *they* want
done with it).  It's not supposed to be easy to change; it is supposed to be
consistent.  Using a different browser per shell/thread/process is a foreign
concept; it's also a useless concept on Windows <0.5 wink>.




From tim.one at home.com  Wed Jan 24 21:32:35 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 15:32:35 -0500
Subject: LC_MESSAGES (was Re: [Python-Dev] test___all__ failing; Windows)
In-Reply-To: <m14LRup-000CxUC@artcom0.artcom-gmbh.de>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEPGIKAA.tim.one@home.com>

[Peter Funk]
> ...
> AFAI found out, LC_MESSAGES was added to the POSIX "standard" in Posix.2.

FYI, it appears that C99 declined to adopt this extension to C89, but don't
know why (the C99 Rationale doesn't mention it).  That means the vendors who
don't already support it can (well, *will*) use the new C99 std as "a
reason" to continue leaving it out.




From tim.one at home.com  Wed Jan 24 21:15:28 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 15:15:28 -0500
Subject: [Python-Dev] test___all__ failing; Windows
In-Reply-To: <14959.1633.163407.779930@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEPFIKAA.tim.one@home.com>

[Skip]
> Then this suggests that either Tim's hack is the correct fix (leave it out
> because we can't rely on it always being there) or I should add it to
> __all__ at the bottom of the file if and only if it's present in the
> module's namespace.

What you suggest at the end *is* the hack I checked in.  That is, it's
already done.  The existence of LC_MESSAGES is clearly platform-specific; if
anyone can say for sure a priori *which* platforms it's available on, tell
Fred Drake so he can update the docs accordingly.




From skip at mojam.com  Wed Jan 24 22:25:45 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 24 Jan 2001 15:25:45 -0600 (CST)
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <20010124123155.A15203@thyrsus.com>
References: <20010124084917.C29977@ActiveState.com>
	<200101240346.WAA06790@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com>
	<20010125170544.A4C7DA840@darjeeling.zadka.site.co.il>
	<20010124123155.A15203@thyrsus.com>
Message-ID: <14959.18521.648454.488731@beluga.mojam.com>

>>>>> "Eric" == Eric S Raymond <esr at thyrsus.com> writes:

    Moshe Zadka <moshez at zadka.site.co.il>:

    >> The "webbrowser" module should prefer to take the setting from the
    >> registry on windows.

    Eric> Um, that's not the way it works right now. The windows-default
    Eric> browser choice launches the registered default browser, but
    Eric> BROWSER may have something else in its search list first.

Why not have a special REGISTRY token you can place in the BROWSER path to
tell it when to consult the registry?  On non-Windows platforms it can
simply be ignored:

    BROWSER=netscape:REGISTRY:explorer

Skip




From esr at thyrsus.com  Wed Jan 24 22:30:44 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 16:30:44 -0500
Subject: [Python-Dev] webbrowser.py
In-Reply-To: <14959.18521.648454.488731@beluga.mojam.com>; from skip@mojam.com on Wed, Jan 24, 2001 at 03:25:45PM -0600
References: <20010124084917.C29977@ActiveState.com> <200101240346.WAA06790@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCMENBIKAA.tim.one@home.com> <20010125170544.A4C7DA840@darjeeling.zadka.site.co.il> <20010124123155.A15203@thyrsus.com> <14959.18521.648454.488731@beluga.mojam.com>
Message-ID: <20010124163044.A15877@thyrsus.com>

Skip Montanaro <skip at mojam.com>:
> Why not have a special REGISTRY token you can place in the BROWSER path to
> tell it when to consult the registry?  On non-Windows platforms it can
> simply be ignored:

In effect, windows-default is that special token.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The Bible is not my book, and Christianity is not my religion.  I could never
give assent to the long, complicated statements of Christian dogma.
	-- Abraham Lincoln



From martin at mira.cs.tu-berlin.de  Wed Jan 24 22:41:11 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Wed, 24 Jan 2001 22:41:11 +0100
Subject: [Python-Dev] Tkinter documentation (Was:  What does "batteries are included" mean?)
Message-ID: <200101242141.f0OLfBT01812@mira.informatik.hu-berlin.de>

> It's already a blot on Python that the standard documentation set
> doesn't cover Tkinter.

Just point your friendly web browser to Ping's HTML generator and ask
for Tkinter, or invoke "pydoc.py Tkinter".

[I wouldn't have brought this up if it hadn't been the contribution of
my friend Nils Fischbeck:-]

Regards,
Martin



From nas at arctrix.com  Wed Jan 24 16:31:55 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Wed, 24 Jan 2001 07:31:55 -0800
Subject: [Python-Dev] Makefile changes
Message-ID: <20010124073155.B32266@glacier.fnational.com>

I've checked in my new makefile.  Hopefully everything goes well.
The following files are no longer used so please don't patch
them:

    Grammar/Makefile.in
    Include/Makefile
    Lib/Makefile
    Modules/Makefile.pre.in
    Objects/Makefile.in
    Parser/Makefile.in
    Python/Makefile.in
    Makefile.in

They will be removed in a few days assuming all goes well.  You
should re-run configure to use the new makefile.

I would appreciate it if people using platforms other than Linux
and GNU make could give me some feedback on the build process.
Does configure and make work okay?  Does "make test" and "make
install" work?  Thanks.

  Neil



From greg at cosc.canterbury.ac.nz  Wed Jan 24 23:55:00 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Thu, 25 Jan 2001 11:55:00 +1300 (NZDT)
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101240354.WAA06903@cj20424-a.reston1.va.home.com>
Message-ID: <200101242255.LAA02208@s454.cosc.canterbury.ac.nz>

Guido:

> But shouldn't the default value be something else,
> like none?

It should really be whatever is the first value that gets
stored after the dict is created. That way people can
use whatever they want for their dummy value and it will
Just Work. And it will probably catch most existing uses
of a dict as a set as well.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From ping at lfw.org  Wed Jan 24 21:33:43 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Wed, 24 Jan 2001 12:33:43 -0800 (PST)
Subject: [Python-Dev] Anonymous + varargs: possible serious breakage -- please confirm!
Message-ID: <Pine.LNX.4.10.10101241222270.483-100000@skuld.kingmanhall.org>

Hi -- after updating my CVS tree today with Python 2.1a1, i ran
the tests and test_inspect failed.  This revealed that the format
of code.co_varnames has changed.  At first i tried to update the
inspect.py module to check the Python version number and track the
change, but now i believe this is actually symptomatic of a real
interpreter problem.

Consider the function:

    def f(a, (b, c), *d):
        x = 1
        print a, b, c, d, x

Whereas in Python 1.5.2:

    f.func_code.co_argcount = 2
    f.func_code.co_nlocals = 6
    f.func_code.co_names = ('x', 'a', 'b', 'c', 'd')
    f.func_code.co_varnames = ('a', '.2', 'd', 'b', 'c', 'x')

In Python 2.1a1:

    f.func_code.co_argcount = 2
    f.func_code.co_nlocals = 6
    f.func_code.co_names = ('b', 'c', 'x', 'a', 'd')
    f.func_code.co_varnames = ('a', '.2', 'b', 'c', 'd', 'x')

Notice how the ordering of the variable names has changed.
I went and looked at the CO_VARARGS clause in eval_code2 to
see if it put the varargs and kwdict arguments in different
slots, but it appears unchanged!  It still puts varargs at
locals[co_argcount] and kwdict at locals[co_argcount + 1].

Please try:

    >>> def f(a, (b, c), *d):
    ...     x = 1
    ...     print a, b, c, d, x
    ...
    >>> f(1, (2, 3), 4)
    1 2 3
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
      File "<stdin>", line 3, in f
    UnboundLocalError: local variable 'd' referenced before assignment
    >>> 

In Python 1.5.2, this prints "1 2 3 (4,)" as expected.

I only have 1.5.2 and 2.1a1 to test.  I hope this problem
isn't present in 2.0...


Note that test_inspect was the only test to fail!  It might be the
only test that checks anonymous and *varargs at the same time.
(Yet another reason to put inspect in the core...)

I did recently check in additions to test_extcall that made the
test much beefier -- but that only tested combinations of regular,
keyword, varargs, and kwdict arguments; it neglected to test
anonymous (tuple) arguments as well.


-- ?!ng




From tim.one at home.com  Thu Jan 25 00:56:25 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 18:56:25 -0500
Subject: [Python-Dev] Re: test___all__ failing; Windows
Message-ID: <LNBBLJKPBEHFEDALKOLCMEPMIKAA.tim.one@home.com>

> In that case, the __all__ attribute in the module has to be calculated
> dynamically. Say, adding code like
>
> try:
>    LC_MESSAGES
> except NameError:
>    pass
> else:
>    __all__.append('LC_MESSAGES')
>
> Ditto for anything else.
>
> Should I check in a patch?

SourceForge CVS doesn't appear to be broken, so I can only conclude everyone
decided this was a bad to stop taking drugs <0.9 wink>.




From tim.one at home.com  Thu Jan 25 01:04:50 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 19:04:50 -0500
Subject: [Python-Dev] (no subject)
Message-ID: <LNBBLJKPBEHFEDALKOLCEEPNIKAA.tim.one@home.com>

[Skip]
> Why not have a special REGISTRY token you can place in the BROWSER
> path to tell it when to consult the registry?  On non-Windows
> platforms it can simply be ignored:
>
>    BROWSER=netscape:REGISTRY:explorer

Because non-Windows platforms shouldn't be bothered with Windows silliness
any more than Windows users should be bothered with Unix silliness.  BROWSER
isn't of any use on Windows, and REGISTRY isn't of any use on Unix.  Eric
may still *think* BROWSER is of use on Windows, but if so that's not really
a technical problem <wink>.




From thomas at xs4all.net  Thu Jan 25 01:25:54 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 25 Jan 2001 01:25:54 +0100
Subject: [Python-Dev] Makefile changes
In-Reply-To: <20010124073155.B32266@glacier.fnational.com>; from nas@arctrix.com on Wed, Jan 24, 2001 at 07:31:55AM -0800
References: <20010124073155.B32266@glacier.fnational.com>
Message-ID: <20010125012554.F962@xs4all.nl>

On Wed, Jan 24, 2001 at 07:31:55AM -0800, Neil Schemenauer wrote:

> I would appreciate it if people using platforms other than Linux
> and GNU make could give me some feedback on the build process.
> Does configure and make work okay?  Does "make test" and "make
> install" work?  Thanks.

Only have time for a quick check now, and no time what so ever tomorrow, but
at first glance, it looks okay (read: it compiles Python) on BSDI 4.0.1,
BSDI 4.1 and FreeBSD 4.2.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From esr at thyrsus.com  Thu Jan 25 01:15:10 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 24 Jan 2001 19:15:10 -0500
Subject: [Python-Dev] (no subject)
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEPNIKAA.tim.one@home.com>; from tim.one@home.com on Wed, Jan 24, 2001 at 07:04:50PM -0500
References: <LNBBLJKPBEHFEDALKOLCEEPNIKAA.tim.one@home.com>
Message-ID: <20010124191510.A17782@thyrsus.com>

Tim Peters <tim.one at home.com>:
> Because non-Windows platforms shouldn't be bothered with Windows silliness
> any more than Windows users should be bothered with Unix silliness.  BROWSER
> isn't of any use on Windows, and REGISTRY isn't of any use on Unix.  Eric
> may still *think* BROWSER is of use on Windows, but if so that's not really
> a technical problem <wink>.

Actually that's not something I have an opinion on.  I addressed the
original question because I know it would be technically possible to set
a BROWSER variable under Windows.  Yes, an unlikely move, but possible.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

A man who has nothing which he is willing to fight for, nothing 
which he cares about more than he does about his personal safety, 
is a miserable creature who has no chance of being free, unless made 
and kept so by the exertions of better men than himself. 
	-- John Stuart Mill, writing on the U.S. Civil War in 1862



From tim.one at home.com  Thu Jan 25 05:38:54 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 24 Jan 2001 23:38:54 -0500
Subject: [Python-Dev] I think my set module is ready for prime time;  comments?
In-Reply-To: <3A6EE944.C8CC6EF7@tismer.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEADILAA.tim.one@home.com>

[Christian Tismer]
> ...
> Not sure if hashes and keys should be apart, but
> sure for values.

How so?  That is, under what assumptions?  Any savings from separation would
appear to require that I look up keys a lot more than I access the
associated values; while trivially true for dicts used as sets, it seems
dubious to me for use of dicts as mappings (count[word] += 1, etc).




From Jason.Tishler at dothill.com  Thu Jan 25 07:09:47 2001
From: Jason.Tishler at dothill.com (Jason Tishler)
Date: Thu, 25 Jan 2001 01:09:47 -0500
Subject: [Python-Dev] Re: Python 2.1 alpha 1 released!
In-Reply-To: <200101230333.WAA28376@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 22, 2001 at 10:33:02PM -0500
References: <200101230333.WAA28376@cj20424-a.reston1.va.home.com>
Message-ID: <20010125010947.M1256@dothill.com>

On Mon, Jan 22, 2001 at 10:33:02PM -0500, Guido van Rossum wrote:
> - Python should now build out of the box on Cygwin.  If it doesn't,
>   mail to Jason Tishler (jlt63 at users.sourceforge.net).

Although Python CVS built OOTB under Cygwin until 2001/01/17 18:54:54,
Python 2.1a1 needs a small patch in order to build cleanly under Cygwin.
If interested, please see the following for details:

    http://www.cygwin.com/ml/cygwin-apps/2001-01/msg00019.html

Thanks,
Jason

-- 
Jason Tishler
Director, Software Engineering       Phone: +1 (732) 264-8770 x235
Dot Hill Systems Corp.               Fax:   +1 (732) 264-8798
82 Bethany Road, Suite 7             Email: Jason.Tishler at dothill.com
Hazlet, NJ 07730 USA                 WWW:   http://www.dothill.com



From tim.one at home.com  Thu Jan 25 08:29:19 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 25 Jan 2001 02:29:19 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <200101231549.KAA05172@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEBFILAA.tim.one@home.com>

[Guido]
> ...
> It's no big deal if the Vaults contain three or more set modules --
> perfect even, people can choose the best one for their purpose.

They really can't, not realistically, unless all the modules in question
conform to the same interface (which users can't control), and users
restrict themselves to methods defined only in the interface (which users
can control).  The problem is that "their purpose" changes over time, and in
some cases the effects of representation on performance simply can't be
out-guessed in advance of actual measurement.  If people need to change any
more than just the import statement, *then* a single implementation has to
be all things to all people.

I hate to say this (bet <wink>?), but I suspect the fact that Python's basic
types are all builtin and not classes has kept us from fully appreciating
the class-based "1 interface, N implementations" approach that C++ and Java
hackers are having so much fun with.  They're not all that easy to find, but
people who have climbed the steep STL learning curve often end up in the
same ecstatic trance I used to see only among fellow Pythoneers.

> But in the core, there's only room for one set type or module.

I don't like the conclusion:  it implies there's no room in the core for
more than one implementation of anything, yet one-size-fits-all doesn't.  I
have no problem with the idea that there's only room for one Set *interface*
in the core.  Then you only need Pronounce on a reasonable set of abstract
operations, and leave the implementation tradeoffs to be made by different
people in different ways (I've really got no use for Eric's list-based sets;
he's really got no use for my sets-of-sets).

That said, if there can be at most one, and must be at least one, a
hashtable based set is the best compromise there is, and mutable objects as
elements should not be supported (they add great implementation complexity
for the benefit of relatively few applications).

jeremy's-set-class-couldn't-be-accused-of-overkill<wink>-ly y'rs  - tim




From tim.one at home.com  Thu Jan 25 08:57:18 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 25 Jan 2001 02:57:18 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <20010123113050.A26162@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCIEBLILAA.tim.one@home.com>

[Eric S. Raymond]
> ...
> What you get by going with a dictionary representation is that
> membership test becomes close to constant-time, while insertion and
> deletion become sometimes cheap and sometimes quite expensive
> (depending of course on whether you have to allocate a new
> hash bucket).

Note that Python's dicts aren't vulnerable to that:  they use open
addressing in a contiguous, preallocated vector.  There are no mallocs() or
free()s going on for lookups, deletes, or inserts, unless an insert happens
to hit a "time to double the size of the vector" boundary.  Deletes never
cost more than a lookup; inserts never more unless the table-size boundary
is hit (one in 2**N unique inserts, at which point N goes up too).

> ...
> "works for everbody" isn't really possible here.  So my solution
> does the next best thing -- pick a choice of tradeoffs that isn't
> obviously worse than the alternatives and keeps things bog-simple.

I agree that this shouldn't be an either/or choice, but if it's going to be
forced into that mold I have to protest that the performance of unordered
lists would kill most of the set applications I've ever had.  I typically
have a small number of very large sets (and I'm talking not 100s, but often
100s of 1000s of elements).  The relatively large memory burden of a dict
representation wouldn't bother me unless I instead had 100s of 1000s of very
small sets.

which-we-may-happen-in-my-next-life-but-not-in-this-one-ly y'rs  - tim




From tim.one at home.com  Thu Jan 25 09:08:30 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 25 Jan 2001 03:08:30 -0500
Subject: [Python-Dev] I think my set module is ready for prime time; comments?
In-Reply-To: <001101c0857a$c0dce420$770a0a0a@nevex.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEBMILAA.tim.one@home.com>

[Greg Wilson]
> ...
> Unfortunately, if values are required to be immutable, then sets of
> sets aren't possible... :-(

Sure they are.  I wrote about how before, and Moshe put up a simple
implementation as a SourceForge patch.  Not bulletproof, though:  "consentng
adults".  No matter *what* you implement, I'll find *some* way to trick it
into believing my sets are immutable <wink>, so don't worry about that.

Bulletproof is very hard, and is a minority distraction at best.  IIRC, SETL
had "by value" semantics when inserting a set into another set as an
element, and had some exceedingly hairy copy-on-write scheme under the
covers to make that bearably quick.  That may be wrong, though.  Herman
Venter's Slim (Sets, Lists and Maps) language does work that way (Guido,
Herman was a friend of the departed Stoffel Erasmus, who you may recall
fondly from Python's very early days -- if *that* doesn't make sets
attractive to you, nothing will <wink>).

Ah!  Meant to post this before:

    http://birch.eecs.lehigh.edu/~bacon/setlprog.ps.gz

That's a readable and very good intro to SETL Classic.  People pondering
computerized sets should at least catch up with what was common knowledge 30
years ago <wink>.




From thomas at xs4all.net  Thu Jan 25 10:24:24 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 25 Jan 2001 10:24:24 +0100
Subject: [Python-Dev] Anonymous + varargs: possible serious breakage -- please confirm!
In-Reply-To: <Pine.LNX.4.10.10101241222270.483-100000@skuld.kingmanhall.org>; from ping@lfw.org on Wed, Jan 24, 2001 at 12:33:43PM -0800
References: <Pine.LNX.4.10.10101241222270.483-100000@skuld.kingmanhall.org>
Message-ID: <20010125102424.G962@xs4all.nl>

On Wed, Jan 24, 2001 at 12:33:43PM -0800, Ka-Ping Yee wrote:

> Please try:

>     >>> def f(a, (b, c), *d):
>     ...     x = 1
>     ...     print a, b, c, d, x
>     ...
>     >>> f(1, (2, 3), 4)
>     1 2 3
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>       File "<stdin>", line 3, in f
>     UnboundLocalError: local variable 'd' referenced before assignment
>     >>> 

> In Python 1.5.2, this prints "1 2 3 (4,)" as expected.

> I only have 1.5.2 and 2.1a1 to test.  I hope this problem
> isn't present in 2.0...

It isn't present in 2.0. This is probably related to Jeremy's changes
in the call mechanism or the compiler track, though Jeremy himself is the
best person to claim that for sure :)

> Note that test_inspect was the only test to fail!  It might be the
> only test that checks anonymous and *varargs at the same time.
> (Yet another reason to put inspect in the core...)

Well, this is not an inspect-specific test, so it shouldn't *be* in
test_inspect, it should be in test_extcall :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From fredrik at effbot.org  Thu Jan 25 10:45:31 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Thu, 25 Jan 2001 10:45:31 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Doc/lib libwinsound.tex,1.5,1.6
References: <E14Limt-0002Rf-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <003801c086b3$8ff41560$e46940d5@hagrid>

tim accidentally wrote:

>     \versionadded{1.5.3} % XXX fix this version number when release is scheduled!

1.5.3?  time for a 1.5.3 => 1.6 query replace?

> fgrep 1.5.3 doc/*/*.tex
doc/lib/libcmp.tex:\deprecated{1.5.3}{Use the \module{filecmp} module inste
doc/lib/libcmpcache.tex:\deprecated{1.5.3}{Use the \module{filecmp} module
ad.}
doc/lib/libwinsound.tex:  \versionadded{1.5.3} % XXX fix this version number

or am I missing something?

Cheers /F




From tim.one at home.com  Thu Jan 25 12:20:18 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 25 Jan 2001 06:20:18 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Doc/lib libwinsound.tex,1.5,1.6
In-Reply-To: <003801c086b3$8ff41560$e46940d5@hagrid>
Message-ID: <LNBBLJKPBEHFEDALKOLCAECFILAA.tim.one@home.com>

Gotta ask Fred about this one!

> or am I missing something?

Yes, the Python 1.5.3 release.  I use it all the time <wink>.




From tismer at tismer.com  Thu Jan 25 13:22:32 2001
From: tismer at tismer.com (Christian Tismer)
Date: Thu, 25 Jan 2001 14:22:32 +0200
Subject: [Python-Dev] Intended to work? (lambda x,y:map(eval, ["x", "y"]))(2,3)
Message-ID: <3A701A88.F2C68635@tismer.com>

In a function like this:

def f(x):
  return eval("x")

, eval uses the local function namespace, and the above works.
This is according to chapter 2.3 of the Python library ref.

Now on my problem: When eval() is used with map, the same
mechanism takes place:

def f(x):
  return map(eval,["x"])

It works the same as the above, because map is a builtin function
that does not modify the frame chain, so eval finds the local
namespace.
Not so with Stackless Python (at the moment), since Stackless map
assigns an own frame to map without passing the correct namespaces
to it. (Reported by Bernd Rinn)

Question: Is this by chance, or is eval() *meant* to function with
the local namespace, even if it is executed in the context of
a function like map() ?

The description of map() does not state whether it has to pass
its surrounding namespace to the mapped function, and if one
simulates map() by writing one's own python implementation,
it will fail exactly like Stackless does today. The same
applies to apply().

I think I should fix Stackless here, anyway?

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer at tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com



From guido at digicool.com  Thu Jan 25 14:35:12 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 25 Jan 2001 08:35:12 -0500
Subject: [Python-Dev] Re: Intended to work? (lambda x,y:map(eval, ["x", "y"]))(2,3)
In-Reply-To: Your message of "Thu, 25 Jan 2001 14:22:32 +0200."
             <3A701A88.F2C68635@tismer.com> 
References: <3A701A88.F2C68635@tismer.com> 
Message-ID: <200101251335.IAA16713@cj20424-a.reston1.va.home.com>

> In a function like this:
> 
> def f(x):
>   return eval("x")
> 
> , eval uses the local function namespace, and the above works.
> This is according to chapter 2.3 of the Python library ref.
> 
> Now on my problem: When eval() is used with map, the same
> mechanism takes place:
> 
> def f(x):
>   return map(eval,["x"])
> 
> It works the same as the above, because map is a builtin function
> that does not modify the frame chain, so eval finds the local
> namespace.
> Not so with Stackless Python (at the moment), since Stackless map
> assigns an own frame to map without passing the correct namespaces
> to it. (Reported by Bernd Rinn)
> 
> Question: Is this by chance, or is eval() *meant* to function with
> the local namespace, even if it is executed in the context of
> a function like map() ?

Map, being a built-in, is transparent to namespaces.

> The description of map() does not state whether it has to pass
> its surrounding namespace to the mapped function, and if one
> simulates map() by writing one's own python implementation,
> it will fail exactly like Stackless does today. The same
> applies to apply().

So you can't simulate a built-in.

> I think I should fix Stackless here, anyway?

Yes.

Note: beware of Jeremy's nested scopes.  That adds a whole slew of
namespaces!  (But eval() is more crippled there.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From jeremy at alum.mit.edu  Thu Jan 25 16:20:45 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 10:20:45 -0500 (EST)
Subject: [Python-Dev] Anonymous + varargs: possible serious breakage -- please confirm!
In-Reply-To: <20010125102424.G962@xs4all.nl>
References: <Pine.LNX.4.10.10101241222270.483-100000@skuld.kingmanhall.org>
	<20010125102424.G962@xs4all.nl>
Message-ID: <14960.17485.549337.5476@localhost.localdomain>

>>>>> "TW" == Thomas Wouters <thomas at xs4all.net> writes:

  TW> On Wed, Jan 24, 2001 at 12:33:43PM -0800, Ka-Ping Yee wrote:
  >> Please try:

  >> >>> def f(a, (b, c), *d):
  >> ...  x = 1 ...  print a, b, c, d, x ...
  >> >>> f(1, (2, 3), 4)
  >> 1 2 3 Traceback (most recent call last): File "<stdin>", line 1,
  >> in ?  File "<stdin>", line 3, in f UnboundLocalError: local
  >> variable 'd' referenced before assignment
  >> >>>

  >> In Python 1.5.2, this prints "1 2 3 (4,)" as expected.

  >> I only have 1.5.2 and 2.1a1 to test.  I hope this problem isn't
  >> present in 2.0...

  TW> It isn't present in 2.0. This is probably related to Jeremy's
  TW> changes in the call mechanism or the compiler track, though
  TW> Jeremy himself is the best person to claim that for sure :)

The bug is in the compiler.  It creates varnames while it is parsing
the argument list.  While I got the handling of the anonymous tuples
right, I forgot to insert *varargs or **kwargs in varnames *before*
the names defined in the tuple.

I will fix it real soon now.

  >> Note that test_inspect was the only test to fail!  It might be
  >> the only test that checks anonymous and *varargs at the same
  >> time.  (Yet another reason to put inspect in the core...)

  TW> Well, this is not an inspect-specific test, so it shouldn't *be*
  TW> in test_inspect, it should be in test_extcall :)

It should probably be in test_grammar.  The ext call mechanism is only
invoked when the caller uses a form like 'f(*arg)'.  Perhaps the name
"ext call" isn't very clear.

Jeremy



From esr at thyrsus.com  Thu Jan 25 17:19:36 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 25 Jan 2001 11:19:36 -0500
Subject: [Python-Dev] Waiting method for file objects
Message-ID: <20010125111936.A23512@thyrsus.com>

I have been researching the question of how to ask a file descriptor how much
data it has waiting for the next sequential read, with a view to discovering
what cross-platform behavior we could count on for a hypothetical `waiting'
method in Python's built-in file class.

1:  Why bother?

I have these main applications in mind:

1. Detecting EOF on a static plain file.
2. Non-blocking poll of a socket opened in non-blocking mode.
3. Non-blocking poll of a FIFO opened in non-blocking mode.
4. Non-blocking poll of a terminal device opened in non-blocking mode.

These are all frequently requested capabilities on C newsgroups -- how
often have *you* seen the "how do I detect an individual keypress"
question from beginning programmers?  I believe having these
capabilities would substantially enhance Python's appeal.

2: What would be under the hood?

Summary: We can do this portably, and we can do it with only one (1)
new #ifdef.  Our tools for this purpose will be the fstat(2) st_size
field and the FIONREAD ioctl(2) call.  They are complementary.

In all supposedly POSIX-conformant environments I know of, the st_size
field has a documented meaning for plain files (S_IFREG) and may or
may not give a meaningful number for FIFOs, sockets, and tty devices.
The Single Unix Specification is silent on the meaning of st_size for
file types other than regular files (S_IFREG).  I have filed a defect
report about this with OpenGroup and am discussing appropriate language
with them.

(The last sentence of the Inferno operating system's language on
stat(2) is interesting: "If the file resides on permanent storage and
is not a directory, the length returned by stat is the number of bytes
in the file. For directories, the length returned is zero. Some
devices report a length that is the number of bytes that may be read
from the device without blocking.")

The FIONREAD ioctl(2) call, on the other hand, returns bytes waiting
on character devices such as FIFOs, sockets, or ttys -- but does not
return a useful value for files or directories or block devices. The
FIONREAD ioctl was supported in both SVr4 and 4.2BSD.  It's present in
all the open-source Unixes, SunOS, Solaris, and AIX.  Via Google
search I have discovered that it's also supported in the Windows
Sockets API and the GUSI POSIX libraries for the Macintosh.  Thus, it
can be considered portable for Python's purposes even though it's
rather sparsely documented.

I was able to obtain confirming information on Linux from Linus
Torvalds himself. My information on Windows and the Mac is from
Gavriel State, formerly a lead developer on Corel's WINE team and a
programmer with extensive cross-platform experience.  Gavriel reported
on the MSCRT POSIX environment, on the Metrowerks Standard Library
POSIX implementation for the Mac, and on the GUSI POSIX implementation
for the Mac.

2.1: Plain files

Torvalds and State confirm that for plain files (S_IFREG) the st_size
field is reliable on all three platforms.  On the Mac it gives the
file's data fork size.

One apparent difficulty with the plain-file case is that POSIX does
not guarantee anything about seek_t quantities such as lseek(2)
returns and the st_size field except that they can be compared for
equality.  Thus, under the strict letter of POSIX law, `waiting' can
be used to detect EOF but not to get a reliable read-size return in
any other file position.

Fortunately, this is less an issue than it appears.  The weakness of
the POSIX language was a 1980s-era concession to a generation of
mainframe operating systems with record-oriented file structures --
all of which are now either thoroughly obsolete or (in the case of IBM
VM/CMS) have become Linux emulators :-).  On modern operating systems
under which files have character granularity, stat(2) emulations can
be and are written to give the right result.

2.2: Block devices

The directory case (S_IFDIR) is a complete loss.  Under Unixes,
including Linux, the fstat(2) size field gives the allocated size of
the directory as if it were a plain file.  Under MSCRT POSIX the
meaning is undocumented and unclear.  Metroworks returns garbage.
GUSI POSIX returns the number of files in the directory!  FIONREAD
cannot be used on directories.

Block devices (S_IFBLK) are a mess again.  Linus points out that a
system with removable or unmountable volumes *cannot* return a useful
st_size field -- what happens when the device is dismounted?

2.3: Character devices

Pipes and FIFOs (S_IFIFO) look better.  On MSCRT the fstat(2) size
field returns the number of bytes waiting to be read.  This is also
true under current Linuxes, though Torvalds says it is "an
implementation detail" and recommends polling with the FIONREAD ioctl
instead.  Fortunately, FIONREAD is available under Unix, Windows, and
the Mac.

Sockets (S_IFSOCK) look better too.  Under Linux, the fstat(2) size
field gives number of bytes waiting.  Torvalds again says this is "an
implementation detail" and recommends polling with the FIONREAD ioctl.
Neither MSCRT POSIX nor Metroworks has direct support for sockets.
GUSI POSIX returns 1 (!) in the st_size field. But FIONREAD is
available under Unix, Windows, and the GUSI POSIX libraries on the
Mac.

Character devices (S_IFCHR) can be polled with FIONREAD.  This technique
has a long history of use with tty devices under Unix.  I don't know whether
it will work with the equivalents of terminal devices for Windows and the Mac.
Fortunately this is not a very important question, as those are GUI 
environments with the terminal devices are rarely if ever used.

3. How does this turn into Python?

The upshot of our portability analysis is that by using FIONREAD and
fstat(2), we can get useful results for plain files, pipes, and
sockets on all three platforms.  Directories and block devices are a
complete loss.  Character devices (in particular, ttys) we can poll
reliably under Unix.  What we'll get polling the equivalents of tty or
character devices under Windows and the Mac is presently unknown, but
also unimportant.

My proposed semantics for a Python `waiting' method is that it reports
the amount of data that would be returned by a read() call at the time
of the waiting-method invocation.  The interpreter throws OSError if
such a report is impossible or forbidden.

I have enclosed a patch against the current CVS sources, including
documentation.  This patch is tested and working against plain files,
sockets, and FIFOs under Linux.  I have also attached the
Python test program I used under Linux.

I would appreciate it if those of you on Windows and Macintosh
machines would test the waiting method. The test program will take
some porting, because it needs to write to a FIFO in background.
Under Linux I do it this way:

	(echo -n '%s' >testfifo; echo 'Data written to FIFO.') &

I don't know how to do the equivalent under Windows or Mac.

When you run this program, it will try to mail me your test results.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Sometimes it is said that man cannot be trusted with the government
of himself.  Can he, then, be trusted with the government of others?
	-- Thomas Jefferson, in his 1801 inaugural address
-------------- next part --------------
Index: fileobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/fileobject.c,v
retrieving revision 2.108
diff -c -r2.108 fileobject.c
*** fileobject.c	2001/01/18 03:03:16	2.108
--- fileobject.c	2001/01/25 16:16:10
***************
*** 35,40 ****
--- 35,44 ----
  #include <errno.h>
  #endif
  
+ #ifndef DONT_HAVE_IOCTL_H
+ #include <sys/ioctl.h>
+ #endif
+ 
  
  typedef struct {
  	PyObject_HEAD
***************
*** 423,428 ****
--- 427,513 ----
  }
  
  static PyObject *
+ file_waiting(PyFileObject *f, PyObject *args)
+ {
+ 	struct stat stbuf;
+ #ifdef HAVE_FSTAT
+ 	int ret;
+ #endif
+ 
+ 	if (f->f_fp == NULL)
+ 		return err_closed();
+ 	if (!PyArg_NoArgs(args))
+ 		return NULL;
+ #ifndef HAVE_FSTAT
+ 	PyErr_SetString(PyExc_OSError, "fstat(2) is not available.");
+ 	clearerr(f->f_fp);
+ 	return NULL;
+ #else
+ 	Py_BEGIN_ALLOW_THREADS
+ 	errno = 0;
+ 	ret = fstat(fileno(f->f_fp), &stbuf);
+ 	Py_END_ALLOW_THREADS
+ 	    if (ret == -1) {			/* the fstat failed */
+ 		PyErr_SetFromErrno(PyExc_IOError);
+ 		clearerr(f->f_fp);
+ 		return NULL;
+        	} else if (S_ISDIR(stbuf.st_mode) || S_ISBLK(stbuf.st_mode)) {
+ 		PyErr_SetString(PyExc_IOError, 
+ 				"Can't poll a block device or directory.");
+ 		clearerr(f->f_fp);
+ 		return NULL;
+ 	} else if (S_ISREG(stbuf.st_mode)) {	/* plain file */
+ #if defined(HAVE_LARGEFILE_SUPPORT) && SIZEOF_OFF_T < 8 && SIZEOF_FPOS_T >= 8
+ 		fpos_t pos;
+ #else
+ 		off_t pos;
+ #endif
+ 		Py_BEGIN_ALLOW_THREADS
+ 		errno = 0;
+ 		pos = _portable_ftell(f->f_fp);
+ 		Py_END_ALLOW_THREADS
+ 		if (pos == -1) {
+ 			PyErr_SetFromErrno(PyExc_IOError);
+ 			clearerr(f->f_fp);
+ 			return NULL;
+ 		}
+ #if !defined(HAVE_LARGEFILE_SUPPORT)
+ 		return PyInt_FromLong(stbuf.st_size - pos);
+ #else
+ 		return PyLong_FromLongLong(stbuf.st_size - pos);
+ #endif
+ 	} else if (S_ISFIFO(stbuf.st_mode) 
+ 		    || S_ISSOCK(stbuf.st_mode) 
+ 		    || S_ISCHR(stbuf.st_mode)) {	/* stream device */
+ #ifndef FIONREAD
+ 		PyErr_SetString(PyExc_OSError, 
+ 				"FIONREAD is not available.");
+ 		clearerr(f->f_fp);
+ 		return NULL;
+ #else
+ 		int waiting;
+ 
+ 		Py_BEGIN_ALLOW_THREADS
+ 		errno = 0;
+ 		ret = ioctl(fileno(f->f_fp), FIONREAD, &waiting);
+ 		Py_END_ALLOW_THREADS
+ 		if (ret == -1) {
+ 			PyErr_SetFromErrno(PyExc_IOError);
+ 			clearerr(f->f_fp);
+ 			return NULL;
+ 		}
+ 
+ 		return Py_BuildValue("i", waiting);
+ #endif /* FIONREAD */
+ 	} else {				/* should never happen! */
+ 		PyErr_SetString(PyExc_OSError, "Unknown file type.");
+ 		clearerr(f->f_fp);
+ 		return NULL;
+ 	}
+ #endif /* HAVE_FSTAT */
+ }
+ 
+ static PyObject *
  file_fileno(PyFileObject *f, PyObject *args)
  {
  	if (f->f_fp == NULL)
***************
*** 1263,1268 ****
--- 1348,1354 ----
  	{"truncate",	(PyCFunction)file_truncate, 1},
  #endif
  	{"tell",	(PyCFunction)file_tell, 0},
+ 	{"waiting",	(PyCFunction)file_waiting, 0},
  	{"readinto",	(PyCFunction)file_readinto, 0},
  	{"readlines",	(PyCFunction)file_readlines, 1},
  	{"xreadlines",	(PyCFunction)file_xreadlines, 1},
-------------- next part --------------
#!/usr/bin/env python
import sys, os, random, string, time, socket, smtplib, readline

print "This program tests the `waiting' method of file objects."

fp = open("waiting_test.py")
if hasattr(fp, "waiting"):
    print "Good, you're running a patched Python with `waiting' available."
else:
    print "You haven't installed the `waiting' patch yet.  This won't work."
    sys.exit(1)

successes = ""
failures = ""
nogo = ""

print ""
print "First, plain files:"

filesize = fp.waiting()
print "There are %d bytes waiting to be read in this file." % filesize
if os.name == 'posix':
    os.system("ls -l waiting_test.py")
    print "That should match the number in the ls listing above."
else:
    print "Please check this with your OS's directory tools."

get = random.randrange(fp.waiting())
print "I'll now read a random number (%d) of bytes." % get
fp.read(get)
print "The waiting method sees %d bytes left." % fp.waiting()
if get + fp.waiting() == filesize:
    print  "%d + %d = %d.  That's consistent.  Test passed." % \
          (get, fp.waiting(), filesize)
    successes += "Plain file random-read test passed.\n"
else:
    print "That's not consistent. Test failed."
    failures += "Plain file random-read test failed\n"

print "Now let's see if we can detect EOF reliably."
fp.read()
left = fp.waiting()
print "I'll do a read()...the waiting method now returns %d" % left
if left == 0:
    print "That looks like EOF."
    successes += "Plain file EOF test passed.\n"
else:
    print "%d bytes left. Test failed." % left
    failures += "Plain file EOF test failed\n"
fp.close()

print ""
print "Now sockets:"
print "Connecting to imap.netaxs.com's IMAP server now..."
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
file = sock.makefile('rb')
sock.connect(("imap.netaxs.com", 143))
print "Waiting a few seconds to avoid a race condition..."
time.sleep(3)
greetsize = file.waiting()
print "There appear to be %d bytes waiting..." % greetsize
greeting = file.readline()
print "I just read the greeting line..."
sys.stdout.write(greeting)
if len(greeting) == greetsize:
    print "...and the size matches.  Test passed."
    successes += "Socket test passed.\n"
else:
    print "That's not right.  Test failed."
    failures += "Socket test failed.\n"
sock.close()

print ""
if not hasattr(os, "mkfifo"):
    print "Your platform doesn't have FIFOs (mkfifo() is absent), so I can't test them."
    nogo = "FIFO test could not be performed."
else:
    print "Now FIFOs:"
    print "I'm making a FIFO named testfifo."; os.mkfifo("testfifo")
    str = string.letters[:random.randrange(len(string.letters))]
    print "I'm going to send it the following string '%s' of random length %d:" \
          % (str, len(str),)
    # Note: Unix dependency here!
    os.system("(echo -n '%s' >testfifo; echo 'Data written to FIFO.') &" % str)
    fp = open("testfifo", "r")
    print "Waiting a few seconds to avoid a race condition..."
    time.sleep(3)
    ready = fp.waiting()
    print "I see %d bytes waiting in the FIFO." % ready
    if ready == len(str):
        print "That's consistent.  Test passed."
        successes += "FIFO test passed.\n"
    else:
        print "That's not consistent. Test failed."
        failures += "FIFO test failed\n"
    os.remove("testfifo")

print "\nSummary:"
report = "Platform is: %s, version is %s\n" % (sys.platform, sys.version)
if successes:
    report += "The following tests succeeded:\n" + successes
if failures:
    report += "The following tests failed:\n" + failures
if nogo:
    report += "The following tests could not be performed:\n" + nogo
if not nogo:
    report += "No tests were skipped.\n"
if not failures:
    report += "All tests succeeded.\n"
print report

if os.name == 'posix':
    me = os.environ["USER"] + "@" + socket.getfqdn()
else:
    me = raw_input("Enter your emasil address, please?")

try:
    server = smtplib.SMTP('localhost')
    report = ("From: %s\nTo: esr at thyrsus.com\nSubject: waiting_test\n\n" % me) + report
    server.sendmail(me, ["esr at thyrsus.com"], report)
    server.quit()
except:
    print "The attempt to mail your test result failed.\n"

From esr at snark.thyrsus.com  Thu Jan 25 17:46:20 2001
From: esr at snark.thyrsus.com (Eric S. Raymond)
Date: Thu, 25 Jan 2001 11:46:20 -0500
Subject: [Python-Dev] Documentation patch for waiting method.
Message-ID: <200101251646.f0PGkKM23567@snark.thyrsus.com>

Index: libstdtypes.tex
===================================================================
RCS file: /cvsroot/python/python/dist/src/Doc/lib/libstdtypes.tex,v
retrieving revision 1.50
diff -u -r1.50 libstdtypes.tex
--- libstdtypes.tex	2001/01/17 01:18:00	1.50
+++ libstdtypes.tex	2001/01/25 16:46:40
@@ -1142,6 +1142,24 @@
   \UNIX{} versions support this operation).
 \end{methoddesc}
 
+\begin{methoddesc}[file]{waiting}{}
+  Return the number of bytes waiting to be read from this file object.
+  For regular files, this returns the size of the file in bytes minus
+  the current seek address, as would be returned by \method{tell()}; a
+  zero return can be used to detect EOF.  For streams such as FIFOs,
+  sockets, Unix ttys, and other Unix character devices, this method
+  returns the number of bytes currently buffered up and waiting to be
+  read.  Attempts to call this method on Unix block devices or
+  on directories will raise an error.
+	\footnote{The \method{waiting()} method uses
+  	\cfunction{fstat(2)} and \cfunction{lseek(2)} on plain files;
+  	these should be reliable on all of Unix, Windows, and MacOS.
+  	It uses the FIONREAD ioctl(2) call to query FIFOs, sockets,
+  	Unix ttys, and other POSIX character devices; FIFO and socket
+  	behavior should be consistent across all three platforms, but
+  	the results from querying other character devices may vary.}
+\end{methoddesc}
+
 \begin{methoddesc}[file]{write}{str}
   Write a string to the file.  There is no return value.  Note: Due to
   buffering, the string may not actually show up in the file until

-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"To disarm the people... was the best and most effectual way to enslave them."
        -- George Mason, speech of June 14, 1788



From fredrik at effbot.org  Thu Jan 25 20:23:50 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Thu, 25 Jan 2001 20:23:50 +0100
Subject: [Python-Dev] Fw: random.py gives wrong results (+ a solution)
Message-ID: <00f701c08704$59bde510$e46940d5@hagrid>

I'm pretty sure Tim's seen this already, but just
in case...

----- Original Message ----- 
From: "Ivan Frohne" <frohne at gci.net>
Newsgroups: comp.lang.python
Sent: Thursday, January 25, 2001 5:20 PM
Subject: Re: random.py gives wrong results (+ a solution)


> 
> "Janne Sinkkonen" <janne at oops.nnets.fi> wrote in message
> news:m3u26oy1rw.fsf at kinos.nnets.fi...
> >
> > At least in Python 2.0 and earlier, the samples returned by the
> > function betavariate() of random.py are not from a beta distribution
> > although the function name misleadingly suggests so.
> >
> > The following would give beta-distributed samples:
> >
> > def betavariate(alpha, beta):
> >      y = gammavariate(alpha,1)
> >      if y==0: return 0.0
> >      else: return  y/(y+gammavariate(beta,1))
> >
> > This is from matlab. A comment in the original matlab code refers to
> > Devroye, L. (1986) Non-Uniform Random Variate Generation, theorem 4.1A
> > (p. 430). Another reference would be Gelman, A. et al. (1995) Bayesian
> > data analysis, p. 481, which I have checked and found to agree with
> > the code above.
> 
> 
> I'm convinced that Janne Sinkkonen is right:  The beta distribution
> generator in module random.py does not return Beta-distributed
> random numbers.  Janne's suggested fix should work just fine.
> 
> Here's my guess on how and why this bug bit  -- it won't be of interest to
> most but
> this subject is so obscure sometimes that there needs to be a detailed
> analysis.
> 
> The probability density function of the gamma distribution with (positive)
> parameters
> A and B is usually written
> 
>     g(x; A, B) = (x**(A-1) * exp(x/B)) / (Gamma(A) * B**A), where x, A, and
> B > 0.
> 
> Here Gamma(A) is the gamma function -- for A a positive integer, Gamma(A) is
> the
> factorial of A - 1, Gamma(A) = (A-1)!.  In fact, this is the definition used
> by the authors of random.py in defining gammavariate(alpha, beta), the gamma
> distribution random number generator.
> 
> Now it happens that a gamma-distributed random variable with parameters A =
> 1 and
> B has the (much simpler) exponential distribution with density function
> 
>     g(x; 1, B) = exp(-x/B) / B.
> 
> Keep that in mind.
> 
> The reference "Discrete Event Simulation in ," by Kevin Watkins
> (McGraw-Hill, 1993)
> was consulted by the random.py authors.  But this reference defines the
> gamma probability distribution a little differently, as
> 
>     g1(x; A, B) =  (B**A * x**(A-1) * exp(B*x)) / Gamma(A), where x, A, B >
> 0.
> 
> (See p. 85).  On page 87, Watkins states (incorrectly) that if grv(A, B) is
> a function which
> returns a gamma random variable with parameters A and B (using his
> definition on p. 85),
> then the function
> 
>     brv(A, B) = grv(1, 1/B) / ( grv(1, 1/B) + grv(1, A) )              [ not
> true!]
> 
> will return a random variable which has the beta distribution with
> parameters A and B.
> 
> Believing Watkins to be correct, the random.py authors remembered that a
> gamma
> random variable with parameter A = 1 is just an exponential random variable
> and
> further simplified their beta generator to
> 
>    brv(A, B) = erv(1/B) / (erv(1/B) + erv(A)), where erv(K) is a random
> variable
> 
> having the exponential distribution with
> 
> parameter K.
> 
> The corrected equation for a beta random variable, using Watkins' definition
> of the
> gamma density, is
> 
>     brv(A, B) = grv(A, 1) / ( grv(A, 1) + grv(1/B, 1) ),
> 
> which translates to
> 
>     brv(A, B) = grv(A, 1) / (grv(A, 1) + grv(B, 1)
> 
> using the more common gamma density definition (the one used in random.py).
> Many standard statistical references give this equation -- two are
> "Non-Uniform random Variate Generation," by Luc Devroye, Springer-Verlag,
> 1986,
> p. 432, and  "Monte Carlo Concepts, Algorithms and Applications," by
> George S. Fishman, Springer, 1996, p. 200.
> 
> --Ivan Frohne
> 
> 
> 
> 
> >>>
> 
> 
> 
> 




From jeremy at alum.mit.edu  Thu Jan 25 18:13:03 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 12:13:03 -0500 (EST)
Subject: [Python-Dev] Makefile changes
In-Reply-To: <20010124073155.B32266@glacier.fnational.com>
References: <20010124073155.B32266@glacier.fnational.com>
Message-ID: <14960.24223.599357.388059@localhost.localdomain>

Neil,

What would it take to add useful dependency information to the
Makefile?  Or does it already exist?

When I was working the nested scopes, building was tedious at times
because a change to funcobject.h meant that, e.g., newmodule.c needed
to be recompiled.  The Makefiles didn't capture that information, so I
had been adding it to the individual Makefiles, e.g.

newmodule.o: newmodule.c ../Include/funcobject.h

(I think this worked.)

It would be great if the Makefile captured all the dependencies.
Could we just use makedepend?

Jeremy



From MarkH at ActiveState.com  Thu Jan 25 20:43:35 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Thu, 25 Jan 2001 11:43:35 -0800
Subject: [Python-Dev] Waiting method for file objects
In-Reply-To: <20010125111936.A23512@thyrsus.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPOEBFDAAA.MarkH@ActiveState.com>

> I would appreciate it if those of you on Windows and Macintosh
> machines would test the waiting method. The test program will take
> some porting, because it needs to write to a FIFO in background.

This didn't compile under Windows.  I have a patch (against CVS) that
compiles, but doesnt appear to work (and will be forwarded to Eric under
seperate cover) [news flash :-)  Changing the open call to add "rb" as the
mode makes it work - text v binary bites again]

I didn't try any sort of fifo test.

The sockets test failed with a socket error, but would certainly have failed
had the socket connected, as my patch includes:

#ifndef S_ISSOCK
#	define S_ISSOCK(mode) (0)
#endif

I have no idea if it managed to mail the results, but I guess not, so the
output is below.  The test file (after some small mods, including the "rb"
param) is indeed 4252 bytes long.

Hope this is useful!

Mark.

This program tests the `waiting' method of file objects.
Good, you're running a patched Python with `waiting' available.

First, plain files:
There are 4252 bytes waiting to be read in this file.
Please check this with your OS's directory tools.
I'll now read a random number (3091) of bytes.
The waiting method sees 1161 bytes left.
3091 + 1161 = 4252.  That's consistent.  Test passed.
Now let's see if we can detect EOF reliably.
I'll do a read()...the waiting method now returns 0
That looks like EOF.

Now sockets:
Connecting to imap.netaxs.com's IMAP server now...
Traceback (most recent call last):
  File "c:\temp\waiting_test.py", line 57, in ?
    sock.connect(("imap.netaxs.com", 143))
  File "<string>", line 1, in connect
socket.error: (10060, 'Operation timed out')




From nas at arctrix.com  Thu Jan 25 14:07:53 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 05:07:53 -0800
Subject: [Python-Dev] Makefile changes
In-Reply-To: <14960.24223.599357.388059@localhost.localdomain>; from jeremy@alum.mit.edu on Thu, Jan 25, 2001 at 12:13:03PM -0500
References: <20010124073155.B32266@glacier.fnational.com> <14960.24223.599357.388059@localhost.localdomain>
Message-ID: <20010125050753.A1573@glacier.fnational.com>

On Thu, Jan 25, 2001 at 12:13:03PM -0500, Jeremy Hylton wrote:
> What would it take to add useful dependency information to the
> Makefile?  Or does it already exist?

Some of it exists but I don't think its complete.

> When I was working the nested scopes, building was tedious at times
> because a change to funcobject.h meant that, e.g., newmodule.c needed
> to be recompiled.  The Makefiles didn't capture that information, so I
> had been adding it to the individual Makefiles, e.g.
> 
> newmodule.o: newmodule.c ../Include/funcobject.h
> 
> (I think this worked.)


Hmm, I don't think so.  Which makefile did you add this to?  Are
you using the new makefile?  The Makefile.pre.in file contains a
line like:

    $(LIBRARY_OBJS) $(MAINOBJ): $(PYTHON_HEADERS)

but newmodule.o not in LIBRARY_OBJS.  By default its not compiled
by make but with distutils.  If you add newmodule to Setup then a
line like:

    Modules/newmodule.o: $(PYTHON_HEADERS)

would do the trick.  I think I will add a line like:

    $(MODOBJS): $(PYTHON_HEADERS)

to fix the problem.

I could easily restore the mkdep target but my feeling right now
that explicitly including the header dependencies is better.
What do you think?  

  Neil



From jeremy at alum.mit.edu  Thu Jan 25 21:02:46 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 15:02:46 -0500 (EST)
Subject: [Python-Dev] PEP 227 checkins to follow
Message-ID: <14960.34406.342961.834827@localhost.localdomain>

I am about to check in the changes that implemention PEP 227.  There
are many changes, which I will make via separate commits.  You might
want to wait until the checkins are done to do an update.  I'll send a
note when I'm done.

I also wanted to mention that the PEP has fallen a little out of
date.  There are a few wrinkles that it doesn't deal with, e.g.
    def f(x):
        def g(y):
            return x + y
        del x
        return g

For now, this raises a SyntaxError.

I'll flesh out the PEP to reflect the current implemention and spec
out some of the less obvious cases.

I'd welcome any comments on the code itself.  I know there are a
number of rough edges and also, most likely, a bunch of memory leaks.
I'll be working to clean things up before 2.1a2, but wanted to get the
code into CVS ASAP.

Jeremy



From jeremy at alum.mit.edu  Thu Jan 25 21:15:01 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 15:15:01 -0500 (EST)
Subject: [Python-Dev] checkins done for PEP 227
Message-ID: <14960.35141.237252.468467@localhost.localdomain>

It looks like python-dev is very slow, so you'll see my original
warning well after the checkins occurred.  Oh, well.  They're done.

Jeremy




From tim.one at home.com  Thu Jan 25 21:58:03 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 25 Jan 2001 15:58:03 -0500
Subject: [Python-Dev] Fw: random.py gives wrong results (+ a solution) 
Message-ID: <LNBBLJKPBEHFEDALKOLCCEDDILAA.tim.one@home.com>

[/F, fwds a c.l.py claim that random.betavariate is dead wrong]

Not to worry; I had already entered that into the SF bug database and
assigned it to me (hmm:  why would you send it to Python-Dev instead of
putting it in the database?).  I suspect he's correct, and, more
importantly, so does Ivan Frohne.  We'll settle it before 2.1a2, but perhaps
not today.  Alas, I have no idea where the original code came from ("Guido"
isn't a useful answer -- he was just converting somebody else's C++ code to
Python).




From fredrik at effbot.org  Thu Jan 25 21:42:05 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Thu, 25 Jan 2001 21:42:05 +0100
Subject: [Python-Dev] Waiting method for file objects
References: <20010125111936.A23512@thyrsus.com>
Message-ID: <01fb01c0870f$48517110$e46940d5@hagrid>

eric wrote:

> Fortunately, this is less an issue than it appears.

only if you ignore Windows...

-1 on making this a file method

+0 on adding it as an optional support function to
the os module.

</F>




From martin at mira.cs.tu-berlin.de  Thu Jan 25 21:42:39 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Thu, 25 Jan 2001 21:42:39 +0100
Subject: [Python-Dev] jeremy@alum.mit.edu
Message-ID: <200101252042.f0PKgd101532@mira.informatik.hu-berlin.de>

> It would be great if the Makefile captured all the dependencies.

That would be great, yes. However, setup.py should probably also
consider dependencies.

> Could we just use makedepend?

Not sure. Certainly not in the build process. I dislike distributions
which, as the first thing, perform dependency generation. Dependencies
change less often than the actual source, so it is should be
sufficient to update them manually. Furthermore, generated files as
part of the CVS repository fail to work properly unless everybody uses
the exact same generator. For autoconf alone, that's a problem because
of multiple autoconf versions. I don't know how many different
makedepend versions are in use.

Regards,
Martin




From tim.one at home.com  Thu Jan 25 22:02:11 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 25 Jan 2001 16:02:11 -0500
Subject: [Python-Dev] Windows compile broken
Message-ID: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com>

Linking...
   Creating library ./python21.lib and object ./python21.exp
ceval.obj : error LNK2001: unresolved external symbol _PyCell_Set
ceval.obj : error LNK2001: unresolved external symbol _PyCell_Get
frameobject.obj : error LNK2001: unresolved external symbol _PyCell_New
./python21.dll : fatal error LNK1120: 3 unresolved externals
Error executing link.exe.


Sorry if this has already been discussed.  I don't see mention of it in the
Python-Dev archive, and my email is almost worse than useless (random delays
of minutes to days, due to what appears to be the simultaneous worldwide
wedging of every email server servicing every email account I have).




From esr at thyrsus.com  Thu Jan 25 22:12:25 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 25 Jan 2001 16:12:25 -0500
Subject: [Python-Dev] Waiting method for file objects
In-Reply-To: <01fb01c0870f$48517110$e46940d5@hagrid>; from fredrik@effbot.org on Thu, Jan 25, 2001 at 09:42:05PM +0100
References: <20010125111936.A23512@thyrsus.com> <01fb01c0870f$48517110$e46940d5@hagrid>
Message-ID: <20010125161225.A24305@thyrsus.com>

Fredrik Lundh <fredrik at effbot.org>:
> > Fortunately, this is less an issue than it appears.
> 
> only if you ignore Windows...

I don't understand this.  Explain?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Sometimes the law defends plunder and participates in it. Sometimes
the law places the whole apparatus of judges, police, prisons and
gendarmes at the service of the plunderers, and treats the victim --
when he defends himself -- as a criminal.
	-- Frederic Bastiat, "The Law"



From esr at thyrsus.com  Thu Jan 25 22:13:31 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Thu, 25 Jan 2001 16:13:31 -0500
Subject: [Python-Dev] jeremy@alum.mit.edu
In-Reply-To: <200101252042.f0PKgd101532@mira.informatik.hu-berlin.de>; from martin@mira.cs.tu-berlin.de on Thu, Jan 25, 2001 at 09:42:39PM +0100
References: <200101252042.f0PKgd101532@mira.informatik.hu-berlin.de>
Message-ID: <20010125161331.B24305@thyrsus.com>

Martin v. Loewis <martin at mira.cs.tu-berlin.de>:
> Not sure. Certainly not in the build process. I dislike distributions
> which, as the first thing, perform dependency generation. Dependencies
> change less often than the actual source, so it is should be
> sufficient to update them manually. Furthermore, generated files as
> part of the CVS repository fail to work properly unless everybody uses
> the exact same generator. For autoconf alone, that's a problem because
> of multiple autoconf versions. I don't know how many different
> makedepend versions are in use.

Easily solved -- there are script versions of makedepend we can just ship
with the distribution.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Morality is always the product of terror; its chains and
strait-waistcoats are fashioned by those who dare not trust others,
because they dare not trust themselves, to walk in liberty.
	-- Aldous Huxley 



From mal at lemburg.com  Thu Jan 25 22:26:04 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Thu, 25 Jan 2001 22:26:04 +0100
Subject: [Python-Dev] Windows compile broken
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com>
Message-ID: <3A7099EC.81689EA5@lemburg.com>

Tim Peters wrote:
> 
> Linking...
>    Creating library ./python21.lib and object ./python21.exp
> ceval.obj : error LNK2001: unresolved external symbol _PyCell_Set
> ceval.obj : error LNK2001: unresolved external symbol _PyCell_Get
> frameobject.obj : error LNK2001: unresolved external symbol _PyCell_New
> ./python21.dll : fatal error LNK1120: 3 unresolved externals
> Error executing link.exe.
> 
> Sorry if this has already been discussed.  I don't see mention of it in the
> Python-Dev archive, and my email is almost worse than useless (random delays
> of minutes to days, due to what appears to be the simultaneous worldwide
> wedging of every email server servicing every email account I have).

These must be related to checkins by Jeremy and his nested
scopes... (I knew these would get us into trouble ;-)

I think Jeremy forgot to check in the needed change for 
Objects/Makefile.in and probably the Windows project file is
missing the new object type too (Objects/cellobject.c).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From jeremy at alum.mit.edu  Thu Jan 25 22:14:52 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 16:14:52 -0500 (EST)
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <3A7099EC.81689EA5@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com>
	<3A7099EC.81689EA5@lemburg.com>
Message-ID: <14960.38732.773129.793360@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:

  MAL> Tim Peters wrote:
  >>
  >> Linking...  Creating library ./python21.lib and object
  >> ./python21.exp ceval.obj : error LNK2001: unresolved external
  >> symbol _PyCell_Set ceval.obj : error LNK2001: unresolved external
  >> symbol _PyCell_Get frameobject.obj : error LNK2001: unresolved
  >> external symbol _PyCell_New ./python21.dll : fatal error LNK1120:
  >> 3 unresolved externals Error executing link.exe.
  >>
  >> Sorry if this has already been discussed.  I don't see mention of
  >> it in the Python-Dev archive, and my email is almost worse than
  >> useless (random delays of minutes to days, due to what appears to
  >> be the simultaneous worldwide wedging of every email server
  >> servicing every email account I have).

  MAL> These must be related to checkins by Jeremy and his nested
  MAL> scopes... (I knew these would get us into trouble ;-)

Just you wait and see!

  MAL> I think Jeremy forgot to check in the needed change for
  MAL> Objects/Makefile.in and probably the Windows project file is
  MAL> missing the new object type too (Objects/cellobject.c).

That's right.  I didn't change the Makefile in Objects or do anything
with Windows.  Don't know how to do the latter, but perhaps Tim will
stop by my desk next week and show me.  As for the Makefile, I thought
I saw a message from Neil saying not to update those anymore.

Jeremy



From nas at arctrix.com  Thu Jan 25 16:10:56 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 07:10:56 -0800
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include cellobject.h,NONE,2.1 Python.h,2.30,2.31
In-Reply-To: <E14Lscy-00065x-00@usw-pr-cvs1.sourceforge.net>; from jhylton@users.sourceforge.net on Thu, Jan 25, 2001 at 12:04:16PM -0800
References: <E14Lscy-00065x-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010125071056.A2390@glacier.fnational.com>

On Thu, Jan 25, 2001 at 12:04:16PM -0800, Jeremy Hylton wrote:
> A cell contains a reference to a single PyObject.  It could be
> implemented as a mutable, one-element sequence, but the separate type
> has less overhead.

Can this object be involved in reference cycles?  If so, it
should probably have the GC methods added to it.

  Neil



From jeremy at alum.mit.edu  Thu Jan 25 22:42:04 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Thu, 25 Jan 2001 16:42:04 -0500 (EST)
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include cellobject.h,NONE,2.1 Python.h,2.30,2.31
In-Reply-To: <20010125071056.A2390@glacier.fnational.com>
References: <E14Lscy-00065x-00@usw-pr-cvs1.sourceforge.net>
	<20010125071056.A2390@glacier.fnational.com>
Message-ID: <14960.40364.594582.353511@localhost.localdomain>

>>>>> "NS" == Neil Schemenauer <nas at arctrix.com> writes:

  NS> On Thu, Jan 25, 2001 at 12:04:16PM -0800, Jeremy Hylton wrote:
  >> A cell contains a reference to a single PyObject.  It could be
  >> implemented as a mutable, one-element sequence, but the separate
  >> type has less overhead.

  NS> Can this object be involved in reference cycles?  If so, it
  NS> should probably have the GC methods added to it.

It's already there.  (Last five lines of cellobject.c quoted as
proof.) 

>	Py_TPFLAGS_DEFAULT | Py_TPFLAGS_GC,	/* tp_flags */
> 	0,					/* tp_doc */
> 	(traverseproc)cell_traverse,		/* tp_traverse */
> 	(inquiry)cell_clear,			/* tp_clear */
>};



From nas at arctrix.com  Thu Jan 25 16:19:22 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 07:19:22 -0800
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <3A7099EC.81689EA5@lemburg.com>; from mal@lemburg.com on Thu, Jan 25, 2001 at 10:26:04PM +0100
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com>
Message-ID: <20010125071922.B2390@glacier.fnational.com>

On Thu, Jan 25, 2001 at 10:26:04PM +0100, M.-A. Lemburg wrote:
> I think Jeremy forgot to check in the needed change for 
> Objects/Makefile.in

That file is dead.  Should I remove it now?  I haven't heard any
major complaints about Makefile.pre.in yet.  Maybe the messages
are all sitting in the python.org mail spool.  Barry, what the
hell is going on?  You need to drop that Postfix crap and get
qmail. :-)

  Neil



From thomas at xs4all.net  Thu Jan 25 23:19:37 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Thu, 25 Jan 2001 23:19:37 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include modsupport.h,2.35,2.36
In-Reply-To: <E14Lue8-0006SF-00@usw-pr-cvs1.sourceforge.net>; from fdrake@users.sourceforge.net on Thu, Jan 25, 2001 at 02:13:36PM -0800
References: <E14Lue8-0006SF-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010125231937.I962@xs4all.nl>

On Thu, Jan 25, 2001 at 02:13:36PM -0800, Fred L. Drake wrote:

> The addition of new parameters to functions in the Python/C API requires
> that PYTHON_API_VERSION be incremented.

When we update the API version, isn't it time to clean up the TP_HASFEATURE
stuff ? Since we updated the API, all the current slots should be there,
right ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Thu Jan 25 23:32:32 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 25 Jan 2001 17:32:32 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include modsupport.h,2.35,2.36
In-Reply-To: Your message of "Thu, 25 Jan 2001 23:19:37 +0100."
             <20010125231937.I962@xs4all.nl> 
References: <E14Lue8-0006SF-00@usw-pr-cvs1.sourceforge.net>  
            <20010125231937.I962@xs4all.nl> 
Message-ID: <200101252232.RAA20013@cj20424-a.reston1.va.home.com>

> > The addition of new parameters to functions in the Python/C API requires
> > that PYTHON_API_VERSION be incremented.
> 
> When we update the API version, isn't it time to clean up the TP_HASFEATURE
> stuff ? Since we updated the API, all the current slots should be there,
> right ?

No, we're issuing a warning about old API versions but still try to
work with them.  After all most extensions don't create frame or code
objects.

I added the flags for the tp_richcompare field when I tried 2.1a1 with
Zope's ExtensionClasses and Acquisition modules.  Turns out I cot a
core dump, while 2.1 ran flawlessly.  The reason: they have their own
type struct which has the same lay-out as the Python 1.5.2 (or even
older) type struct, followed by fields of their own.  They have the
tp_flags field set to 0, so up to 2.0, it was compatible.  I expect
that 2.1a2 will work with the unchanged Zope code because of the flag
I added.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From mal at lemburg.com  Fri Jan 26 00:04:54 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 00:04:54 +0100
Subject: [Python-Dev] Windows compile broken
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com>
Message-ID: <3A70B116.12BF756B@lemburg.com>

Neil Schemenauer wrote:
> 
> On Thu, Jan 25, 2001 at 10:26:04PM +0100, M.-A. Lemburg wrote:
> > I think Jeremy forgot to check in the needed change for
> > Objects/Makefile.in
> 
> That file is dead.  Should I remove it now?  I haven't heard any
> major complaints about Makefile.pre.in yet.

What about that file ? Are you saying that Makefile.pre.in
will no longer work in 2.1 ??? 

Please don't remove that mechanism -- it has been in use for
quite a while and is much more stable than distutils. We should
at least wait a few more distutils releases for the dust to
settle before removing the old fallback solution.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Fri Jan 26 00:06:40 2001
From: guido at digicool.com (Guido van Rossum)
Date: Thu, 25 Jan 2001 18:06:40 -0500
Subject: [Python-Dev] Windows compile broken
In-Reply-To: Your message of "Fri, 26 Jan 2001 00:04:54 +0100."
             <3A70B116.12BF756B@lemburg.com> 
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com>  
            <3A70B116.12BF756B@lemburg.com> 
Message-ID: <200101252306.SAA20173@cj20424-a.reston1.va.home.com>

> > That file is dead.  Should I remove it now?  I haven't heard any
> > major complaints about Makefile.pre.in yet.
> 
> What about that file ? Are you saying that Makefile.pre.in
> will no longer work in 2.1 ??? 
> 
> Please don't remove that mechanism -- it has been in use for
> quite a while and is much more stable than distutils. We should
> at least wait a few more distutils releases for the dust to
> settle before removing the old fallback solution.

Let's at least mark it clearly as obsolete though -- it's a pain to
maintain.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas at arctrix.com  Thu Jan 25 17:31:28 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 08:31:28 -0800
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <3A70B116.12BF756B@lemburg.com>; from mal@lemburg.com on Fri, Jan 26, 2001 at 12:04:54AM +0100
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com> <3A70B116.12BF756B@lemburg.com>
Message-ID: <20010125083128.A2699@glacier.fnational.com>

On Fri, Jan 26, 2001 at 12:04:54AM +0100, M.-A. Lemburg wrote:
> What about that file ? Are you saying that Makefile.pre.in
> will no longer work in 2.1 ??? 

I'm talking about Objects/Makefile.in.  Which Makefile.pre.in are
you talking about?  Modules/Makefile.pre.in is dead too.  There
is a Makefile.pre.in in the toplevel directory which does the
same thing.  There is also Misc/Makefile.pre.in.  That file gets
installed into lib and still works as it aways did.  The toplevel
Makefile.pre.in can use Modules/Setup* just like the old
Modules/Makefile.pre.in could.  Does this address your concerns?

> Please don't remove that mechanism -- it has been in use for
> quite a while and is much more stable than distutils. We should
> at least wait a few more distutils releases for the dust to
> settle before removing the old fallback solution.

No doubt.

  Neil



From nas at arctrix.com  Thu Jan 25 17:33:48 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Thu, 25 Jan 2001 08:33:48 -0800
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <200101252306.SAA20173@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Thu, Jan 25, 2001 at 06:06:40PM -0500
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com> <3A70B116.12BF756B@lemburg.com> <200101252306.SAA20173@cj20424-a.reston1.va.home.com>
Message-ID: <20010125083348.B2699@glacier.fnational.com>

On Thu, Jan 25, 2001 at 06:06:40PM -0500, Guido van Rossum wrote:
> Let's at least mark it clearly as obsolete though -- it's a pain to
> maintain.

Are you talking about Misc/Makefile.pre.in?  If so, how do you
suggest we mark it?

I don't think Modules/Setup should go away any time soon.  I
often like to build lots of modules staticly into the
interpreter.  setup.py has no support for building static
modules.

  Neil



From tim.one at home.com  Fri Jan 26 00:27:52 2001
From: tim.one at home.com (Tim Peters)
Date: Thu, 25 Jan 2001 18:27:52 -0500
Subject: [Python-Dev] Windows compile broken
In-Reply-To: <14960.38732.773129.793360@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEBILAA.tim.one@home.com>

Thanks for the clues, everyone!  I'll fix it for Windows.  Note that I'm
getting email in wild bursts, and most often delayed.  So I'm generally not
seeing any checkin msgs, or SF bug email, or Python-Dev email, ..., anywhere
near the time (or, alas, sometimes even day) they're generated.  So I simply
didn't see the checkin msg introducing cellobject.c.

all's-well-that-looks-like-it-may-end-ly y'rs  - tim




From mal at lemburg.com  Fri Jan 26 10:32:14 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 10:32:14 +0100
Subject: [Python-Dev] Makefile.pre.in (Windows compile broken)
References: <LNBBLJKPBEHFEDALKOLCGEDDILAA.tim.one@home.com> <3A7099EC.81689EA5@lemburg.com> <20010125071922.B2390@glacier.fnational.com> <3A70B116.12BF756B@lemburg.com> <20010125083128.A2699@glacier.fnational.com>
Message-ID: <3A71441E.4584A5C8@lemburg.com>

Neil Schemenauer wrote:
> 
> On Fri, Jan 26, 2001 at 12:04:54AM +0100, M.-A. Lemburg wrote:
> > What about that file ? Are you saying that Makefile.pre.in
> > will no longer work in 2.1 ???
> 
> I'm talking about Objects/Makefile.in.  Which Makefile.pre.in are
> you talking about?  Modules/Makefile.pre.in is dead too.  There
> is a Makefile.pre.in in the toplevel directory which does the
> same thing.  There is also Misc/Makefile.pre.in.  That file gets
> installed into lib and still works as it aways did.  The toplevel
> Makefile.pre.in can use Modules/Setup* just like the old
> Modules/Makefile.pre.in could.  Does this address your concerns?

Yes. Thanks. I was talking about the Misc/Makefile.pre.in mechanism
which was used in the past by many Python C extensions to provide
a portable of compiling the extension into a shared module or
statically into the Python interpreter.
 
I have been using that mechanism for years now and with much
success. Even though I am currently moving to distutils I have
no idea how stable distutils is on exotic platforms or ones which
have special needs (like e.g. AIX).

> > Please don't remove that mechanism -- it has been in use for
> > quite a while and is much more stable than distutils. We should
> > at least wait a few more distutils releases for the dust to
> > settle before removing the old fallback solution.
> 
> No doubt.

Ok.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Fri Jan 26 10:37:12 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 10:37:12 +0100
Subject: [Python-Dev] setup.py
Message-ID: <3A714548.C487DCC9@lemburg.com>

I have posted two messages here regarding the new setup.py
mechanism for building Modules/ but have received no comments
on them so far. Here's another go:

1. I think that setup.py should output warnings about modules 
   which cannot be built for some reason rather than having
   ot the build process completely.

2. I suggest adding -L/usr/lib/termcap to the readline extension.
   This doesn't hurt anywhere and will get this extension to compile
   on SuSE Linux too.

Thoughts ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Fri Jan 26 13:27:56 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Fri, 26 Jan 2001 07:27:56 -0500
Subject: [Python-Dev] setup.py
In-Reply-To: <3A714548.C487DCC9@lemburg.com>; from mal@lemburg.com on Fri, Jan 26, 2001 at 10:37:12AM +0100
References: <3A714548.C487DCC9@lemburg.com>
Message-ID: <20010126072756.A5013@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> 1. I think that setup.py should output warnings about modules 
>    which cannot be built for some reason rather than having
>    ot the build process completely.
> 
> 2. I suggest adding -L/usr/lib/termcap to the readline extension.
>    This doesn't hurt anywhere and will get this extension to compile
>    on SuSE Linux too.

Both good ideas.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Such are a well regulated militia, composed of the freeholders,
citizen and husbandman, who take up arms to preserve their property,
as individuals, and their rights as freemen.
        -- "M.T. Cicero", in a newspaper letter of 1788 touching the "militia" 
            referred to in the Second Amendment to the Constitution.



From mal at lemburg.com  Fri Jan 26 15:13:45 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 15:13:45 +0100
Subject: [Python-Dev] setup.py
References: <3A714548.C487DCC9@lemburg.com> <20010126072756.A5013@thyrsus.com>
Message-ID: <3A718619.6278AF41@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal at lemburg.com>:
> > 1. I think that setup.py should output warnings about modules
> >    which cannot be built for some reason rather than having
> >    ot the build process completely.
> >
> > 2. I suggest adding -L/usr/lib/termcap to the readline extension.
> >    This doesn't hurt anywhere and will get this extension to compile
> >    on SuSE Linux too.
> 
> Both good ideas.

Should I implement the two and check these in ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Fri Jan 26 15:25:59 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Fri, 26 Jan 2001 09:25:59 -0500
Subject: [Python-Dev] setup.py
In-Reply-To: <3A718619.6278AF41@lemburg.com>; from mal@lemburg.com on Fri, Jan 26, 2001 at 03:13:45PM +0100
References: <3A714548.C487DCC9@lemburg.com> <20010126072756.A5013@thyrsus.com> <3A718619.6278AF41@lemburg.com>
Message-ID: <20010126092559.A5623@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> "Eric S. Raymond" wrote:
> > 
> > M.-A. Lemburg <mal at lemburg.com>:
> > > 1. I think that setup.py should output warnings about modules
> > >    which cannot be built for some reason rather than having
> > >    ot the build process completely.
> > >
> > > 2. I suggest adding -L/usr/lib/termcap to the readline extension.
> > >    This doesn't hurt anywhere and will get this extension to compile
> > >    on SuSE Linux too.
> > 
> > Both good ideas.
> 
> Should I implement the two and check these in ?

I may not channel Guido the way Tim does, but I suspect he gave you
developer privileges because he trusts you to do routine stuff like this.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

The saddest life is that of a political aspirant under democracy. His
failure is ignominious and his success is disgraceful.
        -- H.L. Mencken



From mal at lemburg.com  Fri Jan 26 15:29:18 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 15:29:18 +0100
Subject: [Python-Dev] setup.py
References: <3A714548.C487DCC9@lemburg.com> <20010126072756.A5013@thyrsus.com> <3A718619.6278AF41@lemburg.com> <20010126092559.A5623@thyrsus.com>
Message-ID: <3A7189BE.C6C2806E@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal at lemburg.com>:
> > "Eric S. Raymond" wrote:
> > >
> > > M.-A. Lemburg <mal at lemburg.com>:
> > > > 1. I think that setup.py should output warnings about modules
> > > >    which cannot be built for some reason rather than having
> > > >    ot the build process completely.
> > > >
> > > > 2. I suggest adding -L/usr/lib/termcap to the readline extension.
> > > >    This doesn't hurt anywhere and will get this extension to compile
> > > >    on SuSE Linux too.
> > >
> > > Both good ideas.
> >
> > Should I implement the two and check these in ?
> 
> I may not channel Guido the way Tim does, but I suspect he gave you
> developer privileges because he trusts you to do routine stuff like this.

Just asking because setup.py is Andrew's baby. I'll add the above
two later today.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mwh21 at cam.ac.uk  Fri Jan 26 17:40:47 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 26 Jan 2001 16:40:47 +0000
Subject: [Python-Dev] [PEP 232] Syntactic support for function attributes strawman.
Message-ID: <m3ofwuw9kg.fsf@atrus.jesus.cam.ac.uk>

Following discussion on c.l.py I've just submitted:

http://sourceforge.net/patch/?func=detailpatch&patch_id=103441&group_id=5470

which implements a syntax for adding function attributes inline:

>>> def f(a) having (publish=1):
...  print 1
... 
>>> f.publish
1

It uses an "import-as" like strategy to avoid makeing "having" a
keyword (which interacts a bit badly with error reporting, as it
happens).  Obviously, it would be easy to change "having" to a
different word.

Another idea I had was:

>>> def f(a) having (.publish=1):
...  print 1
... 
>>> f.publish
1

to emphasize the attributeness of what's going on, but I didn't like
this as much in practice (I always forgot the period!).

Emile van Sebille also suggested

>>> d = {'a':1}
>>> def f(a) having (**d):
...  print 1
... 
>>> f.a
1

which I haven't implemented, because I didn't really like it, but I
thought I'd mention.

I'll do test suites and documentation in time, but I thought I'd call
in here to check the idea wasn't DOA.  What do you all think?

Cheers,
M.

-- 
  surely, somewhere, somehow, in the history of computing, at least
  one manual has been written that you could at least remotely
  attempt to consider possibly glancing at.              -- Adam Rixey





From nas at arctrix.com  Fri Jan 26 10:55:57 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 01:55:57 -0800
Subject: [Python-Dev] [PEP 232] Syntactic support for function attributes strawman.
In-Reply-To: <m3ofwuw9kg.fsf@atrus.jesus.cam.ac.uk>; from mwh21@cam.ac.uk on Fri, Jan 26, 2001 at 04:40:47PM +0000
References: <m3ofwuw9kg.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <20010126015556.A4215@glacier.fnational.com>

I don't see whats wrong with:

    def f(a):
        print 1
    f.publish = 1

Its perfectly clear to me.  As a bonus it works already.  I'm -1
on inventing more syntax.

  Neil



From evan at digicool.com  Fri Jan 26 18:12:43 2001
From: evan at digicool.com (Evan Simpson)
Date: Fri, 26 Jan 2001 12:12:43 -0500
Subject: [Python-Dev] [PEP 232] Syntactic support for function attributes strawman.
References: <m3ofwuw9kg.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <00c001c087bb$322a9720$3e48a4d8@digicool.com>

From: Michael Hudson <mwh21 at cam.ac.uk>
> >>> def f(a) having (publish=1):
> ...  print 1

This doesn't really need special syntax.  I would much rather have this (or
something like it) as a way of spelling initialized local variables.  That
is, when I want static local variables, instead of corrupting the function
signature by writing:

def f(x, marker=[], foo=foo)

...I could write:

def f(x) having (marker=[], foo)

Cheers,

Evan @ digicool




From jeremy at alum.mit.edu  Fri Jan 26 18:58:24 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 26 Jan 2001 12:58:24 -0500 (EST)
Subject: [Python-Dev] Makefile changes
In-Reply-To: <20010125050753.A1573@glacier.fnational.com>
References: <20010124073155.B32266@glacier.fnational.com>
	<14960.24223.599357.388059@localhost.localdomain>
	<20010125050753.A1573@glacier.fnational.com>
Message-ID: <14961.47808.315324.734238@localhost.localdomain>

>>>>> "NS" == Neil Schemenauer <nas at arctrix.com> writes:

  >> When I was working the nested scopes, building was tedious at
  >> times because a change to funcobject.h meant that, e.g.,
  >> newmodule.c needed to be recompiled.  The Makefiles didn't
  >> capture that information, so I had been adding it to the
  >> individual Makefiles, e.g.
  >>
  >> newmodule.o: newmodule.c ../Include/funcobject.h
  >>
  >> (I think this worked.)

  NS> Hmm, I don't think so.  Which makefile did you add this to?

Just to clarify: I added this line to the old Makefile before you
checked the new one in.

  NS> Hmm, I don't think so.  Which makefile did you add this to?  Are
  NS> you using the new makefile?  The Makefile.pre.in file contains a
  NS> line like:

  NS>     $(LIBRARY_OBJS) $(MAINOBJ): $(PYTHON_HEADERS)

  NS> but newmodule.o not in LIBRARY_OBJS.  By default its not
  NS> compiled by make but with distutils.  If you add newmodule to
  NS> Setup then a line like:

  NS>     Modules/newmodule.o: $(PYTHON_HEADERS)

  NS> would do the trick.  I think I will add a line like:

  NS>     $(MODOBJS): $(PYTHON_HEADERS)

  NS> to fix the problem.

  NS> I could easily restore the mkdep target but my feeling right now
  NS> that explicitly including the header dependencies is better.
  NS> What do you think?

Isn't it overkill to have every .o file depend on all the .h files?
If I change cobject.h, there are very few .o files that depend on this
change.  I suppose, however, it's not worth the effort to get it right
at a finer granularity, e.g. that the only files that depend on
cobject.h are cobject, cStringIO, unicodedata, _cursesmodule, object,
and unicodeobject.

Jeremy






From fdrake at acm.org  Fri Jan 26 21:36:18 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Fri, 26 Jan 2001 15:36:18 -0500 (EST)
Subject: [Python-Dev] Makefile changes
In-Reply-To: <14961.47808.315324.734238@localhost.localdomain>
References: <20010124073155.B32266@glacier.fnational.com>
	<14960.24223.599357.388059@localhost.localdomain>
	<20010125050753.A1573@glacier.fnational.com>
	<14961.47808.315324.734238@localhost.localdomain>
Message-ID: <14961.57282.880552.358709@cj42289-a.reston1.va.home.com>

Jeremy Hylton writes:
 > Isn't it overkill to have every .o file depend on all the .h files?
 > If I change cobject.h, there are very few .o files that depend on this
 > change.  I suppose, however, it's not worth the effort to get it right

  Perhaps.  It's definately easier to maintain than tracking it more
specifically and better than what we had, so I'll live with it.  ;)

 > at a finer granularity, e.g. that the only files that depend on
 > cobject.h are cobject, cStringIO, unicodedata, _cursesmodule, object,
 > and unicodeobject.

  And py_curses.h, which is also used in _curses_panel.c.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From nas at arctrix.com  Fri Jan 26 14:58:50 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 05:58:50 -0800
Subject: [Python-Dev] Makefile changes
In-Reply-To: <14961.47808.315324.734238@localhost.localdomain>; from jeremy@alum.mit.edu on Fri, Jan 26, 2001 at 12:58:24PM -0500
References: <20010124073155.B32266@glacier.fnational.com> <14960.24223.599357.388059@localhost.localdomain> <20010125050753.A1573@glacier.fnational.com> <14961.47808.315324.734238@localhost.localdomain>
Message-ID: <20010126055850.C4918@glacier.fnational.com>

On Fri, Jan 26, 2001 at 12:58:24PM -0500, Jeremy Hylton wrote:
> Isn't it overkill to have every .o file depend on all the .h files?

Maybe, but Python compiles pretty fast anyhow.  I'd rather error
on the safe side (ie. compiling too much).  Trying to figure out
which of the subheaders a .c file uses when it imports Python.h
would be a lot of work and error prone.  More power to you if you
want to do it.  ;-)

  Neil



From dgoodger at atsautomation.com  Fri Jan 26 22:46:13 2001
From: dgoodger at atsautomation.com (Goodger, David)
Date: Fri, 26 Jan 2001 16:46:13 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
Message-ID: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>

[CC'ing to Armin Steinhoff, who maintains pyqnx on SourceForge.]

I'm having trouble building Python 2.1a1 on QNX 4.25. Caveat: my C is very
rusty (long live Python!), I don't know my way around configure, and am not
familiar with Python's Makefile. Python 2.0 compiled fine (with a couple of
tweaks), but I'm getting caught by the new way of building things. Please
help if you can! Many thanks in advance.

Here's an excerpt of my efforts:

    # cd /tmp/py
    # gunzip -c < python-2.1a1.tgz | tar -rf -
    # cd Python-2.1a1
    # ./configure 2>&1 | tee ../configure.1
    # make 2>&1 | tee ../make.1
    ...
    ./python //5/tmp/py/Python-2.1a1/setup.py build
    'import site' failed; use -v for traceback
    Traceback (most recent call last):
      File "//5/tmp/py/Python-2.1a1/setup.py", line 4, in ?
        import sys, os, string, getopt
    ImportError: No module named string

Running ./python results in stack overflow. The old QNX instructions in
README recommend editing Modules/Makefile:
    LDFLAGS=    -N 64k

    # make 2>&1 | tee ../make.2

Same error as first make. But now the stack doesn't overflow.

    # python
    'import site' failed; use -v for traceback
    Python 2.1a1 (#2, Jan 26 2001, 11:38:55) [C] on qnxJ
    Type "copyright", "credits" or "license" for more information.
    >>> import sys
    >>> sys.path
    ['', '/usr/local/lib/python', '/home/dgoodger/lib/python', 
    '/5/tmp/py/Python-2.1a1/Lib', '/5/tmp/py/Python-2.1a1/Lib/plat-qnxJ', 
    '/tmp/py/Python-2.1a1/Modules']
    >>> ^D

    # fullpath .
    . is //5/tmp/py/Python-2.1a1

The QNX node number prefix '//5' (machine or host number, equivalent to a
'hostname:' prefix for network paths) is being reduced somehow (path
normalization?) to '/5', so paths don't resolve. 2 slashes ('//') are
required at the head of the path. Is this something that can be fixed?

I added a prefix (QNX virtual-to-real path mapping on the filesystem tree)
to correct this:

    # prefix -A /5=//5

Now /5 points to //5, similar to a link.

    # make 2>&1 | tee ../make.3
    ...
    ./python //5/tmp/py/Python-2.1a1/setup.py build
    unable to execute ld: No such file or directory
    running build
    running build_ext
    building 'struct' extension
    creating build
    creating build/temp.qnx-J-PCI-2.1
    cc -O -I. -I/5/tmp/py/Python-2.1a1/./Include -IInclude/
-I/usr/local/include -c /5/tmp/py/Python-2.1a1/Modules/structmodule.c -o
build/temp.qnx-J-PCI-2.1/structmodule.o
    creating build/lib.qnx-J-PCI-2.1
    ld build/temp.qnx-J-PCI-2.1/structmodule.o -L/usr/local/lib -o
build/lib.qnx-J-PCI-2.1/struct.so
    error: command 'ld' failed with exit status 1
    make: *** [sharedmods] Error 1

QNX doesn't have an 'ld' command. Is configure not getting its info to
setup.py? (Is it supposed to?)

What should I check? I have logs of each of the configure & make runs.
Should I submit this as a bug on SourceForge?

Hope to hear from somebody soon.


David Goodger
Systems Administrator & Programmer, Advanced Systems
Automation Tooling Systems Inc., Automation Systems Division
direct: (519) 653-4483 ext. 7121    fax: (519) 650-6695
e-mail: dgoodger at atsautomation.com



From guido at digicool.com  Fri Jan 26 22:52:47 2001
From: guido at digicool.com (Guido van Rossum)
Date: Fri, 26 Jan 2001 16:52:47 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: Your message of "Fri, 26 Jan 2001 16:46:13 EST."
             <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> 
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> 
Message-ID: <200101262152.QAA26624@cj20424-a.reston1.va.home.com>

> [CC'ing to Armin Steinhoff, who maintains pyqnx on SourceForge.]
> 
> I'm having trouble building Python 2.1a1 on QNX 4.25. Caveat: my C is very
> rusty (long live Python!), I don't know my way around configure, and am not
> familiar with Python's Makefile. Python 2.0 compiled fine (with a couple of
> tweaks), but I'm getting caught by the new way of building things. Please
> help if you can! Many thanks in advance.
> 
> Here's an excerpt of my efforts:
> 
>     # cd /tmp/py
>     # gunzip -c < python-2.1a1.tgz | tar -rf -
>     # cd Python-2.1a1
>     # ./configure 2>&1 | tee ../configure.1
>     # make 2>&1 | tee ../make.1
>     ...
>     ./python //5/tmp/py/Python-2.1a1/setup.py build
>     'import site' failed; use -v for traceback
>     Traceback (most recent call last):
>       File "//5/tmp/py/Python-2.1a1/setup.py", line 4, in ?
>         import sys, os, string, getopt
>     ImportError: No module named string
> 
> Running ./python results in stack overflow. The old QNX instructions in
> README recommend editing Modules/Makefile:
>     LDFLAGS=    -N 64k
> 
>     # make 2>&1 | tee ../make.2
> 
> Same error as first make. But now the stack doesn't overflow.
> 
>     # python
>     'import site' failed; use -v for traceback
>     Python 2.1a1 (#2, Jan 26 2001, 11:38:55) [C] on qnxJ
>     Type "copyright", "credits" or "license" for more information.
>     >>> import sys
>     >>> sys.path
>     ['', '/usr/local/lib/python', '/home/dgoodger/lib/python', 
>     '/5/tmp/py/Python-2.1a1/Lib', '/5/tmp/py/Python-2.1a1/Lib/plat-qnxJ', 
>     '/tmp/py/Python-2.1a1/Modules']
>     >>> ^D
> 
>     # fullpath .
>     . is //5/tmp/py/Python-2.1a1
> 
> The QNX node number prefix '//5' (machine or host number, equivalent to a
> 'hostname:' prefix for network paths) is being reduced somehow (path
> normalization?) to '/5', so paths don't resolve. 2 slashes ('//') are
> required at the head of the path. Is this something that can be fixed?

Aha -- you may need QNX-specific path manipulation functions.  What's
going on is that site.py normalizes the entries in sys.path, using
this function:

    def makepath(*paths):
	dir = os.path.join(*paths)
	return os.path.normcase(os.path.abspath(dir))

I've got a feeling that os.path.abspath(dir) here is the culprit in
posixpath.py:

def abspath(path):
    """Return an absolute path."""
    if not isabs(path):
        path = join(os.getcwd(), path)
    return normpath(path)

And here I think that normpath(path) is the routine that actually gets
rid of the double leading /.

Feel free to submit a patch that leaves double leading slashes in if
on QNX.

> I added a prefix (QNX virtual-to-real path mapping on the filesystem tree)
> to correct this:
> 
>     # prefix -A /5=//5
> 
> Now /5 points to //5, similar to a link.
> 
>     # make 2>&1 | tee ../make.3
>     ...
>     ./python //5/tmp/py/Python-2.1a1/setup.py build
>     unable to execute ld: No such file or directory
>     running build
>     running build_ext
>     building 'struct' extension
>     creating build
>     creating build/temp.qnx-J-PCI-2.1
>     cc -O -I. -I/5/tmp/py/Python-2.1a1/./Include -IInclude/
> -I/usr/local/include -c /5/tmp/py/Python-2.1a1/Modules/structmodule.c -o
> build/temp.qnx-J-PCI-2.1/structmodule.o
>     creating build/lib.qnx-J-PCI-2.1
>     ld build/temp.qnx-J-PCI-2.1/structmodule.o -L/usr/local/lib -o
> build/lib.qnx-J-PCI-2.1/struct.so
>     error: command 'ld' failed with exit status 1
>     make: *** [sharedmods] Error 1
> 
> QNX doesn't have an 'ld' command. Is configure not getting its info to
> setup.py? (Is it supposed to?)
> 
> What should I check? I have logs of each of the configure & make runs.
> Should I submit this as a bug on SourceForge?
> 
> Hope to hear from somebody soon.

This is probably in the realm of the distutils.  I have no idea how to
teach it to build on QNX, sorry!

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at cnri.reston.va.us  Fri Jan 26 23:01:01 2001
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Fri, 26 Jan 2001 17:01:01 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>; from dgoodger@atsautomation.com on Fri, Jan 26, 2001 at 04:46:13PM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>
Message-ID: <20010126170101.B2762@amarok.cnri.reston.va.us>

On Fri, Jan 26, 2001 at 04:46:13PM -0500, Goodger, David wrote:
>    ImportError: No module named string

The 'import string' in setup.py actually seems to be redundant now,
since nothing seems to actually refer to the string module.  I've
removed it from CVS.

>The QNX node number prefix '//5' (machine or host number, equivalent to a
>'hostname:' prefix for network paths) is being reduced somehow (path
>normalization?) to '/5', so paths don't resolve. 2 slashes ('//') are
>required at the head of the path. Is this something that can be fixed?

Ooh, very likely:
>>> os.path.normpath('//5/foo/bar')
'/5/foo/bar'

Isn't // at the root a Unix convention of some sort for some 
network filesystems?  Probably normpath() should just leave it alone.

>QNX doesn't have an 'ld' command. Is configure not getting its info to
>setup.py? (Is it supposed to?)

setup.py should be parsing the Makefile.  The old QNX instructions say
Modules/Makefile should be edited, but with Neil's non-recursive
Makefile patch (committed after alpha1's release), editing
Modules/Makefile will have no effect.  Try editing just the top-level
Makefile, which should affect setup.py.

--amk
 



From mal at lemburg.com  Fri Jan 26 23:15:09 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 23:15:09 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us>
Message-ID: <3A71F6ED.D6D642A7@lemburg.com>

"Andrew M. Kuchling" wrote:
> >The QNX node number prefix '//5' (machine or host number, equivalent to a
> >'hostname:' prefix for network paths) is being reduced somehow (path
> >normalization?) to '/5', so paths don't resolve. 2 slashes ('//') are
> >required at the head of the path. Is this something that can be fixed?
> 
> Ooh, very likely:
> >>> os.path.normpath('//5/foo/bar')
> '/5/foo/bar'
> 
> Isn't // at the root a Unix convention of some sort for some
> network filesystems?  Probably normpath() should just leave it alone.

Samba uses //<hostname>/<mountname>/<path>. os.path.normpath()
should probably leave the leading '//' untouched (having too
many of those in the path doesn't do any harm, AFAIK).
 
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From nas at arctrix.com  Fri Jan 26 16:26:12 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 07:26:12 -0800
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>; from dgoodger@atsautomation.com on Fri, Jan 26, 2001 at 04:46:13PM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE>
Message-ID: <20010126072611.A5345@glacier.fnational.com>

On Fri, Jan 26, 2001 at 04:46:13PM -0500, Goodger, David wrote:
> Running ./python results in stack overflow. The old QNX instructions in
> README recommend editing Modules/Makefile:
>     LDFLAGS=    -N 64k
> 
>     # make 2>&1 | tee ../make.2

The README should be changed to say edit the toplevel Makefile.
Should those flags be the default?  If you can give me the
MACHDEP from your Makefile I can add it to configure.in.

> QNX doesn't have an 'ld' command. Is configure not getting its info to
> setup.py? (Is it supposed to?)

I'm not sure how distutils figures out what to use for ld.  It
doesn't appear in the Makefile.  It think this is probably some
distutils thing.  Andrew?

  Neil



From fredrik at effbot.org  Fri Jan 26 23:25:34 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Fri, 26 Jan 2001 23:25:34 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com>
Message-ID: <001a01c087e6$ec3b9710$e46940d5@hagrid>

mal wrote:> > Ooh, very likely:
> > >>> os.path.normpath('//5/foo/bar')
> > '/5/foo/bar'
> > 
> > Isn't // at the root a Unix convention of some sort for some
> > network filesystems?  Probably normpath() should just leave it alone.
> 
> Samba uses //<hostname>/<mountname>/<path>. os.path.normpath()
> should probably leave the leading '//' untouched (having too
> many of those in the path doesn't do any harm, AFAIK).

from 1.5.2's posixpath:

def normpath(path):
    """Normalize path, eliminating double slashes, etc."""
    import string
    # Treat initial slashes specially
    slashes = ''
    while path[:1] == '/':
        slashes = slashes + '/'
        path = path[1:]
    ...
    return slashes + string.joinfields(comps, '/')

from 2.0's posixpath:

def normpath(path):
    """Normalize path, eliminating double slashes, etc."""
    if path == '':
        return '.'
    import string
    initial_slash = (path[0] == '/')
    ...
    if initial_slash:
        path = '/' + path
    return path or '.'

interesting...

Cheers /F




From akuchlin at cnri.reston.va.us  Fri Jan 26 23:28:03 2001
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Fri, 26 Jan 2001 17:28:03 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <20010126072611.A5345@glacier.fnational.com>; from nas@arctrix.com on Fri, Jan 26, 2001 at 07:26:12AM -0800
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126072611.A5345@glacier.fnational.com>
Message-ID: <20010126172803.A2817@amarok.cnri.reston.va.us>

On Fri, Jan 26, 2001 at 07:26:12AM -0800, Neil Schemenauer wrote:
>I'm not sure how distutils figures out what to use for ld.  It
>doesn't appear in the Makefile.  It think this is probably some
>distutils thing.  Andrew?

It looks at LDSHARED.  See customize_compiler in
Lib/distutils/sysconfig.py.  Looking in Modules/Makefile, LDFLAGS is
only used for the final link to produce a Python executable, so I
think this is up to the Makefile, not setup.py.

--amk



From nas at arctrix.com  Fri Jan 26 16:56:41 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 07:56:41 -0800
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <20010126172803.A2817@amarok.cnri.reston.va.us>; from akuchlin@cnri.reston.va.us on Fri, Jan 26, 2001 at 05:28:03PM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126072611.A5345@glacier.fnational.com> <20010126172803.A2817@amarok.cnri.reston.va.us>
Message-ID: <20010126075641.A5534@glacier.fnational.com>

On Fri, Jan 26, 2001 at 05:28:03PM -0500, Andrew M. Kuchling wrote:
> On Fri, Jan 26, 2001 at 07:26:12AM -0800, Neil Schemenauer wrote:
> >I'm not sure how distutils figures out what to use for ld.
> 
> It looks at LDSHARED.

Okay.  David, what should LDSHARED say for QNX?  I can add the
magic to configure.in.

  Neil



From mal at lemburg.com  Fri Jan 26 23:51:09 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Fri, 26 Jan 2001 23:51:09 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid>
Message-ID: <3A71FF5D.DC609775@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:> > Ooh, very likely:
> > > >>> os.path.normpath('//5/foo/bar')
> > > '/5/foo/bar'
> > >
> > > Isn't // at the root a Unix convention of some sort for some
> > > network filesystems?  Probably normpath() should just leave it alone.
> >
> > Samba uses //<hostname>/<mountname>/<path>. os.path.normpath()
> > should probably leave the leading '//' untouched (having too
> > many of those in the path doesn't do any harm, AFAIK).
> 
> from 1.5.2's posixpath:
> 
> def normpath(path):
>     """Normalize path, eliminating double slashes, etc."""
>     import string
>     # Treat initial slashes specially
>     slashes = ''
>     while path[:1] == '/':
>         slashes = slashes + '/'
>         path = path[1:]
>     ...
>     return slashes + string.joinfields(comps, '/')
> 
> from 2.0's posixpath:
> 
> def normpath(path):
>     """Normalize path, eliminating double slashes, etc."""
>     if path == '':
>         return '.'
>     import string
>     initial_slash = (path[0] == '/')
>     ...
>     if initial_slash:
>         path = '/' + path
>     return path or '.'
> 
> interesting...

Here's the log message:

revision 1.34
date: 2000/07/19 17:09:51;  author: montanaro;  state: Exp;  lines: +18 -23
added rewritten normpath from Moshe Zadka that does the right thing with
paths containing ..

and the diff:

diff -r1.34 -r1.33
349,350d348
<     if path == '':
<         return '.'
352,367c350,372
<     initial_slash = (path[0] == '/')
<     comps = string.split(path, '/')
<     new_comps = []
<     for comp in comps:
<         if comp in ('', '.'):
<             continue
<         if (comp != '..' or (not initial_slash and not new_comps) or 
<              (new_comps and new_comps[-1] == '..')):
<             new_comps.append(comp)
<         elif new_comps:
<             new_comps.pop()
<     comps = new_comps
<     path = string.join(comps, '/')
<     if initial_slash:
<         path = '/' + path
<     return path or '.'
---
>     # Treat initial slashes specially
>     slashes = ''
>     while path[:1] == '/':
>         slashes = slashes + '/'
>         path = path[1:]
>     comps = string.splitfields(path, '/')
>     i = 0
>     while i < len(comps):
>         if comps[i] == '.':
>             del comps[i]
>             while i < len(comps) and comps[i] == '':
>                 del comps[i]
>         elif comps[i] == '..' and i > 0 and comps[i-1] not in ('', '..'):
>             del comps[i-1:i+1]
>             i = i-1
>         elif comps[i] == '' and i > 0 and comps[i-1] <> '':
>             del comps[i]
>         else:
>             i = i+1
>     # If the path is now empty, substitute '.'
>     if not comps and not slashes:
>         comps.append('.')
>     return slashes + string.joinfields(comps, '/')

Revision 1.33 clearly leaves initial slashes untouched.
I guess we should restore this...

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From nas at arctrix.com  Fri Jan 26 17:12:15 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Fri, 26 Jan 2001 08:12:15 -0800
Subject: [Python-Dev] LINKCC defaults to CXX
Message-ID: <20010126081215.B5534@glacier.fnational.com>

Dear lord why?  So people can develop extensions using C++?  Its
not worth the pain inflicted on everyone else.  Let them
recompile with LINKCC=CXX.

Linking with CXX opens a huge can of stinky worms.  First of all,
just because configure found a value for CXX doesn't mean it
works.  Even if it does that doesn't mean that using it is a good
idea.  Linking with CXX will bring in the C++ runtime.  There are
a large number of platforms where the C++ ABI has not been
standarized; for example, anything that used g++.

Can we please leave LINKCC default to CXX?  Its easy enough for
the crazies to override if they like.  I'll even create a
configure option for them.

  Neil



From barry at digicool.com  Sat Jan 27 00:09:57 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Fri, 26 Jan 2001 18:09:57 -0500
Subject: [Python-Dev] LINKCC defaults to CXX
References: <20010126081215.B5534@glacier.fnational.com>
Message-ID: <14962.965.464326.794431@anthem.wooz.org>

>>>>> "NS" == Neil Schemenauer <nas at arctrix.com> writes:

    NS> Can we please leave LINKCC default to CXX?

I think you mean default it to CC, eh?  +1



From mal at lemburg.com  Sat Jan 27 01:16:01 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 27 Jan 2001 01:16:01 +0100
Subject: [Python-Dev] Nightly CVS tarballs
Message-ID: <3A721341.3F348E51@lemburg.com>

I just got a request from someone who wants to test the latest
CVS version but unfortunately can't because he's behind a 
firewall.

Is there any chance of reactivating the nightly tarball generation
that was once in place ?

	http://www.python.org/download/cvs.html

Thanks,
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From dgoodger at atsautomation.com  Sat Jan 27 01:30:21 2001
From: dgoodger at atsautomation.com (Goodger, David)
Date: Fri, 26 Jan 2001 19:30:21 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
Message-ID: <B4A8F5EFA7E4D41184A00003470D35DE011587BC@INTERGATE>

Thank you all for your prompt replies. (Guido's was within seconds! Well,
minutes, certainly.)

I'll give it another go on Monday. I've got renovations to fill my weekend.

/David



From thomas at xs4all.net  Sat Jan 27 01:35:41 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sat, 27 Jan 2001 01:35:41 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <B4A8F5EFA7E4D41184A00003470D35DE011587BC@INTERGATE>; from dgoodger@atsautomation.com on Fri, Jan 26, 2001 at 07:30:21PM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE011587BC@INTERGATE>
Message-ID: <20010127013541.N962@xs4all.nl>

On Fri, Jan 26, 2001 at 07:30:21PM -0500, Goodger, David wrote:

> Thank you all for your prompt replies. (Guido's was within seconds! Well,
> minutes, certainly.)

Oh, the wonderful things one can do with a time machine....

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From jeremy at alum.mit.edu  Fri Jan 26 23:14:26 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Fri, 26 Jan 2001 17:14:26 -0500 (EST)
Subject: [Python-Dev] Nightly CVS tarballs
In-Reply-To: <3A721341.3F348E51@lemburg.com>
References: <3A721341.3F348E51@lemburg.com>
Message-ID: <14961.63170.394043.790610@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:

  MAL> I just got a request from someone who wants to test the latest
  MAL> CVS version but unfortunately can't because he's behind a
  MAL> firewall.

  MAL> Is there any chance of reactivating the nightly tarball
  MAL> generation that was once in place ?

  MAL> 	http://www.python.org/download/cvs.html

I plan to set up nightly cvs snapshots soon.  We should be moving into
our new office next week; I hope to have a machine that is on the net
24x7 shortly after that.

Jeremy



From bckfnn at worldonline.dk  Sat Jan 27 08:58:38 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Sat, 27 Jan 2001 07:58:38 GMT
Subject: [Python-Dev] Nightly CVS tarballs
In-Reply-To: <14961.63170.394043.790610@localhost.localdomain>
References: <3A721341.3F348E51@lemburg.com> <14961.63170.394043.790610@localhost.localdomain>
Message-ID: <3a727e79.835771@smtp.worldonline.dk>

>>>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:
>
>  MAL> I just got a request from someone who wants to test the latest
>  MAL> CVS version but unfortunately can't because he's behind a
>  MAL> firewall.
>
>  MAL> Is there any chance of reactivating the nightly tarball
>  MAL> generation that was once in place ?
>
>  MAL> 	http://www.python.org/download/cvs.html

[Jeremy]

>I plan to set up nightly cvs snapshots soon.  We should be moving into
>our new office next week; I hope to have a machine that is on the net
>24x7 shortly after that.

FWIW, I have been using this cron and shell script running on
shell.sourceforge.net. This way I don't need 24x7 in order to make a cvs
tarball (and .zip) available.


22 2 * * * $HOME/bin/jython-snap



SHOTLABEL=`date +%Y%m%d`
LOGLABEL=log.`date +%Y%m%d`
cd /home/groups/jython/htdocs/cvssnaps
(cvs -Qd :pserver:anonymous at cvs1:/cvsroot/jython checkout -d
jython-$SHOTLABEL jython && \
  tar zcf jython-nightly.tar.gz jython-$SHOTLABEL && \
  rm -fr jython-nightly.zip && \
  zip -qr9 jython-nightly.zip jython-$SHOTLABEL && \
  rm -fr jython-$SHOTLABEL) >$LOGLABEL 2>&1


regards,
finn



From tim.one at home.com  Sat Jan 27 10:35:14 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 27 Jan 2001 04:35:14 -0500
Subject: [Python-Dev] setup.py
In-Reply-To: <20010126092559.A5623@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEJPILAA.tim.one@home.com>

[Eric S. Raymond]
> I may not channel Guido the way Tim does, but I suspect he gave you
> developer privileges because he trusts you to do routine stuff like this.

Excellent, Eric!  You're batting 1%.  Here's how to boost it to 93%:
whenever a new idea comes up, just grumble "no".  You'll be right 92% of the
time <wink>.

Reminds me of a friend who got sucked into working at a neural-net startup
trying to build a black box to predict whether the daily close of the S&P
500 would be above or below the previous day's.  He was greatly impressed by
the research they had done, showing that the prototype got the right answer
more than half the time when fed historical data, and at a very high
significance level (i.e., it almost certainly did better than flipping a
coin).  What he didn't realize at the time is that if they had written the
prototype in Python:

    # S&P close daily direction predictor
    print "higher"

it would have been right about 2/3rds the time <0.33 wink>.

never-ascribe-to-insight-what-can-be-explained-by-idiocy-ly y'rs  - tim




From martin at mira.cs.tu-berlin.de  Sat Jan 27 10:38:41 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Sat, 27 Jan 2001 10:38:41 +0100
Subject: [Python-Dev] Nightly CVS tarballs
Message-ID: <200101270938.f0R9cfU08311@mira.informatik.hu-berlin.de>

> Is there any chance of reactivating the nightly tarball generation
> that was once in place ?

What's wrong with

http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.gz

?

Regards,
Martin



From fredrik at effbot.org  Sat Jan 27 11:43:50 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Sat, 27 Jan 2001 11:43:50 +0100
Subject: [Python-Dev] setup.py
References: <LNBBLJKPBEHFEDALKOLCOEJPILAA.tim.one@home.com>
Message-ID: <008c01c0884e$09bd2030$e46940d5@hagrid>

tim wrote:
> Reminds me of a friend who got sucked into working at a neural-net startup
> trying to build a black box to predict whether the daily close of the S&P
> 500 would be above or below the previous day's.  /.../
> 
>     # S&P close daily direction predictor
>     print "higher"

replace "higher" with "same", and you have a pretty
decent weather predictor.

Cheers /F




From mal at lemburg.com  Sat Jan 27 13:01:30 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 27 Jan 2001 13:01:30 +0100
Subject: [Python-Dev] Nightly CVS tarballs
References: <200101270938.f0R9cfU08311@mira.informatik.hu-berlin.de>
Message-ID: <3A72B89A.E03C1912@lemburg.com>

"Martin v. Loewis" wrote:
> 
> > Is there any chance of reactivating the nightly tarball generation
> > that was once in place ?
> 
> What's wrong with
> 
> http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.gz
> 
> ?

I didn't realize that SF does this automagically. Could someone
please redirect the link on the python.org cvs page to the
above address (David Ascher's tarball generation stopped in
February 2000 !).

Thanks for the hint, Martin.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fdrake at acm.org  Sat Jan 27 14:16:01 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Sat, 27 Jan 2001 08:16:01 -0500 (EST)
Subject: [Python-Dev] Nightly CVS tarballs
In-Reply-To: <3A72B89A.E03C1912@lemburg.com>
References: <200101270938.f0R9cfU08311@mira.informatik.hu-berlin.de>
	<3A72B89A.E03C1912@lemburg.com>
Message-ID: <14962.51729.905084.154359@cj42289-a.reston1.va.home.com>

"Martin v. Loewis" wrote:
 > What's wrong with
 > 
 > http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.gz

M.-A. Lemburg writes:
 > I didn't realize that SF does this automagically. Could someone
 > please redirect the link on the python.org cvs page to the
 > above address (David Ascher's tarball generation stopped in
 > February 2000 !).

  Did you want a "snapshot" or a copy of the repository?  What SF
produces is a tarball of the repository, not a snapshot.  We still
need to do something to create snapshots.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From mal at lemburg.com  Sat Jan 27 14:28:40 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Sat, 27 Jan 2001 14:28:40 +0100
Subject: [Python-Dev] Nightly CVS tarballs
References: <200101270938.f0R9cfU08311@mira.informatik.hu-berlin.de>
		<3A72B89A.E03C1912@lemburg.com> <14962.51729.905084.154359@cj42289-a.reston1.va.home.com>
Message-ID: <3A72CD08.F47DAA69@lemburg.com>

"Fred L. Drake, Jr." wrote:
> 
> "Martin v. Loewis" wrote:
>  > What's wrong with
>  >
>  > http://cvs.sourceforge.net/cvstarballs/python-cvsroot.tar.gz
> 
> M.-A. Lemburg writes:
>  > I didn't realize that SF does this automagically. Could someone
>  > please redirect the link on the python.org cvs page to the
>  > above address (David Ascher's tarball generation stopped in
>  > February 2000 !).
> 
>   Did you want a "snapshot" or a copy of the repository?  What SF
> produces is a tarball of the repository, not a snapshot. 

I meant a copy of what you get when you check out the Python
CVS tree wrapped into a .tar.gz file. The size of the above
archive (16MB) suggests that a lot more is going into the
.tar.gz file. A .tar.gz of the CVS checkout is around 4MB in
size. Looks like we still need to do something after all ;)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From armin at steinhoff.de  Sat Jan 27 17:24:57 2001
From: armin at steinhoff.de (Armin Steinhoff)
Date: Sat, 27 Jan 2001 17:24:57 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
Message-ID: <4.3.2.7.2.20010127170125.00b2ee80@mail.secureweb.de>

Hello Guido,

nice to see the first 2.1 version :)

At 16:52 26.01.01 -0500, you wrote:
> > [CC'ing to Armin Steinhoff, who maintains pyqnx on SourceForge.]
> >
> > I'm having trouble building Python 2.1a1 on QNX 4.25. Caveat: my C is very
> > rusty (long live Python!), I don't know my way around configure, and am not
> > familiar with Python's Makefile. Python 2.0 compiled fine (with a couple of
> > tweaks), but I'm getting caught by the new way of building things. Please
> > help if you can! Many thanks in advance.
> >
> > Here's an excerpt of my efforts:
> >
> >     # cd /tmp/py
> >     # gunzip -c < python-2.1a1.tgz | tar -rf -
> >     # cd Python-2.1a1
> >     # ./configure 2>&1 | tee ../configure.1

I did a fast hack with the new 2.1 version:

CC=cc LINKCC=cc configure --without-gcc --shared=no --without-threads

(Hope '--shared=no' works ... QNX4 doesn't support dynamic loading)
Please replace all references to g++ by cc -> in the main Makefile and the 
Modules/Makefile.
In the Modules/Makefile set LDFLAGS=250K  ... the default stacksize of 32K 
seems to be too small.

> >     # make 2>&1 | tee ../make.1
> >     ...
> >     ./python //5/tmp/py/Python-2.1a1/setup.py build
> >     'import site' failed; use -v for traceback

'python -v' shows that the module 'distutils.util' isn't there ....  it 
seems to be not included in the source distribution.

'import site' failed; traceback:
Traceback (most recent call last):
File "//1/Python-2.1a1/Lib/site.py", line 85, in ?
from distutils.util import get_platform
ImportError: No module named distutils.util
                                              ^^^^^^^^^^^^^^
[ clip ..]

>This is probably in the realm of the distutils.  I have no idea how to
>teach it to build on QNX, sorry!

IMHO ... it is not a path problem.

In the moment there is no time left for me to go into these details. A 
clean port will happen in a few weeks. Please check out PyQNX for news 
regarding QNX4.25 and QNX6.0  (aka  QNX Neutrino).

Greetings

Armin Steinhoff

Life-Demo of PyDACHS
http://www.dachs.net/PyDACHS_python-tilcon.htm
in our booth at
Embedded Systems 2001, Nuremberg, GER
http://www.embedded-systems-messe.de
Febr. 14-16, 2000            Hall 11, Booth P 04






From guido at digicool.com  Sat Jan 27 17:50:50 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 27 Jan 2001 11:50:50 -0500
Subject: [Python-Dev] LINKCC defaults to CXX
In-Reply-To: Your message of "Fri, 26 Jan 2001 08:12:15 PST."
             <20010126081215.B5534@glacier.fnational.com> 
References: <20010126081215.B5534@glacier.fnational.com> 
Message-ID: <200101271650.LAA30720@cj20424-a.reston1.va.home.com>

> Dear lord why?  So people can develop extensions using C++?  Its
> not worth the pain inflicted on everyone else.  Let them
> recompile with LINKCC=CXX.
> 
> Linking with CXX opens a huge can of stinky worms.  First of all,
> just because configure found a value for CXX doesn't mean it
> works.  Even if it does that doesn't mean that using it is a good
> idea.  Linking with CXX will bring in the C++ runtime.  There are
> a large number of platforms where the C++ ABI has not been
> standarized; for example, anything that used g++.
> 
> Can we please leave LINKCC default to CXX?  Its easy enough for
> the crazies to override if they like.  I'll even create a
> configure option for them.

Arg.  My bad.  I did this as an experiment; it didn't break on my
machine, but I didn't intend this to become the standard!  Thanks for
changing it back.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Sat Jan 27 17:52:23 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 27 Jan 2001 11:52:23 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: Your message of "Fri, 26 Jan 2001 23:51:09 +0100."
             <3A71FF5D.DC609775@lemburg.com> 
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid>  
            <3A71FF5D.DC609775@lemburg.com> 
Message-ID: <200101271652.LAA30750@cj20424-a.reston1.va.home.com>

> revision 1.34
> date: 2000/07/19 17:09:51;  author: montanaro;  state: Exp;  lines: +18 -23
> added rewritten normpath from Moshe Zadka that does the right thing with
> paths containing ..
[...]
> Revision 1.33 clearly leaves initial slashes untouched.
> I guess we should restore this...

Yes, please!  (Just the "leading extra slashes stay" behavior.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Sat Jan 27 17:57:40 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 27 Jan 2001 11:57:40 -0500
Subject: [Python-Dev] New bug in function object hash() and comparisons
In-Reply-To: Your message of "Fri, 26 Jan 2001 17:02:09 EST."
             <list-760656@digicool.com> 
References: <list-760656@digicool.com> 
Message-ID: <200101271657.LAA30782@cj20424-a.reston1.va.home.com>

Barry noticed:

> Anyway, did you know that you can use functions as keys to a
> dictionary, but that you can mutate them to "lose" the element?
> 
> -------------------- snip snip --------------------
> Python 2.0 (#13, Jan 10 2001, 13:06:39) 
> [GCC egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)] on linux2
> Type "copyright", "credits" or "license" for more information.
> >>> d = {}
> >>> def foo(): pass
> ... 
> >>> def bar(): pass
> ... 
> >>> d[foo] = 1
> >>> d[foo]
> 1
> >>> foocode = foo.func_code
> >>> foo.func_code = bar.func_code
> >>> d[foo]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> KeyError: <function foo at 0x81ef474>
> >>> d[bar] = 2
> >>> d[bar]
> 2
> >>> d[foo]
> 2
> >>> foo.func_code = foocode
> >>> d[foo]
> 1
> -------------------- snip snip --------------------
> 
> It's because a function's func_code attribute is used in its hash
> calculation, but func_code is writable!

Clearly, something changed.  I'm pretty sure it's the function
attributes.  Either the function attributes shouldn't be used in
comparing function objects, or hash() on functions should be
unimplemented, or comparison on functions should use simple pointer
compares.

What's the right solution?  Do people use functions as dict keys?  If
not, we can remove the hash() implementation.  But I suspect they
*are* used as dict keys.  Not using the __dict__ on comparisons
appears ugly, so probably the best solution is to change function
comparisons to use simple pointer compares.  That removes the
possibility to see whether two different functions implement the same
code -- but does anybody really use that?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From moshez at zadka.site.co.il  Sat Jan 27 18:17:50 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Sat, 27 Jan 2001 19:17:50 +0200 (IST)
Subject: [Python-Dev] New bug in function object hash() and comparisons
In-Reply-To: <200101271657.LAA30782@cj20424-a.reston1.va.home.com>
References: <200101271657.LAA30782@cj20424-a.reston1.va.home.com>, <list-760656@digicool.com>
Message-ID: <20010127171750.91412A840@darjeeling.zadka.site.co.il>

On Sat, 27 Jan 2001 11:57:40 -0500, Guido van Rossum <guido at digicool.com> wrote:

(about function hash doing the wrong thing)
> What's the right solution?

I have no idea...

>  Do people use functions as dict keys?  If
> not, we can remove the hash() implementation.

...but this ain't it.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From gvwilson at ca.baltimore.com  Sat Jan 27 18:23:42 2001
From: gvwilson at ca.baltimore.com (Greg Wilson)
Date: Sat, 27 Jan 2001 12:23:42 -0500
Subject: [Python-Dev] RE: Python-Dev digest, Vol 1 #1119 - 17 msgs
In-Reply-To: <20010127170103.DA6DEEA44@mail.python.org>
Message-ID: <000001c08885$e5418c40$770a0a0a@nevex.com>

> Guido wrote:
> What's the right solution?  Do people use functions as dict keys?

Yup --- even use this as an example in the course (part of drumming
home to students that functions are just a special kind of data).

Greg



From barry at digicool.com  Sat Jan 27 18:43:43 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 27 Jan 2001 12:43:43 -0500
Subject: [Python-Dev] Re: New bug in function object hash() and comparisons
References: <list-760656@digicool.com>
	<200101271657.LAA30782@cj20424-a.reston1.va.home.com>
Message-ID: <14963.2255.268933.615456@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido at digicool.com> writes:

    GvR> Clearly, something changed.  I'm pretty sure it's the
    GvR> function attributes.

Actually no.  func_code is used in func_hash() but somewhere in the
Python 1.6 cycle, func_code was made assignable.
    
    GvR> Either the function attributes shouldn't be used in comparing
    GvR> function objects, or hash() on functions should be
    GvR> unimplemented, or comparison on functions should use simple
    GvR> pointer compares.

    GvR> What's the right solution?

We should definitely continue to allow functions as keys to
dictionaries, but probably just remove func_code as an input to the
function's hash.
    
-Barry



From barry at digicool.com  Sat Jan 27 18:48:33 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Sat, 27 Jan 2001 12:48:33 -0500
Subject: [Python-Dev] Re: New bug in function object hash() and comparisons
References: <list-760656@digicool.com>
	<200101271657.LAA30782@cj20424-a.reston1.va.home.com>
	<14963.2255.268933.615456@anthem.wooz.org>
Message-ID: <14963.2545.14600.667505@anthem.wooz.org>

    Me> We should definitely continue to allow functions as keys to
    Me> dictionaries, but probably just remove func_code as an input
    Me> to the function's hash.
    
But of course, func_globals won't be sufficient as a hash for
functions.  Probably changing the hash to a pointer compare is the
best thing after all.

-Barry



From guido at digicool.com  Sat Jan 27 18:49:16 2001
From: guido at digicool.com (Guido van Rossum)
Date: Sat, 27 Jan 2001 12:49:16 -0500
Subject: [Python-Dev] Re: New bug in function object hash() and comparisons
In-Reply-To: Your message of "Sat, 27 Jan 2001 12:43:43 EST."
             <14963.2255.268933.615456@anthem.wooz.org> 
References: <list-760656@digicool.com> <200101271657.LAA30782@cj20424-a.reston1.va.home.com>  
            <14963.2255.268933.615456@anthem.wooz.org> 
Message-ID: <200101271749.MAA32025@cj20424-a.reston1.va.home.com>

> >>>>> "GvR" == Guido van Rossum <guido at digicool.com> writes:
> 
>     GvR> Clearly, something changed.  I'm pretty sure it's the
>     GvR> function attributes.
> 
> Actually no.  func_code is used in func_hash() but somewhere in the
> Python 1.6 cycle, func_code was made assignable.

Argh!  You're right.

>     GvR> Either the function attributes shouldn't be used in comparing
>     GvR> function objects, or hash() on functions should be
>     GvR> unimplemented, or comparison on functions should use simple
>     GvR> pointer compares.
> 
>     GvR> What's the right solution?
> 
> We should definitely continue to allow functions as keys to
> dictionaries, but probably just remove func_code as an input to the
> function's hash.

OK, that settles it.  There's not much point in having a function
compare do anything besides a pointer comparison when the code objects
aren't compared.  (Two completely different functions could compare
equal e.g. if they has the same attribute dict.)  So we should just
punt, and compare functions by object pointer.

The proper way to do this is to *delete* func_hash and func_compare
from funcobject.c -- the default comparison will take care of this.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From akuchlin at mems-exchange.org  Sat Jan 27 19:58:30 2001
From: akuchlin at mems-exchange.org (A.M. Kuchling)
Date: Sat, 27 Jan 2001 13:58:30 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org>
Message-ID: <200101271858.NAA04898@mira.erols.com>

On Sat, 27 Jan 2001 18:28:02 +0100, 
	Andreas Jung <andreas at andreas-jung.com> wrote:
>Is there a reason why 2.1 runs significantly slower ?
>Both Python versions were compiled with -g -O2 only.

[CC'ing to python-dev]  Confirmed:

[amk at mira Python-2.0]$ ./python Lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 3.14
This machine benchmarks at 3184.71 pystones/second
[amk at mira Python-2.0]$ python2.1 Lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 3.81
This machine benchmarks at 2624.67 pystones/second

The ceval.c changes seem a likely candidate to have caused this.
Anyone want to run Marc-Andre's microbenchmarks and see how the
numbers have changed?

--amk




From moshez at zadka.site.co.il  Sat Jan 27 20:14:28 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Sat, 27 Jan 2001 21:14:28 +0200 (IST)
Subject: [Python-Dev] Function Hash: Check it in?
Message-ID: <20010127191428.D71ADA840@darjeeling.zadka.site.co.il>

Attached is an example Python session after I patched the intepreter.
The test-suite passes all right.

I want an OK to check this in.

Here is the patch:
Index: Objects/funcobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/funcobject.c,v
retrieving revision 2.33
diff -c -r2.33 funcobject.c
*** Objects/funcobject.c        2001/01/25 20:06:59     2.33
--- Objects/funcobject.c        2001/01/27 19:13:08
***************
*** 347,358 ****
        0,              /*tp_print*/
        0, /*tp_getattr*/
        0, /*tp_setattr*/
!       (cmpfunc)func_compare, /*tp_compare*/
        (reprfunc)func_repr, /*tp_repr*/
        0,              /*tp_as_number*/
        0,              /*tp_as_sequence*/
        0,              /*tp_as_mapping*/
!       (hashfunc)func_hash, /*tp_hash*/
        0,              /*tp_call*/
        0,              /*tp_str*/
        (getattrofunc)func_getattro,         /*tp_getattro*/
--- 347,358 ----
        0,              /*tp_print*/
        0, /*tp_getattr*/
        0, /*tp_setattr*/
!       0, /*tp_compare*/
        (reprfunc)func_repr, /*tp_repr*/
        0,              /*tp_as_number*/
        0,              /*tp_as_sequence*/
        0,              /*tp_as_mapping*/
!       0, /*tp_hash*/
        0,              /*tp_call*/
        0,              /*tp_str*/
        (getattrofunc)func_getattro,         /*tp_getattro*/

Python 2.1a1 (#1, Jan 27 2001, 21:01:24)
[GCC 2.95.3 20010111 (prerelease)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> def foo():
...     pass
...
>>> def bar():
...     pass
...
>>> hash(foo)
135484636
>>> hash(bar)
135481676
>>> foo == bar
0
>>> d = {}
>>> d[foo] =1
>>> def temp():
...     print "baz"
...
>>> foo.func_code = temp.func_code
>>> d[foo]
1

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From tim.one at home.com  Sat Jan 27 21:06:20 2001
From: tim.one at home.com (Tim Peters)
Date: Sat, 27 Jan 2001 15:06:20 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <200101271858.NAA04898@mira.erols.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGELGILAA.tim.one@home.com>

[A.M. Kuchling]
> [CC'ing to python-dev]  Confirmed:
>
> [amk at mira Python-2.0]$ ./python Lib/test/pystone.py
> Pystone(1.1) time for 10000 passes = 3.14
> This machine benchmarks at 3184.71 pystones/second
> [amk at mira Python-2.0]$ python2.1 Lib/test/pystone.py
> Pystone(1.1) time for 10000 passes = 3.81
> This machine benchmarks at 2624.67 pystones/second
>
> The ceval.c changes seem a likely candidate to have caused this.
> Anyone want to run Marc-Andre's microbenchmarks and see how the
> numbers have changed?

Want to, yes, but it looks hopeless on my box:

**** 2.0

C:\Python20>python lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 0.851013
This machine benchmarks at 11750.7 pystones/second

C:\Python20>python lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 1.24279
This machine benchmarks at 8046.41 pystones/second

**** 2.1a1

C:\Python21a1>python lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 0.823313
This machine benchmarks at 12146 pystones/second

C:\Python21a1>python lib/test/pystone.py
Pystone(1.1) time for 10000 passes = 1.27046
This machine benchmarks at 7871.15 pystones/second

**** CVS

C:\Code\python\dist\src\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.836391
This machine benchmarks at 11956.1 pystones/second

C:\Code\python\dist\src\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 1.3055
This machine benchmarks at 7659.9 pystones/second


That's after a reboot:  no matter which Python I use, it gets about 12000 on
the first run with a given python.exe, and about 8000 on the second.  Not
shown is that it *stays* at about 8000 until the next reboot.

So there's a Windows (W98SE) Mystery, but also no evidence that timings have
changed worth spit under the MS compiler.  The eval loop is very touchy, and
I suspect you won't track this down on your box until staring at the code
gcc (I presume you're using gcc) generates.  May be sensitive to which
release of gcc you're using too.

switch-to-windows-and-you'll-have-easier-things-to-worry-about<wink>-ly
    y'rs  - tim




From fredrik at pythonware.com  Sun Jan 28 10:37:45 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Sun, 28 Jan 2001 10:37:45 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid>              <3A71FF5D.DC609775@lemburg.com>  <200101271652.LAA30750@cj20424-a.reston1.va.home.com>
Message-ID: <00ed01c0890e$e3bf5ad0$e46940d5@hagrid>

guido wrote:

> > Revision 1.33 clearly leaves initial slashes untouched.
> > I guess we should restore this...
> 
> Yes, please!  (Just the "leading extra slashes stay" behavior.)

just looked this up in the specs, and POSIX seem to
require that leading slashes are preserved only if there
are exactly two of them:

    A pathname that begins with two successive slashes
    may be interpreted in an implementation-dependent
    manner, although more than two leading slashes are
    treated as a single slash.
    (from susv2)

maybe we should add a if len(slashes) > 2: slashes = "/"
test to the patch?

Cheers /F




From thomas at xs4all.net  Sun Jan 28 18:39:58 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Sun, 28 Jan 2001 18:39:58 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <00ed01c0890e$e3bf5ad0$e46940d5@hagrid>; from fredrik@pythonware.com on Sun, Jan 28, 2001 at 10:37:45AM +0100
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid> <3A71FF5D.DC609775@lemburg.com> <200101271652.LAA30750@cj20424-a.reston1.va.home.com> <00ed01c0890e$e3bf5ad0$e46940d5@hagrid>
Message-ID: <20010128183958.Q962@xs4all.nl>

On Sun, Jan 28, 2001 at 10:37:45AM +0100, Fredrik Lundh wrote:
> guido wrote:

> > > Revision 1.33 clearly leaves initial slashes untouched.
> > > I guess we should restore this...
> > 
> > Yes, please!  (Just the "leading extra slashes stay" behavior.)

> just looked this up in the specs, and POSIX seem to
> require that leading slashes are preserved only if there
> are exactly two of them:

>     A pathname that begins with two successive slashes
>     may be interpreted in an implementation-dependent
>     manner, although more than two leading slashes are
>     treated as a single slash.
>     (from susv2)

> maybe we should add a if len(slashes) > 2: slashes = "/"
> test to the patch?

How strictly do we need (or want, for that matter) to follow POSIX here ?
I'm aware the module is called 'posixpath', but it's used in a bit more than
just POSIX environments (or POSIX behaviours) so it might make sense to
ignore this particular tidbit. What if there is a system that attaches a
special meaning to ///, should we create a new path module for it ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From martin at mira.cs.tu-berlin.de  Sun Jan 28 21:50:35 2001
From: martin at mira.cs.tu-berlin.de (Martin v. Loewis)
Date: Sun, 28 Jan 2001 21:50:35 +0100
Subject: [Python-Dev] XSLT parser interface
Message-ID: <200101282050.f0SKoZr08809@mira.informatik.hu-berlin.de>

Based on my previous IDL interface for XPath parsers, I've defined an
API for a parser that parsers XSLT pattern expressions. It is an
extension to the XPath API, so I attach only the additional functions.

Any comments are appreciated.

Martin

module XPath{
  // XSLT exprType values
  const unsigned short PATTERN = 17;
  const unsigned short LOCATION_PATTERN = 18;
  const unsigned short RELATIVE_PATH_PATTERN = 19;
  const unsigned short STEP_PATTERN = 20;

  interface Pattern;
  interface LocationPathPattern;
  interface RelativePathPattern;
  interface StepPattern;

  interface PatternFactory:ExprFactory{
    Pattern createPattern(in LocationPathPattern first);
    // idkey may be null, represents IdKeyPattern
    // if parent is true, it is '/', else '//'
    // rel may be null
    LocationPathPattern createLocationPathPattern(in FunctionCall idkey,
						  boolean parent,
						  in RelativePathPattern rel);
    // if parent is true, it is /, else //
    RelativePathPattern createRelativePathPattern(in RelativePathPattern rel,
						  boolean parent,
						  in StepPattern step);
    StepPattern createStepPattern(in AxisSpecifier axis,
				  in NodeTest test,
				  in PredicateList predicates);
  };

  typedef sequence<LocationPathPattern> LocationPathPatterns;
  interface Pattern:Expr{
    readonly attribute LocationPathPatterns patterns;
    void append(in LocationPathPattern pattern);
  };

  interface LocationPathPattern:Expr{
    readonly attribute FunctionCall idkey;
    readonly attribute boolean parent;
    readonly attribute RelativePathPattern relative_pattern;
  };

  interface RelativePathPattern:Expr{
    readonly attribute RelativePathPattern relative;
    readonly attribute boolean parent;
    readonly attribute StepPattern step;
  };

  interface StepPattern:Expr{
    readonly attribute AxisSpecifier axis;
    readonly attribute NodeTest test;
    readonly attribute PredicateList predicates;
  };

  interface XSLTParser:Parser{
    Pattern parsePattern(in DOMString pattern);
  };
};



From skip at mojam.com  Sun Jan 28 22:40:28 2001
From: skip at mojam.com (Skip Montanaro)
Date: Sun, 28 Jan 2001 15:40:28 -0600 (CST)
Subject: [Python-Dev] What happened to Setup.local's functionality?
Message-ID: <14964.37324.642566.602319@beluga.mojam.com>

I just remembered Modules/Setup.local.  I peeked at mine and noticed it had
been zeroed out.  I then copied a version of it over from another machine
and reran make a couple times.  Makesetup ran but nothing mentioned in
Setup.local got built.

I don't think 2.1 can be released without providing a way for users to
recover from this change.  I didn't see anything obvious in setup.py.  Am I
missing something?

Skip




From thomas at xs4all.net  Mon Jan 29 01:39:04 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 29 Jan 2001 01:39:04 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include pyport.h,2.20,2.21
In-Reply-To: <20001104001415.A2093@53b.hoffleit.de>; from gregor@hoffleit.de on Sat, Nov 04, 2000 at 12:14:15AM +0100
References: <200010050142.SAA08326@slayer.i.sourceforge.net> <20001104001415.A2093@53b.hoffleit.de>
Message-ID: <20010129013904.R962@xs4all.nl>

On Sat, Nov 04, 2000 at 12:14:15AM +0100, Gregor Hoffleit wrote:
> FYI: This misdefinition with LONG_BIT was due to a bug in glibc's limits.h. It
> has been fixed in glibc 2.96.

Do you mean gcc 2.96, or glibc 2.(1|2).96 ? Or is 2.96 some internal
versioning for glibc that I was unaware of ? :)

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From barry at digicool.com  Mon Jan 29 06:03:45 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 00:03:45 -0500
Subject: [Python-Dev] Function Hash: Check it in?
References: <20010127191428.D71ADA840@darjeeling.zadka.site.co.il>
Message-ID: <14964.63921.966960.445548@anthem.wooz.org>

>>>>> "MZ" == Moshe Zadka <moshez at zadka.site.co.il> writes:

    MZ> Attached is an example Python session after I patched the
    MZ> intepreter.  The test-suite passes all right.

    MZ> I want an OK to check this in.

Moshe, please remove the func_hash() and func_compare() functions, and
if the patch passes the test suite, go ahead and check it all in.
Please also check in a test case.

Thanks,
-Barry



From barry at digicool.com  Mon Jan 29 06:04:12 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 00:04:12 -0500
Subject: [Python-Dev] Function Hash: Check it in?
References: <20010127191428.D71ADA840@darjeeling.zadka.site.co.il>
Message-ID: <14964.63948.492662.775413@anthem.wooz.org>

Oh yeah, please also add an entry to the NEWS file.

Thanks,
-Barry



From moshez at zadka.site.co.il  Mon Jan 29 07:26:25 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon, 29 Jan 2001 08:26:25 +0200 (IST)
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <14964.63948.492662.775413@anthem.wooz.org>
References: <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il>
Message-ID: <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>

On Mon, 29 Jan 2001 00:04:12 -0500, barry at digicool.com (Barry A. Warsaw) wrote:
 
> Oh yeah, please also add an entry to the NEWS file.

Done. The checkin to the NEWS file will be done in about a million years,
when my antique of a modem finishes sending the data.
I had to change test_opcodes since it tested that functions with the
same code compare equal.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From gregor at mediasupervision.de  Mon Jan 29 12:13:39 2001
From: gregor at mediasupervision.de (Gregor Hoffleit)
Date: Mon, 29 Jan 2001 12:13:39 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Include pyport.h,2.20,2.21
In-Reply-To: <20010129013904.R962@xs4all.nl>; from thomas@xs4all.net on Mon, Jan 29, 2001 at 01:39:04AM +0100
References: <200010050142.SAA08326@slayer.i.sourceforge.net> <20001104001415.A2093@53b.hoffleit.de> <20010129013904.R962@xs4all.nl>
Message-ID: <20010129121339.A1166@mediasupervision.de>

On Mon, Jan 29, 2001 at 01:39:04AM +0100, Thomas Wouters wrote:
> On Sat, Nov 04, 2000 at 12:14:15AM +0100, Gregor Hoffleit wrote:
> > FYI: This misdefinition with LONG_BIT was due to a bug in glibc's limits.h. It
> > has been fixed in glibc 2.96.
> 
> Do you mean gcc 2.96, or glibc 2.(1|2).96 ? Or is 2.96 some internal
> versioning for glibc that I was unaware of ? :)

Sorry, it was fixed in glibc 2.1.96.

    Gregor
    



From mal at lemburg.com  Mon Jan 29 12:31:11 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 12:31:11 +0100
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
References: <B4A8F5EFA7E4D41184A00003470D35DE01158776@INTERGATE> <20010126170101.B2762@amarok.cnri.reston.va.us> <3A71F6ED.D6D642A7@lemburg.com> <001a01c087e6$ec3b9710$e46940d5@hagrid>  
	            <3A71FF5D.DC609775@lemburg.com> <200101271652.LAA30750@cj20424-a.reston1.va.home.com>
Message-ID: <3A75547F.A601E219@lemburg.com>

Guido van Rossum wrote:
> 
> > revision 1.34
> > date: 2000/07/19 17:09:51;  author: montanaro;  state: Exp;  lines: +18 -23
> > added rewritten normpath from Moshe Zadka that does the right thing with
> > paths containing ..
> [...]
> > Revision 1.33 clearly leaves initial slashes untouched.
> > I guess we should restore this...
> 
> Yes, please!  (Just the "leading extra slashes stay" behavior.)

Checked in a patch which preserves '/' and '//' but converts
more than 3 initial slashes into one (see Fredrik's note about
POSIX standard on this).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 29 13:24:15 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 13:24:15 +0100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com>
Message-ID: <3A7560EF.39D6CF@lemburg.com>

Here the results of my micro benckmark pybench 0.7:

PYBENCH 0.7

Benchmark: /home/lemburg/tmp/pybench-2.1a1.pyb (rounds=10, warp=20)

Tests:                              per run    per oper.  diff *
------------------------------------------------------------------------
          BuiltinFunctionCalls:    1102.30 ms    8.65 us   +7.56%
           BuiltinMethodLookup:     966.75 ms    1.84 us   +4.56%
                 ConcatStrings:    1198.55 ms    7.99 us  +11.63%
                 ConcatUnicode:    1835.60 ms   12.24 us  +19.29%
               CreateInstances:    1556.40 ms   37.06 us   +2.49%
       CreateStringsWithConcat:    1396.70 ms    6.98 us   +5.44%
       CreateUnicodeWithConcat:    1895.80 ms    9.48 us  +31.61%
                  DictCreation:    1760.50 ms   11.74 us   +2.43%
                      ForLoops:    1426.90 ms  142.69 us   -7.51%
                    IfThenElse:    1155.25 ms    1.71 us   -6.24%
                   ListSlicing:     555.40 ms  158.69 us   -4.14%
                NestedForLoops:     784.55 ms    2.24 us   -6.33%
          NormalClassAttribute:    1052.80 ms    1.75 us  -10.42%
       NormalInstanceAttribute:    1053.80 ms    1.76 us   +0.89%
           PythonFunctionCalls:    1127.50 ms    6.83 us  +12.56%
             PythonMethodCalls:     909.10 ms   12.12 us   +9.70%
                     Recursion:     942.40 ms   75.39 us  +23.74%
                  SecondImport:     924.20 ms   36.97 us   +3.98%
           SecondPackageImport:     951.10 ms   38.04 us   +6.16%
         SecondSubmoduleImport:    1211.30 ms   48.45 us   +7.69%
       SimpleComplexArithmetic:    1635.30 ms    7.43 us   +5.58%
        SimpleDictManipulation:     963.35 ms    3.21 us   -0.57%
         SimpleFloatArithmetic:     877.00 ms    1.59 us   -2.92%
      SimpleIntFloatArithmetic:     851.10 ms    1.29 us   -5.89%
       SimpleIntegerArithmetic:     850.05 ms    1.29 us   -6.41%
        SimpleListManipulation:    1168.50 ms    4.33 us   +8.14%
          SimpleLongArithmetic:    1231.15 ms    7.46 us   +1.52%
                    SmallLists:    2153.35 ms    8.44 us  +10.77%
                   SmallTuples:    1314.65 ms    5.48 us   +3.80%
         SpecialClassAttribute:    1050.80 ms    1.75 us   +1.48%
      SpecialInstanceAttribute:    1248.75 ms    2.08 us   -2.32%
                StringMappings:    1702.60 ms   13.51 us  +19.69%
              StringPredicates:    1024.25 ms    3.66 us  -25.49%
                 StringSlicing:    1093.35 ms    6.25 us   +4.35%
                     TryExcept:    1584.85 ms    1.06 us  -10.90%
                TryRaiseExcept:    1239.50 ms   82.63 us   +4.64%
                  TupleSlicing:     983.00 ms    9.36 us   +3.36%
               UnicodeMappings:    1631.65 ms   90.65 us  +42.76%
             UnicodePredicates:    1762.10 ms    7.83 us  +15.99%
             UnicodeProperties:    1410.80 ms    7.05 us  +19.57%
                UnicodeSlicing:    1366.20 ms    7.81 us  +19.23%
------------------------------------------------------------------------
            Average round time:   58001.00 ms              +3.30%

*) measured against: /home/lemburg/tmp/pybench-2.0.pyb (rounds=10, warp=20)

The benchmark is available here in case someone wants to verify
the results on different platforms:

	http://www.lemburg.com/python/pybench-0.7.zip

The above tests were done on a Linux 2.2 system, AMD K6 233MHz. 
The figures shown compare CVS Python (2.1a1) against stock
Python 2.0. 

As you can see, Python function calls have suffered
a lot for some reason. Unicode mappings and other Unicode database
related methods show the effect of the compression of the Unicode
database -- a clear space/speed tradeoff. 

I can't really explain why Unicode concatenation has had a 
slowdown -- perhaps the new coercion logic has something to
do with this ?!

On the nice side: attribute lookups are faster; probably due to
the string key optimizations in the dictionary implementation.
Loops and exceptions are also a tad faster.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From fredrik at pythonware.com  Mon Jan 29 13:30:32 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Mon, 29 Jan 2001 13:30:32 +0100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com>
Message-ID: <01fc01c089ef$48072230$0900a8c0@SPIFF>

mal wrote:
>                UnicodeMappings:    1631.65 ms   90.65 us  +42.76%
>              UnicodePredicates:    1762.10 ms    7.83 us  +15.99%
>              UnicodeProperties:    1410.80 ms    7.05 us  +19.57%
>                 UnicodeSlicing:    1366.20 ms    7.81 us  +19.23%
>
> Unicode mappings and other Unicode database related methods
> show the effect of the compression of the Unicode database -- a
> clear space/speed tradeoff.

umm.  the tests don't seem to test the "\N{name}" escapes, so the
only thing that has changed in 2.1 is the "decomposition" method
(used in the UnicodeProperties test).

are you sure you're comparing against 2.0 final?

Cheers /F




From mal at lemburg.com  Mon Jan 29 13:52:12 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 13:52:12 +0100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com> <01fc01c089ef$48072230$0900a8c0@SPIFF>
Message-ID: <3A75677C.E4FA82A0@lemburg.com>

Fredrik Lundh wrote:
> 
> mal wrote:
> >                UnicodeMappings:    1631.65 ms   90.65 us  +42.76%
> >              UnicodePredicates:    1762.10 ms    7.83 us  +15.99%
> >              UnicodeProperties:    1410.80 ms    7.05 us  +19.57%
> >                 UnicodeSlicing:    1366.20 ms    7.81 us  +19.23%
> >
> > Unicode mappings and other Unicode database related methods
> > show the effect of the compression of the Unicode database -- a
> > clear space/speed tradeoff.
> 
> umm.  the tests don't seem to test the "\N{name}" escapes, so the
> only thing that has changed in 2.1 is the "decomposition" method
> (used in the UnicodeProperties test).

The mappings figure surprised me too: the code has not changed,
but the unicodetype_db.h look different. Don't know how this
affects performance though.

The differences could also be explained by a increase in Unicode
object creation time (the concatenation is also a lot slower),
so perhaps that's where we should look...

> are you sure you're comparing against 2.0 final?

Yes... after a check of the Makefile I found that I had compiled
Python 2.0 with -O3 and 2.1a1 with -O2 -- perhaps this makes
a difference w/r to inlining of code. I'll recompile and rerun
the benchmark.
 
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From tim.one at home.com  Mon Jan 29 13:56:49 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 07:56:49 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
Message-ID: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>

[Ping]
>     dict[key] = 1
>     if key in dict: ...
>     for key in dict: ...

[Guido]
> No chance of a time-machine escape, but I *can* say that I agree that
> Ping's proposal makes a lot of sense.  This is a reversal of my
> previous opinion on this matter.  (Take note -- those don't happen
> very often! :-)
>
> First to submit a working patch gets a free copy of 2.1a2 and
> subsequent releases,

Thomas since submitted a patch to do the "if key in dict" part (which I
reviewed and accepted, pending resolution of doc issues).

It does not do the "for key in dict" part.  It's not entirely clear whether
you intended to approve that part too (I've simplified away many layers of
quoting in the above <wink>).  In any case, nobody is working on that part.

WRT that part, Ping produced some stats in:

http://mail.python.org/pipermail/python-dev/2001-January/012106.html

> How often do you write 'dict.has_key(x)'?          (std lib says: 206)
> How often do you write 'for x in dict.keys()'?     (std lib says: 49)
>
> How often do you write 'x in dict.values()'?       (std lib says: 0)
> How often do you write 'for x in dict.values()'?   (std lib says: 3)

However, he did not report on occurrences of

    for k, v in dict.items()

I'm not clear exactly which files he examined in the above, or how the
counts were obtained.  So I don't know how this compares:  I counted 188
instances of the string ".items(" in 122 .py files, under the dist/ portion
of current CVS.  A number of those were assignment and return stmts, others
were dict.items() in an arglist, and at least one was in a comment.  After
weeding those out, I was left with 153 legit "for" loops iterating over
x.items().  In all:

    153 iterating over x.items()
    118     "     over x.keys()
     17     "     over x.values()

So I conclude that iterating over .values() is significantly more common
than iterating over .keys().

On c.l.py about an hour ago, Thomas complained that two (out of two) of his
coworkers guessed wrong about what

    for x in dict:

would do, but didn't say what they *did* think it would do.  Since Thomas
doesn't work with idiots, I'm guessing they *didn't* guess it would iterate
over either values or the lines of a freshly-opened file named "dict"
<wink>.

So if you did intend to approve "for x in dict" iterating over dict.keys(),
maybe you want to call me out on that "approval post" I forged under your
name.

falls-on-swords-so-often-there's-nothing-left-to-puncture<wink>-ly y'rs
    - tim




From mal at lemburg.com  Mon Jan 29 14:18:52 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 14:18:52 +0100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com> <01fc01c089ef$48072230$0900a8c0@SPIFF> <3A75677C.E4FA82A0@lemburg.com>
Message-ID: <3A756DBC.8EAC42F5@lemburg.com>

"M.-A. Lemburg" wrote:
> 
> Fredrik Lundh wrote:
> >
> > mal wrote:
> > >                UnicodeMappings:    1631.65 ms   90.65 us  +42.76%
> > >              UnicodePredicates:    1762.10 ms    7.83 us  +15.99%
> > >              UnicodeProperties:    1410.80 ms    7.05 us  +19.57%
> > >                 UnicodeSlicing:    1366.20 ms    7.81 us  +19.23%
> > >
> > > Unicode mappings and other Unicode database related methods
> > > show the effect of the compression of the Unicode database -- a
> > > clear space/speed tradeoff.
> >
> > umm.  the tests don't seem to test the "\N{name}" escapes, so the
> > only thing that has changed in 2.1 is the "decomposition" method
> > (used in the UnicodeProperties test).
> 
> The mappings figure surprised me too: the code has not changed,
> but the unicodetype_db.h look different. Don't know how this
> affects performance though.
> 
> The differences could also be explained by a increase in Unicode
> object creation time (the concatenation is also a lot slower),
> so perhaps that's where we should look...
> 
> > are you sure you're comparing against 2.0 final?
> 
> Yes... after a check of the Makefile I found that I had compiled
> Python 2.0 with -O3 and 2.1a1 with -O2 -- perhaps this makes
> a difference w/r to inlining of code. I'll recompile and rerun
> the benchmark.

Looks like there is an effect of choosing -O3 over -O2 (even though
not necessarily positive all the way); what results do you get on
Windows ?

--

PYBENCH 0.7

Benchmark: /home/lemburg/tmp/pybench-2.1a1.pyb (rounds=10, warp=20)

Tests:                              per run    per oper.  diff *
------------------------------------------------------------------------
          BuiltinFunctionCalls:    1065.10 ms    8.35 us   +3.93%
           BuiltinMethodLookup:    1286.30 ms    2.45 us  +39.12%
                 ConcatStrings:    1243.30 ms    8.29 us  +15.80%
                 ConcatUnicode:    1449.10 ms    9.66 us   -5.83%
               CreateInstances:    1639.25 ms   39.03 us   +7.95%
       CreateStringsWithConcat:    1453.45 ms    7.27 us   +9.73%
       CreateUnicodeWithConcat:    1558.45 ms    7.79 us   +8.19%
                  DictCreation:    1869.35 ms   12.46 us   +8.77%
                      ForLoops:    1526.85 ms  152.69 us   -1.03%
                    IfThenElse:    1381.00 ms    2.05 us  +12.09%
                   ListSlicing:     547.40 ms  156.40 us   -5.52%
                NestedForLoops:     824.50 ms    2.36 us   -1.56%
          NormalClassAttribute:    1233.55 ms    2.06 us   +4.96%
       NormalInstanceAttribute:    1215.50 ms    2.03 us  +16.37%
           PythonFunctionCalls:    1107.30 ms    6.71 us  +10.55%
             PythonMethodCalls:    1047.00 ms   13.96 us  +26.34%
                     Recursion:     940.35 ms   75.23 us  +23.47%
                  SecondImport:     894.05 ms   35.76 us   +0.59%
           SecondPackageImport:     915.05 ms   36.60 us   +2.14%
         SecondSubmoduleImport:    1131.10 ms   45.24 us   +0.56%
       SimpleComplexArithmetic:    1652.05 ms    7.51 us   +6.67%
        SimpleDictManipulation:    1150.25 ms    3.83 us  +18.72%
         SimpleFloatArithmetic:     889.65 ms    1.62 us   -1.52%
      SimpleIntFloatArithmetic:     900.80 ms    1.36 us   -0.40%
       SimpleIntegerArithmetic:     901.75 ms    1.37 us   -0.72%
        SimpleListManipulation:    1125.40 ms    4.17 us   +4.15%
          SimpleLongArithmetic:    1305.15 ms    7.91 us   +7.62%
                    SmallLists:    2102.85 ms    8.25 us   +8.18%
                   SmallTuples:    1329.55 ms    5.54 us   +4.98%
         SpecialClassAttribute:    1234.60 ms    2.06 us  +19.23%
      SpecialInstanceAttribute:    1422.55 ms    2.37 us  +11.28%
                StringMappings:    1585.55 ms   12.58 us  +11.46%
              StringPredicates:    1241.35 ms    4.43 us   -9.69%
                 StringSlicing:    1206.20 ms    6.89 us  +15.12%
                     TryExcept:    1764.35 ms    1.18 us   -0.81%
                TryRaiseExcept:    1217.40 ms   81.16 us   +2.77%
                  TupleSlicing:     933.00 ms    8.89 us   -1.90%
               UnicodeMappings:    1137.35 ms   63.19 us   -0.49%
             UnicodePredicates:    1632.05 ms    7.25 us   +7.43%
             UnicodeProperties:    1244.05 ms    6.22 us   +5.44%
                UnicodeSlicing:    1252.10 ms    7.15 us   +9.27%
------------------------------------------------------------------------
            Average round time:   58804.00 ms              +4.73%

*) measured against: /home/lemburg/tmp/pybench-2.0.pyb (rounds=10, warp=20)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 29 14:28:24 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 14:28:24 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
Message-ID: <3A756FF8.B7185FA2@lemburg.com>

Tim Peters wrote:
> 
> [Ping]
> >     dict[key] = 1
> >     if key in dict: ...
> >     for key in dict: ...
> 
> [Guido]
> > No chance of a time-machine escape, but I *can* say that I agree that
> > Ping's proposal makes a lot of sense.  This is a reversal of my
> > previous opinion on this matter.  (Take note -- those don't happen
> > very often! :-)
> >
> > First to submit a working patch gets a free copy of 2.1a2 and
> > subsequent releases,
> 
> Thomas since submitted a patch to do the "if key in dict" part (which I
> reviewed and accepted, pending resolution of doc issues).
> 
> It does not do the "for key in dict" part.  It's not entirely clear whether
> you intended to approve that part too (I've simplified away many layers of
> quoting in the above <wink>).  In any case, nobody is working on that part.
> 
> WRT that part, Ping produced some stats in:
> 
> http://mail.python.org/pipermail/python-dev/2001-January/012106.html
> 
> > How often do you write 'dict.has_key(x)'?          (std lib says: 206)
> > How often do you write 'for x in dict.keys()'?     (std lib says: 49)
> >
> > How often do you write 'x in dict.values()'?       (std lib says: 0)
> > How often do you write 'for x in dict.values()'?   (std lib says: 3)
> 
> However, he did not report on occurrences of
> 
>     for k, v in dict.items()
> 
> I'm not clear exactly which files he examined in the above, or how the
> counts were obtained.  So I don't know how this compares:  I counted 188
> instances of the string ".items(" in 122 .py files, under the dist/ portion
> of current CVS.  A number of those were assignment and return stmts, others
> were dict.items() in an arglist, and at least one was in a comment.  After
> weeding those out, I was left with 153 legit "for" loops iterating over
> x.items().  In all:
> 
>     153 iterating over x.items()
>     118     "     over x.keys()
>      17     "     over x.values()
> 
> So I conclude that iterating over .values() is significantly more common
> than iterating over .keys().
> 
> On c.l.py about an hour ago, Thomas complained that two (out of two) of his
> coworkers guessed wrong about what
> 
>     for x in dict:
> 
> would do, but didn't say what they *did* think it would do.  Since Thomas
> doesn't work with idiots, I'm guessing they *didn't* guess it would iterate
> over either values or the lines of a freshly-opened file named "dict"
> <wink>.
> 
> So if you did intend to approve "for x in dict" iterating over dict.keys(),
> maybe you want to call me out on that "approval post" I forged under your
> name.

Dictionaries are not sequences. I wonder what order a user of
for k,v in dict: (or whatever other of this proposal you choose)
will expect...

Please also take into account that dictionaries are *mutable*
and their internal state is not defined to e.g. not change due to
lookups (take the string optimization for example...), so exposing
PyDict_Next() in any to Python will cause trouble. In the end,
you will need to create a list or tuple to iterate over one way
or another, so why bother overloading for-loops w/r to dictionaries ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From bckfnn at worldonline.dk  Mon Jan 29 14:48:44 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Mon, 29 Jan 2001 13:48:44 GMT
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>
References: <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>
Message-ID: <3a75747e.17414620@smtp.worldonline.dk>

On Mon, 29 Jan 2001 08:26:25 +0200 (IST), you wrote:

>I had to change test_opcodes since it tested that functions with the
>same code compare equal.

Thanks. With this change, Jython too can complete the test_opcodes. In
Jython a code object can never compare equal to anything but itself.

regards,
finn



From moshez at zadka.site.co.il  Mon Jan 29 15:04:47 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon, 29 Jan 2001 16:04:47 +0200 (IST)
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <3a75747e.17414620@smtp.worldonline.dk>
References: <3a75747e.17414620@smtp.worldonline.dk>, <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>
Message-ID: <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>

On Mon, 29 Jan 2001 13:48:44 GMT, bckfnn at worldonline.dk (Finn Bock) wrote:
 
> Thanks. With this change, Jython too can complete the test_opcodes. In
> Jython a code object can never compare equal to anything but itself.

Great! I'm happy to have helped.
I'm starting to wonder what the tests really test: the language definition,
or accidents of the implementation?
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From MarkH at ActiveState.com  Mon Jan 29 15:35:25 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Tue, 30 Jan 2001 01:35:25 +1100
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <3A756DBC.8EAC42F5@lemburg.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPGEGHDAAA.MarkH@ActiveState.com>

"M.-A. Lemburg" wrote:
> what results do you get on Windows ?

Win2k, dual 800, relatively quiet!

Python 2.0

F:\src\Python-2.0\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.847605
This machine benchmarks at 11798 pystones/second

F:\src\Python-2.0\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.845104
This machine benchmarks at 11832.9 pystones/second

F:\src\Python-2.0\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.846069
This machine benchmarks at 11819.4 pystones/second

F:\src\Python-2.0\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.849447
This machine benchmarks at 11772.4 pystones/second

Python from CVS today:

F:\src\python-cvs\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.885801
This machine benchmarks at 11289.2 pystones/second

F:\src\python-cvs\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.889048
This machine benchmarks at 11248 pystones/second

F:\src\python-cvs\PCbuild>python ..\lib\test\pystone.py
Pystone(1.1) time for 10000 passes = 0.892422
This machine benchmarks at 11205.5 pystones/second


Although I deleted Tim's earlier mail, from memory this is pretty similar in
terms of performance lost.  I'm afraid I have no idea what your benchmarks
are or how to build them <wink>, but did check that the optimizer is set for
"mazimize for speed" (/O2).  Other compiler options gave significantly
smaller results (no optimizations around 8500, and "optimize for space"
(/O1) at around 10000).  Other fiddling with the optimizer couldn't get
better results than the existing settings.

Mark.




From guido at digicool.com  Mon Jan 29 15:48:22 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 09:48:22 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Mon, 29 Jan 2001 07:56:49 EST."
             <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> 
Message-ID: <200101291448.JAA11473@cj20424-a.reston1.va.home.com>

> [Ping]
> >     dict[key] = 1
> >     if key in dict: ...
> >     for key in dict: ...
> 
> [Guido]
> > No chance of a time-machine escape, but I *can* say that I agree that
> > Ping's proposal makes a lot of sense.  This is a reversal of my
> > previous opinion on this matter.  (Take note -- those don't happen
> > very often! :-)
> >
> > First to submit a working patch gets a free copy of 2.1a2 and
> > subsequent releases,
> 
> Thomas since submitted a patch to do the "if key in dict" part (which I
> reviewed and accepted, pending resolution of doc issues).
> 
> It does not do the "for key in dict" part.  It's not entirely clear whether
> you intended to approve that part too (I've simplified away many layers of
> quoting in the above <wink>).  In any case, nobody is working on that part.
> 
> WRT that part, Ping produced some stats in:
> 
> http://mail.python.org/pipermail/python-dev/2001-January/012106.html
> 
> > How often do you write 'dict.has_key(x)'?          (std lib says: 206)
> > How often do you write 'for x in dict.keys()'?     (std lib says: 49)
> >
> > How often do you write 'x in dict.values()'?       (std lib says: 0)
> > How often do you write 'for x in dict.values()'?   (std lib says: 3)
> 
> However, he did not report on occurrences of
> 
>     for k, v in dict.items()
> 
> I'm not clear exactly which files he examined in the above, or how the
> counts were obtained.  So I don't know how this compares:  I counted 188
> instances of the string ".items(" in 122 .py files, under the dist/ portion
> of current CVS.  A number of those were assignment and return stmts, others
> were dict.items() in an arglist, and at least one was in a comment.  After
> weeding those out, I was left with 153 legit "for" loops iterating over
> x.items().  In all:
> 
>     153 iterating over x.items()
>     118     "     over x.keys()
>      17     "     over x.values()
> 
> So I conclude that iterating over .values() is significantly more common
> than iterating over .keys().

I did a less sophisticated count but come to the same conclusion:
iterations over items() are (somewhat) more common than over keys(),
and values() are 1-2 orders of magnitude less common.  My numbers:

$ cd python/src/Lib
$ grep 'for .*items():' *.py | wc -l
     47
$ grep 'for .*keys():' *.py | wc -l
     43
$ grep 'for .*values():' *.py | wc -l
      2

> On c.l.py about an hour ago, Thomas complained that two (out of two) of his
> coworkers guessed wrong about what
> 
>     for x in dict:
> 
> would do, but didn't say what they *did* think it would do.  Since Thomas
> doesn't work with idiots, I'm guessing they *didn't* guess it would iterate
> over either values or the lines of a freshly-opened file named "dict"
> <wink>.

I don't much value to the readability argument: typically, one will
write "for key in dict" or "for name in dict" and then it's obvious
what is meant.

> So if you did intend to approve "for x in dict" iterating over dict.keys(),
> maybe you want to call me out on that "approval post" I forged under your
> name.

But here's my dilemma.  "if (k, v) in dict" is clearly useless (nobody
has even asked me for a has_item() method).  I can live with "x in
list" checking the values and "x in dict" checking the keys.  But I
can *not* live with "x in dict" equivalent to "dict.has_key(x)" if
"for x in dict" would mean "for x in dict.items()".  I also think that
defining "x in dict" but not "for x in dict" will be confusing.

So we need to think more.

How about:

    for key in dict: ...		# ... over keys

    for key:value in dict: ...		# ... over items

This is syntactically unambiguous (a colon is currently illegal in
that position).

This also suggests:

    for index:value in list: ...	# ... over zip(range(len(list), list)

while doesn't strike me as bad or ugly, and would fulfill my brother's
dearest wish.

(And why didn't we think of this before?)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Mon Jan 29 15:58:16 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Mon, 29 Jan 2001 15:58:16 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291448.JAA11473@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 09:48:22AM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <200101291448.JAA11473@cj20424-a.reston1.va.home.com>
Message-ID: <20010129155816.T962@xs4all.nl>

On Mon, Jan 29, 2001 at 09:48:22AM -0500, Guido van Rossum wrote:

> How about:

>     for key in dict: ...		# ... over keys

>     for key:value in dict: ...		# ... over items

> This is syntactically unambiguous (a colon is currently illegal in
> that position).

I won't comment on the syntax right now, I need to look at it for a while
first :-) However, what about MAL's point about dict ordering, internally ?
Wouldn't FOR_LOOP be forced to generate a list of keys anyway, to avoid
skipping keys ? I know currently the dict implementation doesn't do any
reordering except during adds/deletes, but there is nothing in the language
ref that supports that -- it's an implementation detail. Would we make a
future enhancement where (some form of) gc would 'clean up' large
dictionaries impossible ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Mon Jan 29 16:00:38 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 10:00:38 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Mon, 29 Jan 2001 14:28:24 +0100."
             <3A756FF8.B7185FA2@lemburg.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>  
            <3A756FF8.B7185FA2@lemburg.com> 
Message-ID: <200101291500.KAA11569@cj20424-a.reston1.va.home.com>

> Dictionaries are not sequences. I wonder what order a user of
> for k,v in dict: (or whatever other of this proposal you choose)
> will expect...

The same order that for k,v in dict.items() will yield, of course.

> Please also take into account that dictionaries are *mutable*
> and their internal state is not defined to e.g. not change due to
> lookups (take the string optimization for example...), so exposing
> PyDict_Next() in any to Python will cause trouble. In the end,
> you will need to create a list or tuple to iterate over one way
> or another, so why bother overloading for-loops w/r to dictionaries ?

Actually, I was going to propose to play dangerously here: the

    for k:v in dict: ...

syntax I proposed in my previous message should indeed expose
PyDict_Next().  It should be a big speed-up, and I'm expecting (though
don't have much proof) that most loops over dicts don't mutate the
dict.

Maybe we could add a flag to the dict that issues an error when a new
key is inserted during such a for loop?  (I don't think the key order
can be affected when a key is *deleted*.)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Mon Jan 29 16:30:17 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 10:30:17 -0500
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: Your message of "Mon, 29 Jan 2001 16:04:47 +0200."
             <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il> 
References: <3a75747e.17414620@smtp.worldonline.dk>, <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>  
            <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il> 
Message-ID: <200101291530.KAA12037@cj20424-a.reston1.va.home.com>

> I'm starting to wonder what the tests really test: the language definition,
> or accidents of the implementation?

It's good to test conformance to the language definition, but this is
also a regression test for the implementation.  The "accidents of the
implementation" definitely need to be tested.  E.g. if we decide that
repr(s) uses \n rather than \012 or \x0a, this should be tested too.
The language definition gives the implementer a choice here; but once
the implementer has made a choice, it's good to have a test that tests
that this choice is implemented correctly.

Perhaps there should be several parts to the regression test,
e.g. language conformance, library conformance, platform-specific
features, and implementation conformance?

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Mon Jan 29 16:57:12 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 10:57:12 -0500
Subject: [Python-Dev] What happened to Setup.local's functionality?
In-Reply-To: Your message of "Sun, 28 Jan 2001 15:40:28 CST."
             <14964.37324.642566.602319@beluga.mojam.com> 
References: <14964.37324.642566.602319@beluga.mojam.com> 
Message-ID: <200101291557.KAA12347@cj20424-a.reston1.va.home.com>

> I just remembered Modules/Setup.local.  I peeked at mine and noticed it had
> been zeroed out.  I then copied a version of it over from another machine
> and reran make a couple times.  Makesetup ran but nothing mentioned in
> Setup.local got built.
> 
> I don't think 2.1 can be released without providing a way for users to
> recover from this change.  I didn't see anything obvious in setup.py.  Am I
> missing something?

Well, Module/Setup is still used, so it should be trivial to add
Setup.local back too.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From nas at arctrix.com  Mon Jan 29 10:23:55 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 29 Jan 2001 01:23:55 -0800
Subject: [Python-Dev] What happened to Setup.local's functionality?
In-Reply-To: <14964.37324.642566.602319@beluga.mojam.com>; from skip@mojam.com on Sun, Jan 28, 2001 at 03:40:28PM -0600
References: <14964.37324.642566.602319@beluga.mojam.com>
Message-ID: <20010129012355.A14763@glacier.fnational.com>

On Sun, Jan 28, 2001 at 03:40:28PM -0600, Skip Montanaro wrote:
> Makesetup ran but nothing mentioned in Setup.local got built.

I believe Setup.local should still work.  One possibility is that
the modules in Setup.local were marked as shared.  Shared modules
from Setup* don't get build by default.  You have to do "make
oldsharedmods".  I'm not sure why oldsharedmods is not included
in the all target.  Andrew, can you think of any reason why it
shouldn't be added.

  Neil



From dgoodger at atsautomation.com  Mon Jan 29 17:19:12 2001
From: dgoodger at atsautomation.com (Goodger, David)
Date: Mon, 29 Jan 2001 11:19:12 -0500
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
Message-ID: <B4A8F5EFA7E4D41184A00003470D35DE0115894B@INTERGATE>

Marc-Andre Lemburg's patch to posixpath.py clears up the path problem.
Thanks!

MACHDEP is qnxJ for QNX 4.25, qnxG for QNX 4.23. I don't know what it is for
QNX 6 (Neutrino). Perhaps test for MACHDEP[:3]=='qnx'?

I'm still stuck at 'python setup.py build':

    unable to execute ld: no such file or directory
    running build
    running build_ext
    building 'struct' extension
    skipping //5/tmp/py/Python-2.1a1/Modules/structmodule.c
(build/temp.qnx-J-PCI-2.1/structmodule.o up-to-date)
    ld build/temp.qnx-J-PCI-2.1/structmodule.o -L/usr/local/lib -o
build/lib.qnx-J-PCI-2.1/struct.so
    error: command 'ld' failed with exit status 1
    make: *** [sharedmods] Error 1

Armin Steinhoff said "QNX4 doesn't support dynamic loading". Is this
compatible with distutils? If not, is there a workaround?

Neil Schemenauer asked, "what should LDSHARED say for QNX?". I don't know.
Python 2.0 compiled OK, and its makefile says LDSHARED=ld. However,
Modules/Setup has no uncommented "*shared*" line.

Those of us who rely on Python to get our work done, and who don't have the
bandwidth for the implementation complexities, owe a lot to everyone who
makes it possible to compile Python out-of-the-box. Very much appreciated.
Thank you!

David Goodger
Systems Administrator & Programmer, Advanced Systems
Automation Tooling Systems Inc., Automation Systems Division
direct: (519) 653-4483 ext. 7121    fax: (519) 650-6695
e-mail: dgoodger at atsautomation.com



From nas at arctrix.com  Mon Jan 29 10:40:07 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Mon, 29 Jan 2001 01:40:07 -0800
Subject: [Python-Dev] problems building Python 2.1a1 on QNX 4.25
In-Reply-To: <B4A8F5EFA7E4D41184A00003470D35DE0115894B@INTERGATE>; from dgoodger@atsautomation.com on Mon, Jan 29, 2001 at 11:19:12AM -0500
References: <B4A8F5EFA7E4D41184A00003470D35DE0115894B@INTERGATE>
Message-ID: <20010129014007.C14763@glacier.fnational.com>

On Mon, Jan 29, 2001 at 11:19:12AM -0500, Goodger, David wrote:
> I'm still stuck at 'python setup.py build':
...
> Armin Steinhoff said "QNX4 doesn't support dynamic loading". Is this
> compatible with distutils? If not, is there a workaround?

The setup.py script only builds shared modules.  Your going to
have to enable modules using the old Setup file.  I think
Setup.dist should got back to including all the modules
(commented out of course).  This would make it easier to people
who can't or don't want to build shared modules.

  Neil



From akuchlin at cnri.reston.va.us  Mon Jan 29 17:50:31 2001
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon, 29 Jan 2001 11:50:31 -0500
Subject: [Python-Dev] What happened to Setup.local's functionality?
In-Reply-To: <20010129012355.A14763@glacier.fnational.com>; from nas@arctrix.com on Mon, Jan 29, 2001 at 01:23:55AM -0800
References: <14964.37324.642566.602319@beluga.mojam.com> <20010129012355.A14763@glacier.fnational.com>
Message-ID: <20010129115031.B4018@amarok.cnri.reston.va.us>

On Mon, Jan 29, 2001 at 01:23:55AM -0800, Neil Schemenauer wrote:
>from Setup* don't get build by default.  You have to do "make
>oldsharedmods".  I'm not sure why oldsharedmods is not included
>in the all target.  Andrew, can you think of any reason why it
>shouldn't be added.

That's an excellent idea, particularly if we add back Setup.dist, too,
and comment out all but the required modules.  

I'll try to do that today.  Note that I'm leaving on vacation
tomorrow, and will be back next Monday.  Everyone, feel free to check
in changes to setup.py that are required.

--amk




From jeremy at alum.mit.edu  Mon Jan 29 17:48:11 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 29 Jan 2001 11:48:11 -0500 (EST)
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <3A75677C.E4FA82A0@lemburg.com>
References: <mailman.980616572.26954.python-list@python.org>
	<200101271858.NAA04898@mira.erols.com>
	<3A7560EF.39D6CF@lemburg.com>
	<01fc01c089ef$48072230$0900a8c0@SPIFF>
	<3A75677C.E4FA82A0@lemburg.com>
Message-ID: <14965.40651.233438.311104@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:

  MAL> Yes... after a check of the Makefile I found that I had
  MAL> compiled Python 2.0 with -O3 and 2.1a1 with -O2 -- perhaps this
  MAL> makes a difference w/r to inlining of code. I'll recompile and
  MAL> rerun the benchmark.
 
When I was working in the CALL_FUNCTION revision, I compared 2.0 final
with my development working using -O3.  At that time, I saw no
significant performance difference between the two.  And I did notice
a difference between -O2 and -O3.

The strange thing is that I notice a difference between -O2 and -O3
with 2.1a1, but in the opposite direction.  On pystone, python -O2
runs consistently faster than -O3; the difference is .05 sec on my
machine.  

Jeremy



From esr at thyrsus.com  Mon Jan 29 18:12:05 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 12:12:05 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <14965.40651.233438.311104@localhost.localdomain>; from jeremy@alum.mit.edu on Mon, Jan 29, 2001 at 11:48:11AM -0500
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com> <01fc01c089ef$48072230$0900a8c0@SPIFF> <3A75677C.E4FA82A0@lemburg.com> <14965.40651.233438.311104@localhost.localdomain>
Message-ID: <20010129121205.A8337@thyrsus.com>

Jeremy Hylton <jeremy at alum.mit.edu>:
> The strange thing is that I notice a difference between -O2 and -O3
> with 2.1a1, but in the opposite direction.  On pystone, python -O2
> runs consistently faster than -O3; the difference is .05 sec on my
> machine.  

Bizarre.  Make me wonder if we have a C compiler problem.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

In every country and in every age, the priest has been hostile to
liberty. He is always in alliance with the despot, abetting his abuses
in return for protection to his own.
	-- Thomas Jefferson, 1814



From jeremy at alum.mit.edu  Mon Jan 29 18:27:08 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 29 Jan 2001 12:27:08 -0500 (EST)
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <20010129121205.A8337@thyrsus.com>
References: <mailman.980616572.26954.python-list@python.org>
	<200101271858.NAA04898@mira.erols.com>
	<3A7560EF.39D6CF@lemburg.com>
	<01fc01c089ef$48072230$0900a8c0@SPIFF>
	<3A75677C.E4FA82A0@lemburg.com>
	<14965.40651.233438.311104@localhost.localdomain>
	<20010129121205.A8337@thyrsus.com>
Message-ID: <14965.42988.362288.154254@localhost.localdomain>

>>>>> "ESR" == Eric S Raymond <esr at thyrsus.com> writes:

  ESR> Jeremy Hylton <jeremy at alum.mit.edu>:
  >> The strange thing is that I notice a difference between -O2 and
  >> -O3 with 2.1a1, but in the opposite direction.  On pystone,
  >> python -O2 runs consistently faster than -O3; the difference is
  >> .05 sec on my machine.

  ESR> Bizarre.  Make me wonder if we have a C compiler problem.

Depends on your defintion of "compiler problem" <wink>.  If you mean,
it compiles our code so it runs slower, then, yes, we've got one :-).

One of the differences between -O2 and -O3, according to the man page,
is that -O3 will perform optimizations that involve a space-speed
tradeoff.  It also include -finline-functions.  I can imagine that
some of these optimizations hurt memory performance enough to make a
difference. 

not-really-understanding-but-not-really-expecting-too-ly y'rs,
Jeremy



From jeremy at alum.mit.edu  Mon Jan 29 18:39:05 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 29 Jan 2001 12:39:05 -0500 (EST)
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <14965.40651.233438.311104@localhost.localdomain>
References: <mailman.980616572.26954.python-list@python.org>
	<200101271858.NAA04898@mira.erols.com>
	<3A7560EF.39D6CF@lemburg.com>
	<01fc01c089ef$48072230$0900a8c0@SPIFF>
	<3A75677C.E4FA82A0@lemburg.com>
	<14965.40651.233438.311104@localhost.localdomain>
Message-ID: <14965.43705.367236.994786@localhost.localdomain>

The recursion test in pybench is testing the performance of the nested
scopes changes, which must do some extra bookkeeping to reference the
recursive function in a nested scope.  To some extent, a performance
hit is a necessary consequence for nested functions with free
variables.

Nonetheless, there are two interesting things to say about this
situation.

First, there is a bug in the current implementation of nested scopes
that the benchmark tickles.  The problem is with code like this:

def outer():
    global f
    def f(x):
        if x > 0:
            return f(x - 1)

The compiler determines that f is free in f.  (It's recursive.)  If f
is free in f, in the absence of the global decl, the body of outer
must allocate fresh storage (a cell) for f each time outer is called
and add a reference to that cell to f's closure.

If f is declared global in outer, then it ought to be treated as a
global in nested scopes, too.  In general terms, a free variable
should use the binding found in the nearest enclosing scope.  If the
nearest enclosing scope has a global binding, then the reference is
global. 

If I fix this problem, the recursion benchmark shouldn't be any slower
than a normal function call.

The second interesting thing to say is that frame allocation and
dealloc is probably more expensive than it needs to be in the current
implementation.  The frame object has a new f_closure slot that holds
a tuple that is freshly allocated every time the frame is allocated.
(Unless the closure is empty, then f_closure is just NULL.)

The extra tuple allocation can probably be done away with by using the
same allocation strategy as locals & stack.  If the f_localsplus array
holds cells + frees + locals + stack, then a new frame will never
require more than a single malloc (and often not even that).

Jeremy



From akuchlin at cnri.reston.va.us  Mon Jan 29 18:54:37 2001
From: akuchlin at cnri.reston.va.us (Andrew M. Kuchling)
Date: Mon, 29 Jan 2001 12:54:37 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <14965.42988.362288.154254@localhost.localdomain>; from jeremy@alum.mit.edu on Mon, Jan 29, 2001 at 12:27:08PM -0500
References: <mailman.980616572.26954.python-list@python.org> <200101271858.NAA04898@mira.erols.com> <3A7560EF.39D6CF@lemburg.com> <01fc01c089ef$48072230$0900a8c0@SPIFF> <3A75677C.E4FA82A0@lemburg.com> <14965.40651.233438.311104@localhost.localdomain> <20010129121205.A8337@thyrsus.com> <14965.42988.362288.154254@localhost.localdomain>
Message-ID: <20010129125437.E4018@amarok.cnri.reston.va.us>

On Mon, Jan 29, 2001 at 12:27:08PM -0500, Jeremy Hylton wrote:
>Depends on your defintion of "compiler problem" <wink>.  If you mean,
>it compiles our code so it runs slower, then, yes, we've got one :-).

Compiling with gcc and -g, with no optimization, 2.0 and 2.1cvs seem
to be very close, with 2.1 slightly slower:

2.0:
Pystone(1.1) time for 10000 passes = 1.04
This machine benchmarks at 9615.38 pystones/second
This machine benchmarks at 9345.79 pystones/second
This machine benchmarks at 9433.96 pystones/second
This machine benchmarks at 9433.96 pystones/second
This machine benchmarks at 9523.81 pystones/second

2.1cvs:
Pystone(1.1) time for 10000 passes = 1.09
This machine benchmarks at 9174.31 pystones/second
This machine benchmarks at 9090.91 pystones/second
This machine benchmarks at 9259.26 pystones/second
This machine benchmarks at 9174.31 pystones/second
This machine benchmarks at 9090.91 pystones/second

Would it be worth experimenting with platform-specific compiler
options to try to squeeze out the last bit of performance (can wait
for the betas, probably).

--amk



From jeremy at alum.mit.edu  Mon Jan 29 19:04:28 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Mon, 29 Jan 2001 13:04:28 -0500 (EST)
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <3A756DBC.8EAC42F5@lemburg.com>
References: <mailman.980616572.26954.python-list@python.org>
	<200101271858.NAA04898@mira.erols.com>
	<3A7560EF.39D6CF@lemburg.com>
	<01fc01c089ef$48072230$0900a8c0@SPIFF>
	<3A75677C.E4FA82A0@lemburg.com>
	<3A756DBC.8EAC42F5@lemburg.com>
Message-ID: <14965.45228.197778.579989@localhost.localdomain>

I hope another set of benchmarks isn't overkill for the list.  I see
different results comparing 2.1 with 2.0 (both -O3) using pybench
0.6. 

The interesting differences I see in this benchmark that I didn't see
in MAL's are:

DictCreation +15.87%
SeoncdImport +20.29%

Other curious differences, which show up in both benchmarks, include:
SpecialClassAttribute +17.91%     (private variables)
SpecialInstanceAttribute +15.34%  (__methods__)

Jeremy

PYBENCH 0.6

Benchmark: py21 (rounds=10, warp=20)

Tests:                              per run    per oper.  diff *
------------------------------------------------------------------------
          BuiltinFunctionCalls:     305.05 ms    2.39 us   +4.77%
           BuiltinMethodLookup:     319.65 ms    0.61 us   +2.55%
                 ConcatStrings:     383.70 ms    2.56 us   +1.27%
               CreateInstances:     463.85 ms   11.04 us   +1.96%
       CreateStringsWithConcat:     381.20 ms    1.91 us   +2.39%
                  DictCreation:     508.85 ms    3.39 us  +15.87%
                      ForLoops:     577.60 ms   57.76 us   +5.65%
                    IfThenElse:     443.70 ms    0.66 us   +1.02%
                   ListSlicing:     207.50 ms   59.29 us   -4.18%
                NestedForLoops:     315.75 ms    0.90 us   +3.54%
          NormalClassAttribute:     379.80 ms    0.63 us   +7.39%
       NormalInstanceAttribute:     385.45 ms    0.64 us   +8.04%
           PythonFunctionCalls:     400.00 ms    2.42 us  +13.62%
             PythonMethodCalls:     306.25 ms    4.08 us   +5.13%
                     Recursion:     337.25 ms   26.98 us  +19.00%
                  SecondImport:     301.20 ms   12.05 us  +20.29%
           SecondPackageImport:     298.20 ms   11.93 us  +18.15%
         SecondSubmoduleImport:     339.15 ms   13.57 us  +11.40%
       SimpleComplexArithmetic:     392.70 ms    1.79 us  -10.52%
        SimpleDictManipulation:     350.40 ms    1.17 us   +3.87%
         SimpleFloatArithmetic:     300.75 ms    0.55 us   +2.04%
      SimpleIntFloatArithmetic:     347.95 ms    0.53 us   +9.01%
       SimpleIntegerArithmetic:     356.40 ms    0.54 us  +12.01%
        SimpleListManipulation:     351.85 ms    1.30 us  +11.33%
          SimpleLongArithmetic:     309.00 ms    1.87 us   -5.81%
                    SmallLists:     584.25 ms    2.29 us  +10.20%
                   SmallTuples:     442.00 ms    1.84 us  +10.33%
         SpecialClassAttribute:     406.50 ms    0.68 us  +17.91%
      SpecialInstanceAttribute:     557.40 ms    0.93 us  +15.34%
                 StringSlicing:     336.45 ms    1.92 us   +9.56%
                     TryExcept:     650.60 ms    0.43 us   +1.40%
                TryRaiseExcept:     345.95 ms   23.06 us   +2.70%
                  TupleSlicing:     266.35 ms    2.54 us   +4.70%
------------------------------------------------------------------------
            Average round time:   14413.00 ms              +7.07%

*) measured against: py20 (rounds=10, warp=20)




From skip at mojam.com  Mon Jan 29 19:07:26 2001
From: skip at mojam.com (Skip Montanaro)
Date: Mon, 29 Jan 2001 12:07:26 -0600 (CST)
Subject: [Python-Dev] What happened to Setup.local's functionality?
In-Reply-To: <20010129012355.A14763@glacier.fnational.com>
References: <14964.37324.642566.602319@beluga.mojam.com>
	<20010129012355.A14763@glacier.fnational.com>
Message-ID: <14965.45406.933528.53857@beluga.mojam.com>

    Neil> You have to do "make oldsharedmods".  

This did the trick.  This should be emblazoned in big red letters somewhere
if the decision is made to not include oldsharedmods as a dependency for the
all target.

Thx,

Skip




From gvwilson at ca.baltimore.com  Mon Jan 29 19:19:21 2001
From: gvwilson at ca.baltimore.com (Greg Wilson)
Date: Mon, 29 Jan 2001 13:19:21 -0500
Subject: [Python-Dev] Re: Re: Sets: elt in dict, lst.include
In-Reply-To: <20010129162012.32158ED49@mail.python.org>
Message-ID: <001501c08a20$00dca2a0$770a0a0a@nevex.com>

> > > [Ping]
> > >     dict[key] = 1
> > >     if key in dict: ...
> > >     for key in dict: ...

> "Tim Peters" <tim.one at home.com>
> "if (k, v) in dict" is clearly useless...
> I can live with "x in list" checking the values and "x in dict"
> checking the keys.  But I can *not* live with "x in dict" equivalent
> to "dict.has_key(x)" if "for x in dict" would mean "for x in dict.items()".
> I also think that defining "x in dict" but not "for x in dict" will be
> confusing.

[Greg]
Quick poll (four people): if the expression "if a in b" works,
then all four expected "for a in b" to work as well.  This is
also my intuition; are there any exceptions in really existing
Python?

> [Guido]
>     for key in dict: ...		# ... over keys
>     for key:value in dict: ...	# ... over items

[Greg]
I'm probably revealing my ignorance of Python's internals here,
but can the iteration protocol be extended so that the object
(in this case, the dict) is told the number and type(s) of the
values the loop is expecting?  With:

    for key in dict: ...

the dict would be asked for one value; with:

    for (key, value) in dict:

the dict would be told that a two-element tuple was expected,
and so on.  This would allow multi-dimensional structures
(e.g. NumPy arrays) to do things like:

    for (i, j, k) in array:		# please give me three indices

and:

    for ((i, j, k), v) in array:	# three indices and value

> [Guido]
>     for index:value in list: ...	# ... over zip(range(len(list), list)

How do you feel about:

    for i in seq.keys():		# strings, tuples, etc.

"keys()" is kind of strange ("indices" or something would be
more natural), *but* this allows uniform iteration over all
built-in collections:

    def showem(c):
        for i in c.keys():
            print i, c[i]

Greg




From bckfnn at worldonline.dk  Mon Jan 29 19:31:48 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Mon, 29 Jan 2001 18:31:48 GMT
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>
References: <3a75747e.17414620@smtp.worldonline.dk>, <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il> <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>
Message-ID: <3a75aba9.31537178@smtp.worldonline.dk>

On Mon, 29 Jan 2001 16:04:47 +0200 (IST), you wrote:

>On Mon, 29 Jan 2001 13:48:44 GMT, bckfnn at worldonline.dk (Finn Bock) wrote:
> 
>> Thanks. With this change, Jython too can complete the test_opcodes. In
>> Jython a code object can never compare equal to anything but itself.
>
>Great! I'm happy to have helped.
>I'm starting to wonder what the tests really test: the language definition,
>or accidents of the implementation?

Based on the amount of code in test_opcodes dedicated to code
comparison, I doubt this particular situation was an accident.

The problems I have had with the test suite are better described as
accidents of the tests themself. From test_extcall:

  We expected (repr): "g() got multiple values for keyword argument 'b'"
  But instead we got: "g() got multiple values for keyword argument 'a'"

This is caused by a difference in iteration over a dictionary.

Or from test_import:

  test test_import crashed -- java.lang.ClassFormatError:
  java.lang.ClassFormatError: @test$py (Illegal Class name "@test$py")

where '@' isn't allowed in java classnames.

These are failures that have very little to do with the thing the test
are about and nothing at all to do with the language definition.

regards,
finn



From cgw at alum.mit.edu  Mon Jan 29 19:35:58 2001
From: cgw at alum.mit.edu (Charles G Waldman)
Date: Mon, 29 Jan 2001 12:35:58 -0600 (CST)
Subject: [Python-Dev] Re: Re: Sets: elt in dict, lst.include
In-Reply-To: <001501c08a20$00dca2a0$770a0a0a@nevex.com>
References: <20010129162012.32158ED49@mail.python.org>
	<001501c08a20$00dca2a0$770a0a0a@nevex.com>
Message-ID: <14965.47118.135246.700571@sirius.net.home>

Greg Wilson writes:

 > This would allow multi-dimensional structures
 > (e.g. NumPy arrays) to do things like:
 > 
 >     for (i, j, k) in array:		# please give me three indices
 > 
 > and:
 > 
 >     for ((i, j, k), v) in array:	# three indices and value

And what if I had, for example, a 3-dimensional array where the values
are 3-tuples?  Would "for (i,j,k) in array" refer to the indices or the
values?




From mal at lemburg.com  Mon Jan 29 20:03:41 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 20:03:41 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>  
	            <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
Message-ID: <3A75BE8D.1B7673EE@lemburg.com>

With all this confusion about how to actually write the
iteration on dictionary items, wouldn't it make more sense
to implement an extension module which then provides a __getitem__
style iterator for dictionaries by interfacing to PyDict_Next() ?

The module could have three different iterators:

1. iterate over items
2.     ... over keys
3.     ... over values

The reasoning behind this is that the __getitem__ interface
is well established and this doesn't introduce any new
syntax while still providing speed and flexibility.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Mon Jan 29 19:08:16 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 19:08:16 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>  
	            <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
Message-ID: <3A75B190.3FD2A883@lemburg.com>

Guido van Rossum wrote:
> 
> > Dictionaries are not sequences. I wonder what order a user of
> > for k,v in dict: (or whatever other of this proposal you choose)
> > will expect...
> 
> The same order that for k,v in dict.items() will yield, of course.

And then people find out that the order has some sorting
properties and start to use it... "how to sort a dictionary?"
comes up again, every now and then.
 
> > Please also take into account that dictionaries are *mutable*
> > and their internal state is not defined to e.g. not change due to
> > lookups (take the string optimization for example...), so exposing
> > PyDict_Next() in any to Python will cause trouble. In the end,
> > you will need to create a list or tuple to iterate over one way
> > or another, so why bother overloading for-loops w/r to dictionaries ?
> 
> Actually, I was going to propose to play dangerously here: the
> 
>     for k:v in dict: ...
> 
> syntax I proposed in my previous message should indeed expose
> PyDict_Next().  It should be a big speed-up, and I'm expecting (though
> don't have much proof) that most loops over dicts don't mutate the
> dict.
> 
> Maybe we could add a flag to the dict that issues an error when a new
> key is inserted during such a for loop?  (I don't think the key order
> can be affected when a key is *deleted*.)

You mean: mark it read-only ? That would be a "nice to have"
property for a lot of mutable types indeed -- sort of like
low-level locks. This would be another candidate for an object flag
(much like the one Fred wants to introduce for weak referenced
objects).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Mon Jan 29 20:22:07 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 14:22:07 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Mon, 29 Jan 2001 19:08:16 +0100."
             <3A75B190.3FD2A883@lemburg.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com>  
            <3A75B190.3FD2A883@lemburg.com> 
Message-ID: <200101291922.OAA13321@cj20424-a.reston1.va.home.com>

> > > Dictionaries are not sequences. I wonder what order a user of
> > > for k,v in dict: (or whatever other of this proposal you choose)
> > > will expect...
> > 
> > The same order that for k,v in dict.items() will yield, of course.
> 
> And then people find out that the order has some sorting
> properties and start to use it... "how to sort a dictionary?"
> comes up again, every now and then.

I don't understand why you bring this up.  We're not revealing
anything new here, the random order of dict items has always been part
of the language.  The answer to "how to sort a dict" should be "copy
it into a list and sort that."

Or am I missing something?

> > > Please also take into account that dictionaries are *mutable*
> > > and their internal state is not defined to e.g. not change due to
> > > lookups (take the string optimization for example...), so exposing
> > > PyDict_Next() in any to Python will cause trouble. In the end,
> > > you will need to create a list or tuple to iterate over one way
> > > or another, so why bother overloading for-loops w/r to dictionaries ?
> > 
> > Actually, I was going to propose to play dangerously here: the
> > 
> >     for k:v in dict: ...
> > 
> > syntax I proposed in my previous message should indeed expose
> > PyDict_Next().  It should be a big speed-up, and I'm expecting (though
> > don't have much proof) that most loops over dicts don't mutate the
> > dict.
> > 
> > Maybe we could add a flag to the dict that issues an error when a new
> > key is inserted during such a for loop?  (I don't think the key order
> > can be affected when a key is *deleted*.)
> 
> You mean: mark it read-only ? That would be a "nice to have"
> property for a lot of mutable types indeed -- sort of like
> low-level locks. This would be another candidate for an object flag
> (much like the one Fred wants to introduce for weak referenced
> objects).

Yes.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From gvwilson at ca.baltimore.com  Mon Jan 29 20:38:50 2001
From: gvwilson at ca.baltimore.com (Greg Wilson)
Date: Mon, 29 Jan 2001 14:38:50 -0500
Subject: [Python-Dev] RE: Python-Dev digest, Vol 1 #1124 - 13 msgs
In-Reply-To: <20010129193101.7BF83EF62@mail.python.org>
Message-ID: <001a01c08a2b$1ba5a040$770a0a0a@nevex.com>

> Greg Wilson writes:
>  > This would allow multi-dimensional structures
>  > (e.g. NumPy arrays) to do things like:
>  >     for (i, j, k) in array:
>  > and:
>  >     for ((i, j, k), v) in array:	# three indices and value

> Charles Waldman asks:
> And what if I had, for example, a 3-dimensional array where the values
> are 3-tuples?  Would "for (i,j,k) in array" refer to the 
> indices or the values?

Greg Wilson writes:
That would be up to the module's implementer --- my idea was to have
the 'for' loop provide more information to the object being iterated
over, so that it could "do the right thing" (just as objects do right
now with "x[i]").

Greg



From mal at lemburg.com  Mon Jan 29 20:45:46 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Mon, 29 Jan 2001 20:45:46 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com>  
	            <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com>
Message-ID: <3A75C86A.3A4236E8@lemburg.com>

Guido van Rossum wrote:
> 
> > > > Dictionaries are not sequences. I wonder what order a user of
> > > > for k,v in dict: (or whatever other of this proposal you choose)
> > > > will expect...
> > >
> > > The same order that for k,v in dict.items() will yield, of course.
> >
> > And then people find out that the order has some sorting
> > properties and start to use it... "how to sort a dictionary?"
> > comes up again, every now and then.
> 
> I don't understand why you bring this up.  We're not revealing
> anything new here, the random order of dict items has always been part
> of the language.  The answer to "how to sort a dict" should be "copy
> it into a list and sort that."
> 
> Or am I missing something?

I just wanted to hint at a problem which iterating over items
in an unordered set can cause. Especially new Python users will find 
it confusing that the order of the items in an iteration can change
from one run to the next.

Not much of an argument, but I like explicit programming more
than magic under the cover. What we really want is iterators for
dictionaries, so why not implement these instead of tweaking
for-loops.

If you are looking for speedups w/r to for-loops, applying a
different indexing technique in for-loops would go a lot further
and provide better performance not only to dictionary loops,
but also to other sequences.

I have made some good experience with a special counter object 
(sort of like a mutable integer) which is used instead of the 
iteration index integer in the current implementation. 

Using an iterator object instead of the integer + __getitem__
call machinery would allow more flexibility for all kinds of
sequences or containers. There could be an iterator type for
dictionaries, one for generic __getitem__ style sequences,
one for lists and tuples, etc. All of these could include
special logic to get the most out of the targetted datatype.

Well, just a thought...
-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From esr at thyrsus.com  Mon Jan 29 21:02:47 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 15:02:47 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291922.OAA13321@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 02:22:07PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com>
Message-ID: <20010129150247.B10191@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> > > Maybe we could add a flag to the dict that issues an error when a new
> > > key is inserted during such a for loop?  (I don't think the key order
> > > can be affected when a key is *deleted*.)
> > 
> > You mean: mark it read-only ? That would be a "nice to have"
> > property for a lot of mutable types indeed -- sort of like
> > low-level locks. This would be another candidate for an object flag
> > (much like the one Fred wants to introduce for weak referenced
> > objects).
> 
> Yes.

For different reasons, I'd like to be able to set a constant flag on a
object instance.  Simple semantics: if you try to assign to a
member or method, it throws an exception.

Application?  I have a large Python program that goes to a lot of effort
to build elaborate context structures in core.  It would be nice to know
they can't be even inadvertently trashed without throwing an exception I 
can watch for.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

No one is bound to obey an unconstitutional law and no courts are bound
to enforce it.  
	-- 16 Am. Jur. Sec. 177 late 2d, Sec 256



From esr at thyrsus.com  Mon Jan 29 21:09:14 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 15:09:14 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3A75C86A.3A4236E8@lemburg.com>; from mal@lemburg.com on Mon, Jan 29, 2001 at 08:45:46PM +0100
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com>
Message-ID: <20010129150914.C10191@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> If you are looking for speedups w/r to for-loops, applying a
> different indexing technique in for-loops would go a lot further
> and provide better performance not only to dictionary loops,
> but also to other sequences.

Which reminds me...

There's not much I miss from C these days, but one thing I wish Python
had is a more general for-loop.  The C semantics that let you have 
any initialization, any termination test, and any iteration you like
are rather cool.

Yes, I realize that

	for (<init>; <test>; <step>) {<body>}

can be simulated with:

	<init>
	while 1:
		if <test>:
			break
		<body> 

Still, having them spatially grouped the way a C for does it is nice.
Makes it easier to see invariants, I think.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Rightful liberty is unobstructed action, according to our will, within limits
drawn around us by the equal rights of others."
	-- Thomas Jefferson



From moshez at zadka.site.co.il  Mon Jan 29 21:29:53 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Mon, 29 Jan 2001 22:29:53 +0200 (IST)
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <200101291530.KAA12037@cj20424-a.reston1.va.home.com>
References: <200101291530.KAA12037@cj20424-a.reston1.va.home.com>, <3a75747e.17414620@smtp.worldonline.dk>, <14964.63948.492662.775413@anthem.wooz.org>, <20010127191428.D71ADA840@darjeeling.zadka.site.co.il> <20010129062625.3A35DA840@darjeeling.zadka.site.co.il>  
            <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>
Message-ID: <20010129202953.D1498A840@darjeeling.zadka.site.co.il>

On Mon, 29 Jan 2001 10:30:17 -0500, Guido van Rossum <guido at digicool.com> wrote:

> It's good to test conformance to the language definition, but this is
> also a regression test for the implementation.  The "accidents of the
> implementation" definitely need to be tested.  E.g. if we decide that
> repr(s) uses \n rather than \012 or \x0a, this should be tested too.
> The language definition gives the implementer a choice here; but once
> the implementer has made a choice, it's good to have a test that tests
> that this choice is implemented correctly.

I agree.

> Perhaps there should be several parts to the regression test,
> e.g. language conformance, library conformance, platform-specific
> features, and implementation conformance?

This sounds like a good idea...probably for the 2.2 timeline.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From tim.one at home.com  Mon Jan 29 22:51:56 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 16:51:56 -0500
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <20010129140447.AE0E4A840@darjeeling.zadka.site.co.il>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEBBIMAA.tim.one@home.com>

[Moshe Zadka]
> ...
> I'm starting to wonder what the tests really test: the language
> definition, or accidents of the implementation?

You'd be amazed (appalled?) at how hard it is to separate them.

In two previous lives as a Big Iron compiler hacker, we routinely had to get
our compilers validated by a govt agency before any US govt account would be
allowed to buy our stuff; e.g.,

    http://www.itl.nist.gov/div897/ctg/vpl/language.htm

This usually *started* as a two-day process, flying the inspector to our
headquarters, taking perhaps 2 minutes of machine time to run the test
suite, then sitting around that day and into the next arguing about whether
the "failures" were due to non-standard assumptions in the tests, or
compiler bugs.  It was almost always the former, but sometimes that didn't
get fully resolved for months (if the inspector was being particularly
troublesome, it could require getting an Official Interpretation from the
relevant stds body -- not swift!).  (BTW, this is one reason huge customers
are often very reluctant to move to a new release:  the validation process
can be very expensive and drag on for months)

>>> def f():
...     global g
...     g += 1
...     return g
...
>>> g = 0
>>> d = {f(): f()}
>>> d
{2: 1}
>>>

The Python Lang Ref doesn't really say whether {2: 1} or {1: 2} "should be"
the result, nor does it say it's implementation-defined.  If you *asked*
Guido what he thought it should do, he'd probably say {1: 2} (not much of a
guess:  I asked him in the past, and that's what he did say <wink>).

Something "like that" can show up in the test suite, but buried under layers
of obfuscating accidents.  Nobody is likely to realize it in the absence of
a failure motivating people to search for it.

Which is a trap:  sometimes ours was the only compiler (of dozens and
dozens) that had *ever* "failed" a particular test.  This was most often the
case at Cray Research, which had bizarre (but exceedingly fast -- which is
what Cray's customers valued most) floating-point arithmetic.  I recall one
test in particular that failed because Cray's was the only box on earth that
set I to 1 in

    INTEGER I
    I = 6.0/3.0

Fortran doesn't define that the result must be 2.  But-- you guessed
it --neither does Python.

Cute:  at KSR, INT(6.0/3.0) did return 2 -- but INT(98./49.) did not <wink>.

then-again-the-python-test-suite-is-still-shallow-ly y'rs  - tim




From hughett at mercur.uphs.upenn.edu  Mon Jan 29 23:05:22 2001
From: hughett at mercur.uphs.upenn.edu (Paul Hughett)
Date: Mon, 29 Jan 2001 17:05:22 -0500
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <LNBBLJKPBEHFEDALKOLCEEBBIMAA.tim.one@home.com>
References: <LNBBLJKPBEHFEDALKOLCEEBBIMAA.tim.one@home.com>
Message-ID: <200101292205.RAA18790@mercur.uphs.upenn.edu>

tim says:

> Cray's was the only box on earth that set I to 1 in

>    INTEGER I
>    I = 6.0/3.0

> Fortran doesn't define that the result must be 2.  But-- you guessed
> it --neither does Python.

I would _guess_ that the IEEE 754 floating point standard does require
that, but I haven't actually gotten my hands on a copy of the standard
yet.  If it doesn't, I may have to stop writing code that depends on
the assumption that floating point computation is exact for exactly
representable integers.  If so, then we're reasonably safe; there
aren't many non-IEEE machines left these days.

Un-lurking-ly yours,

Paul Hughett



From tim.one at home.com  Mon Jan 29 23:53:43 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 17:53:43 -0500
Subject: [Python-Dev] Function Hash: Check it in?
In-Reply-To: <200101292205.RAA18790@mercur.uphs.upenn.edu>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEBEIMAA.tim.one@home.com>

[Paul Hughett]
> I would _guess_ that the IEEE 754 floating point standard does require
> that [6./3. == 2.],

It does, but 754 is silent on how languages may or may not *bind* to its
semantics.  The C99 std finally addresses that (15 years after 754), and
Java does too (albeit in a way Kahan despises), but that's about it for
"name brand" <wink> languages.

> ...
> If it doesn't, I may have to stop writing code that depends on
> the assumption that floating point computation is exact for exactly
> representable integers.  If so, then we're reasonably safe; there
> aren't many non-IEEE machines left these days.

I'm afraid you've got no guarantees even on a box with 100% conforming 754
hardware.  One of the last "mystery bugs" I helped tracked down at my
previous employer only showed up under Intel's C++ compiler.  It turned out
the compiler was looking for code of the form:

    double *a, *b, scale;
    for (i=0; i < n; ++i) {
        a[i] = b[i] / scale;
    }

and rewriting it as:

    double __temp = 1./scale;
    for (i=0; i < n; ++i) {
        a[i] = b[i] * __temp;
    }

for speed.  As time goes on, PC compilers are becoming more and more like
Cray's and KSR's in this respect:  float division is much more expensive
than float mult, and so variations of "so multiply by the reciprocal
instead" are hard for vendors to resist.  And, e.g., under 754 double rules,

   (17. * 123.) * (1./123.)

must *not* yield exactly 17.0 if done wholly in 754 double (but then 754
says nothing about how any language maps that string to 754 operations).

if-you-like-logic-chopping-you'll-love-arguing-stds<wink>-ly y'rs  - tim




From guido at digicool.com  Tue Jan 30 00:59:34 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 18:59:34 -0500
Subject: [Python-Dev] Does autoconfig detect INSTALL incorrectly?
In-Reply-To: Your message of "Tue, 23 Jan 2001 00:30:56 PST."
             <20010123003056.A28309@glacier.fnational.com> 
References: <20010123003056.A28309@glacier.fnational.com> 
Message-ID: <200101292359.SAA20364@cj20424-a.reston1.va.home.com>

> Why is the configure.in file set to always use "install-sh"?
> There is a comment that says:
> 
>     # Install just never works :-(
> 
> I don't think that statement is accurate.  /usr/bin/install works
> quite well on my machine.  The only commments I can find in the
> changelog are:
> 
>     revision 1.16
>     date: 1995/01/20 14:12:16;  author: guido;  state: Exp;  lines: +27 -2
>     add INSTALL_PROGRAM and INSTALL_DATA; check for getopt
> 
> and:
> 
>     revision 1.5
>     date: 1994/08/19 15:33:51;  author: guido;  state: Exp;  lines: +14 -6
>     Simplify value of INSTALL (always 'cp').
> 
> Is there any reason why the autoconf macro AC_PROG_INSTALL is not used?  The
> documentation seems to indicate that is does what we want.

Neil,

It's too long for me to remember, and I bet this was before
AC_PROG_INSTALL.  If there's a reason to prefer a working "install"
over install-sh, feel free to do the right thing!  (You're in charge
of the Makefile anyway now, it seems. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From skip at mojam.com  Tue Jan 30 01:17:25 2001
From: skip at mojam.com (Skip Montanaro)
Date: Mon, 29 Jan 2001 18:17:25 -0600 (CST)
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
Message-ID: <14966.2069.950895.627663@beluga.mojam.com>

After reading through this thread and noticing (but not paying close
attention to) all the related posts on c.l.py (subject: "in for dicts"), it
seems to me that the whole "if/for something in dict" thing needds to be
hashed out in a PEP.  There were a fair amount of "Python's changing too
fast" rants when 2.0 was released.  Adding a major feature such as this at
the 2.1 stage is only going to generate that many more rants.  The fact that
it was easy for Thomas to implement "if key in dict" doesn't make the
overall concept less controversial.  There are apparently lots of varying
opinions about what's reasonable.  This topic seems related to PEP 212 (Loop
Counter Iteration) and PEP 218 (Adding a Built-In Set Object Type), but may
well warrant its own.

That said, I have plenty enough on my plate trying to keep Mojam afloat
these days, so I can't step into the crevass, just observe that it looks to
me like a very long ways to the bottom... ;-)

Skip



From guido at digicool.com  Tue Jan 30 01:22:58 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 19:22:58 -0500
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: Your message of "Mon, 29 Jan 2001 18:17:25 CST."
             <14966.2069.950895.627663@beluga.mojam.com> 
References: <14966.2069.950895.627663@beluga.mojam.com> 
Message-ID: <200101300022.TAA21244@cj20424-a.reston1.va.home.com>

> After reading through this thread and noticing (but not paying close
> attention to) all the related posts on c.l.py (subject: "in for dicts"), it
> seems to me that the whole "if/for something in dict" thing needds to be
> hashed out in a PEP.  There were a fair amount of "Python's changing too
> fast" rants when 2.0 was released.  Adding a major feature such as this at
> the 2.1 stage is only going to generate that many more rants.  The fact that
> it was easy for Thomas to implement "if key in dict" doesn't make the
> overall concept less controversial.  There are apparently lots of varying
> opinions about what's reasonable.  This topic seems related to PEP 212 (Loop
> Counter Iteration) and PEP 218 (Adding a Built-In Set Object Type), but may
> well warrant its own.

Excellent.  Good reminder also that this shouldn't go into 2.1 --
clearly the design space is too complicated for a quick decision.

> That said, I have plenty enough on my plate trying to keep Mojam afloat
> these days, so I can't step into the crevass, just observe that it looks to
> me like a very long ways to the bottom... ;-)

I'm not able to lead such a PEP effort myself either, but I hope
*someone* will be.  This PEP has a good chance for 2.2 though (what
with BDFL approval and all :-).

--Guido van Rossum (home page: http://www.python.org/~guido/)




From tim.one at home.com  Tue Jan 30 02:39:17 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 20:39:17 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291448.JAA11473@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com>

[Guido]
> I did a less sophisticated count but come to the same conclusion:
> iterations over items() are (somewhat) more common than over keys(),
> and values() are 1-2 orders of magnitude less common.  My numbers:
>
> $ cd python/src/Lib
> $ grep 'for .*items():' *.py | wc -l
>      47
> $ grep 'for .*keys():' *.py | wc -l
>      43
> $ grep 'for .*values():' *.py | wc -l
>       2

I like my larger sample and anal methodology better <wink>.  A closer look
showed that it may have been unduly biased by the mass of files in
Lib/encodings/, where

encoding_map = {}
for k,v in decoding_map.items():
    encoding_map[v] = k

is at the end of most files (btw, MAL, that's the answer to your question:
people would expect "the same" ordering you expected there, i.e. none in
particular).

> ...
> I don't much value to the readability argument: typically, one will
> write "for key in dict" or "for name in dict" and then it's obvious
> what is meant.

Well, "fiddlesticks" comes to mind <0.9 wink>.  If I've got a dict mapping
phone numbers to names, "for name in dict" is dead backwards.

    for vevent in keydefs.keys():
    for x in self.subdirs.keys():
    for name in lsumdict.keys():
    for locale in self.descriptions.keys():
    for name in attrs.keys():
    for func in other.top_level.keys():
    for func in target.keys():
    for i in u2.keys():
    for s in d.keys():
    for url in self.bad.keys():

are other cases in the CVS tree where I don't think the name makes it
obvious in the absence of ".keys()".

But I don't personally give any weight to whether people can guess what
something does at first glance.  My rule is that it doesn't matter, provided
it's (a) easy to learn; and (especially), (b) hard to *forget* once you've
learned it.  A classic example is Python's "points between elements"
treatment of slice indices:  few people guess right what that does at first
glance, but once they "get it" they're delighted and rarely mess up again.

And I think this is "like that".

> ...
> But here's my dilemma.  "if (k, v) in dict" is clearly useless (nobody
> has even asked me for a has_item() method).

Yup.

> I can live with "x in list" checking the values and "x in dict"
> checking the keys.  But I can *not* live with "x in dict" equivalent
> to "dict.has_key(x)" if "for x in dict" would mean
> "for x in dict.items()".

That's why I brought it up -- it's not entirely clear what's to be done
here.

> I also think that defining "x in dict" but not "for x in dict" will
> be confusing.
>
> So we need to think more.

The hoped-for next step indeed.

> How about:
>
>     for key in dict: ...		# ... over keys
>
>     for key:value in dict: ...		# ... over items
>
> This is syntactically unambiguous (a colon is currently illegal in
> that position).

Cool!  Can we resist adding

    if key:value in dict

for "parallelism"?  (I know I can ...)  2/3rd of these are marginally more
attractive:

    for key: in dict:    # over dict.keys()
    for :value in dict:  # over dict.values()
    for : in dict:       # a delay loop

> This also suggests:
>
>     for index:value in list: ...	# ... over zip(range(len(list), list)
>
> while doesn't strike me as bad or ugly, and would fulfill my brother's
> dearest wish.

You mean besides the one that you fry in hell for not adding "for ...
indexing"?  Ya, probably.

> (And why didn't we think of this before?)

Best guess:  we were focused exclusively on sequences, and a colon just
didn't suggest itself in that context.  Second-best guess:  having finally
approved one of these gimmicks, you finally got desperate enough to make it
work <wink>.

ponderingly y'rs  - tim




From tim.one at home.com  Tue Jan 30 02:58:59 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 20:58:59 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCEBPIMAA.tim.one@home.com>

[Guido]
> ...
> I'm expecting (though don't have much proof) that most loops over
> dicts don't mutate the dict.

Safe bet!  I do recall writing one once:  it del'ed keys for which the
associated count was 1, because the rest of the algorithm was only
interested in duplicates.

> Maybe we could add a flag to the dict that issues an error when a new
> key is inserted during such a for loop?  (I don't think the key order
> can be affected when a key is *deleted*.)

That latter is true but specific to this implementation.  "Can't mutate the
dict period" is easier to keep straight, and probably harmless in practice
(if not, it could be relaxed later).  Recall that a similar trick is played
during list.sort(), replacing the list's type pointer for the duration (to
point to an internal "immutable list" type, same as the list type except the
"dangerous" slots point to a function that raises an "immutable list"
TypeError).  Then no runtime expense is incurred for regular lists to keep
checking flags.  I thought of this as an elegant use for switching types at
runtime; you may still be appalled by it, though!




From tim.one at home.com  Tue Jan 30 03:07:36 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 21:07:36 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3A75B190.3FD2A883@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECAIMAA.tim.one@home.com>

[Guido]
> The same order that for k,v in dict.items() will yield, of course.

[MAL]
> And then people find out that the order has some sorting
> properties and start to use it...

Except that it has none.  dict insertion has never used any comparison
outcome beyond "equal"/"not equal", so any ordering you think you see is--
and always was --an illusion.




From guido at digicool.com  Tue Jan 30 03:06:35 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:06:35 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Mon, 29 Jan 2001 20:39:17 EST."
             <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com> 
Message-ID: <200101300206.VAA21925@cj20424-a.reston1.va.home.com>

This is all PEP material now.  Tim, do you want to own the PEP?  It
seems just up your alley!

> Cool!  Can we resist adding
> 
>     if key:value in dict
> 
> for "parallelism"?  (I know I can ...)

That's easy to resist because, unlike ``for key:value in dict'', it's
not unambiguous: ``if key:value in dict'' is already legal syntax
currently, with 'key' as the condition and 'value in dict' as the (not
particularly useful) body of the if statement.

> > (And why didn't we think of this before?)
> 
> Best guess:  we were focused exclusively on sequences, and a colon just
> didn't suggest itself in that context.  Second-best guess:  having finally
> approved one of these gimmicks, you finally got desperate enough to make it
> work <wink>.

I'm certainly more comfortable with just ``for key in dict'' than with
the whole slow of extensions using colons.

But, again, that's for the PEP to fight over.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan 30 03:15:04 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:15:04 -0500
Subject: [Python-Dev] C's for statement
In-Reply-To: Your message of "Mon, 29 Jan 2001 15:09:14 EST."
             <20010129150914.C10191@thyrsus.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com>  
            <20010129150914.C10191@thyrsus.com> 
Message-ID: <200101300215.VAA21955@cj20424-a.reston1.va.home.com>

[ESR]
> There's not much I miss from C these days, but one thing I wish Python
> had is a more general for-loop.  The C semantics that let you have 
> any initialization, any termination test, and any iteration you like
> are rather cool.
> 
> Yes, I realize that
> 
> 	for (<init>; <test>; <step>) {<body>}
> 
> can be simulated with:
> 
> 	<init>
> 	while 1:
> 		if <test>:
> 			break
> 		<body> 
> 
> Still, having them spatially grouped the way a C for does it is nice.
> Makes it easier to see invariants, I think.

Hm, I've seen too many ugly C for loops to have much appreciation for
it.  I can recognize and appreciate the few common forms that clearly
iterate over an array; most other forms look rather contorted to me.
Check out the Python C sources; if you find anything more complicated
than ``for (i = n; i > 0; i--)'' I probably didn't write
it. :-)

Common abominations include:

- writing a while loop as for(;<test>;)

- putting arbitrary initialization code in <init>

- having an empty condition, so the <step> becomes an arbitraty
  extension of the body that's written out-of-sequence

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Tue Jan 30 03:19:12 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 21:19:12 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3A75C86A.3A4236E8@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECAIMAA.tim.one@home.com>

[MAL]
> I just wanted to hint at a problem which iterating over items
> in an unordered set can cause.  Especially new Python users will find
> it confusing that the order of the items in an iteration can change
> from one run to the next.

Do they find "for k, v in dict.items()" confusing now?  Would be the same.

> ...
> What we really want is iterators for dictionaries, so why not
> implement these instead of tweaking for-loops.

Seems an unrelated topic:  would "iterators for dictionaries" solve the
supposed problem with iteration order?

> If you are looking for speedups w/r to for-loops, applying a
> different indexing technique in for-loops would go a lot further
> and provide better performance not only to dictionary loops,
> but also to other sequences.
>
> I have made some good experience with a special counter object
> (sort of like a mutable integer) which is used instead of the
> iteration index integer in the current implementation.

Please quantify, if possible.  My belief (based on past experiments) is that
in loops fancier than

    for i in range(n):
        pass

the loop overhead quickly falls into the noise even now.

> Using an iterator object instead of the integer + __getitem__
> call machinery would allow more flexibility for all kinds of
> sequences or containers. ...

This is yet another abrupt change of topic, yes <0.9 wink>?  I agree a new
iteration *protocol* could have major attractions.




From guido at digicool.com  Tue Jan 30 03:17:27 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:17:27 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: Your message of "Mon, 29 Jan 2001 15:02:47 EST."
             <20010129150247.B10191@thyrsus.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com>  
            <20010129150247.B10191@thyrsus.com> 
Message-ID: <200101300217.VAA21978@cj20424-a.reston1.va.home.com>

[ESR]
> For different reasons, I'd like to be able to set a constant flag on a
> object instance.  Simple semantics: if you try to assign to a
> member or method, it throws an exception.
> 
> Application?  I have a large Python program that goes to a lot of effort
> to build elaborate context structures in core.  It would be nice to know
> they can't be even inadvertently trashed without throwing an exception I 
> can watch for.

Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:

- How to spell it?  x.freeze()?  x.readonly()?

- Should this reversible?  I.e. should there be an x.unfreeze()?

- Should we support something like this for instances too?  Sometimes
  it might be cool to be able to freeze changing attribute values...

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Tue Jan 30 03:29:25 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 21:29:25 -0500
Subject: [Python-Dev] C's for statement
In-Reply-To: <200101300215.VAA21955@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCKECBIMAA.tim.one@home.com>

Check out SETL's loop statement.  I think Perl5 is a subset of it <0.9
wink>.




From esr at thyrsus.com  Tue Jan 30 03:34:01 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 21:34:01 -0500
Subject: [Python-Dev] Re: C's for statement
In-Reply-To: <200101300215.VAA21955@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 09:15:04PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com> <20010129150914.C10191@thyrsus.com> <200101300215.VAA21955@cj20424-a.reston1.va.home.com>
Message-ID: <20010129213401.A17235@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> Common abominations include:
> 
> - writing a while loop as for(;<test>;)

Agreed. Bletch.
 
> - putting arbitrary initialization code in <init>

Not sure what's "arbitrary", unless you mean unrelated to the 
iteration variable.

> - having an empty condition, so the <step> becomes an arbitraty
>   extension of the body that's written out-of-sequence

Again agreed.  Double bletch.

I guess my archetype of the cute C for-loop is the idiom for 
pointer-list traversal:

	struct foo {int data; struct foo *next;} *ptr, *head; 

	for (ptr = head; *ptr; ptr = ptr->next)
		do_something_with(ptr->data)

This is elegant.  It separates the logic for list traversal from the
operation on the list element.

Not the highest on my list of wants -- I'd sooner have ?: back.  I submitted
a patch for that once, and the discussion sort of died.  Were you dead
det against it, or should I revive this proposal?
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"The bearing of arms is the essential medium through which the
individual asserts both his social power and his participation in
politics as a responsible moral being..."
        -- J.G.A. Pocock, describing the beliefs of the founders of the U.S.



From esr at thyrsus.com  Tue Jan 30 03:49:59 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 21:49:59 -0500
Subject: [Python-Dev] Re: Making mutable objects readonly
In-Reply-To: <200101300217.VAA21978@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 09:17:27PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <20010129214959.B17235@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
> 
> - How to spell it?  x.freeze()?  x.readonly()?

I like "freeze", it'a a clear imperative where "readonly()" sounds
like a test (e.g. "is this readonly()?")
 
> - Should we support something like this for instances too?  Sometimes
>   it might be cool to be able to freeze changing attribute values...

Moshe Zadka sent me a hack that handles instances:

> class MarkableAsConstant:
> 
> 	def __init__(self):
> 		self.mark_writable()
> 
> 	def __setattr__(self, name, value):
> 		if self._writable:
> 			self.__dict__[name] = value
> 		else:
> 			raise ValueError, "object is read only"
> 
> 	def mark_writable(self):
> 		self.__dict__['_writable'] = 1
> 
> 	def mark_readonly(self):
> 		self.__dict__['_writable'] = 0

> - Should this reversible?  I.e. should there be an x.unfreeze()?

I gave this some thought earlier today.  There are advantages to either
way.  Making freeze a one-way operation would make it possible to use
freezing to get certain kinds of security and integrity guarantees that
you can't have if freezing is reversible.

Fortunately, there's a semantics that captures both.  If we allow
freeze to take an optional key argument, and require that an unfreeze
call must supply the same key or fail, we get both worlds.  We can
even one-way-hash the keys so they don't have to be stored in the
bytecode.

Want to lock a structure permanently?  Pick a random long key.  Freeze
with it.  Then throw that key away...
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Strict gun laws are about as effective as strict drug laws...It pains
me to say this, but the NRA seems to be right: The cities and states
that have the toughest gun laws have the most murder and mayhem.
        -- Mike Royko, Chicago Tribune



From tim.one at home.com  Tue Jan 30 03:57:59 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 21:57:59 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGECDIMAA.tim.one@home.com>

[Guido]
> Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
>
> - How to spell it?  x.freeze()?  x.readonly()?

See below.

> - Should this reversible?

Of course.  Or x.freeze(solid=1) to default to permanent rigidity, but not
require it.

>  I.e. should there be an x.unfreeze()?

That conveniently answers the first question, since x.unreadonly() reads
horribly <wink>.

> - Should we support something like this for instances too?  Sometimes
>   it might be cool to be able to freeze changing attribute values...

"Should be" supported for every mutable object.  Next step:  as in endless
C++ debates, endless Python debates about "representation freeze" vs
"logical freeze" ("well, yes, I'm changing this member, but it's just an
invisible cache so I *should* be able to tag the object as const anyway
..."; etc etc etc).

keep-it-simple-ly y'rs  - tim




From guido at digicool.com  Tue Jan 30 03:57:24 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:57:24 -0500
Subject: [Python-Dev] Re: C's for statement
In-Reply-To: Your message of "Mon, 29 Jan 2001 21:34:01 EST."
             <20010129213401.A17235@thyrsus.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com> <20010129150914.C10191@thyrsus.com> <200101300215.VAA21955@cj20424-a.reston1.va.home.com>  
            <20010129213401.A17235@thyrsus.com> 
Message-ID: <200101300257.VAA22186@cj20424-a.reston1.va.home.com>

> > - putting arbitrary initialization code in <init>
> 
> Not sure what's "arbitrary", unless you mean unrelated to the 
> iteration variable.

Yes, that.

> I guess my archetype of the cute C for-loop is the idiom for 
> pointer-list traversal:
> 
> 	struct foo {int data; struct foo *next;} *ptr, *head; 
> 
> 	for (ptr = head; *ptr; ptr = ptr->next)
> 		do_something_with(ptr->data)
> 
> This is elegant.  It separates the logic for list traversal from the
> operation on the list element.

And it rarely happens in Python, because sequences are rarely
represented as linked lists.

> Not the highest on my list of wants -- I'd sooner have ?: back.  I submitted
> a patch for that once, and the discussion sort of died.  Were you dead
> det against it, or should I revive this proposal?

Not dead set against something like it, but dead set against the ?:
syntax because then : becomes too overloaded for the human reader, e.g.:

    if foo ? bar : bletch : spam = eggs

If you want to revive this, I strongly suggest writing a PEP first
before posting here.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan 30 03:59:17 2001
From: guido at digicool.com (Guido van Rossum)
Date: Mon, 29 Jan 2001 21:59:17 -0500
Subject: [Python-Dev] Re: Making mutable objects readonly
In-Reply-To: Your message of "Mon, 29 Jan 2001 21:49:59 EST."
             <20010129214959.B17235@thyrsus.com> 
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com>  
            <20010129214959.B17235@thyrsus.com> 
Message-ID: <200101300259.VAA22208@cj20424-a.reston1.va.home.com>

> > - How to spell it?  x.freeze()?  x.readonly()?
> 
> I like "freeze", it'a a clear imperative where "readonly()" sounds
> like a test (e.g. "is this readonly()?")

Agreed.

> > - Should we support something like this for instances too?  Sometimes
> >   it might be cool to be able to freeze changing attribute values...
> 
> Moshe Zadka sent me a hack that handles instances:
[...]

OK, so no special support needed there.

> > - Should this reversible?  I.e. should there be an x.unfreeze()?
> 
> I gave this some thought earlier today.  There are advantages to either
> way.  Making freeze a one-way operation would make it possible to use
> freezing to get certain kinds of security and integrity guarantees that
> you can't have if freezing is reversible.
> 
> Fortunately, there's a semantics that captures both.  If we allow
> freeze to take an optional key argument, and require that an unfreeze
> call must supply the same key or fail, we get both worlds.  We can
> even one-way-hash the keys so they don't have to be stored in the
> bytecode.
> 
> Want to lock a structure permanently?  Pick a random long key.  Freeze
> with it.  Then throw that key away...

Way too cute.  My suggestion freeze(0) freezes forever, freeze(1)
can be unfrozen.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From esr at thyrsus.com  Tue Jan 30 04:06:19 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 22:06:19 -0500
Subject: [Python-Dev] Re: C's for statement
In-Reply-To: <200101300257.VAA22186@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Mon, Jan 29, 2001 at 09:57:24PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com> <20010129150914.C10191@thyrsus.com> <200101300215.VAA21955@cj20424-a.reston1.va.home.com> <20010129213401.A17235@thyrsus.com> <200101300257.VAA22186@cj20424-a.reston1.va.home.com>
Message-ID: <20010129220619.A17713@thyrsus.com>

Guido van Rossum <guido at digicool.com>:
> Not dead set against something like it, but dead set against the ?:
> syntax because then : becomes too overloaded for the human reader, e.g.:
> 
>     if foo ? bar : bletch : spam = eggs
> 
> If you want to revive this, I strongly suggest writing a PEP first
> before posting here.

Noted.  Will do.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

Such are a well regulated militia, composed of the freeholders,
citizen and husbandman, who take up arms to preserve their property,
as individuals, and their rights as freemen.
        -- "M.T. Cicero", in a newspaper letter of 1788 touching the "militia" 
            referred to in the Second Amendment to the Constitution.



From tim.one at home.com  Tue Jan 30 04:18:47 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 22:18:47 -0500
Subject: [Python-Dev] Re: Making mutable objects readonly
In-Reply-To: <20010129214959.B17235@thyrsus.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEECEIMAA.tim.one@home.com>

Note that even adding a "frozen" flag would add 4 bytes to every freezable
object on most machines.  That's why I'd rather .freeze() replace the type
pointer and .unfreeze() restore it.  No time or space overhead; no
cluttering up the normal-case (i.e., unfrozen) type implementations with new
tests.




From tim.one at home.com  Tue Jan 30 04:57:07 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 22:57:07 -0500
Subject: [Python-Dev] Re: Python 2.1 slower than 2.0
In-Reply-To: <14965.42988.362288.154254@localhost.localdomain>
Message-ID: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com>

Note that optimizing compilers use a pile of linear-time heuristics to
attempt to solve exponential-time optimization problems (from optimal
register assignment to optimal instruction scheduling, they're all formally
intractable even in isolation).

When code gets non-trivial, not even a compiler's chief designer can
reliably outguess what optimization may do.  It's really not unusual for a
higher optimization level to yield slower code, and especially not when the
source code is pushing or exceeding machine limits (# of registers, # of
instruction pipes, size of branch-prediction buffers; I-cache structure;
dynamic restrictions on execution units; ...).

[Jeremy]
> ...
> One of the differences between -O2 and -O3, according to the man page,
> is that -O3 will perform optimizations that involve a space-speed
> tradeoff.  It also include -finline-functions.  I can imagine that
> some of these optimizations hurt memory performance enough to make a
> difference.

One of the time-consuming ongoing tasks at my last employer was running
profiles and using them to override counterproductive compiler inlining
decisions (in both directions).  It's not just memory that excessive
inlining can screw up, but also things like running out of registers and so
inserting gobs of register spill/restore code, and inlining so much code
that the instruction scheduler effectively gives up (under many compilers, a
sure sign of this is when you look at the generated code for a function, and
it looks beautiful "at the top" but terrible "at the bottom"; some clever
optimizers tried to get around that by optimizing "bottom-up", and then it
looks beautiful at the bottom but terrible at the top <0.5 wink>; others
work middle-out or burn the candle at both ends, with visible consequences
you should be able to recognize now!).

optimization-is-easier-than-speech-recog-but-the-latter-doesn't-work-
    all-that-well-either-ly y'rs  - tim




From barry at digicool.com  Tue Jan 30 05:13:24 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 23:13:24 -0500
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
References: <14966.2069.950895.627663@beluga.mojam.com>
Message-ID: <14966.16228.548177.112853@anthem.wooz.org>

>>>>> "SM" == Skip Montanaro <skip at mojam.com> writes:

    SM> it seems to me that the whole "if/for something in dict" thing
    SM> needds to be hashed out in a PEP.
    
    SM> There are apparently lots of varying opinions about what's
    SM> reasonable.  This topic seems related to PEP 212 (Loop Counter
    SM> Iteration) and PEP 218 (Adding a Built-In Set Object Type),
    SM> but may well warrant its own.

As keeper of PEP0, I have to agree.  I personally would vastly prefer
a new iterator protocol than syntax such as "for key:value in dict".
I'd really like to see a PEP on an iterator protocol for Python, but
like Skip, I'm too busy at the moment to do it myself.  If nobody
takes it on before then, I might be willing to champion such a PEP for
the 2.2 time frame.  Until then, I'm decidedly -1 on "for/if in dict".

-Barry



From barry at digicool.com  Tue Jan 30 05:25:09 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 23:25:09 -0500
Subject: [Python-Dev] Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
	<3A756FF8.B7185FA2@lemburg.com>
	<200101291500.KAA11569@cj20424-a.reston1.va.home.com>
	<3A75B190.3FD2A883@lemburg.com>
	<200101291922.OAA13321@cj20424-a.reston1.va.home.com>
	<20010129150247.B10191@thyrsus.com>
	<200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <14966.16933.209494.214183@anthem.wooz.org>

>>>>> "GvR" == Guido van Rossum <guido at digicool.com> writes:

    GvR> Yes, this is a good thing.  Easy to do on lists and dicts.
    GvR> Questions:

    GvR> - How to spell it?  x.freeze()?  x.readonly()?

    GvR> - Should this reversible?  I.e. should there be an
    GvR> x.unfreeze()?

    GvR> - Should we support something like this for instances too?
    GvR> Sometimes it might be cool to be able to freeze changing
    GvR> attribute values...

lock(x) ...? :)

-Barry



From barry at digicool.com  Tue Jan 30 05:26:50 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Mon, 29 Jan 2001 23:26:50 -0500
Subject: [Python-Dev] Re: Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
	<3A756FF8.B7185FA2@lemburg.com>
	<200101291500.KAA11569@cj20424-a.reston1.va.home.com>
	<3A75B190.3FD2A883@lemburg.com>
	<200101291922.OAA13321@cj20424-a.reston1.va.home.com>
	<20010129150247.B10191@thyrsus.com>
	<200101300217.VAA21978@cj20424-a.reston1.va.home.com>
	<20010129214959.B17235@thyrsus.com>
Message-ID: <14966.17034.721204.305315@anthem.wooz.org>

>>>>> "ESR" == Eric S Raymond <esr at thyrsus.com> writes:

    ESR> Fortunately, there's a semantics that captures both.  If we
    ESR> allow freeze to take an optional key argument, and require
    ESR> that an unfreeze call must supply the same key or fail, we
    ESR> get both worlds.  We can even one-way-hash the keys so they
    ESR> don't have to be stored in the bytecode.

    ESR> Want to lock a structure permanently?  Pick a random long
    ESR> key.  Freeze with it.  Then throw that key away...

Clever!



From esr at thyrsus.com  Tue Jan 30 05:32:16 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Mon, 29 Jan 2001 23:32:16 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <14966.16933.209494.214183@anthem.wooz.org>; from barry@digicool.com on Mon, Jan 29, 2001 at 11:25:09PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com> <14966.16933.209494.214183@anthem.wooz.org>
Message-ID: <20010129233215.A18533@thyrsus.com>

Barry A. Warsaw <barry at digicool.com>:
> lock(x) ...? :)

I was thinking that myself, Barry.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Boys who own legal firearms have much lower rates of delinquency and
drug use and are even slightly less delinquent than nonowners of guns."
	-- U.S. Department of Justice, National Institute of
	   Justice, Office of Juvenile Justice and Delinquency Prevention,
	   NCJ-143454, "Urban Delinquency and Substance Abuse," August 1995.



From tim.one at home.com  Tue Jan 30 05:56:09 2001
From: tim.one at home.com (Tim Peters)
Date: Mon, 29 Jan 2001 23:56:09 -0500
Subject: [Python-Dev] SSL socket read at EOF; SourceForge problem
Message-ID: <LNBBLJKPBEHFEDALKOLCEECJIMAA.tim.one@home.com>

I tried to open an SF bug for the following msg from c.l.py, but SF balked:

    ERROR
    ERROR getting bug_id

Logged out, logged in, tried it again, same outcome.

Intended bug report content:

Good question from c.l.py, assigned to Guido cuz he's a Socket Guy:

From: Clarence Gardner <clarence at netlojix.com>
Subject: RE: Thread Safety
Date: Mon, 29 Jan 2001 09:51:03 -0800

...

I'm going to repeat a question that I posted about a week ago that passed
without comment on the newsgroup. The issue is the SSL support in the socket
module, which raises an exception when the reading socket is at EOF, rather
than returning an empty string. I'm hesitant to call it a "bug", but I
wouldn't have implemented it this way.  There are the names of two people
mentioned at the top of socketmodule.c, but no contact information, so I'm
suggesting here that it be changed to conform to normal file/socket
practice. (SSL was actually added at 2.0, so I'm late to the party with
this; mea culpa, mea culpa.  I delayed trying Python2 because of the
extension rebuilding.)




From thomas at xs4all.net  Tue Jan 30 07:14:20 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 07:14:20 +0100
Subject: [Python-Dev] Re: C's for statement
In-Reply-To: <20010129213401.A17235@thyrsus.com>; from esr@thyrsus.com on Mon, Jan 29, 2001 at 09:34:01PM -0500
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <3A75C86A.3A4236E8@lemburg.com> <20010129150914.C10191@thyrsus.com> <200101300215.VAA21955@cj20424-a.reston1.va.home.com> <20010129213401.A17235@thyrsus.com>
Message-ID: <20010130071420.U962@xs4all.nl>

On Mon, Jan 29, 2001 at 09:34:01PM -0500, Eric S. Raymond wrote:

> I guess my archetype of the cute C for-loop is the idiom for 
> pointer-list traversal:

> 	struct foo {int data; struct foo *next;} *ptr, *head; 

> 	for (ptr = head; *ptr; ptr = ptr->next)
> 		do_something_with(ptr->data)

Note two things: in Python, you would use a list, so 'for x i list' does
exactly what you want here ;) And if you really need it, you could use
iterators for exactly this (once we have them, of course): you are inventing
a new storage type. Quite common in C, since the only one it has is useless
for anything other than strings<wink>, but not so common in Python.

> Not the highest on my list of wants -- I'd sooner have ?: back.  I submitted
> a patch for that once, and the discussion sort of died.  Were you dead
> det against it, or should I revive this proposal?

Triple blech. Guido will never go for it! (There, increased your chance of
getting it approved! :) Seriously though, I wouldn't like it much, it's too
cryptic a syntax. I notice I use it less and less in C, too.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Tue Jan 30 07:18:25 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 07:18:25 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com>; from tim.one@home.com on Mon, Jan 29, 2001 at 08:39:17PM -0500
References: <200101291448.JAA11473@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCKEBIIMAA.tim.one@home.com>
Message-ID: <20010130071825.V962@xs4all.nl>

On Mon, Jan 29, 2001 at 08:39:17PM -0500, Tim Peters wrote:

>     for key: in dict:    # over dict.keys()
>     for :value in dict:  # over dict.values()
>     for : in dict:       # a delay loop

Wot's the last one supposed to do ? 'for unused_var in range(len(dict)):' ?

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From tim.one at home.com  Tue Jan 30 07:25:51 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 30 Jan 2001 01:25:51 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010130071825.V962@xs4all.nl>
Message-ID: <LNBBLJKPBEHFEDALKOLCMECLIMAA.tim.one@home.com>

>>     for key: in dict:    # over dict.keys()
>>     for :value in dict:  # over dict.values()
>>     for : in dict:       # a delay loop

[Thomas Wouters]
> Wot's the last one supposed to do ? 'for unused_var in
> range(len(dict)):' ?

Well, as the preceding line said in the original:

>>    2/3rd of these are marginally more attractive [than
>>    "if key:value in dict"]:

I think you've guessed which 2/3 those are <wink>.  I don't see that the
last line has any visible semantics whatsoever, so Python can do whatever it
likes, provided it doesn't do anything visible.

You still hang out on c.l.py!  So you gotta know that if something of the
form

    x:y

is suggested, people will line up to suggest meanings for the 3 obvious
variations, along with

    x::y

and

    x:-:y

and

    x lambda y

too <0.9 wink>.




From thomas at xs4all.net  Tue Jan 30 07:26:48 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 07:26:48 +0100
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <14966.2069.950895.627663@beluga.mojam.com>; from skip@mojam.com on Mon, Jan 29, 2001 at 06:17:25PM -0600
References: <14966.2069.950895.627663@beluga.mojam.com>
Message-ID: <20010130072648.W962@xs4all.nl>

On Mon, Jan 29, 2001 at 06:17:25PM -0600, Skip Montanaro wrote:

> The fact that it was easy for Thomas to implement "if key in dict" doesn't
> make the overall concept less controversial.

Note that the fact I implemented it doesn't mean I'm +1 on it (witness my
posts on python-list.) In fact, *while implementing it*, I grew from +0 to
-0 and maybe even to a weak -1 (all in 5 minutes :) The enthousiastic
subject of the patch was a weak attempt at 5AM humour, not a venting of an
ancient desire :)

More-5AM-humour-ly y'rs,
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Tue Jan 30 07:55:16 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 07:55:16 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net>; from jhylton@users.sourceforge.net on Mon, Jan 29, 2001 at 05:27:30PM -0800
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <20010130075515.X962@xs4all.nl>

On Mon, Jan 29, 2001 at 05:27:30PM -0800, Jeremy Hylton wrote:

> add note about two kinds of illegal imports that are now checked

> + - The compiler will report a SyntaxError if "from ... import *" occurs
> +   in a function or class scope or if a name bound by the import
> +   statement is declared global in the same scope.  The language
> +   reference has also documented that these cases are illegal, but
> +   they were not enforced.

Woah. Is this really a good idea ? I have seen 'from ... import *' in a
function scope put to good (relatively -- we're talking 'import *' here)
use. I also thought of 'import' as yet another assignment statement, so to
me it's both logical and consistent if 'import' would listen to 'global'.
Otherwise we have to re-invent 'import spam; eggs = spam' if we want eggs to
be global. 

Is there really a reason to enforce this, or are we enforcing the wording of
the language reference for the sake of enforcing the wording of the language
reference ? When writing 'import as' for 2.0, I fixed some of the
inconsistencies in import, making it adhere to 'global' statements in as
many cases as possible (all except 'from ... import *') but I was apparently
not aware of the wording of the language reference. I'd suggest updating the
wording in the language reference, not the implementation, unless there is a
good reason to disallow this.

I also have another issue with your recent patches, Jeremy, also in the
backwards-compatibility departement :) You gave new.code two new,
non-optional arguments, in the middle of the long argument list. I sent a
note about it to python-checkins instead of python-dev by accident, but Fred
seemed to agree with me there.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From mwh21 at cam.ac.uk  Tue Jan 30 09:30:15 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 30 Jan 2001 08:30:15 +0000
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: "Tim Peters"'s message of "Mon, 29 Jan 2001 22:57:07 -0500"
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com>
Message-ID: <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk>

In the interest of generating some numbers (and filling up my hard
drive), last night I wrote a script to build lots & lots of versions
of python (many of which turned out to be redundant - eg. -O6 didn't
seem to do anything different to -O3 and pybench doesn't work with
1.5.2), and then run pybench with them.  Summarised results below;
first a key:

src-n: this morning's CVS (with Jeremy's f_localsplus optimisation)
        (only built this with -O3)
src: CVS from yesterday afternoon
src-obmalloc: CVS from yesterday afternoon with Vladimir's obmalloc 
        patch applied.  More on this later...
Python-2.0: you can guess what this is.

All runs are compared against Python-2.0-O2:

Benchmark: src-n-O3 (rounds=10, warp=20)
            Average round time:   49029.00 ms              -0.86%
Benchmark: src (rounds=10, warp=20)
            Average round time:   67141.00 ms             +35.76%
Benchmark: src-O (rounds=10, warp=20)
            Average round time:   50167.00 ms              +1.44%
Benchmark: src-O2 (rounds=10, warp=20)
            Average round time:   49641.00 ms              +0.37%
Benchmark: src-O3 (rounds=10, warp=20)
            Average round time:   49104.00 ms              -0.71%
Benchmark: src-O6 (rounds=10, warp=20)
            Average round time:   49131.00 ms              -0.66%
Benchmark: src-obmalloc (rounds=10, warp=20)
            Average round time:   63276.00 ms             +27.94%
Benchmark: src-obmalloc-O (rounds=10, warp=20)
            Average round time:   46927.00 ms              -5.11%
Benchmark: src-obmalloc-O2 (rounds=10, warp=20)
            Average round time:   46146.00 ms              -6.69%
Benchmark: src-obmalloc-O3 (rounds=10, warp=20)
            Average round time:   46456.00 ms              -6.07%
Benchmark: src-obmalloc-O6 (rounds=10, warp=20)
            Average round time:   46450.00 ms              -6.08%
Benchmark: Python-2.0 (rounds=10, warp=20)
            Average round time:   68933.00 ms             +39.38%
Benchmark: Python-2.0-O (rounds=10, warp=20)
            Average round time:   49542.00 ms              +0.17%
Benchmark: Python-2.0-O3 (rounds=10, warp=20)
            Average round time:   48262.00 ms              -2.41%
Benchmark: Python-2.0-O6 (rounds=10, warp=20)
            Average round time:   48273.00 ms              -2.39%

My conclusion?  Python 2.1 is slower than Python 2.0, but not by
enough to care about.

Interestingly, adding obmalloc speeds things up.  Let's take a closer
look:

$ python pybench.py -c src-obmalloc-O3 -s src-O3      
PYBENCH 0.7

Benchmark: src-O3 (rounds=10, warp=20)

Tests:                              per run    per oper.  diff *
------------------------------------------------------------------------
          BuiltinFunctionCalls:     843.35 ms    6.61 us   +2.93%
           BuiltinMethodLookup:     878.70 ms    1.67 us   +0.56%
                 ConcatStrings:    1068.80 ms    7.13 us   -1.22%
                 ConcatUnicode:    1373.70 ms    9.16 us   -1.24%
               CreateInstances:    1433.55 ms   34.13 us   +9.06%
       CreateStringsWithConcat:    1031.75 ms    5.16 us  +10.95%
       CreateUnicodeWithConcat:    1277.85 ms    6.39 us   +3.14%
                  DictCreation:    1275.80 ms    8.51 us  +44.22%
                      ForLoops:    1415.90 ms  141.59 us   -0.64%
                    IfThenElse:    1152.70 ms    1.71 us   -0.15%
                   ListSlicing:     397.40 ms  113.54 us   -0.53%
                NestedForLoops:     789.75 ms    2.26 us   -0.37%
          NormalClassAttribute:     935.15 ms    1.56 us   -0.41%
       NormalInstanceAttribute:     961.15 ms    1.60 us   -0.60%
           PythonFunctionCalls:    1079.65 ms    6.54 us   -1.00%
             PythonMethodCalls:     908.05 ms   12.11 us   -0.88%
                     Recursion:     838.50 ms   67.08 us   -0.00%
                  SecondImport:     741.20 ms   29.65 us  +25.57%
           SecondPackageImport:     744.25 ms   29.77 us  +18.66%
         SecondSubmoduleImport:     947.05 ms   37.88 us  +25.60%
       SimpleComplexArithmetic:    1129.40 ms    5.13 us  +114.92%
        SimpleDictManipulation:    1048.55 ms    3.50 us   -0.00%
         SimpleFloatArithmetic:     746.05 ms    1.36 us   -2.75%
      SimpleIntFloatArithmetic:     823.35 ms    1.25 us   -0.37%
       SimpleIntegerArithmetic:     823.40 ms    1.25 us   -0.37%
        SimpleListManipulation:    1004.70 ms    3.72 us   +0.01%
          SimpleLongArithmetic:     865.30 ms    5.24 us  +100.65%
                    SmallLists:    1657.65 ms    6.50 us   +6.63%
                   SmallTuples:    1143.95 ms    4.77 us   +2.90%
         SpecialClassAttribute:     949.00 ms    1.58 us   -0.22%
      SpecialInstanceAttribute:    1353.05 ms    2.26 us   -0.73%
                StringMappings:    1161.00 ms    9.21 us   +7.30%
              StringPredicates:    1069.65 ms    3.82 us   -5.30%
                 StringSlicing:     846.30 ms    4.84 us   +8.61%
                     TryExcept:    1590.40 ms    1.06 us   -0.49%
                TryRaiseExcept:    1104.65 ms   73.64 us  +24.46%
                  TupleSlicing:     681.10 ms    6.49 us   -3.13%
               UnicodeMappings:    1021.70 ms   56.76 us   +0.79%
             UnicodePredicates:    1308.45 ms    5.82 us   -4.79%
             UnicodeProperties:    1148.45 ms    5.74 us  +13.67%
                UnicodeSlicing:     984.15 ms    5.62 us   -0.51%
------------------------------------------------------------------------
            Average round time:   49104.00 ms              +5.70%

*) measured against: src-obmalloc-O3 (rounds=10, warp=20)

Words fail me slightly, but maybe some tuning of the memory allocation
of longs & complex numbers would be in order?

Time for lectures - I don't think algebraic geometry is going to make
my head hurt as much as trying to explain benchmarks...

Cheers,
M.

-- 
  ARTHUR:  But which is probably incapable of drinking the coffee.
                    -- The Hitch-Hikers Guide to the Galaxy, Episode 6




From ping at lfw.org  Tue Jan 30 09:38:12 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Tue, 30 Jan 2001 00:38:12 -0800 (PST)
Subject: [Python-Dev] Read-only function attributes
Message-ID: <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org>

Hi there.

I see that the function attribute feature specifically allows
assignment to func_code and func_defaults, but no other special
attributes.  This seems really suspect to me.  Why would we want
to allow the reassignment of special attributes at all?

Functions have always been immutable objects, and i can see some
motivation for attaching mutable dictionaries to them, but it's
a more serious move to make the functions mutable themselves.

I don't recall any discussion about changing special attributes;
i don't see a clear purpose to them; and i do see a danger in
making it harder to be certain that a program is safe and predictable.

(Yes, i did notice that function attributes can't be set in
restricted mode, but the addition of extra features requiring
extra security checks makes me uneasy.)


-- ?!ng




From ping at lfw.org  Tue Jan 30 09:52:43 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Tue, 30 Jan 2001 00:52:43 -0800 (PST)
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <Pine.LNX.4.10.10101300043260.7769-100000@skuld.kingmanhall.org>

Eric S. Raymond wrote:
> For different reasons, I'd like to be able to set a constant flag on a
> object instance.  Simple semantics: if you try to assign to a
> member or method, it throws an exception.

Guido van Rossum wrote:
> Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
> 
> - How to spell it?  x.freeze()?  x.readonly()?

I'm not so sure.  There seem to be many issues here.  More questions:

What's the difference between a frozen list and a tuple?

Is a frozen list hashable?

> - Should this reversible?  I.e. should there be an x.unfreeze()?

What if two threads lock and then unlock the same structure?

> - Should we support something like this for instances too?  Sometimes
>   it might be cool to be able to freeze changing attribute values...

If you do this, i bet people will immediately want to freeze
individual attributes.  Some might be confused by

    a.x = [1, 2, 3]
    lock(a.x)        # intend to lock the attribute, not the list
    a.x = 3          # hey, why is this allowed?

What does locking an extension object do?

What happens when you lock an object that implements list or dict
semantics?  Do we care that locking a UserList accomplishes nothing?

Should unfreeze/unlock() be disallowed in restricted mode?


-- ?!ng

No software is totally secure, but using [Microsoft] Outlook is like
hanging a sign on your back that reads "PLEASE MESS WITH MY COMPUTER."
    -- Scott Rosenberg, Salon Magazine




From fredrik at effbot.org  Tue Jan 30 10:05:47 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 30 Jan 2001 10:05:47 +0100
Subject: [Python-Dev] Read-only function attributes
References: <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org>
Message-ID: <01d701c08a9b$d7a9fe60$e46940d5@hagrid>

Ka-Ping Yee wrote:
> I see that the function attribute feature specifically allows
> assignment to func_code and func_defaults, but no other special
> attributes.  This seems really suspect to me.  Why would we want
> to allow the reassignment of special attributes at all?

to allow an IDE to "patch" a running program?

</F>




From gvwilson at ca.baltimore.com  Tue Jan 30 14:08:42 2001
From: gvwilson at ca.baltimore.com (Greg Wilson)
Date: Tue, 30 Jan 2001 08:08:42 -0500 (EST)
Subject: [Python-Dev] re: Making mutable objects readonly
In-Reply-To: <20010130085202.18E71EAC4@mail.python.org>
Message-ID: <Pine.LNX.4.10.10101300804330.14867-100000@akbar.nevex.com>

> Barry Warsaw:
> lock(x) ...? :)

Greg Wilson:

-1 --- everyone will assume it's mutual exclusion, rather than immutability.






From guido at digicool.com  Tue Jan 30 15:01:15 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 09:01:15 -0500
Subject: [Python-Dev] Read-only function attributes
In-Reply-To: Your message of "Tue, 30 Jan 2001 00:38:12 PST."
             <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org> 
Message-ID: <200101301401.JAA25600@cj20424-a.reston1.va.home.com>

> I see that the function attribute feature specifically allows
> assignment to func_code and func_defaults, but no other special
> attributes.  This seems really suspect to me.  Why would we want
> to allow the reassignment of special attributes at all?

As Effbot said, this is useful in certain circumstances where a
development environment wants to implement a "better reload".  For
this same reason you can assign to a class's __bases__ and __dict__
and to an instance's __class__ and __dict__.

--Guido van Rossum (home page: http://www.python.org/~guido/)




From guido at digicool.com  Tue Jan 30 16:00:58 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 10:00:58 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: Your message of "Tue, 30 Jan 2001 00:52:43 PST."
             <Pine.LNX.4.10.10101300043260.7769-100000@skuld.kingmanhall.org> 
References: <Pine.LNX.4.10.10101300043260.7769-100000@skuld.kingmanhall.org> 
Message-ID: <200101301500.KAA25733@cj20424-a.reston1.va.home.com>

> Guido van Rossum wrote:
> > Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
> > 
> > - How to spell it?  x.freeze()?  x.readonly()?

Ping:
> I'm not so sure.  There seem to be many issues here.  More questions:
> 
> What's the difference between a frozen list and a tuple?

A frozen list can be unfrozen (maybe)?

> Is a frozen list hashable?

Yes -- that's what started this thread (using dicts as dict keys,
actually).

> > - Should this reversible?  I.e. should there be an x.unfreeze()?
> 
> What if two threads lock and then unlock the same structure?

That's up to the threads -- it's no different that other concurrent
access.

> > - Should we support something like this for instances too?  Sometimes
> >   it might be cool to be able to freeze changing attribute values...
> 
> If you do this, i bet people will immediately want to freeze
> individual attributes.  Some might be confused by
> 
>     a.x = [1, 2, 3]
>     lock(a.x)        # intend to lock the attribute, not the list
>     a.x = 3          # hey, why is this allowed?

That's a matter of API.  I wouldn't make this a built-in, but rather a
method on freezable objects (please don't call it lock()!).

> What does locking an extension object do?

What does adding 1 to an extension object do?

> What happens when you lock an object that implements list or dict
> semantics?  Do we care that locking a UserList accomplishes nothing?

Who says it doesn't?

> Should unfreeze/unlock() be disallowed in restricted mode?

I don't see why not.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan 30 16:06:57 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 10:06:57 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: Your message of "Tue, 30 Jan 2001 07:55:16 +0100."
             <20010130075515.X962@xs4all.nl> 
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net>  
            <20010130075515.X962@xs4all.nl> 
Message-ID: <200101301506.KAA25763@cj20424-a.reston1.va.home.com>

> On Mon, Jan 29, 2001 at 05:27:30PM -0800, Jeremy Hylton wrote:
> 
> > add note about two kinds of illegal imports that are now checked
> 
> > + - The compiler will report a SyntaxError if "from ... import *" occurs
> > +   in a function or class scope or if a name bound by the import
> > +   statement is declared global in the same scope.  The language
> > +   reference has also documented that these cases are illegal, but
> > +   they were not enforced.

> Woah. Is this really a good idea ? I have seen 'from ... import *'
> in a function scope put to good (relatively -- we're talking 'import
> *' here) use. I also thought of 'import' as yet another assignment
> statement, so to me it's both logical and consistent if 'import'
> would listen to 'global'.  Otherwise we have to re-invent 'import
> spam; eggs = spam' if we want eggs to be global.

Note that Jeremy is only raising errors for "from M import *".

> Is there really a reason to enforce this, or are we enforcing the
> wording of the language reference for the sake of enforcing the
> wording of the language reference ? When writing 'import as' for
> 2.0, I fixed some of the inconsistencies in import, making it adhere
> to 'global' statements in as many cases as possible (all except
> 'from ... import *') but I was apparently not aware of the wording
> of the language reference. I'd suggest updating the wording in the
> language reference, not the implementation, unless there is a good
> reason to disallow this.

I think Jeremy has an excellent reason.  Compilers want to do analysis
of name usage at compile time.  The value of * cannot be determined at
compile time (you cannot know what module will actually be imported at
run time).  Up till now, we were able to fudge this, but Jeremy's new
compiler needs to know exactly which names are defined in all local
scopes, in order to do nested scopes right.

> I also have another issue with your recent patches, Jeremy, also in
> the backwards-compatibility departement :) You gave new.code two
> new, non-optional arguments, in the middle of the long argument
> list. I sent a note about it to python-checkins instead of
> python-dev by accident, but Fred seemed to agree with me there.

(Tim will love this. :-)

I don't know what those new arguments represent.  If they can
reasonably be assumed to be empty for code that doesn't use the new
features, I'd say move them to the end and default them properly.  If
they must be specified, I'd say too bad, the new module is an accident
of the implementation anyway, and its users should update their code.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Tue Jan 30 16:08:39 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 10:08:39 -0500
Subject: [Python-Dev] Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: Your message of "Tue, 30 Jan 2001 07:26:48 +0100."
             <20010130072648.W962@xs4all.nl> 
References: <14966.2069.950895.627663@beluga.mojam.com>  
            <20010130072648.W962@xs4all.nl> 
Message-ID: <200101301508.KAA25825@cj20424-a.reston1.va.home.com>

> Note that the fact I implemented it doesn't mean I'm +1 on it (witness my
> posts on python-list.) In fact, *while implementing it*, I grew from +0 to
> -0 and maybe even to a weak -1 (all in 5 minutes :) The enthousiastic
> subject of the patch was a weak attempt at 5AM humour, not a venting of an
> ancient desire :)

Can you say "PEP time"? :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)



From barry at digicool.com  Tue Jan 30 16:29:43 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 30 Jan 2001 10:29:43 -0500
Subject: [Python-Dev] Read-only function attributes
References: <Pine.LNX.4.10.10101300013030.7769-100000@skuld.kingmanhall.org>
Message-ID: <14966.56807.288840.7850@anthem.wooz.org>

>>>>> "KY" == Ka-Ping Yee <ping at lfw.org> writes:

    KY> I see that the function attribute feature specifically allows
    KY> assignment to func_code and func_defaults, but no other
    KY> special attributes.  This seems really suspect to me.  Why
    KY> would we want to allow the reassignment of special attributes
    KY> at all?

... and actually, none of that changed w/ the function attribute
patch.  You've been able to assign to func_code and func_defaults
since Python 1.6!

-Barry



From thomas at xs4all.net  Tue Jan 30 16:52:04 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 16:52:04 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <200101301506.KAA25763@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 30, 2001 at 10:06:57AM -0500
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com>
Message-ID: <20010130165204.I962@xs4all.nl>

On Tue, Jan 30, 2001 at 10:06:57AM -0500, Guido van Rossum wrote:
> > On Mon, Jan 29, 2001 at 05:27:30PM -0800, Jeremy Hylton wrote:
> > 
> > > add note about two kinds of illegal imports that are now checked
> > 
> > > + - The compiler will report a SyntaxError if "from ... import *" occurs
> > > +   in a function or class scope or if a name bound by the import
> > > +   statement is declared global in the same scope.  The language
> > > +   reference has also documented that these cases are illegal, but
> > > +   they were not enforced.

> > Woah. Is this really a good idea ? I have seen 'from ... import *'
> > in a function scope put to good (relatively -- we're talking 'import
> > *' here) use. I also thought of 'import' as yet another assignment
> > statement, so to me it's both logical and consistent if 'import'
> > would listen to 'global'.  Otherwise we have to re-invent 'import
> > spam; eggs = spam' if we want eggs to be global.

> Note that Jeremy is only raising errors for "from M import *".

No, he says he's also raising errors for 'import spam' if 'spam' is declared
global, like so:

def viking():
    global spam
    import spam

> > Is there really a reason to enforce this, or are we enforcing the
> > wording of the language reference for the sake of enforcing the
> > wording of the language reference ? When writing 'import as' for
> > 2.0, I fixed some of the inconsistencies in import, making it adhere
> > to 'global' statements in as many cases as possible (all except
> > 'from ... import *') but I was apparently not aware of the wording
> > of the language reference. I'd suggest updating the wording in the
> > language reference, not the implementation, unless there is a good
> > reason to disallow this.

> I think Jeremy has an excellent reason.  Compilers want to do analysis
> of name usage at compile time.  The value of * cannot be determined at
> compile time (you cannot know what module will actually be imported at
> run time).  Up till now, we were able to fudge this, but Jeremy's new
> compiler needs to know exactly which names are defined in all local
> scopes, in order to do nested scopes right.

Hrrmm.... I guess I have to agree with that. None the less, I wish we could
have a "ack! this is stupid code! it uses 'from larch import *'! All bets
are off, we do a lot of slow complicated runtime checking now!" mode. The
thing I still enjoy most about Python is that it always does what I want,
and though I'd never want to do 'from different import *' in a local scope,
I do want other, less wise people to have the same experience, where
possible :)

And I also want to be able to do:

def fill_me(with):
    global me
    if with == 1:
        import me
    elif with == 2:
        import me_too as me
    elif with == 3:
        from me.Tools import me_me as me
    elif with == 4:
        me = FakeModule()
        sys.modules['me'] = me
    else:
        raise ValueError

And I can't quite argue that away with 'the compiler needs to know ...' --
it's all there!

> > I also have another issue with your recent patches, Jeremy, also in
> > the backwards-compatibility departement :) You gave new.code two
> > new, non-optional arguments, in the middle of the long argument
> > list. I sent a note about it to python-checkins instead of
> > python-dev by accident, but Fred seemed to agree with me there.

> (Tim will love this. :-)

> I don't know what those new arguments represent.  If they can
> reasonably be assumed to be empty for code that doesn't use the new
> features, I'd say move them to the end and default them properly.  If
> they must be specified, I'd say too bad, the new module is an accident
> of the implementation anyway, and its users should update their code.

Okay, I can live with that. It's sure to cause some gripes though. Then
again, from looking at the code I'd say those arguments (freevars and
cellvars) can easily default to empty tuples.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From bckfnn at worldonline.dk  Tue Jan 30 18:34:10 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Tue, 30 Jan 2001 17:34:10 GMT
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>   <3A756FF8.B7185FA2@lemburg.com>  <200101291500.KAA11569@cj20424-a.reston1.va.home.com>
Message-ID: <3a76df10.22007715@smtp.worldonline.dk>

[Guido]

>Maybe we could add a flag to the dict that issues an error when a new
>key is inserted during such a for loop?  

FWIW, some of the java2 collections decided to throw a Concurrent-
ModificationException in the iterator if the collection was modified
during the iteration. Generally none of java2 collections can be
modified while iterating over it (the exception is calling .remove() on
the iterator object and not all collections support that).

>(I don't think the key order can be affected when a key is *deleted*.)

Probably also true for the Hashtables which is backing our PyDictionary,
but I'll rather not depend too much on it being true.

[Tim]

>That latter is true but specific to this implementation.  "Can't mutate the
>dict period" is easier to keep straight, and probably harmless in practice
>(if not, it could be relaxed later).  

Agree.

>Recall that a similar trick is played
>during list.sort(), replacing the list's type pointer for the duration (to
>point to an internal "immutable list" type, same as the list type except the
>"dangerous" slots point to a function that raises an "immutable list"
>TypeError).  Then no runtime expense is incurred for regular lists to keep
>checking flags.  I thought of this as an elegant use for switching types at
>runtime; you may still be appalled by it, though!

Changing the type of a type? Yuck! 

I might very likely be reading the CPython sources wrongly, but it seems
this trick will cause an BadInternalCall if some other C extension are
trying to modify a list while it is freezed by the type switching trick.
I imagine this would happen if the extension called:

  PyList_SetItem(myList, 0, aValue);

I guess Jython could support this from the python side, but its hard to
ensure from the java side without adding an additional PyList_Check(..)
to all list methods. It just doesn't feel like the right thing to go
since it would cause slower access to all mutable objects.

regards,
finn



From guido at digicool.com  Tue Jan 30 21:42:58 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 15:42:58 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: Your message of "Tue, 30 Jan 2001 16:52:04 +0100."
             <20010130165204.I962@xs4all.nl> 
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com>  
            <20010130165204.I962@xs4all.nl> 
Message-ID: <200101302042.PAA29301@cj20424-a.reston1.va.home.com>

> > > Woah. Is this really a good idea ? I have seen 'from ... import *'
> > > in a function scope put to good (relatively -- we're talking 'import
> > > *' here) use. I also thought of 'import' as yet another assignment
> > > statement, so to me it's both logical and consistent if 'import'
> > > would listen to 'global'.  Otherwise we have to re-invent 'import
> > > spam; eggs = spam' if we want eggs to be global.
> 
> > Note that Jeremy is only raising errors for "from M import *".
> 
> No, he says he's also raising errors for 'import spam' if 'spam' is declared
> global, like so:
> 
> def viking():
>     global spam
>     import spam

Yeah, this was just brought to my attention at our group meeting
today.  I'm with you on this one -- there really isn't a good reason
why this shouldn't work.  (I wonder why that constraint was ever added
to the reference manual; maybe I was just upset that someone would
*do* something as ugly as that, or maybe there was a J[P]ython
reason???.)

> > I think Jeremy has an excellent reason.  Compilers want to do analysis
> > of name usage at compile time.  The value of * cannot be determined at
> > compile time (you cannot know what module will actually be imported at
> > run time).  Up till now, we were able to fudge this, but Jeremy's new
> > compiler needs to know exactly which names are defined in all local
> > scopes, in order to do nested scopes right.
> 
> Hrrmm.... I guess I have to agree with that. None the less, I wish we could
> have a "ack! this is stupid code! it uses 'from larch import *'! All bets
> are off, we do a lot of slow complicated runtime checking now!" mode. The
> thing I still enjoy most about Python is that it always does what I want,
> and though I'd never want to do 'from different import *' in a local scope,
> I do want other, less wise people to have the same experience, where
> possible :)

Hm, maybe, just *maybe* Jeremy can do this if there are no nested
scopes in sight.  But I don't think it's a big deal as long as the
error message is clear -- it's bad style.

> And I also want to be able to do:
> 
> def fill_me(with):
>     global me
>     if with == 1:
>         import me
>     elif with == 2:
>         import me_too as me
>     elif with == 3:
>         from me.Tools import me_me as me
>     elif with == 4:
>         me = FakeModule()
>         sys.modules['me'] = me
>     else:
>         raise ValueError
> 
> And I can't quite argue that away with 'the compiler needs to know ...' --
> it's all there!

Sort of, although I would prefer to do a two-stager here: first some
variation of "import me as meohmy", and then "global me; me = meohmy" .

> > > I also have another issue with your recent patches, Jeremy, also in
> > > the backwards-compatibility departement :) You gave new.code two
> > > new, non-optional arguments, in the middle of the long argument
> > > list. I sent a note about it to python-checkins instead of
> > > python-dev by accident, but Fred seemed to agree with me there.
> 
> > (Tim will love this. :-)
> 
> > I don't know what those new arguments represent.  If they can
> > reasonably be assumed to be empty for code that doesn't use the new
> > features, I'd say move them to the end and default them properly.  If
> > they must be specified, I'd say too bad, the new module is an accident
> > of the implementation anyway, and its users should update their code.
> 
> Okay, I can live with that. It's sure to cause some gripes though. Then
> again, from looking at the code I'd say those arguments (freevars and
> cellvars) can easily default to empty tuples.

OK.  I hope Jeremy can fix this when he gets home.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Tue Jan 30 23:30:25 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Tue, 30 Jan 2001 23:30:25 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3a76df10.22007715@smtp.worldonline.dk>; from bckfnn@worldonline.dk on Tue, Jan 30, 2001 at 05:34:10PM +0000
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3a76df10.22007715@smtp.worldonline.dk>
Message-ID: <20010130233025.J962@xs4all.nl>

On Tue, Jan 30, 2001 at 05:34:10PM +0000, Finn Bock wrote:

> >Recall that a similar trick is played during list.sort(), replacing the
> >list's type pointer for the duration (to point to an internal "immutable
> >list" type, same as the list type except the "dangerous" slots point to a
> >function that raises an "immutable list" TypeError).  Then no runtime
> >expense is incurred for regular lists to keep checking flags.  I thought
> >of this as an elegant use for switching types at runtime; you may still
> >be appalled by it, though!

> Changing the type of a type? Yuck! 

No, the typeobject itself isn't changed -- that would freeze *all*
dicts/lists/whatever, not just the one we want. We'd be changing the type of
an object (or 'type instance', if you want, but not "type 'instance'"), not
the type of a type.

> I might very likely be reading the CPython sources wrongly, but it seems
> this trick will cause an BadInternalCall if some other C extension are
> trying to modify a list while it is freezed by the type switching trick.
> I imagine this would happen if the extension called:

>   PyList_SetItem(myList, 0, aValue);

Only if PyList_SetItem refuses to handle 'frozen' lists. In my eyes,
'frozen' lists should still pass PyList_Check(), but also PyList_Frozen()
(or whatever), and methods/operations that modify the listobject would have
to check if the list is frozen, and raise an appropriate error if so. This
might throw 'unexpected' errors, but only in situations that can't happen
right now!

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From fredrik at effbot.org  Tue Jan 30 23:45:16 2001
From: fredrik at effbot.org (Fredrik Lundh)
Date: Tue, 30 Jan 2001 23:45:16 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3a76df10.22007715@smtp.worldonline.dk> <20010130233025.J962@xs4all.nl>
Message-ID: <003501c08b0e$51f975c0$e46940d5@hagrid>

> Only if PyList_SetItem refuses to handle 'frozen' lists. In my eyes,
> 'frozen' lists should still pass PyList_Check(), but also PyList_Frozen()
> (or whatever), and methods/operations that modify the listobject would have
> to check if the list is frozen, and raise an appropriate error if so. This
> might throw 'unexpected' errors.

did someone just subscribe me to the perl-porters list?

-1 on "modal freeze" (it's madness)
-0 on an "immutable dictionary" type in the core




From tim.one at home.com  Wed Jan 31 00:53:45 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 30 Jan 2001 18:53:45 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101300206.VAA21925@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>

[Guido]
> This is all PEP material now.

Yup.

> Tim, do you want to own the PEP?

Not really.  Available time is finite, and this isn't at the top of the list
of things I'd like to see (resuming the discussion of generators +
coroutines + iteration protocol comes to mind first).

>> Cool!  Can we resist adding
>>
>>     if key:value in dict
>>
>> for "parallelism"?  (I know I can ...)

> That's easy to resist because, unlike ``for key:value in dict'', it's
> not unambiguous:

But

    if (key:value) in dict

is.  Just trying to help whoever *does* want the PEP <wink>.

> ...
> I'm certainly more comfortable with just ``for key in dict'' than with
> the whole slow of extensions using colons.

What about just the

    for key:value in dict
    for index:value in sequence

extensions?  The degenerate forms (omitting x or y or both in x:y) are
mechanical variations so are likely to get raised.

> But, again, that's for the PEP to fight over.

PEPs are easier if you Pronounce on things you hate early so that those can
get recorded in the "BDFL Pronouncements" section without further ado.

whatever-this-may-look-like-it's-not-a-pep-discussion<wink>-ly y'rs  - tim




From nas at arctrix.com  Tue Jan 30 18:12:15 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 30 Jan 2001 09:12:15 -0800
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <003501c08b0e$51f975c0$e46940d5@hagrid>; from fredrik@effbot.org on Tue, Jan 30, 2001 at 11:45:16PM +0100
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3a76df10.22007715@smtp.worldonline.dk> <20010130233025.J962@xs4all.nl> <003501c08b0e$51f975c0$e46940d5@hagrid>
Message-ID: <20010130091215.C18319@glacier.fnational.com>

On Tue, Jan 30, 2001 at 11:45:16PM +0100, Fredrik Lundh wrote:
> did someone just subscribe me to the perl-porters list?
> 
> -1 on "modal freeze" (it's madness)
> -0 on an "immutable dictionary" type in the core

I'm glad I'm not the only one who had that feeling.  I agree with
your votes too.

  Neil



From nas at arctrix.com  Tue Jan 30 18:24:54 2001
From: nas at arctrix.com (Neil Schemenauer)
Date: Tue, 30 Jan 2001 09:24:54 -0800
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>; from tim.one@home.com on Tue, Jan 30, 2001 at 06:53:45PM -0500
References: <200101300206.VAA21925@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>
Message-ID: <20010130092454.D18319@glacier.fnational.com>

[Tim Peters on adding yet more syntatic sugar]
> Available time is finite, and this isn't at the top of the list
> of things I'd like to see (resuming the discussion of
> generators + coroutines + iteration protocol comes to mind
> first).

What's the chances of getting generators into 2.2?  The
implementation should not be hard.  Didn't Steven Majewski have
something years ago?  Why do we always get sidetracked on trying
to figure out how to do coroutines and continuations?

Generators would add real power to the language and are simple
enough that most users could benefit from them.  Also, it should be
possible to design an interface that does not preclude the
addition of coroutines or continuations later.

I'm not volunteering to champion the cause just yet.  I just want
to know if there is some issue I'm missing.

  Neil



From barry at digicool.com  Wed Jan 31 01:24:05 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Tue, 30 Jan 2001 19:24:05 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <200101300206.VAA21925@cj20424-a.reston1.va.home.com>
	<LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>
	<20010130092454.D18319@glacier.fnational.com>
Message-ID: <14967.23333.57259.347222@anthem.wooz.org>

>>>>> "NS" == Neil Schemenauer <nas at arctrix.com> writes:

    NS> What's the chances of getting generators into 2.2?  The
    NS> implementation should not be hard.  Didn't Steven Majewski
    NS> have something years ago?  Why do we always get sidetracked on
    NS> trying to figure out how to do coroutines and continuations?

I'd be +1 on someone wrestling PEP 220 from Gordon's icy claws,
renaming it just "Generators" and filling it out for the 2.2 time
frame.  If we want to address coroutines and continuations later, we
can write separate PEPs for them.

Send me a draft.

-Barry



From guido at digicool.com  Wed Jan 31 01:28:44 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 19:28:44 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Tue, 30 Jan 2001 18:53:45 EST."
             <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com> 
Message-ID: <200101310028.TAA30090@cj20424-a.reston1.va.home.com>

> Not really.  Available time is finite, and this isn't at the top of the list
> of things I'd like to see (resuming the discussion of generators +
> coroutines + iteration protocol comes to mind first).

OK, get going on that one then!

> >> Cool!  Can we resist adding
> >>
> >>     if key:value in dict
> >>
> >> for "parallelism"?  (I know I can ...)
> 
> > That's easy to resist because, unlike ``for key:value in dict'', it's
> > not unambiguous:
> 
> But
> 
>     if (key:value) in dict
> 
> is.  Just trying to help whoever *does* want the PEP <wink>.

OK, I'll pronounce -1 on this one.  It looks ugly to me -- too
reminiscent of C's if (...) required parentheses.  Also it suggests
that (key:value) is a new tuple notation that might be useful in other
contexts -- which it's not.

> > ...
> > I'm certainly more comfortable with just ``for key in dict'' than with
> > the whole slow of extensions using colons.
> 
> What about just the
> 
>     for key:value in dict
>     for index:value in sequence
> 
> extensions?

I'm not against these -- I'd say +0.5.

> The degenerate forms (omitting x or y or both in x:y) are
> mechanical variations so are likely to get raised.

For those, +0.2.

> > But, again, that's for the PEP to fight over.
> 
> PEPs are easier if you Pronounce on things you hate early so that those can
> get recorded in the "BDFL Pronouncements" section without further ado.

At your service -- see above.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Wed Jan 31 01:49:24 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 19:49:24 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Tue, 30 Jan 2001 09:24:54 PST."
             <20010130092454.D18319@glacier.fnational.com> 
References: <200101300206.VAA21925@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>  
            <20010130092454.D18319@glacier.fnational.com> 
Message-ID: <200101310049.TAA30197@cj20424-a.reston1.va.home.com>

> [Tim Peters on adding yet more syntatic sugar]
> > Available time is finite, and this isn't at the top of the list
> > of things I'd like to see (resuming the discussion of
> > generators + coroutines + iteration protocol comes to mind
> > first).
> 
> What's the chances of getting generators into 2.2?  The
> implementation should not be hard.  Didn't Steven Majewski have
> something years ago?  Why do we always get sidetracked on trying
> to figure out how to do coroutines and continuations?

I think there's a very good chance of getting them into 2.2.  But it
*is* true that coroutines are a very attractice piece of land "just
nextdoor".  On the other hand, continiations are a mirage, so don't
try to go there. :-)

> Generators would add real power to the language and are simple
> enough that most users could benefit from them.  Also, it should be
> possible to design an interface that does not preclude the
> addition of coroutines or continuations later.
> 
> I'm not volunteering to champion the cause just yet.  I just want
> to know if there is some issue I'm missing.

There are different ways to do interators.

Here is a very "tame" proposal (and definitely in the realm of 2.2),
that doesn't require any coroutine-like tricks.  Let's propose that

    for var in expr:
	...do something with var...

will henceforth be translated into

    __iter = iterator(expr)
    while __iter.more():
        var = __iter.next()
        ...do something with var...

-- or some variation that combines more() and next() (I don't care).

Then a new built-in function iterator() is needed that creates an
iterator object.  It should try two things:

(1) If the object implements __iterator__() (or a C API equivalent),
    call that and be done; this way arbitrary iterators can be
    created.

(2) If the object smells like a sequence (how to test???), use an
    iterator sort of like this:

    class Iterator:

        def __init__(self, sequence):
            self.sequence = sequence
            self.index = 0

        def more(self):
	    # Store the item so that each index is tried exactly once
            try:
                self.item = self.sequence[self.index]
            except IndexError:
                return 0
            else:
                self.index = self.index + 1
                return 1

        def next(self):
            return self.item

    (I don't necessarily mean that all those instance variables should
    be publicly available.)

The built-in sequence types can use a very fast built-in iterator type
that uses a C int for the index and doesn't store the item in the
iterator.  (This should be as fast as Marc-Andre's for loop
optimization using a C counter.)

Dictionaries can define an appropriate iterator that uses
PyDict_Next().

If the argument to iterator() is itself an iterator (how to test???),
it returns the argument unchanged, so that one can also write

    for var in iterator(obj):
	...do something with var...

Files of course should have iterators that return the next input line.

We could build filtering and mapping iterators that take an iterator
argument and do certain manipulations with the elements; this would
effectively introduce the notion lazy evaluation on sequences.

Etc., etc.

This does not come close to Icon generators -- but it doesn't require
any coroutine-like capabilities, unlike those.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From tim.one at home.com  Wed Jan 31 01:55:10 2001
From: tim.one at home.com (Tim Peters)
Date: Tue, 30 Jan 2001 19:55:10 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3a76df10.22007715@smtp.worldonline.dk>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEEJIMAA.tim.one@home.com>

[Finn Bock]
> Changing the type of a type? Yuck!

No, it temporarily changes the type of the single list being sorted, like
so, where "self" is a pointer to a PyListObject (which is a list, not a list
*type* object):

	self->ob_type = &immutable_list_type;
	err = samplesortslice(self->ob_item,
			      self->ob_item + self->ob_size,
			      compare);
	self->ob_type = &PyList_Type;

immutable_list_type is "just like" PyList_Type, except that the slots for
mutating methods point to a function that raises a TypeError.

Before this drastic step came years of increasingly ugly hacks trying to
stop core dumps when people mutated a list during the sort.  Python's sort
is very complex, and lots of pointers are tucked away -- having the size of
the array, or its position in memory, or the set of objects it contains,
change as a side effect of doing a compare, would be difficult and expensive
to recover from -- and by "difficult" read "nobody ever managed to get it
right before this" <0.5 wink>.

> I might very likely be reading the CPython sources wrongly, but it seems
> this trick will cause an BadInternalCall if some other C extension are
> trying to modify a list while it is freezed by the type switching trick.
> I imagine this would happen if the extension called:
>
>   PyList_SetItem(myList, 0, aValue);

Well, in CPython it's not "legal" for any other thread to use the C API
while the sort is in progress, because the thread doing the sort holds the
global interpreter lock for the duration.  So this could happen "legally"
only if a comparison function called by the sort called out to a C extension
attempting to mutate the list.  In that case, fine, it *is* a bad call:
mutation is not allowed during list sorting, so they deserve whatever they
get -- and far better a "bad internal call" than a core dump.

If the immutable_list_type were used more generally, it would require more
general support (but I see Thomas already talked about that -- thanks).




From guido at digicool.com  Wed Jan 31 01:55:19 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 19:55:19 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: Your message of "Tue, 30 Jan 2001 19:24:05 EST."
             <14967.23333.57259.347222@anthem.wooz.org> 
References: <200101300206.VAA21925@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com> <20010130092454.D18319@glacier.fnational.com>  
            <14967.23333.57259.347222@anthem.wooz.org> 
Message-ID: <200101310055.TAA30250@cj20424-a.reston1.va.home.com>

> I'd be +1 on someone wrestling PEP 220 from Gordon's icy claws,
> renaming it just "Generators" and filling it out for the 2.2 time
> frame.  If we want to address coroutines and continuations later, we
> can write separate PEPs for them.

I think it's better not to re-use PEP 220 for that.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From thomas at xs4all.net  Wed Jan 31 01:58:32 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 31 Jan 2001 01:58:32 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101310028.TAA30090@cj20424-a.reston1.va.home.com>; from guido@digicool.com on Tue, Jan 30, 2001 at 07:28:44PM -0500
References: <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com> <200101310028.TAA30090@cj20424-a.reston1.va.home.com>
Message-ID: <20010131015832.K962@xs4all.nl>

On Tue, Jan 30, 2001 at 07:28:44PM -0500, Guido van Rossum wrote:

> > What about just the

> >     for key:value in dict
> >     for index:value in sequence

> > extensions?

> I'm not against these -- I'd say +0.5.

What, fractions ? Isn't that against the whole idea of (+|-)(0|1) ? :)
But since we are voting, I'm -0 on this right now, and might end up -1 or
+0, depending on the implementation; I still can't *see* this, though I
wouldn't be myself if I hadn't tried to implement it anyway :) And I ran
into some fairly mind-boggling issues. The worst bit is 'how the f*ck
does FOR_LOOP know if something's a dict or a list'. And the
almost-as-bad bit is 'WTF to do for user classes, extension types and
almost-list/almost-dict practically-builtin types (arrays, the *dbm's,
etc.)'. After some sleep-deprived consideration I gave up and decided we
need an iteration/generator protocol first.

However, my life's been busy (or rather, my work has been) with all kinds
of small and not so small details, and I haven't been getting much sleep
in the last week or so, so I might be overlooking something very simple.
That's why I can go either way based on implementation -- it might prove
me wrong :) Until my boss is back and I stop being 'responsible' (end of
this week, start of next week) and I get a chance to get rid of about 2
months of work backlog (the time he was away) I won't have time to
champion or even contribute to such a PEP. Then again, by that time I
might be preparing for IPC9 (_if_ my boss sends me there) or even my
ApacheCon US presentation (which got accepted today, yay!)

So, if that other message was an attempt to drop the PEP on me, Guido,
the answer is the same as I tend to give to suits that show up next to my
desk wanting to discuss something important (to them) right away:
"b'gg'r 'ff" :)

I'll-save-my-answer-to-PR-officers-doing-the-same-for-when-you-do-something-
 -*really*-offensive-ly <wink> y'rs
-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From guido at digicool.com  Wed Jan 31 02:16:51 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 20:16:51 -0500
Subject: [Python-Dev] Let's release 2.1a2 Thursday night
Message-ID: <200101310116.UAA30386@cj20424-a.reston1.va.home.com>

Things look good for a release of 2.1a2 this week; we're aiming for
Thursday night.  I won't be in town (speaking to the press at
LinuxWorld Expo in New York) but Jeremy will handle the release
process and the other PythonLabs folks will assist him.

Tomorrow Fred will check in his weak references after making some
changes (mostly making it more Spartan :-) that I suggested in a code
review.

After that, I think we're good for the second (and last!) alpha
release; and enough has changed (e.g. nested scopes, lots of setup.py
changes, flat Makefile) to warrant going ahead now.

Now is the time for those last-minute bugfixes that you're all so
famous for!

I propose a checkin freeze for non-PythonLabs folks Wednesday midnight
US west coast time, to give Jeremy c.s. enough time to build the
release and give it a good work-out.  (An internal freeze is up to
Jeremy to declare, but should probably take Tim's sleep cycle into
account.)

--Guido van Rossum (home page: http://www.python.org/~guido/)

PS. I'll be out of reach from noon US east coast time tomorrow
(Wednesday), traveling to New York by train.  I probably won't check
my email while out there; I'll be back Friday night.



From guido at digicool.com  Wed Jan 31 02:35:25 2001
From: guido at digicool.com (Guido van Rossum)
Date: Tue, 30 Jan 2001 20:35:25 -0500
Subject: [Python-Dev] SSL socket read at EOF; SourceForge problem
In-Reply-To: Your message of "Mon, 29 Jan 2001 23:56:09 EST."
             <LNBBLJKPBEHFEDALKOLCEECJIMAA.tim.one@home.com> 
References: <LNBBLJKPBEHFEDALKOLCEECJIMAA.tim.one@home.com> 
Message-ID: <200101310135.UAA30629@cj20424-a.reston1.va.home.com>

> I'm going to repeat a question that I posted about a week ago that passed
> without comment on the newsgroup. The issue is the SSL support in the socket
> module, which raises an exception when the reading socket is at EOF, rather
> than returning an empty string. I'm hesitant to call it a "bug", but I
> wouldn't have implemented it this way.  There are the names of two people
> mentioned at the top of socketmodule.c, but no contact information, so I'm
> suggesting here that it be changed to conform to normal file/socket
> practice. (SSL was actually added at 2.0, so I'm late to the party with
> this; mea culpa, mea culpa.  I delayed trying Python2 because of the
> extension rebuilding.)

I agree that it makes more sense if a read at EOF returns an empty
string, since that's what other file-like objects in Python do.  I
can't do much about this right now, but I'd love to see a patch.  It
could go into 2.1a2 if small enough.

Note that input() and raw_input() are specifically excepted because
they are intended for use in interactive mode by newbies mostly; and
because "" as return value for EOF would be ambiguous for these.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From greg at cosc.canterbury.ac.nz  Wed Jan 31 05:12:23 2001
From: greg at cosc.canterbury.ac.nz (Greg Ewing)
Date: Wed, 31 Jan 2001 17:12:23 +1300 (NZDT)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101310028.TAA30090@cj20424-a.reston1.va.home.com>
Message-ID: <200101310412.RAA03140@s454.cosc.canterbury.ac.nz>

<someone whose attribution has been lost>:

>     for index:value in sequence

-1, because we only construct dicts using that
notation, not sequences.

Greg Ewing, Computer Science Dept, +--------------------------------------+
University of Canterbury,	   | A citizen of NewZealandCorp, a	  |
Christchurch, New Zealand	   | wholly-owned subsidiary of USA Inc.  |
greg at cosc.canterbury.ac.nz	   +--------------------------------------+



From guido at digicool.com  Wed Jan 31 06:21:37 2001
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 31 Jan 2001 00:21:37 -0500
Subject: [Python-Dev] codecity.com
Message-ID: <200101310521.AAA31653@cj20424-a.reston1.va.home.com>

Should I spread this word, or is this a joke?  The Python quiz
category is laughable.

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Sat, 27 Jan 2001 23:16:02 -0800
From:    "Jeff Cordova" <jeffc at codecity.com>
To:      <guido at python.org>
Subject: New, fun way to learn Python.

Hi Guido,

  I wanted to let you know about www.codecity.com After several years of
managing large software projects in Silicon Valley, I realized that I was
spending a lot of time teaching jr. programmers how to write code. So, I
created CodeCity to help me automate some of that. If you go to the site,
you'll see that I've created a category for Python. There's not much depth
to the Python content yet (the site is only a week old) but I'm expecting
the Python community to add their wisdom over a period of time. If you could
spread the word, it would be highly appreciated.

Thankyou,

Jeff C.

------- End of Forwarded Message




From tim.one at home.com  Wed Jan 31 07:16:48 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 01:16:48 -0500
Subject: [Python-Dev] codecity.com
In-Reply-To: <200101310521.AAA31653@cj20424-a.reston1.va.home.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEFJIMAA.tim.one@home.com>

[Guido, on www.codecity.com]
> Should I spread this word, or is this a joke?  The Python quiz
> category is laughable.

While the Python section still seems to have only one question, the first
day this was announced the third choice wasn't today's:

    Python is Open Source code, so it doesn't have a creator

but:

    Martha Stewart

I liked it better before <0.9 wink>.




From moshez at zadka.site.co.il  Wed Jan 31 07:30:07 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Wed, 31 Jan 2001 08:30:07 +0200 (IST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <200101310049.TAA30197@cj20424-a.reston1.va.home.com>
References: <200101310049.TAA30197@cj20424-a.reston1.va.home.com>, <200101300206.VAA21925@cj20424-a.reston1.va.home.com> <LNBBLJKPBEHFEDALKOLCOEEIIMAA.tim.one@home.com>  
            <20010130092454.D18319@glacier.fnational.com>
Message-ID: <20010131063007.536ACA83E@darjeeling.zadka.site.co.il>

On Tue, 30 Jan 2001 19:49:24 -0500, Guido van Rossum <guido at digicool.com> wrote:

> There are different ways to do interators.
> 
> Here is a very "tame" proposal (and definitely in the realm of 2.2),
> that doesn't require any coroutine-like tricks.  Let's propose that
> 
>     for var in expr:
> 	...do something with var...
> 
> will henceforth be translated into
> 
>     __iter = iterator(expr)
>     while __iter.more():
>         var = __iter.next()
>         ...do something with var...

I'm +1 on that...but Tim's "try to use that to write something that
will return the nodes of a binary tree" still haunts me.

Personally, though, I'd thin down the interface to

while 1:
	try:
		var = __iter.next()
	except NoMoreError:
		break # pseudo-break?

With the usual caveat that this is a lie as far as "else" is concerned
(IOW, pseudo-break gets into the else)

> Then a new built-in function iterator() is needed that creates an
> iterator object.  It should try two things:
> 
> (1) If the object implements __iterator__() (or a C API equivalent),
>     call that and be done; this way arbitrary iterators can be
>     created.
 
> (2) If the object smells like a sequence (how to test???), use an
>     iterator sort of like this:

Why not, "if the object doesn't have __iterator__, try this. If it 
won't work, we'll find out by the exception that will be thrown in
our face".

class Iterator:

	def __init__(self, seq):
		self.seq = seq
		self.index = 0

	def next(self):
		try:
			try:
				return self.seq[self.index] # <- smells like
			except IndexError:
				raise NoMoreError(self.index)
		finally:
			self.index += 1

>     (I don't necessarily mean that all those instance variables should
>     be publicly available.)

But what about your poor brother? <wink> Er....I mean, this would make
implementing "indexing" really about just getting the index from the
iterator.

> If the argument to iterator() is itself an iterator (how to test???),

No idea, and this looks problematic. I see your point -- but it's
still problematic.

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From tim.one at home.com  Wed Jan 31 07:57:26 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 01:57:26 -0500
Subject: [Python-Dev] Can't enter new Python bugs on SourceForge?
Message-ID: <LNBBLJKPBEHFEDALKOLCGEFLIMAA.tim.one@home.com>

Reported this earlier.  Still can't create a new bug.  Guido either.  Here's
the SF Support request opened on this:

http://sourceforge.net/support/
    index.php?func=detailsupport&support_id=113100&group_id=1

The good(?) news is that Python isn't the only project to report this
problem.




From tim.one at home.com  Wed Jan 31 08:50:18 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 02:50:18 -0500
Subject: [Python-Dev] FW: Python programmer needed (addition to urllib2 and HTTPS support)
Message-ID: <LNBBLJKPBEHFEDALKOLCCEFPIMAA.tim.one@home.com>

Get rich quick!

-----Original Message-----
From: python-list-admin at python.org
[mailto:python-list-admin at python.org]On Behalf Of Albert Chin-A-Young
Sent: Wednesday, January 31, 2001 2:31 AM
To: python-list at python.org
Subject: Python programmer needed (addition to urllib2 and HTTPS
support)


We're in need of a contract Python programmer for the following:
  1. Allow connecting to a host with urlopen() which requires
     BASIC HTTP authentication with a proxy (via urllib2.py).
     This should address bug #125217:
     http://sourceforge.net/bugs/?func=detailbug&bug_id=125217&group_id=5470
  2. Allow connecting to a host with urlopen() which requires
     BASIC HTTP authentication with a proxy that requires
     BASIC HTTP authentication (via urllib2.py).
  3. Support for non-authenticated clients to connect to a
     HTTPS server
  4. Support for a client to authenticate the HTTPS host (to
     verify that it's certificate is valid)

What we might consider adding (depends on cost):
  1. Support for authenticated clients to connect to a HTTPS server.

Please note that solutions to the four items above must be rolled back
into the main Python distribution (implies the "community" and the
Python developers need to agree on the adopted solution).

--
albert chin (china at thewrittenword dot com)
--
http://mail.python.org/mailman/listinfo/python-list




From ping at lfw.org  Wed Jan 31 10:47:10 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Wed, 31 Jan 2001 01:47:10 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <Pine.LNX.4.10.10101310142480.8204-100000@skuld.kingmanhall.org>
Message-ID: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>

On Tue, 30 Jan 2001, Guido van Rossum wrote:
> 
> Can you say "PEP time"? :-)

Okay, i have written a draft PEP that tries to combine the
"elt in dict", custom iterator, and "for k:v" issues into a
coherent proposal.  Have a look:

    http://www.lfw.org/python/pep-iterators.txt
    http://www.lfw.org/python/pep-iterators.html

Could i get a number for this please?


-- ?!ng

"The only `intuitive' interface is the nipple.  After that, it's all learned."
    -- Bruce Ediger, on user interfaces




From moshez at zadka.site.co.il  Wed Jan 31 11:14:49 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Wed, 31 Jan 2001 12:14:49 +0200 (IST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
Message-ID: <20010131101449.B28C5A83E@darjeeling.zadka.site.co.il>

On Wed, 31 Jan 2001 01:47:10 -0800 (PST), Ka-Ping Yee <ping at lfw.org> wrote:

> Okay, i have written a draft PEP that tries to combine the
> "elt in dict", custom iterator, and "for k:v" issues into a
> coherent proposal.  Have a look:
> 
>     http://www.lfw.org/python/pep-iterators.txt
>     http://www.lfw.org/python/pep-iterators.html

Er....one problem with first reading: you forgot to mention in the
while loop description that 'else:' would be executed if the exception
is raised, so the 'break' is a pseudo-break'.

Basic response: I *love* the iter(), sq_iter and __iter__ parts.
I tremble at seeing the rest.
Why not add a method to dictionaries .iteritems() and do

for (k, v) in dict.iteritems():
	pass

(dict.iteritems() would return an an iterator to the items)

-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From MarkH at ActiveState.com  Wed Jan 31 11:34:01 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Wed, 31 Jan 2001 21:34:01 +1100
Subject: [Python-Dev] WARNING: Changed build process for zlib on Windows
Message-ID: <LCEPIIGDJPKCOIHOBJEPMEKGDAAA.MarkH@ActiveState.com>

Hi all,
	In an attempt to solve "[ Bug #129293 ] zlib library used for binary win32
distribution can crash"
(https://sourceforge.net/bugs/?func=detailbug&group_id=5470&bug_id=129293),
Tim and I have decided that we should fix the build process of zlib.pyd on
windows.

The current process requires that the builder download _2_ zlib archives - a
binary distribution for zlib.lib, and the source archive for the headers.
We believe that slight differences between the 2 are causing the above bug.
A particular warning-light is that the current process defines ZLIB_DLL even
though we are _not_ currently using the DLL but the static lib.  Removing
this #define generates linker errors.

The new process is very simple, but may break some peoples build.  In theory
it _should_ still work for everyone, but if it fails to build, please check
your directory structure.


From ping at lfw.org  Wed Jan 31 12:00:48 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Wed, 31 Jan 2001 03:00:48 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <20010131015832.K962@xs4all.nl>
Message-ID: <Pine.LNX.4.10.10101310252370.8204-100000@skuld.kingmanhall.org>

On Wed, 31 Jan 2001, Thomas Wouters wrote:
> I still can't *see* this, though I
> wouldn't be myself if I hadn't tried to implement it anyway :) And I ran
> into some fairly mind-boggling issues. The worst bit is 'how the f*ck
> does FOR_LOOP know if something's a dict or a list'.

I believe the Pythonic answer to that is "see if the appropriate
method is available".

The best definition of "sequence-like" or "mapping-like" i can
come up with is:

    x is sequence-like if it provides __getitem__() but not keys()
    x is mapping-like if it provides __getitem__() and keys()

But in our case, since we need iteration, we can look for specific
methods that have to do with just what we need for iteration and
nothing else.  Thus, e.g. a mapping-like class without a values()
method is no problem if we never ask to iterate over values.

> And the
> almost-as-bad bit is 'WTF to do for user classes, extension types and
> almost-list/almost-dict practically-builtin types

I think it can be done; the draft PEP at

    http://www.lfw.org/python/pep-iterators.html

is a best-attempt at supporting everything just as you would expect.
Let me know if you think there are important cases it doesn't cover.

I know, the table

    mp_iteritems    __iteritems__, __iter__, items, __getitem__
    mp_iterkeys     __iterkeys__, __iter__, keys, __getitem__
    mp_itervalues   __itervalues__, __iter__, values, __getitem__
    sq_iter         __iter__, __getitem__

might look a little frightening, but it's not so bad, and i think
it's about as simple as you can make it while continuing to support
existing pseudo-lists and pseudo-dictionaries.  No instance should
ever provide __iter__ at the same time as any of the other __iter*__
methods anyway.


-- ?!ng

"The only `intuitive' interface is the nipple.  After that, it's all learned."
    -- Bruce Ediger, on user interfaces




From mal at lemburg.com  Wed Jan 31 12:56:12 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 12:56:12 +0100
Subject: [Python-Dev] Re: from ... import * ([Python-checkins] CVS: python/dist/src/Python
 compile.c,2.153,2.154)
References: <E14NPXJ-0004Re-00@usw-pr-cvs1.sourceforge.net>
Message-ID: <3A77FD5C.DE8729DC@lemburg.com>

> Update of /cvsroot/python/python/dist/src/Python
> In directory usw-pr-cvs1:/tmp/cvs-serv17061/Python
> 
> Modified Files:
>         compile.c 
> Log Message:
> Enforce two illegal import statements that were outlawed in the
> reference manual but not checked: Names bound by import statemants may
> not occur in global statements in the same scope. The from ... import *
> form may only occur in a module scope.
> 
> I guess these changes could break code, but the reference manual
> warned about them.

Jeremy, your code breaks all uses of "from package import submodule"
inside packages.

Try distutils for example or setup.py....

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Wed Jan 31 13:01:24 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 13:01:24 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com>  
	            <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com>
Message-ID: <3A77FE94.E5082136@lemburg.com>

Guido van Rossum wrote:
> 
> [ESR]
> > For different reasons, I'd like to be able to set a constant flag on a
> > object instance.  Simple semantics: if you try to assign to a
> > member or method, it throws an exception.
> >
> > Application?  I have a large Python program that goes to a lot of effort
> > to build elaborate context structures in core.  It would be nice to know
> > they can't be even inadvertently trashed without throwing an exception I
> > can watch for.
> 
> Yes, this is a good thing.  Easy to do on lists and dicts.  Questions:
> 
> - How to spell it?  x.freeze()?  x.readonly()?

How about .lock() and .unlock() ?
 
> - Should this reversible?  I.e. should there be an x.unfreeze()?

Yes. These low-level locks could be used in thread programming
since the above calls are C level functions and thus thread safe
w/r to the global interpreter lock.
 
> - Should we support something like this for instances too?  Sometimes
>   it might be cool to be able to freeze changing attribute values...

Sure :)

Eric, could you write a PEP for this ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Wed Jan 31 13:08:15 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 13:08:15 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCMECAIMAA.tim.one@home.com>
Message-ID: <3A78002F.DC8F0582@lemburg.com>

Tim Peters wrote:
> 
> [MAL]
> > ...
> > What we really want is iterators for dictionaries, so why not
> > implement these instead of tweaking for-loops.
> 
> Seems an unrelated topic:  would "iterators for dictionaries" solve the
> supposed problem with iteration order?

No, but it would solve the problem in a more elegant and
generalized way. Besides, it also allows writing code which
is thread safe, since the iterator can take special actions
to assure that the dictionary doesn't change during the
iteration phase (see the other thread about "making mutable objects
readonly").
 
> > If you are looking for speedups w/r to for-loops, applying a
> > different indexing technique in for-loops would go a lot further
> > and provide better performance not only to dictionary loops,
> > but also to other sequences.
> >
> > I have made some good experience with a special counter object
> > (sort of like a mutable integer) which is used instead of the
> > iteration index integer in the current implementation.
> 
> Please quantify, if possible.  My belief (based on past experiments) is that
> in loops fancier than
> 
>     for i in range(n):
>         pass
> 
> the loop overhead quickly falls into the noise even now.

I don't remember the figures, but these micor optimizations do
speedup loops by a noticable amount. Just compare the performance
of stock Python 1.5 against my patched version.
 
> > Using an iterator object instead of the integer + __getitem__
> > call machinery would allow more flexibility for all kinds of
> > sequences or containers. ...
> 
> This is yet another abrupt change of topic, yes <0.9 wink>?  I agree a new
> iteration *protocol* could have major attractions.

Not really... the counter object is just a special case of
an iterator -- in this case iteration is over the IN.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Wed Jan 31 13:10:43 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 13:10:43 +0100
Subject: [Python-Dev] Re: Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCEECEIMAA.tim.one@home.com>
Message-ID: <3A7800C3.B5D3203F@lemburg.com>

Tim Peters wrote:
> 
> Note that even adding a "frozen" flag would add 4 bytes to every freezable
> object on most machines.  That's why I'd rather .freeze() replace the type
> pointer and .unfreeze() restore it.  No time or space overhead; no
> cluttering up the normal-case (i.e., unfrozen) type implementations with new
> tests.

Note that Fred's weak ref implementation also need a flag on every
weak referencable object (at least last time I looked at his patches).

Why not add a flag byte or word to these objects -- then we'd have
8 or 16 choices of what to do with them ;-)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From MarkH at ActiveState.com  Wed Jan 31 13:18:12 2001
From: MarkH at ActiveState.com (Mark Hammond)
Date: Wed, 31 Jan 2001 23:18:12 +1100
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A77FE94.E5082136@lemburg.com>
Message-ID: <LCEPIIGDJPKCOIHOBJEPEEKJDAAA.MarkH@ActiveState.com>

MAL writes:

> > - How to spell it?  x.freeze()?  x.readonly()?
>
> How about .lock() and .unlock() ?

I'm with Greg here - lock() and unlock() imply an operation similar to
threading.Lock() - ie, exclusivity rather than immutability.

I don't have a strong opinion on the other names, but definately prefer any
of the others over lock() for this operation.

Mark.




From mal at lemburg.com  Wed Jan 31 13:26:07 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 13:26:07 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LCEPIIGDJPKCOIHOBJEPEEKJDAAA.MarkH@ActiveState.com>
Message-ID: <3A78045F.7DB50871@lemburg.com>

Mark Hammond wrote:
> 
> MAL writes:
> 
> > > - How to spell it?  x.freeze()?  x.readonly()?
> >
> > How about .lock() and .unlock() ?
> 
> I'm with Greg here - lock() and unlock() imply an operation similar to
> threading.Lock() - ie, exclusivity rather than immutability.
> 
> I don't have a strong opinion on the other names, but definately prefer any
> of the others over lock() for this operation.

Funny, I though that .lock() and .unlock() could be used to
implement exactly what threading.Lock() does...

Anyway, names really don't matter much, so how about: 

.mutable([flag]) -> integer

  If called without argument, returns 1/0 depending on whether
  the object is mutable or not. When called with a flag argument,
  sets the mutable state of the object to the value indicated
  by flag and returns the previous flag state.

The semantics of this interface would be in sync with many other
state APIs in Python and C (e.g. setlocale()).

The advantage of making this a method should be clear: it allows
writing polymorphic code.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From pedroni at inf.ethz.ch  Wed Jan 31 13:34:32 2001
From: pedroni at inf.ethz.ch (Samuele Pedroni)
Date: Wed, 31 Jan 2001 13:34:32 +0100 (MET)
Subject: [Python-Dev] weak refs and jython
Message-ID: <200101311234.NAA24584@core.inf.ethz.ch>

Hi.

I have read weak ref PEP, maybe too late.
I don't know if portability of code using weak refs between python and jython
was a goal or could be one, and up to which extent actual impl. will correspond 
to the PEP.

But about

    The callbacks registered with weak references must accept a single
    parameter, which will be the weak-ly referenced object itself.
    The object can be resurrected by creating some other reference to
    the object in the callback, in which case the weak reference
    generating the callback will still be cleared but no remaining
    weak references to the object will be cleared.
    
AFAIK using java weak refs (which I think is a natural choice) I see
no way (at least no worth-the-effort way) to implement this in jython.
Java weak refs cannot be resurrected.

regards, Samuele Pedroni.


PS: Mr. X  is a jython developer.




From bckfnn at worldonline.dk  Wed Jan 31 13:49:22 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Wed, 31 Jan 2001 12:49:22 GMT
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <200101302042.PAA29301@cj20424-a.reston1.va.home.com>
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com>   <20010130165204.I962@xs4all.nl>  <200101302042.PAA29301@cj20424-a.reston1.va.home.com>
Message-ID: <3a7809c0.14839067@smtp.worldonline.dk>

>> > Note that Jeremy is only raising errors for "from M import *".
>> 
>> No, he says he's also raising errors for 'import spam' if 'spam' is declared
>> global, like so:
>> 
>> def viking():
>>     global spam
>>     import spam
>
>Yeah, this was just brought to my attention at our group meeting
>today.  I'm with you on this one -- there really isn't a good reason
>why this shouldn't work.  (I wonder why that constraint was ever added
>to the reference manual; maybe I was just upset that someone would
>*do* something as ugly as that, or maybe there was a J[P]ython
>reason???.)

Previously Jython have had problems with "from .. import *" in function
scope, and still have problems when used with the python -> java
compiler:

http://sourceforge.net/bugs/?func=detailbug&bug_id=122834&group_id=12867

Using global on an import name is currently ignored by Jython because
the name assignment is done by the runtime, not the compiler.

regards,
finn



From thomas at xs4all.net  Wed Jan 31 13:59:14 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 31 Jan 2001 13:59:14 +0100
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <3a7809c0.14839067@smtp.worldonline.dk>; from bckfnn@worldonline.dk on Wed, Jan 31, 2001 at 12:49:22PM +0000
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com> <20010130165204.I962@xs4all.nl> <200101302042.PAA29301@cj20424-a.reston1.va.home.com> <3a7809c0.14839067@smtp.worldonline.dk>
Message-ID: <20010131135914.N962@xs4all.nl>

On Wed, Jan 31, 2001 at 12:49:22PM +0000, Finn Bock wrote:

> Using global on an import name is currently ignored by Jython because
> the name assignment is done by the runtime, not the compiler.

So it's impossible to do, in Jython, something like:

def fillme():
    global me
    import me

but it is possible to do:

def fillme():
    global me
    import me as _me
    me = _me

? I have to say I don't like that; we're always claiming 'import' (and
'def' and 'class' for that matter) are 'just another way of writing
assignment'. All these special cases break that.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From bckfnn at worldonline.dk  Wed Jan 31 14:35:36 2001
From: bckfnn at worldonline.dk (Finn Bock)
Date: Wed, 31 Jan 2001 13:35:36 GMT
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: <20010131135914.N962@xs4all.nl>
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com> <20010130165204.I962@xs4all.nl> <200101302042.PAA29301@cj20424-a.reston1.va.home.com> <3a7809c0.14839067@smtp.worldonline.dk> <20010131135914.N962@xs4all.nl>
Message-ID: <3a780eda.16144995@smtp.worldonline.dk>

On Wed, 31 Jan 2001 13:59:14 +0100, you wrote:

>On Wed, Jan 31, 2001 at 12:49:22PM +0000, Finn Bock wrote:
>
>> Using global on an import name is currently ignored by Jython because
>> the name assignment is done by the runtime, not the compiler.
>
>So it's impossible to do, in Jython, something like:
>
>def fillme():
>    global me
>    import me
>
>but it is possible to do:
>
>def fillme():
>    global me
>    import me as _me
>    me = _me
>
>?

Yes, only the second example will make a global variable.

> I have to say I don't like that; we're always claiming 'import' (and
>'def' and 'class' for that matter) are 'just another way of writing
>assignment'. All these special cases break that.

I don't like it either, I was only reported what jython currently does.
The current design used by Jython does lend itself directly towards a
solution, but I don't see anything that makes it impossible to solve.

regards,
finn



From mal at lemburg.com  Wed Jan 31 15:34:19 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 15:34:19 +0100
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com> <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <3A78226B.2E177EFE@lemburg.com>

Michael Hudson wrote:
> 
> In the interest of generating some numbers (and filling up my hard
> drive), last night I wrote a script to build lots & lots of versions
> of python (many of which turned out to be redundant - eg. -O6 didn't
> seem to do anything different to -O3 and pybench doesn't work with
> 1.5.2), and then run pybench with them.  Summarised results below;
> first a key:
> 
> src-n: this morning's CVS (with Jeremy's f_localsplus optimisation)
>         (only built this with -O3)
> src: CVS from yesterday afternoon
> src-obmalloc: CVS from yesterday afternoon with Vladimir's obmalloc
>         patch applied.  More on this later...
> Python-2.0: you can guess what this is.
> 
> All runs are compared against Python-2.0-O2:
> 
> Benchmark: src-n-O3 (rounds=10, warp=20)
>             Average round time:   49029.00 ms              -0.86%
> Benchmark: src (rounds=10, warp=20)
>             Average round time:   67141.00 ms             +35.76%
> Benchmark: src-O (rounds=10, warp=20)
>             Average round time:   50167.00 ms              +1.44%
> Benchmark: src-O2 (rounds=10, warp=20)
>             Average round time:   49641.00 ms              +0.37%
> Benchmark: src-O3 (rounds=10, warp=20)
>             Average round time:   49104.00 ms              -0.71%
> Benchmark: src-O6 (rounds=10, warp=20)
>             Average round time:   49131.00 ms              -0.66%
> Benchmark: src-obmalloc (rounds=10, warp=20)
>             Average round time:   63276.00 ms             +27.94%
> Benchmark: src-obmalloc-O (rounds=10, warp=20)
>             Average round time:   46927.00 ms              -5.11%
> Benchmark: src-obmalloc-O2 (rounds=10, warp=20)
>             Average round time:   46146.00 ms              -6.69%
> Benchmark: src-obmalloc-O3 (rounds=10, warp=20)
>             Average round time:   46456.00 ms              -6.07%
> Benchmark: src-obmalloc-O6 (rounds=10, warp=20)
>             Average round time:   46450.00 ms              -6.08%
> Benchmark: Python-2.0 (rounds=10, warp=20)
>             Average round time:   68933.00 ms             +39.38%
> Benchmark: Python-2.0-O (rounds=10, warp=20)
>             Average round time:   49542.00 ms              +0.17%
> Benchmark: Python-2.0-O3 (rounds=10, warp=20)
>             Average round time:   48262.00 ms              -2.41%
> Benchmark: Python-2.0-O6 (rounds=10, warp=20)
>             Average round time:   48273.00 ms              -2.39%
> 
> My conclusion?  Python 2.1 is slower than Python 2.0, but not by
> enough to care about.

What compiler did you use and on which platform ?

I have made similar experience with -On with n>3 compared to -O2
using pgcc (gcc optimized for PC processors). BTW, the Linux
kernel uses "-Wall -Wstrict-prototypes -O3 -fomit-frame-pointer"
as CFLAGS -- perhaps Python should too on Linux ?!
 
Does anybody know about the effect of -fomit-frame-pointer ?
Would it cause problems or produce code which is not compatible
with code compiled without this flag ?

> Interestingly, adding obmalloc speeds things up.  Let's take a closer
> look:
> 
> $ python pybench.py -c src-obmalloc-O3 -s src-O3
> PYBENCH 0.7
> 
> Benchmark: src-O3 (rounds=10, warp=20)
> 
> Tests:                              per run    per oper.  diff *
> ------------------------------------------------------------------------
>           BuiltinFunctionCalls:     843.35 ms    6.61 us   +2.93%
>            BuiltinMethodLookup:     878.70 ms    1.67 us   +0.56%
>                  ConcatStrings:    1068.80 ms    7.13 us   -1.22%
>                  ConcatUnicode:    1373.70 ms    9.16 us   -1.24%
>                CreateInstances:    1433.55 ms   34.13 us   +9.06%
>        CreateStringsWithConcat:    1031.75 ms    5.16 us  +10.95%
>        CreateUnicodeWithConcat:    1277.85 ms    6.39 us   +3.14%
>                   DictCreation:    1275.80 ms    8.51 us  +44.22%
>                       ForLoops:    1415.90 ms  141.59 us   -0.64%
>                     IfThenElse:    1152.70 ms    1.71 us   -0.15%
>                    ListSlicing:     397.40 ms  113.54 us   -0.53%
>                 NestedForLoops:     789.75 ms    2.26 us   -0.37%
>           NormalClassAttribute:     935.15 ms    1.56 us   -0.41%
>        NormalInstanceAttribute:     961.15 ms    1.60 us   -0.60%
>            PythonFunctionCalls:    1079.65 ms    6.54 us   -1.00%
>              PythonMethodCalls:     908.05 ms   12.11 us   -0.88%
>                      Recursion:     838.50 ms   67.08 us   -0.00%
>                   SecondImport:     741.20 ms   29.65 us  +25.57%
>            SecondPackageImport:     744.25 ms   29.77 us  +18.66%
>          SecondSubmoduleImport:     947.05 ms   37.88 us  +25.60%
>        SimpleComplexArithmetic:    1129.40 ms    5.13 us  +114.92%
>         SimpleDictManipulation:    1048.55 ms    3.50 us   -0.00%
>          SimpleFloatArithmetic:     746.05 ms    1.36 us   -2.75%
>       SimpleIntFloatArithmetic:     823.35 ms    1.25 us   -0.37%
>        SimpleIntegerArithmetic:     823.40 ms    1.25 us   -0.37%
>         SimpleListManipulation:    1004.70 ms    3.72 us   +0.01%
>           SimpleLongArithmetic:     865.30 ms    5.24 us  +100.65%
>                     SmallLists:    1657.65 ms    6.50 us   +6.63%
>                    SmallTuples:    1143.95 ms    4.77 us   +2.90%
>          SpecialClassAttribute:     949.00 ms    1.58 us   -0.22%
>       SpecialInstanceAttribute:    1353.05 ms    2.26 us   -0.73%
>                 StringMappings:    1161.00 ms    9.21 us   +7.30%
>               StringPredicates:    1069.65 ms    3.82 us   -5.30%
>                  StringSlicing:     846.30 ms    4.84 us   +8.61%
>                      TryExcept:    1590.40 ms    1.06 us   -0.49%
>                 TryRaiseExcept:    1104.65 ms   73.64 us  +24.46%
>                   TupleSlicing:     681.10 ms    6.49 us   -3.13%
>                UnicodeMappings:    1021.70 ms   56.76 us   +0.79%
>              UnicodePredicates:    1308.45 ms    5.82 us   -4.79%
>              UnicodeProperties:    1148.45 ms    5.74 us  +13.67%
>                 UnicodeSlicing:     984.15 ms    5.62 us   -0.51%
> ------------------------------------------------------------------------
>             Average round time:   49104.00 ms              +5.70%
> 
> *) measured against: src-obmalloc-O3 (rounds=10, warp=20)
> 
> Words fail me slightly, but maybe some tuning of the memory allocation
> of longs & complex numbers would be in order?

AFAIR, Vladimir's malloc implementation favours small objects.
All number objects (except longs) fall into this category.

Perhaps we should think about adding his lib to the core ?!

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Wed Jan 31 15:39:01 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 15:39:01 +0100
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com> <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <3A782385.5B544CD5@lemburg.com>

> In the interest of generating some numbers (and filling up my hard
> drive), last night I wrote a script to build lots & lots of versions
> of python (many of which turned out to be redundant - eg. -O6 didn't
> seem to do anything different to -O3 and pybench doesn't work with
> 1.5.2), and then run pybench with them. 

FYI, I've just updated the archive to also work under Python 1.5.x:

	http://www.lemburg.com/python/pybench-0.7.zip

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mwh21 at cam.ac.uk  Wed Jan 31 16:52:23 2001
From: mwh21 at cam.ac.uk (Michael Hudson)
Date: 31 Jan 2001 15:52:23 +0000
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: "M.-A. Lemburg"'s message of "Wed, 31 Jan 2001 15:34:19 +0100"
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com> <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk> <3A78226B.2E177EFE@lemburg.com>
Message-ID: <m3itmv7m88.fsf@atrus.jesus.cam.ac.uk>

"M.-A. Lemburg" <mal at lemburg.com> writes:

> > My conclusion?  Python 2.1 is slower than Python 2.0, but not by
> > enough to care about.
> 
> What compiler did you use and on which platform ?

Argh, sorry; I meant to put this in!

$ uname -a
Linux atrus.jesus.cam.ac.uk 2.2.14-1.1.0 #1 Thu Jan 6 05:12:58 EST 2000 i686 unknown
$ gcc --version
2.95.1

It's a Dell Dimension XPS D233 (a 233MHz PII) with a reasonably fast
hard drive (two year old 10G IBM 7200rpm thingy) and quite a lot of
RAM (192Mb).

[snip]
 
> AFAIR, Vladimir's malloc implementation favours small objects.
> All number objects (except longs) fall into this category.

Well, longs & complex numbers don't do any free list handling (like
floats and int do), so I see two conclusions:

1) Don't add obmalloc to the core, but do simple free list stuff for
   longs (might be tricky) and complex nubmers (this should be a
   no-brainer).
2) Integrate obmalloc - then maybe we can ditch all of that icky
   freelist stuff.

> Perhaps we should think about adding his lib to the core ?!

Strikes me as the better solution.  Can anyone try this on Windows?
Seeing as windows malloc reputedly sucks, maybe the differences would
be bigger.

Cheers,
M.

-- 
  Our lecture theatre has just crashed. It will currently only
  silently display an unexplained line-drawing of a large dog
  accompanied by spookily flickering lights.
     -- Dan Sheppard, ucam.chat (from Owen Dunn's summary of the year)




From barry at digicool.com  Wed Jan 31 17:42:28 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Wed, 31 Jan 2001 11:42:28 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
References: <Pine.LNX.4.10.10101310142480.8204-100000@skuld.kingmanhall.org>
	<Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
Message-ID: <14968.16500.594486.613828@anthem.wooz.org>

>>>>> "KY" == Ka-Ping Yee <ping at lfw.org> writes:

    KY> Could i get a number for this please?

Looks like you beat Eric to PEP 234. :)

I'll update PEP 0 and let you check in your txt file.  I may want to
do an editorial pass over it.

-Barry



From barry at digicool.com  Wed Jan 31 17:50:10 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Wed, 31 Jan 2001 11:50:10 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
	<20010131101449.B28C5A83E@darjeeling.zadka.site.co.il>
Message-ID: <14968.16962.830739.920771@anthem.wooz.org>

>>>>> "MZ" == Moshe Zadka <moshez at zadka.site.co.il> writes:

    MZ> Basic response: I *love* the iter(), sq_iter and __iter__
    MZ> parts.  I tremble at seeing the rest.  Why not add a method to
    MZ> dictionaries .iteritems() and do

    | for (k, v) in dict.iteritems():
    | 	pass

    MZ> (dict.iteritems() would return an an iterator to the items)

Moshe, I had exactly the same reaction and exactly the same idea.  I'm
a strong -1 on introducing new syntax for this when new methods can
handle it in a much more readable way (IMO).

Another idea would be to allow the iterator() method to take an
argument:

    for key in dict.iterator()

a.k.a.

    for key in dict.iterator(KEYS)

and also

    for value in dict.iterator(VALUES)
    for key, value in dict.iterator(ITEMS)

One problem is that the constants KEYS, VALUES, and ITEMS would either
have to be defined some place, or you'd just use values like 0, 1, 2,
which is less readable perhaps than just having iteratoritems(),
iteratorkeys(), and iteratorvalues() methods.  Alternative spellings:

    itemsiter(), keysiter(), valsiter()
    itemsiterator(), keysiterator(), valuesiterator()
    iiterator(), kiterator(), viterator()

ad-nauseum-ly y'rs,
-Barry



From skip at mojam.com  Wed Jan 31 17:11:19 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 31 Jan 2001 10:11:19 -0600 (CST)
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A77FE94.E5082136@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
	<3A756FF8.B7185FA2@lemburg.com>
	<200101291500.KAA11569@cj20424-a.reston1.va.home.com>
	<3A75B190.3FD2A883@lemburg.com>
	<200101291922.OAA13321@cj20424-a.reston1.va.home.com>
	<20010129150247.B10191@thyrsus.com>
	<200101300217.VAA21978@cj20424-a.reston1.va.home.com>
	<3A77FE94.E5082136@lemburg.com>
Message-ID: <14968.14631.419491.440774@beluga.mojam.com>

What stimulated this thread about making mutable objects (temporarily)
immutable?  Can someone give me an example where this is actually useful and
can't be handled through some existing mechanism?  I'm definitely with
Fredrik on this one.  Sounds like madness to me.

I'm just guessing here, but since the most common need for immutable objects
is a dictionary keys, I can envision having to test the lock state of a list
or dict that someone wants to use as a key everywhere you would normally
call has_key:

    if l.islocked() and d.has_key(l):
       ...

If you want immutable dicts or lists in order to use them as dictionary
keys, just serialize them first:

    survey_says = {"spam": 14, "eggs": 42}
    sl = marshal.dumps(survey_says)
    dict[sl] = "spam"

Here's another pitfall I can envision.

    survey_says = {"spam": 14, "eggs": 42}
    survey_says.lock()
    dict[survey_says] = "Richard Dawson"
    survey_says.unlock()

At this point can I safely iterate over the keys in the dictionary or not?

Skip




From skip at mojam.com  Wed Jan 31 16:57:30 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 31 Jan 2001 09:57:30 -0600 (CST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <Pine.LNX.4.10.10101310252370.8204-100000@skuld.kingmanhall.org>
References: <20010131015832.K962@xs4all.nl>
	<Pine.LNX.4.10.10101310252370.8204-100000@skuld.kingmanhall.org>
Message-ID: <14968.13802.22823.702114@beluga.mojam.com>

    Ping>     x is sequence-like if it provides __getitem__() but not keys()

So why does this barf?

    >>> [].__getitem__
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    AttributeError: __getitem__

(Obviously, lists *do* understand __getitem__ at some level.  Why isn't it
exposed in the method table?)

Skip



From fredrik at pythonware.com  Wed Jan 31 18:19:44 2001
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 31 Jan 2001 18:19:44 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org><20010131101449.B28C5A83E@darjeeling.zadka.site.co.il> <14968.16962.830739.920771@anthem.wooz.org>
Message-ID: <007301c08baa$02908220$e46940d5@hagrid>

barry wrote:
> Alternative spellings:
> 
>     itemsiter(), keysiter(), valsiter()
>     itemsiterator(), keysiterator(), valuesiterator()
>     iiterator(), kiterator(), viterator()

shouldn't that be xitems, xkeys, xvalues?

</F>




From mal at lemburg.com  Wed Jan 31 18:21:02 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 18:21:02 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
		<3A756FF8.B7185FA2@lemburg.com>
		<200101291500.KAA11569@cj20424-a.reston1.va.home.com>
		<3A75B190.3FD2A883@lemburg.com>
		<200101291922.OAA13321@cj20424-a.reston1.va.home.com>
		<20010129150247.B10191@thyrsus.com>
		<200101300217.VAA21978@cj20424-a.reston1.va.home.com>
		<3A77FE94.E5082136@lemburg.com> <14968.14631.419491.440774@beluga.mojam.com>
Message-ID: <3A78497E.8BCF197E@lemburg.com>

Skip Montanaro wrote:
> 
> What stimulated this thread about making mutable objects (temporarily)
> immutable?  Can someone give me an example where this is actually useful and
> can't be handled through some existing mechanism?  I'm definitely with
> Fredrik on this one.  Sounds like madness to me.

This thread is an offspring of the "for something in dict:" thread.
The problem we face when iterating over mutable objects is that
the underlying objects can change. By marking them read-only we can
safely iterate over their contents.

Another advantage of being able to mark mutable as read-only is
that they may become usable as dictionary keys. Optimizations such
as self-reorganizing read-only dictionaries would also become
possible (e.g. attribute dictionaries which are read-only could
calculate a second hash value to make the hashing perfect).
 
> I'm just guessing here, but since the most common need for immutable objects
> is a dictionary keys, I can envision having to test the lock state of a list
> or dict that someone wants to use as a key everywhere you would normally
> call has_key:
> 
>     if l.islocked() and d.has_key(l):
>        ...
> 
> If you want immutable dicts or lists in order to use them as dictionary
> keys, just serialize them first:
> 
>     survey_says = {"spam": 14, "eggs": 42}
>     sl = marshal.dumps(survey_says)
>     dict[sl] = "spam"

Sure and that's what .items(), .keys() and .values() do. The idea
was to avoid the extra step of creating lists or tuples first.
 
> Here's another pitfall I can envision.
> 
>     survey_says = {"spam": 14, "eggs": 42}
>     survey_says.lock()
>     dict[survey_says] = "Richard Dawson"
>     survey_says.unlock()
>
> At this point can I safely iterate over the keys in the dictionary or not?

Tim already pointed out that we will need two different read-only
states:

	a) temporary
	b) permanent

For dictionaries to become usable as keys in another dictionary,
they'd have to marked permanently read-only.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From jeremy at alum.mit.edu  Wed Jan 31 05:35:58 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue, 30 Jan 2001 23:35:58 -0500 (EST)
Subject: [Python-Dev] Re: from ... import * ([Python-checkins] CVS: python/dist/src/Python
 compile.c,2.153,2.154)
In-Reply-To: <3A77FD5C.DE8729DC@lemburg.com>
References: <E14NPXJ-0004Re-00@usw-pr-cvs1.sourceforge.net>
	<3A77FD5C.DE8729DC@lemburg.com>
Message-ID: <14967.38446.700271.122029@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:

  >> Modified Files: compile.c Log Message: Enforce two illegal import
  >> statements that were outlawed in the reference manual but not
  >> checked: Names bound by import statemants may not occur in global
  >> statements in the same scope. The from ... import * form may only
  >> occur in a module scope.
  >>
  >> I guess these changes could break code, but the reference manual
  >> warned about them.

  MAL> Jeremy, your code breaks all uses of "from package import
  MAL> submodule" inside packages.

  MAL> Try distutils for example or setup.py....

Quite aside from whether the changes should be preserved, I don't see
how "from package import submodule" is affected.  I ran setup.py
without any problem; I wouldn't have been able to build Python
otherwise.  I wrote some simple test cases and didn't have any trouble
with the form you describe.

Can you provide a concrete example?  It may be that something other
than the changes mentioned above that is causing you problems.

Jeremy



From jeremy at alum.mit.edu  Wed Jan 31 05:35:58 2001
From: jeremy at alum.mit.edu (Jeremy Hylton)
Date: Tue, 30 Jan 2001 23:35:58 -0500 (EST)
Subject: [Python-Dev] Re: from ... import * ([Python-checkins] CVS: python/dist/src/Python
 compile.c,2.153,2.154)
In-Reply-To: <3A77FD5C.DE8729DC@lemburg.com>
References: <E14NPXJ-0004Re-00@usw-pr-cvs1.sourceforge.net>
	<3A77FD5C.DE8729DC@lemburg.com>
Message-ID: <14967.38446.700271.122029@localhost.localdomain>

>>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:

  >> Modified Files: compile.c Log Message: Enforce two illegal import
  >> statements that were outlawed in the reference manual but not
  >> checked: Names bound by import statemants may not occur in global
  >> statements in the same scope. The from ... import * form may only
  >> occur in a module scope.
  >>
  >> I guess these changes could break code, but the reference manual
  >> warned about them.

  MAL> Jeremy, your code breaks all uses of "from package import
  MAL> submodule" inside packages.

  MAL> Try distutils for example or setup.py....

Quite aside from whether the changes should be preserved, I don't see
how "from package import submodule" is affected.  I ran setup.py
without any problem; I wouldn't have been able to build Python
otherwise.  I wrote some simple test cases and didn't have any trouble
with the form you describe.

Can you provide a concrete example?  It may be that something other
than the changes mentioned above that is causing you problems.

Jeremy



From barry at digicool.com  Wed Jan 31 18:20:24 2001
From: barry at digicool.com (Barry A. Warsaw)
Date: Wed, 31 Jan 2001 12:20:24 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org>
	<20010131101449.B28C5A83E@darjeeling.zadka.site.co.il>
	<14968.16962.830739.920771@anthem.wooz.org>
	<007301c08baa$02908220$e46940d5@hagrid>
Message-ID: <14968.18776.644453.903217@anthem.wooz.org>

>>>>> "FL" == Fredrik Lundh <fredrik at pythonware.com> writes:

    FL> shouldn't that be xitems, xkeys, xvalues?

Or iitems(), ikeys(), ivalues()?

Personally, I don't much care.  If we get consensus on the more
important issue of going with methods instead of new syntax, I'm sure
Guido will pick whatever method names appeal to him most.

-Barry



From ping at lfw.org  Wed Jan 31 18:14:15 2001
From: ping at lfw.org (Ka-Ping Yee)
Date: Wed, 31 Jan 2001 09:14:15 -0800 (PST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <14968.13802.22823.702114@beluga.mojam.com>
Message-ID: <Pine.LNX.4.10.10101310903380.8204-100000@skuld.kingmanhall.org>

On Wed, 31 Jan 2001, Skip Montanaro wrote:
> Ping> x is sequence-like if it provides __getitem__() but not keys()
> 
> So why does this barf?
> 
>     >>> [].__getitem__

I was describing how to tell if instances are sequence-like.  Before
we get to make that judgement, first we have to look at the C method
table.  So:

    x is sequence-like if it has tp_as_sequence;
        all instances have tp_as_sequence;
            an instance is sequence-like if it has __getitem__() but not keys()

    x is mapping-like if it has tp_as_mapping;
        all instances have tp_as_mapping;
            an instance is mapping-like if it has both __getitem__() and keys()

The "in" operator is implemented this way.

    x customizes "in" if it has sq_contains;
        all instances have sq_contains;
            an instance customizes "in" if it has __contains__()

If sq_contains is missing, or if an instance has no __contains__ method,
we supply the default behaviour by comparing the operand to each member
of x in turn.  This default behaviour is implemented twice: once in
PyObject_Contains, and once in instance_contains.

So i proposed this same structure for sq_iter and __iter__.

    x customizes "for ... in x" if it has sq_iter;
        all instances have sq_iter;
            an instance customizes "in" if it has __iter__()

If sq_iter is missing, or if an instance has no __iter__ method,
we supply the default behaviour by calling PyObject_GetItem on x
and incrementing the index until IndexError.


-- ?!ng

"The only `intuitive' interface is the nipple.  After that, it's all learned."
    -- Bruce Ediger, on user interfaces




From mal at lemburg.com  Wed Jan 31 18:57:20 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 18:57:20 +0100
Subject: [Python-Dev] Re: from ... import * ([Python-checkins] CVS: python/dist/src/Python
 compile.c,2.153,2.154)
References: <E14NPXJ-0004Re-00@usw-pr-cvs1.sourceforge.net>
		<3A77FD5C.DE8729DC@lemburg.com> <14967.38446.700271.122029@localhost.localdomain>
Message-ID: <3A785200.FFB37CAD@lemburg.com>

Jeremy Hylton wrote:
> 
> >>>>> "MAL" == M -A Lemburg <mal at lemburg.com> writes:
> 
>   >> Modified Files: compile.c Log Message: Enforce two illegal import
>   >> statements that were outlawed in the reference manual but not
>   >> checked: Names bound by import statemants may not occur in global
>   >> statements in the same scope. The from ... import * form may only
>   >> occur in a module scope.
>   >>
>   >> I guess these changes could break code, but the reference manual
>   >> warned about them.
> 
>   MAL> Jeremy, your code breaks all uses of "from package import
>   MAL> submodule" inside packages.
> 
>   MAL> Try distutils for example or setup.py....
> 
> Quite aside from whether the changes should be preserved, I don't see
> how "from package import submodule" is affected.  I ran setup.py
> without any problem; I wouldn't have been able to build Python
> otherwise.  I wrote some simple test cases and didn't have any trouble
> with the form you describe.

Perhaps you still had old .pyc files in your installation dir ?
 
> Can you provide a concrete example?  It may be that something other
> than the changes mentioned above that is causing you problems.

The distutils code is full of imports like these (and other
code I'm running is too):

distutils/cmd.py:

    def __init__ (self, dist):
        """Create and initialize a new Command object.  Most importantly,
        invokes the 'initialize_options()' method, which is the real
        initializer and depends on the actual command being
        instantiated.
        """
        # late import because of mutual dependence between these classes
        from distutils.dist import Distribution

This is the report I got from Benjamin Collar:

> I've gotten the newest CVS tarball, but setup.py is still not
> working; this time with a different error. I will resubmit a bug on
> sourceforge if that's the proper way to handle this. Here's the error:
> 
> ./python ./setup.py build
> Traceback (most recent call last):
>   File "./setup.py", line 12, in ?
>     from distutils.core import Extension, setup
>   File "/usr/src/python/dist/src/Lib/distutils/core.py", line 20, in ?
>     from distutils.cmd import Command
>   File "/usr/src/python/dist/src/Lib/distutils/cmd.py", line 15, in ?
>     from distutils import util, dir_util, file_util, archive_util,
> dep_util
> SyntaxError: 'from ... import *' may only occur in a module scope
> make: *** [sharedmods] Error 1

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From skip at mojam.com  Wed Jan 31 19:33:56 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 31 Jan 2001 12:33:56 -0600 (CST)
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A78497E.8BCF197E@lemburg.com>
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
	<3A756FF8.B7185FA2@lemburg.com>
	<200101291500.KAA11569@cj20424-a.reston1.va.home.com>
	<3A75B190.3FD2A883@lemburg.com>
	<200101291922.OAA13321@cj20424-a.reston1.va.home.com>
	<20010129150247.B10191@thyrsus.com>
	<200101300217.VAA21978@cj20424-a.reston1.va.home.com>
	<3A77FE94.E5082136@lemburg.com>
	<14968.14631.419491.440774@beluga.mojam.com>
	<3A78497E.8BCF197E@lemburg.com>
Message-ID: <14968.23188.573257.392841@beluga.mojam.com>

    MAL> This thread is an offspring of the "for something in dict:" thread.
    MAL> The problem we face when iterating over mutable objects is that the
    MAL> underlying objects can change. By marking them read-only we can
    MAL> safely iterate over their contents.

I suspect you'll find it difficult to mark dbm/bsddb/gdbm files read-only.
(And what about Andy Dustman's cool sqldict stuff?)  If you can't extend
this concept in a reasonable fashion to cover (most of) the other objects
that smell like dictionaries, I think you'll just be adding needless
complications for a feature than can't be used where it's really needed.

I see no problem asking for the items() of an in-memory dictionary in order
to get a predictable list to iterate over, but doing that for disk-based
mappings would be next to impossible.  So, I'm stuck iterating over
something can can change out from under me.  In the end, the programmer will
still have to handle border cases specially.  Besides, even if you *could*
lock your disk-based mapping, are you really going to do that in situations
where its sharable (that's what databases they are there for, after all)?  I
suspect you're going to keep the database mutable and work around any
resulting problems.

If you want to implement "for key in dict:", why not just have the VM call
keys() under the covers and use that list?  It would be no worse than the
situation today where you call "for key in dict.keys():", and with the same
caveats.  If you're dumb enough to do that for an on-disk mapping object,
well, you get what you asked for.

Skip



From esr at thyrsus.com  Wed Jan 31 18:55:00 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 31 Jan 2001 12:55:00 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A78045F.7DB50871@lemburg.com>; from mal@lemburg.com on Wed, Jan 31, 2001 at 01:26:07PM +0100
References: <LCEPIIGDJPKCOIHOBJEPEEKJDAAA.MarkH@ActiveState.com> <3A78045F.7DB50871@lemburg.com>
Message-ID: <20010131125500.C5151@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> Anyway, names really don't matter much, so how about: 
> 
> .mutable([flag]) -> integer
> 
>   If called without argument, returns 1/0 depending on whether
>   the object is mutable or not. When called with a flag argument,
>   sets the mutable state of the object to the value indicated
>   by flag and returns the previous flag state.

I'll bear this in mind if things progress to the point where a PEP is
indicated.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>




From tim.one at home.com  Wed Jan 31 20:49:34 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 14:49:34 -0500
Subject: [Python-Dev] WARNING: Changed build process for zlib on Windows
In-Reply-To: <LCEPIIGDJPKCOIHOBJEPMEKGDAAA.MarkH@ActiveState.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCMEHBIMAA.tim.one@home.com>

[Mark Hammond]
> ...
> The new process is very simple, but may break some peoples build.
> ...
> The reason this _should_ not break your build is that your
> _probably_ already have a "..\..\zlib-1.1.3" directory installed
> in the right place so the header files can be located.

Actually, it's certain to break the build for anyone who read
PCbuild\readme.txt.  But I *want* it to break:  changing the directory name
is a strong hint that they should download the zlib source code from the
same place you did (and which is now explained in PCbuild\readme.txt, and
mentioned in the 2.1a2 NEWS file).

Other than that, worked first time, and-- even better --the second time too
<wink>.




From esr at thyrsus.com  Wed Jan 31 18:53:16 2001
From: esr at thyrsus.com (Eric S. Raymond)
Date: Wed, 31 Jan 2001 12:53:16 -0500
Subject: [Python-Dev] Making mutable objects readonly
In-Reply-To: <3A77FE94.E5082136@lemburg.com>; from mal@lemburg.com on Wed, Jan 31, 2001 at 01:01:24PM +0100
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com> <3A756FF8.B7185FA2@lemburg.com> <200101291500.KAA11569@cj20424-a.reston1.va.home.com> <3A75B190.3FD2A883@lemburg.com> <200101291922.OAA13321@cj20424-a.reston1.va.home.com> <20010129150247.B10191@thyrsus.com> <200101300217.VAA21978@cj20424-a.reston1.va.home.com> <3A77FE94.E5082136@lemburg.com>
Message-ID: <20010131125316.B5151@thyrsus.com>

M.-A. Lemburg <mal at lemburg.com>:
> Eric, could you write a PEP for this ?

Not yet.  I'm about (at Guido's suggestion) to submit a revised ternary-select
proposal.  Let's process that first.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>

"Today, we need a nation of Minutemen, citizens who are not only prepared to
take arms, but citizens who regard the preservation of freedom as the basic
purpose of their daily life and who are willing to consciously work and
sacrifice for that freedom."
	-- John F. Kennedy



From tim.one at home.com  Wed Jan 31 21:28:00 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 15:28:00 -0500
Subject: [Python-Dev] weak refs and jython
In-Reply-To: <200101311234.NAA24584@core.inf.ethz.ch>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHCIMAA.tim.one@home.com>

[Samuele Pedroni]
> I have read weak ref PEP, maybe too late.
> I don't know if portability of code using weak refs between
> python and jython was a goal or could be one,

CPython generally doesn't want to do anything impossible for Jython, if it
can help it.

> and up to which extent actual impl. will correspond to the PEP.

Don't care about that.

> ...
> AFAIK using java weak refs (which I think is a natural choice) I
> see no way (at least no worth-the-effort way) to implement this
> in jython.  Java weak refs cannot be resurrected.

Thanks for bringing this up!  Fred is looking into it.




From fdrake at acm.org  Wed Jan 31 21:25:51 2001
From: fdrake at acm.org (Fred L. Drake, Jr.)
Date: Wed, 31 Jan 2001 15:25:51 -0500 (EST)
Subject: [Python-Dev] weak refs and jython
In-Reply-To: <200101311234.NAA24584@core.inf.ethz.ch>
References: <200101311234.NAA24584@core.inf.ethz.ch>
Message-ID: <14968.29903.183882.41485@cj42289-a.reston1.va.home.com>

Samuele Pedroni writes:
 > AFAIK using java weak refs (which I think is a natural choice) I see
 > no way (at least no worth-the-effort way) to implement this in jython.
 > Java weak refs cannot be resurrected.

  This is certainly annoying.
  How about this: the callback receives the weak reference object or
proxy which it was registered on as a parameter.  Since the reference
has already been cleared, there's no way to get the object back, so we
don't need to get it from Java either.
  Would that be workable?  (I'm adjusting my patch now.)


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Digital Creations




From tim.one at home.com  Wed Jan 31 21:56:52 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 15:56:52 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <14968.13802.22823.702114@beluga.mojam.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHDIMAA.tim.one@home.com>

[Ping]
> x is sequence-like if it provides __getitem__() but not keys()

[Skip]
> So why does this barf?
>
>     >>> [].__getitem__
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>     AttributeError: __getitem__
>
> (Obviously, lists *do* understand __getitem__ at some level.  Why
> isn't it exposed in the method table?)

The old type/class split:  list is a type, and types spell their "method
tables" in ways that have little in common with how classes do it.

See PyObject_GetItem in abstract.c for gory details (e.g., dicts spell their
version of getitem via ->tp_as_mapping->mp_subscript(...), while lists spell
it ->tp_as_sequence->sq_item(...); neither has any binding to the attr
"__getitem__"; instance objects fill in both the tp_as_mapping and
tp_as_sequence slots, then map both the mp_subscript and sq_item slots to
classobject.c's instance_item, which in turn looks up "__getitem__").

bet-you're-sorry-you-asked<wink>-ly y'rs  - tim




From tim.one at home.com  Wed Jan 31 22:24:53 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 16:24:53 -0500
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: <3A78226B.2E177EFE@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCEEHFIMAA.tim.one@home.com>

[M.-A. Lemburg]
> AFAIR, Vladimir's malloc implementation favours small objects.

It favors the memory alloc/dealloc patterns Vlad recorded while running an
instrumented Python.  Which is mostly good news.  The flip side is that it
favors the specific programs he ran, and who knows whether those are
"typical".  OTOH, vendor mallocs favor the programs *they* ran, which
probably didn't include Python at all <wink>.

> ...
> Perhaps we should think about adding his lib to the core ?!

It's patch 101104 on SF.  I pushed Vlad to push this for 2.0, but he wisely
decided it was too big a change at the time.  It's certainly too much a
change to slam into 2.1 at this late stage too.  There are many reasons to
want this (e.g., list.append() calls realloc every time today, because,
despite over-allocating, it has no idea how much storage *has* already been
allocated; any malloc has to know this info under the covers, but there's no
way for us to know that too unless we add another N bytes to every list
object to record it, or use our own malloc which *can* tell us that info).

list.append()-behavior-varies-wildly-across-platforms-today-
    when-the-list-gets-large-because-of-that-ly y'rs  - tim




From tim.one at home.com  Wed Jan 31 22:49:31 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 16:49:31 -0500
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <3A78002F.DC8F0582@lemburg.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCGEHGIMAA.tim.one@home.com>

[Tim]
>> Seems an unrelated topic:  would "iterators for dictionaries" solve the
>> supposed problem with iteration order?

[MAL]
> No, but it would solve the problem in a more elegant and
> generalized way.

I'm lost.  "Would [it] solve the ... problem?" "No [it wouldn't solve the
problem], but it would solve the problem ...".  Can only assume we're
switching topics within single sentences now <wink>.

> Besides, it also allows writing code which is thread safe, since
> the iterator can take special actions to assure that the dictionary
> doesn't change during the iteration phase (see the other thread
> about "making mutable objects readonly").

Sorry, but immutability has nothing to do with thread safety (the latter has
to do with "doing a right thing" in the presence of multiple threads, to
keep data structures internally consistent; raising an exception is never "a
right thing" unless the user is violating the advertised semantics, and if
mutation during iteration is such a violation, the presence or absence of
multiple threads has nothing to do with that).  IOW, perhaps, a critical
section is an area of non-exceptional serialization, not a landmine that
makes other threads *blow up* if they touch it.

> ...
> I don't remember the figures, but these micor optimizations

That's plural, but I thought you were talking specifically about the mutable
counter object.  I don't know which, but the two statements don't jibe.

> do speedup loops by a noticable amount. Just compare the performance
> of stock Python 1.5 against my patched version.

No time now, but after 2.1 is out, sure, wrt it (not 1.5).




From tim.one at home.com  Wed Jan 31 23:10:12 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 17:10:12 -0500
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: <m3itmv7m88.fsf@atrus.jesus.cam.ac.uk>
Message-ID: <LNBBLJKPBEHFEDALKOLCAEHIIMAA.tim.one@home.com>

[Michael Hudson]
> ...
> Can anyone try this on Windows?  Seeing as windows malloc
> reputedly sucks, maybe the differences would be bigger.

No time now (pymalloc is a non-starter for 2.1).  Was tried in the past on
Windows.  Helped significantly.  Unclear how much was simply due to
exploiting the global interpreter lock, though.  "Windows" is also a
multiheaded beast (e.g., NT has very different memory performance
characteristics than 95).




From tim.one at home.com  Wed Jan 31 23:43:59 2001
From: tim.one at home.com (Tim Peters)
Date: Wed, 31 Jan 2001 17:43:59 -0500
Subject: generators (was RE: [Python-Dev] Re: Sets: elt in dict, lst.include)
In-Reply-To: <20010130092454.D18319@glacier.fnational.com>
Message-ID: <LNBBLJKPBEHFEDALKOLCOEHIIMAA.tim.one@home.com>

[Neil Schemenauer]
> What's the chances of getting generators into 2.2?

Unknown.  IMO it has more to do with generalizing the iteration protocol
than with generators per se (a generator object that doesn't play nice with
"for" is unpleasant to use; otoh, a generator object that can't be used
divorced from "for" is frustrating too (like when comparing the fringes of
two trees efficiently, which requires interleaving two distinct traversals,
each naturally recursive on its own)).

> The implementation should not be hard.  Didn't Steven Majewski have
> something years ago?

Yes, but Guido also sketched out a nearly complete implementation within the
last year or so.

> Why do we always get sidetracked on trying to figure out how to
> do coroutines and continuations?

Sorry, I've been failing to find a good answer to that question for a decade
<0.4 wink>.  I should note, though, that Guido's current notion of
"generator" is stronger than Icon/CLU/Sather's (which are "strictly
stack-like"), and requires machinery more elaborate than StevenM (or Guido)
sketched before.

> Generators would add real power to the language and are simple
> enough that most users could benefit from them.  Also, it should be
> possible to design an interface that does not preclude the
> addition of coroutines or continuations later.

Agreed.

> I'm not volunteering to champion the cause just yet.  I just want
> to know if there is some issue I'm missing.

microthreads have an enthusiastic and possibly growing audience.  That gets
into (C) stacklessness, though, as do coroutines.  I'm afraid that once you
go beyond "simple" (Icon) generators, a whole world of other stuff gets
pulled in.  The key trick to implementing simple generators in current
Python is simply to decline decrementing the frame's refcount upon a
"suspend" (of course the full details are more involved than *just* that,
but they mostly follow *from* just that).

everything-is-the-enemy-of-something-ly y'rs  - tim




From skip at mojam.com  Wed Jan 31 23:27:38 2001
From: skip at mojam.com (Skip Montanaro)
Date: Wed, 31 Jan 2001 16:27:38 -0600 (CST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
In-Reply-To: <LNBBLJKPBEHFEDALKOLCOEHDIMAA.tim.one@home.com>
References: <14968.13802.22823.702114@beluga.mojam.com>
	<LNBBLJKPBEHFEDALKOLCOEHDIMAA.tim.one@home.com>
Message-ID: <14968.37210.886842.820413@beluga.mojam.com>

>>>>> "Tim" == Tim Peters <tim.one at home.com> writes:

    >> (Obviously, lists *do* understand __getitem__ at some level.  Why
    >> isn't it exposed in the method table?)

    Tim> The old type/class split: list is a type, and types spell their
    Tim> "method tables" in ways that have little in common with how classes
    Tim> do it.

The problem that rolls around in the back of my mind from time-to-time is
that since Python doesn't currently support interfaces, checking for
specific methods seems to be the only reasonable way to determine if a
object does what you want or not.

What would break if we decided to simply add __getitem__ (and other sequence
methods) to list object's method table?  Would they foul something up or
would simply sit around quietly waiting for hasattr to notice them?

Skip




From pedroni at inf.ethz.ch  Wed Jan 31 23:29:37 2001
From: pedroni at inf.ethz.ch (Samuele Pedroni)
Date: Wed, 31 Jan 2001 23:29:37 +0100
Subject: [Python-Dev] weak refs and jython
References: <200101311234.NAA24584@core.inf.ethz.ch> <14968.29903.183882.41485@cj42289-a.reston1.va.home.com>
Message-ID: <001f01c08bd5$4c9c9900$7c5821c0@newmexico>

Hi.

[Fred L. Drake, Jr.]

>  > Java weak refs cannot be resurrected.
>
>   This is certainly annoying.
>   How about this: the callback receives the weak reference object or
> proxy which it was registered on as a parameter.  Since the reference
> has already been cleared, there's no way to get the object back, so we
> don't need to get it from Java either.
>   Would that be workable?  (I'm adjusting my patch now.)

Yes, it is workable: clearly we can implement weak refs only under java2 but
this is not (really) an issue.
We can register the refs in a java reference queue, and poll it lazily or
trough a low-priority thread
in order to invoke the callbacks.

-- Some remarks
I have used java weak/soft refs to implement some of the internal tables of
jython in order to avoid memory leaks, at least
under java2.

I imagine that the idea behind callbacks plus resurrection was to enable the
construction of sofisticated caches.

My intuition is that these features are not present under java because they
will interfere too much with gc
and have a performance penalty.
On the other hand java offers reference queues and soft references, the latter
cover the common case of caches
that should be cleared when there is few memory left. (Never tried them
seriously, so I don't know if the
actual impl is fair, or will just wait too much starting to discard things =>
behavior like primitives gc).

The main difference I see between callbacks and queues approach is that with
queues is this left to the user
when to do the actual cleanup of his tables/caches, and handling queues
internally has a "low" overhead.
With callbacks what happens depends really on the collection times/patterns and
the overhead is related
to call overhead and how much is non trivial, what the user put in the
callbacks. Clearly general performance
will not be easily predictable.
(From a theoretical viewpoint one can simulate more or less queues with
callbacks and the other way around).

Resurrection makes few sense with queues, but I can easely see that lacking of
both resurrection and soft refs
limits what can be done with weak-like refs.

Last thing: one of the things that is really missing in java refs features is
that one cannot put conditions of the form
as long A is not collected B should not be collected either. Clearly I'm
referring to situation when one cannot modify
the class of A in order to add a field, which is quite typical in java. This
should not be a problem with python and
its open/dynamic way-of-life.

regards, Samuele Pedroni.




From mal at lemburg.com  Wed Jan 31 20:03:12 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 20:03:12 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LNBBLJKPBEHFEDALKOLCIEPHILAA.tim.one@home.com>
		<3A756FF8.B7185FA2@lemburg.com>
		<200101291500.KAA11569@cj20424-a.reston1.va.home.com>
		<3A75B190.3FD2A883@lemburg.com>
		<200101291922.OAA13321@cj20424-a.reston1.va.home.com>
		<20010129150247.B10191@thyrsus.com>
		<200101300217.VAA21978@cj20424-a.reston1.va.home.com>
		<3A77FE94.E5082136@lemburg.com>
		<14968.14631.419491.440774@beluga.mojam.com>
		<3A78497E.8BCF197E@lemburg.com> <14968.23188.573257.392841@beluga.mojam.com>
Message-ID: <3A786170.CD65B8A4@lemburg.com>

Skip Montanaro wrote:
> 
>     MAL> This thread is an offspring of the "for something in dict:" thread.
>     MAL> The problem we face when iterating over mutable objects is that the
>     MAL> underlying objects can change. By marking them read-only we can
>     MAL> safely iterate over their contents.
> 
> I suspect you'll find it difficult to mark dbm/bsddb/gdbm files read-only.
> (And what about Andy Dustman's cool sqldict stuff?)  If you can't extend
> this concept in a reasonable fashion to cover (most of) the other objects
> that smell like dictionaries, I think you'll just be adding needless
> complications for a feature than can't be used where it's really needed.

We are currently only talking about Python dictionaries here, even though
other objects could also benefit from this.
 
> I see no problem asking for the items() of an in-memory dictionary in order
> to get a predictable list to iterate over, but doing that for disk-based
> mappings would be next to impossible.  So, I'm stuck iterating over
> something can can change out from under me.  In the end, the programmer will
> still have to handle border cases specially.  Besides, even if you *could*
> lock your disk-based mapping, are you really going to do that in situations
> where its sharable (that's what databases they are there for, after all)?  I
> suspect you're going to keep the database mutable and work around any
> resulting problems.
> 
> If you want to implement "for key in dict:", why not just have the VM call
> keys() under the covers and use that list?  It would be no worse than the
> situation today where you call "for key in dict.keys():", and with the same
> caveats.  If you're dumb enough to do that for an on-disk mapping object,
> well, you get what you asked for.

That's why iterators do a much better task here. In DB design
these are usually called cursors which the allow moving inside
large result sets. But this really is a different topic...

Readonlyness could be put to some good use in optimizing data
structure for which you know that they won't change anymore.
Temporary readonlyness has the nice sideeffect of allowing low-level
lock implementations and makes writing thread safe code easier
to handle, because you can make assertions w/r to the immutability
of an object during a certain period of time explicit in your
code.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Wed Jan 31 21:36:54 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 21:36:54 +0100
Subject: [Python-Dev] Making mutable objects readonly
References: <LCEPIIGDJPKCOIHOBJEPEEKJDAAA.MarkH@ActiveState.com> <3A78045F.7DB50871@lemburg.com> <20010131125500.C5151@thyrsus.com>
Message-ID: <3A787766.35453597@lemburg.com>

"Eric S. Raymond" wrote:
> 
> M.-A. Lemburg <mal at lemburg.com>:
> > Anyway, names really don't matter much, so how about:
> >
> > .mutable([flag]) -> integer
> >
> >   If called without argument, returns 1/0 depending on whether
> >   the object is mutable or not. When called with a flag argument,
> >   sets the mutable state of the object to the value indicated
> >   by flag and returns the previous flag state.
> 
> I'll bear this in mind if things progress to the point where a PEP is
> indicated.

Great :)

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From guido at digicool.com  Wed Jan 31 17:23:37 2001
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 31 Jan 2001 11:23:37 -0500
Subject: [Python-Dev] Re: [Python-checkins] CVS: python/dist/src/Misc NEWS,1.108,1.109
In-Reply-To: Your message of "Wed, 31 Jan 2001 13:35:36 GMT."
             <3a780eda.16144995@smtp.worldonline.dk> 
References: <E14NPZy-0004WU-00@usw-pr-cvs1.sourceforge.net> <20010130075515.X962@xs4all.nl> <200101301506.KAA25763@cj20424-a.reston1.va.home.com> <20010130165204.I962@xs4all.nl> <200101302042.PAA29301@cj20424-a.reston1.va.home.com> <3a7809c0.14839067@smtp.worldonline.dk> <20010131135914.N962@xs4all.nl>  
            <3a780eda.16144995@smtp.worldonline.dk> 
Message-ID: <200101311623.LAA01774@cj20424-a.reston1.va.home.com>

[Finn]
> >> Using global on an import name is currently ignored by Jython because
> >> the name assignment is done by the runtime, not the compiler.

[Thomas]
> >So it's impossible to do, in Jython, something like:
> >
> >def fillme():
> >    global me
> >    import me
> >
> >but it is possible to do:
> >
> >def fillme():
> >    global me
> >    import me as _me
> >    me = _me
> >
> >?

[Finn again]
> Yes, only the second example will make a global variable.
> 
> > I have to say I don't like that; we're always claiming 'import' (and
> >'def' and 'class' for that matter) are 'just another way of writing
> >assignment'. All these special cases break that.
> 
> I don't like it either, I was only reported what jython currently does.
> The current design used by Jython does lend itself directly towards a
> solution, but I don't see anything that makes it impossible to solve.

Tentatively, I'd say that this should be documented as a Jython
difference and Jython should strive to fix this.  So I see no good
reason to rule it out in CPython.

That doesn't mean I like Thomas's example!  It should probably be
redesigned along the lines of

    def fillme():
	import me
	return me

    me = fillme()

to avoid needing side effects on globals.

--Guido van Rossum (home page: http://www.python.org/~guido/)



From guido at digicool.com  Wed Jan 31 17:26:11 2001
From: guido at digicool.com (Guido van Rossum)
Date: Wed, 31 Jan 2001 11:26:11 -0500
Subject: [Python-Dev] The 2nd Korea Python Users Seminar
Message-ID: <200101311626.LAA01799@cj20424-a.reston1.va.home.com>

Wow...!

Way to go, Christian!

--Guido van Rossum (home page: http://www.python.org/~guido/)

------- Forwarded Message

Date:    Wed, 31 Jan 2001 22:46:06 +0900
From:    "Changjune Kim" <junaftnoon at yahoo.com>
To:      <guido at python.org>
Subject: The 2nd Korea Python Users Seminar

Dear Mr. Guido van Rossum,

First of all, I can't thank you more for your great contribution to the
presence of Python. It is not a mere computer programming language but a whole
culture, I think.

I am proud to tell you that we are having the 2nd Korea Python Users Seminar
which is wide open to the public. There are already more than 400 people who
registered ahead, and we expect a few more at the site. The seminar will be
held in Seoul, South Korea on Feb 2.

With the effort of Korea Python Users Group, there has been quite a boom or
phenomenon for Python among developers in Korea. Several magazines are
_competitively_ carrying regular articles about Python -- I'm one of the
authors -- and there was an article even on a _normal_ newspaper, one of the
major four big newspapers in Korea, which described the sprouting of Python in
Korea and pointed its extreme easiness to learn. (moreover, it's the year of
the snake in the 12 zodiac animals)

The seminar is mainly about:

Python 2.0, intro for newbies, Python coding style, ZOPE, internationalization
of Zope for Korean, GUIs such as wxPython, PyQt, Internet programming in
Python, Python with UML, Python C/API, XML with Python, and Stackless Python.

Christian Tismer is coming for SPC presentation with me, and Hostway CEO Lucas
Roh will give a talk about how they are using Python, and one of the Python
evangelists, Brian Lee, CTO of Linuxkorea will give a brief intro to Python
and Python C/API.

I'm so excited and happy to tell you this great news. If there is any message
you want to give to Korea Python Users Group and the audience, it'd be
great -- I could translate it and post it at the site for all the audience.

Thank you again for your wonderful snake.

Best regards,

June from Korea.




_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com

------- End of Forwarded Message




From moshez at zadka.site.co.il  Wed Jan 31 21:32:45 2001
From: moshez at zadka.site.co.il (Moshe Zadka)
Date: Wed, 31 Jan 2001 22:32:45 +0200 (IST)
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <007301c08baa$02908220$e46940d5@hagrid>
References: <007301c08baa$02908220$e46940d5@hagrid>, <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org><20010131101449.B28C5A83E@darjeeling.zadka.site.co.il> <14968.16962.830739.920771@anthem.wooz.org>
Message-ID: <20010131203245.E813BA83E@darjeeling.zadka.site.co.il>

[Barry]
>     itemsiter(), keysiter(), valsiter()
>     itemsiterator(), keysiterator(), valuesiterator()
>     iiterator(), kiterator(), viterator()

[/F]
> shouldn't that be xitems, xkeys, xvalues?

I'm so hoping I missed a <wink> there somewhere. Please, no more
of the dreaded 'x'.

thinking-of-ripping-x-from-my-keyboard-ly y'rs, Z.
-- 
Moshe Zadka <sig at zadka.site.co.il>
This is a signature anti-virus. 
Please stop the spread of signature viruses!
Fingerprint: 4BD1 7705 EEC0 260A 7F21  4817 C7FC A636 46D0 1BD6



From thomas at xs4all.net  Wed Jan 31 22:00:33 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 31 Jan 2001 22:00:33 +0100
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
In-Reply-To: <3A78226B.2E177EFE@lemburg.com>; from mal@lemburg.com on Wed, Jan 31, 2001 at 03:34:19PM +0100
References: <LNBBLJKPBEHFEDALKOLCCECHIMAA.tim.one@home.com> <m3y9vt7888.fsf@atrus.jesus.cam.ac.uk> <3A78226B.2E177EFE@lemburg.com>
Message-ID: <20010131220033.O962@xs4all.nl>

On Wed, Jan 31, 2001 at 03:34:19PM +0100, M.-A. Lemburg wrote:

> I have made similar experience with -On with n>3 compared to -O2
> using pgcc (gcc optimized for PC processors). BTW, the Linux
> kernel uses "-Wall -Wstrict-prototypes -O3 -fomit-frame-pointer"
> as CFLAGS -- perhaps Python should too on Linux ?!

Maybe, but the Linux kernel can be quite specific in what version of gcc you
need, and knows in advance on what platform you are using it :) The
stability and actual speedup of gcc's optimization options can and does vary
across platforms. In the above example, -Wall and -Wstrict-prototypes are
just warnings, and -O3 is the same as "-O2 -finline-functions". As for
-fomit-frame-pointer....

> Does anybody know about the effect of -fomit-frame-pointer ?
> Would it cause problems or produce code which is not compatible
> with code compiled without this flag ?

The effect of -fomit-frame-pointer is that the compilation of frame-pointer
handling code is avoided. It doesn't have any effect on compatibility, since
it doesn't matter that other parts/functions/libraries do have such code,
but it does make debugging impossible (on most machines, in any case.) From
GCC's info docs:

-fomit-frame-pointer'
     Don't keep the frame pointer in a register for functions that
     don't need one.  This avoids the instructions to save, set up and
     restore frame pointers; it also makes an extra register available
     in many functions.  *It also makes debugging impossible on some
     machines.*

     On some machines, such as the Vax, this flag has no effect, because
     the standard calling sequence automatically handles the frame
     pointer and nothing is saved by pretending it doesn't exist.  The
     machine-description macro RAME_POINTER_REQUIRED' controls
     whether a target machine supports this flag.  *Note Registers::.

Obviously, for the Linux kernel this is a very good thing, you don't debug
the Linux kernel like a normal program anyway (contrary to some other UNIX
kernels, I might add.) I believe -g turns off -fomit-frame-pointer itself,
but the docs for -g or -fomit-frame-pointer don't mention it. 

One other thing I noted in the gcc docs is that gcc doesn't do loop
unrolling even with -O3, though I thought it would at -O2. You need to add
-funroll-loop to enable loop unrolling, and that might squeeze out some more
performance.. This only works for loops with a fixed repetition, though, so
I'm not sure if it matters.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From thomas at xs4all.net  Wed Jan 31 20:14:58 2001
From: thomas at xs4all.net (Thomas Wouters)
Date: Wed, 31 Jan 2001 20:14:58 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include - really begs for a PEP
In-Reply-To: <14968.16962.830739.920771@anthem.wooz.org>; from barry@digicool.com on Wed, Jan 31, 2001 at 11:50:10AM -0500
References: <Pine.LNX.4.10.10101310147020.8204-100000@skuld.kingmanhall.org> <20010131101449.B28C5A83E@darjeeling.zadka.site.co.il> <14968.16962.830739.920771@anthem.wooz.org>
Message-ID: <20010131201457.I922@xs4all.nl>

[ Trimming CC: line ]

On Wed, Jan 31, 2001 at 11:50:10AM -0500, Barry A. Warsaw wrote:

> Moshe, I had exactly the same reaction and exactly the same idea.  I'm
> a strong -1 on introducing new syntax for this when new methods can
> handle it in a much more readable way (IMO).

Same here. I *might* like it if iterators were given a format string (or
tuple object, or whatever) so they knew what the iterating code expected
(so something like this:

  for x,y,z in obj

would translate into 

  iterator(obj)("(x,y,z)")

or maybe just

  iterator(obj)((None,None,None))

or maybe even just

  iterator(obj)(3) # that is, number of elements

or so) but I suspect it might be too cute (and obfuscated) for Python,
especially if it was put to use to distingish between 'for x:y in obj' and
'for x,y in obj'.

-- 
Thomas Wouters <thomas at xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!



From sjoerd at oratrix.nl  Wed Jan 31 21:05:06 2001
From: sjoerd at oratrix.nl (Sjoerd Mullender)
Date: Wed, 31 Jan 2001 21:05:06 +0100
Subject: [Python-Dev] python setup.py fails with illegal import (+ fix)
Message-ID: <20010131200507.A106931E1AD@bireme.oratrix.nl>

With the current CVS version, running python setup.py as part of the
build process fails with a syntax error:
Traceback (most recent call last):
  File "../setup.py", line 12, in ?
    from distutils.core import Extension, setup
  File "/usr/people/sjoerd/src/python/Lib/distutils/core.py", line 20, in ?
    from distutils.cmd import Command
  File "/usr/people/sjoerd/src/python/Lib/distutils/cmd.py", line 15, in ?
    from distutils import util, dir_util, file_util, archive_util, dep_util
SyntaxError: 'from ... import *' may only occur in a module scope

The fix is to change the from ... import * that the compiler complains
about:
Index: file_util.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/distutils/file_util.py,v
retrieving revision 1.7
diff -u -c -r1.7 file_util.py
*** file_util.py 2000/09/30 17:29:35	1.7
--- file_util.py 2001/01/31 20:01:56
***************
*** 106,112 ****
      # changing it (ie. it's not already a hard/soft link to src OR
      # (not update) and (src newer than dst).
  
!     from stat import *
      from distutils.dep_util import newer
  
      if not os.path.isfile(src):
--- 106,112 ----
      # changing it (ie. it's not already a hard/soft link to src OR
      # (not update) and (src newer than dst).
  
!     from stat import ST_ATIME, ST_MTIME, ST_MODE, S_IMODE
      from distutils.dep_util import newer
  
      if not os.path.isfile(src):

I didn't check this in because distutils is Greg Ward's baby.

-- Sjoerd Mullender <sjoerd.mullender at oratrix.com>



From mal at lemburg.com  Wed Jan 31 23:24:43 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 23:24:43 +0100
Subject: [Python-Dev] Benchmarking "fun" (was Re: Python 2.1 slower than 2.0)
References: <LNBBLJKPBEHFEDALKOLCAEHIIMAA.tim.one@home.com>
Message-ID: <3A7890AB.69B893F9@lemburg.com>

Tim Peters wrote:
> 
> [Michael Hudson]
> > ...
> > Can anyone try this on Windows?  Seeing as windows malloc
> > reputedly sucks, maybe the differences would be bigger.
> 
> No time now (pymalloc is a non-starter for 2.1).  Was tried in the past on
> Windows.  Helped significantly.  Unclear how much was simply due to
> exploiting the global interpreter lock, though.  "Windows" is also a
> multiheaded beast (e.g., NT has very different memory performance
> characteristics than 95).

We're still in alpha, no ?  

Adding pymalloc is not much of
a deal since it fits nicely with the Python malloc macros and
giving the package a nice spin by putting it into a Python alpha
release would sure create more confidence in this nice piece
of work. We can always take it out again before going into the 
beta phase.

Or do we have a 2.1 feature freeze already ?

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/



From mal at lemburg.com  Wed Jan 31 23:15:50 2001
From: mal at lemburg.com (M.-A. Lemburg)
Date: Wed, 31 Jan 2001 23:15:50 +0100
Subject: [Python-Dev] Re: Sets: elt in dict, lst.include
References: <LNBBLJKPBEHFEDALKOLCGEHGIMAA.tim.one@home.com>
Message-ID: <3A788E96.AB823FAE@lemburg.com>

Tim Peters wrote:
> 
> [Tim]
> >> Seems an unrelated topic:  would "iterators for dictionaries" solve the
> >> supposed problem with iteration order?
> 
> [MAL]
> > No, but it would solve the problem in a more elegant and
> > generalized way.
> 
> I'm lost.  "Would [it] solve the ... problem?" "No [it wouldn't solve the
> problem], but it would solve the problem ...".  Can only assume we're
> switching topics within single sentences now <wink>.

Sorry, not my brightest day today... what I wanted to say is that
iterators would solve the problem of defining "something" in
"for something in dict" nicely. 

Since iterators can define the order in which a data structure is 
traversed, this would also do away with the second (supposed) 
problem.

> > Besides, it also allows writing code which is thread safe, since
> > the iterator can take special actions to assure that the dictionary
> > doesn't change during the iteration phase (see the other thread
> > about "making mutable objects readonly").
> 
> Sorry, but immutability has nothing to do with thread safety (the latter has
> to do with "doing a right thing" in the presence of multiple threads, to
> keep data structures internally consistent; raising an exception is never "a
> right thing" unless the user is violating the advertised semantics, and if
> mutation during iteration is such a violation, the presence or absence of
> multiple threads has nothing to do with that).  IOW, perhaps, a critical
> section is an area of non-exceptional serialization, not a landmine that
> makes other threads *blow up* if they touch it.

Who said that an exception is raised ? The method I posted
on the mutability thread allows querying the current state just
like you would query the availability of a resource.

> > ...
> > I don't remember the figures, but these micor optimizations
> 
> That's plural, but I thought you were talking specifically about the mutable
> counter object.  I don't know which, but the two statements don't jibe.

The counter object patch is a micro-optimization and as such will
only give you a gain of a few percent. What makes the difference
is the sum of these micro optimizations.

Here's the patch for Python 1.5 which includes the optimizations:

	http://www.lemburg.com/python/mxPython-1.5.patch.gz
 
> > do speedup loops by a noticable amount. Just compare the performance
> > of stock Python 1.5 against my patched version.
> 
> No time now, but after 2.1 is out, sure, wrt it (not 1.5).

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company:                                        http://www.egenix.com/
Consulting:                                    http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/